<a href="https://colab.research.google.com/github/Kiron-Ang/DSC/blob/main/vacation_recommender_system.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Vacation Recommender System
### Kiron Ang, November 2024

This IPYNB file contains Python code for a content-based recommender system to help plan a new vacation. Users can enter the month and country of their previous vacations and then receive recommended months and countries for future vacations.

The issue with a collaborative recommender system lies in the bias in the sample selection. Using popular reviews online will bias the recommendations to vacations that have frequently been made, such as trips to beaches during the summer months. Less common trips should also be included in the system's recommendations because the road less traveled can still be rewarding, due to factors that change based on the time of year such as climate, fluctuating prices, and geopolitical factors.

In [2]:
print("Printing version numbers. . .")

!python -V

!pip install -U polars > output.txt
import polars
print("polars", polars.__version__)

!pip install -U scikit-learn > output.txt
import sklearn
print("scikit-learn", sklearn.__version__)

import ipywidgets
print("ipywidgets", ipywidgets.__version__)

import IPython
print("IPython", IPython.__version__)

Printing version numbers. . .
Python 3.10.12
polars 1.12.0
scikit-learn 1.5.2
ipywidgets 7.7.1
IPython 7.34.0


In [3]:
# www.un.org/en/about-us/member-states

countries = ["Afghanistan", "Albania", "Algeria", "Andorra", "Angola", "Antigua and Barbuda", "Argentina", "Armenia", "Australia", "Austria", "Azerbaijan", "Bahamas", "Bahrain", "Bangladesh", "Barbados", "Belarus", "Belgium", "Belize", "Benin", "Bhutan", "Bolivia (Plurinational State of)", "Bosnia and Herzegovina", "Botswana", "Brazil", "Brunei Darussalam", "Bulgaria", "Burkina Faso", "Burundi", "Cabo Verde", "Cambodia", "Cameroon", "Canada", "Central African Republic", "Chad", "Chile", "China", "Colombia", "Comoros", "Congo", "Costa Rica", "Côte D'Ivoire", "Croatia", "Cuba", "Cyprus", "Czechia", "Democratic People's Republic of Korea", "Democratic Republic of the Congo", "Denmark", "Djibouti", "Dominica", "Dominican Republic", "Ecuador", "Egypt", "El Salvador", "Equatorial Guinea", "Eritrea", "Estonia", "Eswatini", "Ethiopia", "Fiji", "Finland", "France", "Gabon", "Gambia (Republic of The)", "Georgia", "Germany", "Ghana", "Greece", "Grenada", "Guatemala", "Guinea", "Guinea Bissau", "Guyana", "Haiti", "Honduras", "Hungary", "Iceland", "India", "Indonesia", "Iran (Islamic Republic of)", "Iraq", "Ireland", "Israel", "Italy", "Jamaica", "Japan", "Jordan", "Kazakhstan", "Kenya", "Kiribati", "Kuwait", "Kyrgyzstan", "Lao People’s Democratic Republic", "Latvia", "Lebanon", "Lesotho", "Liberia", "Libya", "Liechtenstein", "Lithuania", "Luxembourg", "Madagascar", "Malawi", "Malaysia", "Maldives", "Mali", "Malta", "Marshall Islands", "Mauritania", "Mauritius", "Mexico", "Micronesia (Federated States of)", "Monaco", "Mongolia", "Montenegro", "Morocco", "Mozambique", "Myanmar", "Namibia", "Nauru", "Nepal", "Netherlands (Kingdom of the)", "New Zealand", "Nicaragua", "Niger", "Nigeria", "North Macedonia", "Norway", "Oman", "Pakistan", "Palau", "Panama", "Papua New Guinea", "Paraguay", "Peru", "Philippines", "Poland", "Portugal", "Qatar", "Republic of Korea", "Republic of Moldova", "Romania", "Russian Federation", "Rwanda", "Saint Kitts and Nevis", "Saint Lucia", "Saint Vincent and the Grenadines", "Samoa", "San Marino", "Sao Tome and Principe", "Saudi Arabia", "Senegal", "Serbia", "Seychelles", "Sierra Leone", "Singapore", "Slovakia", "Slovenia", "Solomon Islands", "Somalia", "South Africa", "South Sudan", "Spain", "Sri Lanka", "Sudan", "Suriname", "Sweden", "Switzerland", "Syrian Arab Republic", "Tajikistan", "Thailand", "Timor-Leste", "Togo", "Tonga", "Trinidad and Tobago", "Tunisia", "Türkiye", "Turkmenistan", "Tuvalu", "Uganda", "Ukraine", "United Arab Emirates", "United Kingdom of Great Britain and Northern Ireland", "United Republic of Tanzania", "United States of America", "Uruguay", "Uzbekistan", "Vanuatu", "Venezuela, Bolivarian Republic of", "Viet Nam", "Yemen", "Zambia", "Zimbabwe"]
months = ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"]

country_dropdown = ipywidgets.Dropdown(options = countries, description = "Country:")
month_dropdown = ipywidgets.Dropdown(options = months, description = "Month:")

past_vacations = {}

def submit_survey(month, country):
    if month in past_vacations:
        past_vacations[month].append(country)
    else:
        past_vacations[month] = [country]
    print(f"Adding a {month} trip to {country}. . .")
    print(f"Vacations: {past_vacations}")

def on_submit(button):
    submit_survey(month_dropdown.value, country_dropdown.value)

submit_button = ipywidgets.Button(description = "Submit")
submit_button.on_click(on_submit)

print("Please use the form below to enter information")
print("about previous vacations that you enjoyed.")
print("Select the month that you traveled, along with")
print("the country that you visited. If your trip was")
print("longer than a month, then put down the month")
print("that you enjoyed the most. Fill out the form as")
print("many times as you need to. If you visited a")
print("country several times, please fill out the form")
print("for each time you visited.")
print("")

IPython.display.display(month_dropdown, country_dropdown, submit_button)

Please use the form below to enter information
about previous vacations that you enjoyed.
Select the month that you traveled, along with
the country that you visited. If your trip was
longer than a month, then put down the month
that you enjoyed the most. Fill out the form as
many times as you need to. If you visited a
country several times, please fill out the form
for each time you visited.



Dropdown(description='Month:', options=('January', 'February', 'March', 'April', 'May', 'June', 'July', 'Augus…

Dropdown(description='Country:', options=('Afghanistan', 'Albania', 'Algeria', 'Andorra', 'Angola', 'Antigua a…

Button(description='Submit', style=ButtonStyle())

In [34]:
# data.un.org

tourism = polars.read_csv("https://data.un.org/_Docs/SYB/CSV/SYB66_176_202310_Tourist-Visitors%20Arrival%20and%20Expenditure.csv", encoding = "latin-1", skip_rows = 1)
gdp = polars.read_csv("https://data.un.org/_Docs/SYB/CSV/SYB66_230_202310_GDP%20and%20GDP%20Per%20Capita.csv", encoding = "latin-1", skip_rows = 1)
crime = polars.read_csv("https://data.un.org/_Docs/SYB/CSV/SYB66_328_202310_Intentional%20homicides%20and%20other%20crimes.csv", encoding = "latin-1", skip_rows = 1, infer_schema = False)

tourism = tourism.filter(tourism["Year"] == 2021)
gdp = gdp.filter(gdp["Year"] == 2021)
crime = crime.filter(crime["Year"] == "2021")

tourism = tourism.filter(tourism[:, 1].is_in(countries))
gdp = gdp.filter(gdp[:, 1].is_in(countries))
crime = crime.filter(crime[:, 1].is_in(countries))

tourism = tourism.filter(tourism["Series"] == "Tourist/visitor arrivals (thousands)")
gdp = gdp.filter(gdp["Series"] == "GDP per capita (US dollars)")
crime = crime.filter(crime["Series"] == "Assault rate per 100,000 population")



In [36]:
tourism

Region/Country/Area,Unnamed: 1_level_0,Year,Series,Tourism arrivals series type,Tourism arrivals series type footnote,Value,Footnotes,Source
i64,str,i64,str,str,str,str,str,str
8,"""Albania""",2021,"""Tourist/visitor arrivals (thousands)""","""TF""",,"""5,515""","""Excluding nationals residing abroad.""","""World Tourism Organization (UNWTO), Madrid, the UNWTO Statistics Database, last accessed December 2022."""
12,"""Algeria""",2021,"""Tourist/visitor arrivals (thousands)""","""VF""",,"""125""","""Including nationals residing abroad.""","""World Tourism Organization (UNWTO), Madrid, the UNWTO Statistics Database, last accessed December 2022."""
20,"""Andorra""",2021,"""Tourist/visitor arrivals (thousands)""","""TF""",,"""1,949""",,"""World Tourism Organization (UNWTO), Madrid, the UNWTO Statistics Database, last accessed December 2022."""
24,"""Angola""",2021,"""Tourist/visitor arrivals (thousands)""","""TF""",,"""64""",,"""World Tourism Organization (UNWTO), Madrid, the UNWTO Statistics Database, last accessed December 2022."""
28,"""Antigua and Barbuda""",2021,"""Tourist/visitor arrivals (thousands)""","""TF""",,"""169""","""Excluding nationals residing abroad.;Arrivals by air.""","""World Tourism Organization (UNWTO), Madrid, the UNWTO Statistics Database, last accessed December 2022."""
…,…,…,…,…,…,…,…,…
840,"""United States of America""",2021,"""Tourist/visitor arrivals (thousands)""","""TF""",,"""22,100""",,"""World Tourism Organization (UNWTO), Madrid, the UNWTO Statistics Database, last accessed December 2022."""
860,"""Uzbekistan""",2021,"""Tourist/visitor arrivals (thousands)""","""VF""",,"""1,881""",,"""World Tourism Organization (UNWTO), Madrid, the UNWTO Statistics Database, last accessed December 2022."""
704,"""Viet Nam""",2021,"""Tourist/visitor arrivals (thousands)""","""VF""",,"""157""","""Including nationals residing abroad.""","""World Tourism Organization (UNWTO), Madrid, the UNWTO Statistics Database, last accessed December 2022."""
894,"""Zambia""",2021,"""Tourist/visitor arrivals (thousands)""","""TF""",,"""554""",,"""World Tourism Organization (UNWTO), Madrid, the UNWTO Statistics Database, last accessed December 2022."""


In [37]:
gdp

Region/Country/Area,Unnamed: 1_level_0,Year,Series,Value,Footnotes,Source
i64,str,i64,str,str,str,str
4,"""Afghanistan""",2021,"""GDP per capita (US dollars)""","""373""",,"""United Nations Statistics Division, New York, National Accounts Statistics: Analysis of Main Aggregates (AMA) database, last accessed April 2023."""
8,"""Albania""",2021,"""GDP per capita (US dollars)""","""6,396""",,"""United Nations Statistics Division, New York, National Accounts Statistics: Analysis of Main Aggregates (AMA) database, last accessed April 2023."""
12,"""Algeria""",2021,"""GDP per capita (US dollars)""","""3,700""",,"""United Nations Statistics Division, New York, National Accounts Statistics: Analysis of Main Aggregates (AMA) database, last accessed April 2023."""
20,"""Andorra""",2021,"""GDP per capita (US dollars)""","""42,066""",,"""United Nations Statistics Division, New York, National Accounts Statistics: Analysis of Main Aggregates (AMA) database, last accessed April 2023."""
24,"""Angola""",2021,"""GDP per capita (US dollars)""","""2,044""",,"""United Nations Statistics Division, New York, National Accounts Statistics: Analysis of Main Aggregates (AMA) database, last accessed April 2023."""
…,…,…,…,…,…,…
548,"""Vanuatu""",2021,"""GDP per capita (US dollars)""","""3,073""",,"""United Nations Statistics Division, New York, National Accounts Statistics: Analysis of Main Aggregates (AMA) database, last accessed April 2023."""
704,"""Viet Nam""",2021,"""GDP per capita (US dollars)""","""3,756""",,"""United Nations Statistics Division, New York, National Accounts Statistics: Analysis of Main Aggregates (AMA) database, last accessed April 2023."""
887,"""Yemen""",2021,"""GDP per capita (US dollars)""","""302""",,"""United Nations Statistics Division, New York, National Accounts Statistics: Analysis of Main Aggregates (AMA) database, last accessed April 2023."""
894,"""Zambia""",2021,"""GDP per capita (US dollars)""","""1,095""",,"""United Nations Statistics Division, New York, National Accounts Statistics: Analysis of Main Aggregates (AMA) database, last accessed April 2023."""


In [38]:
crime

Region/Country/Area,Unnamed: 1_level_0,Year,Series,Value,Footnotes,Source
str,str,str,str,str,str,str
"""8""","""Albania""","""2021""","""Assault rate per 100,000 population""","""5.7""",,"""United Nations Office on Drugs and Crime (UNODC), Vienna, UNODC Statistics database, last accessed June 2023."""
"""12""","""Algeria""","""2021""","""Assault rate per 100,000 population""","""22.7""",,"""United Nations Office on Drugs and Crime (UNODC), Vienna, UNODC Statistics database, last accessed June 2023."""
"""32""","""Argentina""","""2021""","""Assault rate per 100,000 population""","""340.6""",,"""United Nations Office on Drugs and Crime (UNODC), Vienna, UNODC Statistics database, last accessed June 2023."""
"""36""","""Australia""","""2021""","""Assault rate per 100,000 population""","""289.0""",,"""United Nations Office on Drugs and Crime (UNODC), Vienna, UNODC Statistics database, last accessed June 2023."""
"""40""","""Austria""","""2021""","""Assault rate per 100,000 population""","""40.5""",,"""United Nations Office on Drugs and Crime (UNODC), Vienna, UNODC Statistics database, last accessed June 2023."""
…,…,…,…,…,…,…
"""756""","""Switzerland""","""2021""","""Assault rate per 100,000 population""","""7.5""",,"""United Nations Office on Drugs and Crime (UNODC), Vienna, UNODC Statistics database, last accessed June 2023."""
"""764""","""Thailand""","""2021""","""Assault rate per 100,000 population""","""13.2""",,"""United Nations Office on Drugs and Crime (UNODC), Vienna, UNODC Statistics database, last accessed June 2023."""
"""784""","""United Arab Emirates""","""2021""","""Assault rate per 100,000 population""","""1.5""",,"""United Nations Office on Drugs and Crime (UNODC), Vienna, UNODC Statistics database, last accessed June 2023."""
"""840""","""United States of America""","""2021""","""Assault rate per 100,000 population""","""280.1""",,"""United Nations Office on Drugs and Crime (UNODC), Vienna, UNODC Statistics database, last accessed June 2023."""
