# Final assignment: Content based filter to explore similar cities to go for holiday during COVID-19 Scenario
Data Science Capstone Project
Table of contents
Introduction: Business Problem
Data
Methodology
Analysis
Results and Discussion
Conclusion

#Introduction: Business Problem 
This project is about finding a correct destination to go during COVID-19 situation by having analysis of similar cities where COVID-19 cases are approximately same to due to x number of factors which are same.

To solve this problem, we will create a python script capable of recommending to a user where they should go on their next vacation, based on the similarities in the cities and to avoid travelling to cities where same type of situation is there for COVID-19 and are not safe to travel.

For this, we will create a content based filter, and we will feed it with the data taken from the Foursquare API. Using this tool, we will define the characteristics of each city, which will be the number of places it has in each category (Italian, Asian, Mediterranean restaurants, beaches, ports, mountains, parks, ...) Once we have the cities with their characteristics We will pass our user through the algorithm, and it will tell us which are the most promising cities for it.

# Data 
For the project we will get the coordinates of the cities through the geocodres API, and the characteristics of the cities from the Foursquare API. We could directly enter the latitude and longitude of the city and ask Foursquare to return the most interesting sites that are nearby, but that would give us bad results.

Why? Well, because the API will return a maximum of 100 sites, but these are organized into more than 500 categories. This would create an underfitting problem in the data. To solve this we will take two actions.

The first will be to group all these characteristics into 177 subgroups. By doing this we will group the most similar sites in the same category. 

The second, will be to ask the api how many places of each category there are in each city, so that it will not only return the closest or most important places, but we will be able to know how many Italian restaurants there are, how many Koreans, how many Americans, how many Mediterranean, ... Thus, asking specifically for each of the categories and taking into account that each category can return 50 sites, we will take into account thousands of sites, and not only the first 100.

Once we have all the data of all the cities, we will ask the user for cities that he has visited previously and what grade would he give them, and in this way we will be able to find what kind of situation prevails in each city and will be able to take a decision whether to travel to that city or not.

In [2]:
import requests
import pandas as pd
import numpy as np
import random
from geopy.geocoders import Nominatim
from IPython.display import Image 
from IPython.core.display import HTML 
from pandas.io.json import json_normalize
import folium # plotting library
from geopy import geocoders  
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
from jupyter_dash import JupyterDash
import plotly.express as px
import requests

In [5]:
List_cities=["Puerto del Rosario, Canary Islands","Cairo","Kusadasi, Turkey","Chamonix","Beijing","Cannes","Amsterdam","Bodrum","Iguazu National Park, Argentina","Courchevel","Berlin","Aberdare","Amritsar","Edimburgh","New York","Orlando", "Sydney","London","Paris","Venice","Manhattan","Cape Town","Las Vegas","Rome","Rio de Janeiro","Maldives","Hawaii","South Island, New Zealand", "Grand Canyon", "San Diego","Niagara Falls","San Francisco","Los Angeles","Dubai","Auckland","Singapore","Seychelles", "Bali","Durban","Bangkok","Iceland","Whitsunday Islands National Park","Cairns","Costa del Sol","Antigua","Melbourne","Mallorca","Lake District","Barbados","Bahamas","Abu Simbel","Bora Bora","Sharm el Sheikh", "Madrid","Algarve","Zermatt","Victoria Falls","Marbella","Masai Mara, Kenya","Chichen Itza","Disney World","Florence","Puerto Banus","Toronto","Taj Mahal","Great Wall of china", "Menorca","Monaco","Luxor","Hong Kong","Banff National Park","Sorrento","Key West","Koh Samui, Thailand","Cancun","Nice","Machu Picchu","Yosemite","Oahu","Florida Keys","Guam","Dublin","Vancouver","Ayers Rock","La Digue Island","Cayman Islands","Naples","St. Pete Beach, Florida", "Barcelona", "Ibiza","Adelaide","Airlie Beach Queensland",'Benidorm',"Buenos Aires","Prague","Cuba","Paphos","Valley of the kings","Galapagos Islands","Isle of Man"]
print(len(List_cities))

100


In [15]:
gn = geocoders.GeoNames(username="sergibago")
data=[]
for city in List_cities:
  try:
    loc=gn.geocode(city, timeout=None)
  except:
    loc=gn.geocode(city, timeout=None)
  print(loc)
  latitude=loc.latitude
  longitude=loc.longitude
  data.append([city,latitude,longitude])

Cities_df=pd.DataFrame(data,columns=["Name","Latitude","Longitude"])

Puerto del Rosario, Canary Islands, Spain
Cairo, Cairo, Egypt
Kusadasi, Aydın, Turkey
Chamonix, Auvergne-Rhône-Alpes, France
Beijing, Beijing, China
Cannes, Provence-Alpes-Côte d'Azur, France
Amsterdam, North Holland, Netherlands
Bodrum, Muğla, Turkey
Iguazú National Park, Misiones, Argentina
Courchevel, Auvergne-Rhône-Alpes, France
Berlin, Berlin, Germany
Aberdare, Wales, United Kingdom
Amritsar, Punjab, India
Edinburgh, Scotland, United Kingdom
New York, New York, United States
Orlando, Florida, United States
Sydney, New South Wales, Australia
London, England, United Kingdom
Paris, Île-de-France, France
Venice, Veneto, Italy
Manhattan, New York, United States
Cape Town, Western Cape, South Africa
Las Vegas, Nevada, United States
Rome, Latium, Italy
Rio de Janeiro, Rio de Janeiro, Brazil
Maldives, Maldives
Hawaii, Hawaii, United States
South Island, New Zealand
Grand Cess Canyon
San Diego, California, United States
Niagara Falls, Ontario, Canada
San Francisco, California, United State

In [16]:
Cities_df=Cities_df.sort_values(by=["Name"])
Cities_df=Cities_df.reset_index(drop=True)
display(Cities_df)

Unnamed: 0,Name,Latitude,Longitude
0,Aberdare,51.71438,-3.44918
1,Abu Simbel,22.37571,31.61170
2,Adelaide,-34.92866,138.59863
3,Airlie Beach Queensland,-20.26751,148.71471
4,Algarve,37.08367,-8.24902
...,...,...,...
95,Venice,45.43713,12.33265
96,Victoria Falls,-17.93285,25.83066
97,Whitsunday Islands National Park,-20.24872,148.98025
98,Yosemite,36.77606,-119.71903


In [17]:
map_cities = folium.Map(location=[0,0], zoom_start=3)
for lat, lon,Name in zip(Cities_df["Latitude"],Cities_df["Longitude"],Cities_df["Name"]):
    folium.Marker([lat,lon], popup=Name).add_to(map_cities)
map_cities

In [18]:
List_categories=["Aquarium","Arcade & Bowling","Casino","Cinema","Night club","Disco","Music","Art","Stadium","Theme Park","Water Park","Zoo","American Restaurant","African Restaurant","Italian Restaurant","Asian Restaurant","Bistro","Buffet","Cafeteria","Creperie","Bodega","Fast Food Restaurant","French Restaurant","Indian Resturant","Irish Pub","Italian restaurant","Latin American Restaurant","Mediterranean Restaurant","Mexican Restaurant","Seafood Restaurant","Steakhouse","Turkish Restaurant","Nightlife Spot","Bar","Beach Bar","Cocktail Bar","Karaoke","Pub","Sport bar","Brewery","Lounge","Nightclub","Golf","Bay","Beach","Surf spot","Botanical Garden","Bridge","Canal","Castle","Dive Spot","Field","Farm","Fishing spot","Forest","Garden","Harbour","Hill","Island","Lake","Lighthouse","Mountain","National Park","Park","Pedestrian Aera","Plaza","River","Ski Area","Stables","Vineyard","Volcano","Waterfall","Windmill","Government building","Library","Observatory","Office","Social Club","Spiritual Center","Antique shop","Arts store","Clothing store","Gift shop","Massage studio","Music store","Outlet","Airport","Bike rental","Boat rental","Ferry or Boat","Bus","Hotel","Resort","Motel","Hostel","Vacation Rental","Bed & Breakfast","Metro station","Pier","RV park"]
print(len(List_categories))

100


In [21]:
CLIENT_ID = # your Foursquare ID
CLIENT_SECRET = # your Foursquare Secret
VERSION = '20180604'
LIMIT = 100
radius=5000

In [22]:
Full_cities_df=pd.DataFrame(columns=List_categories,index=List_cities)
Full_cities_df=Full_cities_df.fillna(0).sort_index()
display(Full_cities_df)

Unnamed: 0,Aquarium,Arcade & Bowling,Casino,Cinema,Night club,Disco,Music,Art,Stadium,Theme Park,...,Bus,Hotel,Resort,Motel,Hostel,Vacation Rental,Bed & Breakfast,Metro station,Pier,RV park
Aberdare,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Abu Simbel,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Adelaide,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Airlie Beach Queensland,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Algarve,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Venice,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Victoria Falls,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Whitsunday Islands National Park,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Yosemite,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [28]:
from google.colab import drive
drive.mount('https://drive.google.com/file/d/1awEpibozVXc2FUwWEtW7z9SHsvL7PQ4W/view?usp=sharing')

ModuleNotFoundError: No module named 'google.colab'

In [25]:
import time

def Get_all_cities_values(Cities_df_lat_lon,List_categories):
  actual_export=1
  global radius, CLIENT_ID, CLIENT_SECRET,ACCESS_TOKEN,VERSION,LIMIT,Full_cities_df,Full_Data_Locations_df
  AllData=[]
  id_num=0
  Client_ID=CLIENT_ID[id_num]
  Client_SECRET=CLIENT_SECRET[id_num]
  print("Changed client id to: ",str(id_num), " token: ", str(Client_ID))
  for name, lat, lon in zip(Cities_df_lat_lon["Name"],Cities_df_lat_lon["Latitude"],Cities_df_lat_lon["Longitude"]):
    print(name)
    url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(Client_ID, Client_SECRET, lat, lon, VERSION, radius, LIMIT)
    result=requests.get(url).json() 
    num=len(result['response']['venues'])
    if(num>90):
      for search_query in List_categories:
        url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(Client_ID, Client_SECRET, lat, lon, VERSION, search_query, radius, LIMIT)
        result=requests.get(url).json()
        exported=False
        print(result)
        time.sleep(1)
        if(result['meta']['code']==403 or result['meta']['code']==429):
          while(result['meta']['code']==403 or result['meta']['code']==429):      
            if(exported==False):
              print("Export: ",str(actual_export))
              Full_Data_Locations_df=pd.DataFrame(AllData,columns=["City","Category","Name","Latitude","Longitude"])
            
              Path="/content/gdrive/MyDrive/Coursera_IBM_final_Capstone"+str(actual_export)+".csv"
              print(Path)
              Full_Data_Locations_df.to_csv(Path, index = True)
              AllData=[]
exported=True
              actual_export+=1;
            time.sleep(30)
            if(id_num==9):
              id_num=0
            else:
              id_num+=1
            Client_ID=CLIENT_ID[id_num]
            Client_SECRET=CLIENT_SECRET[id_num]
            print("Changed client id to: ",str(id_num), " token: ", str(Client_ID))
            url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(Client_ID, Client_SECRET, lat, lon, VERSION, search_query, radius, LIMIT)
          
            result=requests.get(url).json()
            print(result)

        exported=False
        results=result['response']['venues']
        print(results)
        count=0
        for i in results:
          Place_Name= i["name"]
          Lat= i["location"]["lat"]
          Lon= i["location"]["lng"]
          AllData.append([name,search_query,Place_Name,Lat,Lon])
          count+=1
        num_results = count
        Full_cities_df[search_query][name]=num_results
    
      Full_Data_Locations_df=pd.DataFrame(AllData,columns=["City","Category","Name","Latitude","Longitude"])
      Path="/content/gdrive/MyDrive/Coursera_IBM_final_Capstone"+str(actual_export)+".csv"
      print(Path)
      Full_Data_Locations_df.to_csv(Path, index = True)
      AllData=[]
      exported=True
      actual_export+=1;
  Full_Data_Locations_df=pd.DataFrame(AllData,columns=["City","Category","Name","Latitude","Longitude"])

IndentationError: unindent does not match any outer indentation level (<tokenize>, line 35)

In [27]:
Path="https://drive.google.com/file/d/1awEpibozVXc2FUwWEtW7z9SHsvL7PQ4W/view?usp=sharing"
#Full_cities_df.to_csv(Path, index = False)
Full_Data_Locations_df=pd.read_csv(Path)

ParserError: Error tokenizing data. C error: Expected 285 fields in line 131, saw 427


# Methodology 
Now we have all the data for all the cities. In this project we will base ourselves on these data to find out which are the most similar cities to each other, and which have the attributes that the client likes the most, to recommend the best possible vacations.

To do this, we will look at the percentage of sites in each category that each city has. That is, we will look for each city what percentage of beach it has, what percentage of mountains, which Italian restaurants, ...

It is important to evaluate cities by percentages of each category, and not by the number of sites they have in each category, since otherwise large cities would always win. That is, if the client was a fan of Italy, and of the beach, they would surely like things like Italian restaurants, art, beaches, beach bars and music stores. However, a city such as Barcelona could have many more Italian restaurants, art galleries, music stores and beaches, and the program would recommend this city rather than a small city in Italy, which is what the user would prefer. When using the percentages, although Barcelona still has many more places of those than for example Florence, Florence will be recommended much earlier, since the percentages of these things will be much higher than in Barcelona, ​​where there are many Italian restaurants, but many more Mediterranean. , Catalan or Spanish, and therefore Italian restaurants are overshadowed.

# Analysis 
First we look at the loaded data. We see that we have a total of +140.000 rows (that is, +140.000 sites) and each row has the attributes of city, category, name, latitude and longitude. We will use the name of the place, latitude and longitude at the end of the program, since when we make the recommendation of cities to the user we will show them on a map the sites that we think may interest him/her the most, so that they do not miss anything on their vacations!

In [29]:
display(Full_Data_Locations_df)

NameError: name 'Full_Data_Locations_df' is not defined

In [30]:

for index,row in Full_Data_Locations_df.iterrows():
  Full_cities_df.loc[row['City']][row['Category']]+=1
display(Full_cities_df.head(15))

NameError: name 'Full_Data_Locations_df' is not defined

In [31]:
Lat=Cities_df[Cities_df['Name']=='Hawaii']['Latitude'].values[0]
Lon=Cities_df[Cities_df['Name']=='Hawaii']['Longitude'].values[0]
hawaii_map = folium.Map(location=[Lat+0.5,Lon-0.5], zoom_start=10)
folium.Marker([Lat,Lon], popup="Hawaii").add_to(hawaii_map)
folium.Marker([21.300150, -157.846462], popup="Honolulu").add_to(hawaii_map)
folium.Circle([Lat, Lon], radius=5000, color='red', fill=False).add_to(hawaii_map)
hawaii_map

In [32]:
drop_index=[]
for e in range (len(Full_cities_df.index)):
  if(sum(Full_cities_df.iloc[e,:])<100):
    drop_index.append(Full_cities_df.index[e])
Citites_to_Drop=Full_cities_df[Full_cities_df.index.isin(drop_index)]
display(Citites_to_Drop)

Unnamed: 0,Aquarium,Arcade & Bowling,Casino,Cinema,Night club,Disco,Music,Art,Stadium,Theme Park,...,Bus,Hotel,Resort,Motel,Hostel,Vacation Rental,Bed & Breakfast,Metro station,Pier,RV park
Aberdare,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Abu Simbel,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Adelaide,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Airlie Beach Queensland,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Algarve,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Venice,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Victoria Falls,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Whitsunday Islands National Park,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Yosemite,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [33]:
Cities_dropped_df=Full_cities_df.drop(drop_index,axis=0)
display(Cities_dropped_df)
print(Cities_dropped_df.index)

Unnamed: 0,Aquarium,Arcade & Bowling,Casino,Cinema,Night club,Disco,Music,Art,Stadium,Theme Park,...,Bus,Hotel,Resort,Motel,Hostel,Vacation Rental,Bed & Breakfast,Metro station,Pier,RV park


Index([], dtype='object')


# Data processing
Let's start by normalizing our data set.

First we create a function to normalize all the data.

To normalize the data, what we will do is find the percentage of sites in each category that there are. For example, we will look at the total number of places found, what percentage are beaches, which mountains, which restaurants, ... If instead of normalizing doing the percentage we would normalize making the maximum number is 1 and the rest the proportional part ( for example, if there are 50 beaches and 25 mountains, the number in the beach category is 1 and 0.5 in the mountains category), because the algorithm seeks to maximize what the user prefers, the algorithm will determine what the user likes the most, and will look for the city that has the most in that category. If we did not do the percentages, the big cities would always win, because they are the ones that have the most things, when in truth what we want is not to find a city with many things, but a city of the same style as the ones that the user likes. Then, making the percentage, if the user wants a city that is 70% beach, 5% Italian restaurants, 10% resorts and 15% French restaurants, the algorithm will search for a city similar to this in percentages, and not for example, a city like Barcelona, ​​which may have many more beaches, Italian and French restaurants, and hotels and resorts, but it will not look anything like the city entered by the user.

In [34]:
def Apply_normalization(df):
  for e in range(len(df.index)):
    if(max(df.iloc[e,:])>0):
      df.iloc[e,:]=(df.iloc[e,:]/sum(df.iloc[e,:]))
  return df

In [35]:
Cities_grouped_nor=Cities_dropped_df.copy()
Cities_grouped_nor=Apply_normalization(Cities_grouped_nor)
display(Cities_grouped_nor.head(25))

Unnamed: 0,Aquarium,Arcade & Bowling,Casino,Cinema,Night club,Disco,Music,Art,Stadium,Theme Park,...,Bus,Hotel,Resort,Motel,Hostel,Vacation Rental,Bed & Breakfast,Metro station,Pier,RV park


In [36]:

def return_most_common_venues(row, num_top_venues):
    row_categories = row
    row_categories_sorted = row_categories.sort_values(ascending=False)
    return row_categories_sorted.index.values[0:num_top_venues]

In [37]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

columns = []
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

Cities_venues_sorted = pd.DataFrame(columns=columns,index=Cities_grouped_nor.index)
Cities_venues_sorted=Cities_venues_sorted.fillna(0).sort_index()

i=0
for ind,row in Cities_grouped_nor.iterrows():
    Cities_venues_sorted.iloc[i, 0:]=return_most_common_venues(row, num_top_venues)
    i+=1;

In [38]:
Cities_venues_sorted.head(25)

Unnamed: 0,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue


In [39]:
userInput = [
            {'City':'New York', 'rating':2},
            {'City':'Barcelona', 'rating':2.5},
            {'City':'Bora Bora', 'rating':5},
            {'City':'Melbourne', 'rating':5},
            {'City':'Bangkok', 'rating':3},
            {'City':'Barbados', 'rating':4},
            {'City':'Airlie Beach Queensland', 'rating':5},
            {'City':'Cancun', 'rating':5},
            {'City':'Berlin', 'rating':2.5},
            {'City':'Vancouver', 'rating':3},
            {'City':'San Francisco', 'rating':4},
            {'City':'Las Vegas', 'rating':3.5},
            {'City':'Cairo', 'rating':4},
         ] 
inputCities = pd.DataFrame(userInput)
inputCities

Unnamed: 0,City,rating
0,New York,2.0
1,Barcelona,2.5
2,Bora Bora,5.0
3,Melbourne,5.0
4,Bangkok,3.0
5,Barbados,4.0
6,Airlie Beach Queensland,5.0
7,Cancun,5.0
8,Berlin,2.5
9,Vancouver,3.0


In [40]:

UserCitiesPreferences=Cities_grouped_nor[Cities_grouped_nor.index.isin(inputCities["City"].tolist())]
display(UserCitiesPreferences.head(12))

Unnamed: 0,Aquarium,Arcade & Bowling,Casino,Cinema,Night club,Disco,Music,Art,Stadium,Theme Park,...,Bus,Hotel,Resort,Motel,Hostel,Vacation Rental,Bed & Breakfast,Metro station,Pier,RV park


In [41]:
UserCitiesPreferences_int=UserCitiesPreferences.copy().reset_index()
UserCitiesPreferences_int=UserCitiesPreferences_int.drop("index",axis=1)
User_profile=UserCitiesPreferences_int.transpose().dot(inputCities["rating"])
Top_15_user_preferences=User_profile.sort_values(ascending=False).head(15)
display(Top_15_user_preferences)

ValueError: matrices are not aligned

In [42]:
for city in inputCities["City"]:
  Cities_grouped_nor=Cities_grouped_nor[Cities_grouped_nor.index!=city]
display(Cities_grouped_nor)

Unnamed: 0,Aquarium,Arcade & Bowling,Casino,Cinema,Night club,Disco,Music,Art,Stadium,Theme Park,...,Bus,Hotel,Resort,Motel,Hostel,Vacation Rental,Bed & Breakfast,Metro station,Pier,RV park


In [43]:
recommendationTable_df = ((Cities_grouped_nor*User_profile).sum(axis=1))/(User_profile.sum())
recommendationTable_df = recommendationTable_df.sort_values(ascending=False)
display(recommendationTable_df.head(15))

NameError: name 'User_profile' is not defined

In [44]:
display(recommendationTable_df.tail(10))

NameError: name 'recommendationTable_df' is not defined

In [45]:
recommendationTable_df=recommendationTable_df[0:15]
recommendationTable_df=recommendationTable_df.sort_index()
recommendationTable_df=recommendationTable_df[recommendationTable_df.index.isin(recommendationTable_df.index[0:15].tolist())]
Found_Cities_venues_sorted=Cities_venues_sorted[Cities_venues_sorted.index.isin(recommendationTable_df.index.tolist())]
Found_Cities_venues_sorted["Match"]=recommendationTable_df.values
Found_Cities_venues_sorted=Found_Cities_venues_sorted.sort_values(by=["Match"],ascending=False)
columns=Found_Cities_venues_sorted.columns.tolist()
columns=columns[-1:]+columns[:-1]
Found_Cities_venues_sorted=Found_Cities_venues_sorted[columns]
display(Found_Cities_venues_sorted)

NameError: name 'recommendationTable_df' is not defined

In [46]:

Cities_Coordinates_df=Cities_df[Cities_df['Name'].isin(recommendationTable_df.index.tolist())].set_index("Name")
display(Cities_Coordinates_df)
Actual_full_Data_Locations_df=Full_Data_Locations_df[Full_Data_Locations_df['City'].isin(Cities_Coordinates_df.index.tolist())]
Actual_full_Data_Locations_df=Actual_full_Data_Locations_df[Actual_full_Data_Locations_df['Category'].isin(Top_15_user_preferences.index).tolist()]
Actual_full_Data_Locations_df['preference']=0
min_value=min(Top_15_user_preferences.values)
for i in range(len(Actual_full_Data_Locations_df.index)):
  Actual_full_Data_Locations_df.iloc[i,5]=(Top_15_user_preferences[Top_15_user_preferences.index == Actual_full_Data_Locations_df.iloc[i,1]][0]-(min_value-2))
display(Actual_full_Data_Locations_df)

NameError: name 'recommendationTable_df' is not defined

In [47]:
app = JupyterDash(__name__)

recommended_city=Found_Cities_venues_sorted.index[0]
cities_list=Found_Cities_venues_sorted.index.sort_values()
JupyterDash.infer_jupyter_proxy_config()
px.set_mapbox_access_token(open("/content/gdrive/MyDrive/mapbox_token.txt").read())
#df = px.data.carshare()

def serve_layout():
    return html.Div([html.H1('Recomended cities to visit and their top places',
                            style={'textAlign': 'center', 'color': '#D7DBDE',
                             'font-size': 45}),
                             html.Div([  html.Div(
                                            [ html.H2('Select a city:' ,
                                             style={'margin-right': '1em','font-size': '30px', 'color': '#D7DBDE','margin-left': '8em'})]
                                        ), dcc.Dropdown(id='input-city', 
                                                      options=[{'label': i, 'value': i} for i in cities_list],
                                                      value=recommended_city,
                                                      placeholder="Select a City",
                                                     style={'width':'80%', 'padding':'3px', 'font-size': '30px', 'color': '#000000', 'text-align-last' : 'center','align-items': 'center',}),], 
                                style={'width': '100%', 'display': 'flex', 'align-items': 'center', 'justify-content': 'center'}),
                                html.Div(dcc.Graph(id='city-plot', style={'width':'95%','height': '90vh'})),
                                ])

app.layout = serve_layout
                     
 # Callback decorator
@app.callback( Output('city-plot','figure'),
                [Input('input-city', 'value')])     
def get_graph(entered_city):
     global Cities_Coordinates_df;   

     Actual_Data_Locations_df=Actual_full_Data_Locations_df[Actual_full_Data_Locations_df['City']==entered_city]


     fig = px.scatter_mapbox(Actual_Data_Locations_df, lat="Latitude", lon="Longitude", color="Category", size="preference",text='Name',
              color_continuous_scale=px.colors.cyclical.IceFire, size_max=15, zoom=10)
     fig.update_layout()

     return fig

if __name__ == '__main__':
    app.run_server( mode="inline",host="localhost",port=9000,debug=True )

NameError: name 'Found_Cities_venues_sorted' is not defined

# Results and Discussion 
The result of this project is the recommended sites for a particular user. As has been seen and explained, the scores entered clearly correspond to a person with little interest in big cities, someone who enjoys relaxing vacations much more, in quiet places and especially with the beach. This way any user can decide whether he/she wants to visit any particular city or not due to COVID-19 situation. We can see, as of the 15 recommended cities, except Dubai, the other cities are relatively quiet cities, and most are beach, so it seems that the algorithm works quite well, and the recommendations are good.

# Conclusion 
In conclusion, we can say that this program is a good tool when planning a vacation during COVID-19 situation, since due to the wide range of places to go, and how quickly they all change, it is difficult to know where to go, and where you will find what you are looking for. You could choose to manually search for sites that seem good to you, and use applications such as google maps or Foursquare to find out if those sites really have what you are looking for, but it is always better if they can give it to you done, as in this case!

The final decision of where to go will be up to the client, but the recommendations are made.