## Final project

### Introduction

Food tourism is a growing industry. Some people like to travel to new cities just to taste new flavors and get new culinary experiences. It is particularly important for this kind of tourist to find a wide variety of restaurants around the center. Foursquare can help to identify the next travel destination. By leveraging the information from their database, it is possible to identify if multiple restaurants are in a walking distance between each other, so that the tourist can easily jump from one to the next one.

### Data
The data will be retrieved by queries to Foursquare server. Information of up to 50 restaurants around the city center will be retrieved via JSON messages. These JSON messages contain relevant information such as location of the restaurant and what type of food they serve.

### Methodology
First restaurant in Dallas and New York data will be collected. This data will be transformed into dataframes for easy analysis. To get insights on the data, it will be visualized in a suitable format. Also, the dataframes will be transformed as required. This is part of the exploratory research that will be performed. Then the results will be presented and a discussion will be provided. Finally a conclusion will be written.

### Results
* The location distribution of the restaurants if both cities is different. On one side, NY is a very dense area, while in Dallas the restaurants are more sparsed.

* There are more restaurants in the city center in NY than in Dallas.

* There are more variety of Food in NY than in Dallas.

### Discussion
* New York has a higher population density and this is also reflected in the restaurants. For this reason, there are plenty restaurants around 1km the city center. This is a great opportunity for a food tourist who want to visit multiple restaurants on a single afternoon.

* For the same reason that NY is denser, there might be a bias on the data retrieved since it was 1km around the city center. There are more restaurants in this area than in Texas. Therefore, it might have been better to get the 50 closes restaurants to the city center, rather that limiting the search to 1km.

### Conclusion
* To conclude, New York is a more attractive city for doing this kind of food tourism, since there are plenty of restaurants in a walking distance from each other. To do the same kind of tourism in Dallas, it is required to use a car.

### Libraries

In [1]:
import numpy as np
import pandas as pd

import json
import requests

from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

import matplotlib.cm as cm
import matplotlib.colors as colors

from IPython.display import Image 
from IPython.core.display import HTML 

from sklearn.cluster import KMeans

import folium

print('Libraries imported.')

Libraries imported.


### Get Foursquare data

In [2]:
CLIENT_ID = 'MPWSBTIPQB0A5NGJ0I1VE5NMQCTY1MZCYT4WCQ00N5TDORPA' # your Foursquare ID
CLIENT_SECRET = '1SKGJMUIZS0MFNLO4THVXFFHTZMFZO1ZJZJXUSBFYGIPKQAR' # your Foursquare Secret
ACCESS_TOKEN = 'DOV55FI30LUSYSWAOTUZYAC2Y1NAAQECNRWWFILUHUVOP0O4' # your FourSquare Access Token
VERSION = '20210122'
LIMIT = 50

In [8]:
#search_query = ['Italian', 'French', 'American', 'Mexican', 'Indian', 'Turkish']               
search_query = ['Restaurant']

latitude_d = 32.7767
longitude_d = -96.7970

radius = 1000

# Results DataFrame
column_names = ['id','name','categories','referralId','hasPerk','location.lat','location.lng',
                'location.labeledLatLngs','location.distance','location.cc','location.country',
                'location.formattedAddress','location.address','location.postalCode','location.city',
                'location.state','location.crossStreet']
df_merged_d = pd.DataFrame(columns=column_names)

for search in search_query:
    url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&oauth_token={}&v={}&query={}&radius={}&limit={}'\
    .format(CLIENT_ID, CLIENT_SECRET, latitude_d, longitude_d,ACCESS_TOKEN, VERSION, search, radius, LIMIT)
    results = requests.get(url).json()

    # JSON to DataFrame
    venues = results['response']['venues']
    dataframe = pd.json_normalize(venues)
    df_merged_d = pd.concat([df_merged_d,dataframe])
    
df_merged_d.reset_index()
print(df_merged_d.shape)
df_merged_d.head(5)

(24, 25)


Unnamed: 0,id,name,categories,referralId,hasPerk,location.lat,location.lng,location.labeledLatLngs,location.distance,location.cc,...,location.state,location.crossStreet,venuePage.id,delivery.id,delivery.url,delivery.provider.name,delivery.provider.icon.prefix,delivery.provider.icon.sizes,delivery.provider.icon.name,location.neighborhood
0,4b15f41cf964a52002b623e3,Ravenna Urban Italian Restaurant,"[{'id': '4bf58dd8d48988d110941735', 'name': 'I...",v-1611367721,False,32.780371,-96.801106,"[{'label': 'display', 'lat': 32.78037110712935...",560,US,...,TX,at Field St.,,,,,,,,
1,41311c80f964a520320d1fe3,Enchilada's Restaurant,"[{'id': '4bf58dd8d48988d1c1941735', 'name': 'M...",v-1611367721,False,32.780968,-96.801241,"[{'label': 'display', 'lat': 32.78096829549494...",619,US,...,TX,Corner Elm & Field,509837611.0,,,,,,,
2,4f442266d4f2bdcc71def513,Pyramid Restaurant & Bar,"[{'id': '4bf58dd8d48988d14e941735', 'name': 'A...",v-1611367721,False,32.785974,-96.801532,"[{'label': 'display', 'lat': 32.78597407382344...",1116,US,...,TX,in The Fairmont Dallas,,1876257.0,https://www.grubhub.com/restaurant/pyramid-res...,grubhub,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",/delivery_provider_grubhub_20180129.png,
3,4b738b87f964a5206cb32de3,650 North Restaurant,"[{'id': '4bf58dd8d48988d1d5941735', 'name': 'H...",v-1611367721,False,32.786953,-96.795799,"[{'label': 'display', 'lat': 32.78695297241211...",1146,US,...,TX,in Marriott Hotel,,,,,,,,
4,57a77172498e3e87490d3153,Brother's restaurant,"[{'id': '56aa371be4b08b9a8d573538', 'name': 'T...",v-1611367721,False,32.777605,-96.796342,"[{'label': 'display', 'lat': 32.77760499894799...",118,US,...,TX,,,,,,,,,


In [9]:
latitude_ny = 40.7128
longitude_ny = -74.0060

df_merged_ny = pd.DataFrame(columns=column_names)

for search in search_query:
    url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&oauth_token={}&v={}&query={}&radius={}&limit={}'\
    .format(CLIENT_ID, CLIENT_SECRET, latitude_ny, longitude_ny,ACCESS_TOKEN, VERSION, search, radius, LIMIT)
    results = requests.get(url).json()

    # JSON to DataFrame
    venues = results['response']['venues']
    dataframe = pd.json_normalize(venues)
    df_merged_ny = pd.concat([df_merged_ny,dataframe])
    
df_merged_ny.reset_index()
print(df_merged_ny.shape)
df_merged_ny.head()

(50, 24)


Unnamed: 0,id,name,categories,referralId,hasPerk,location.lat,location.lng,location.labeledLatLngs,location.distance,location.cc,...,location.city,location.state,location.crossStreet,delivery.id,delivery.url,delivery.provider.name,delivery.provider.icon.prefix,delivery.provider.icon.sizes,delivery.provider.icon.name,venuePage.id
0,3fd66200f964a520ece31ee3,Golden Unicorn Restaurant 麒麟金閣,"[{'id': '4bf58dd8d48988d1f5931735', 'name': 'D...",v-1611367722,False,40.713629,-73.99723,"[{'label': 'display', 'lat': 40.71362850464683...",745,US,...,New York,NY,at Catherine St,2197825.0,https://www.seamless.com/menu/golden-unicorn-b...,seamless,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",/delivery_provider_seamless_20180129.png,72966848.0
1,3fd66200f964a520ceea1ee3,Deluxe Green Bo Restaurant,"[{'id': '4bf58dd8d48988d145941735', 'name': 'C...",v-1611367722,False,40.715545,-73.998137,"[{'label': 'display', 'lat': 40.71554491813315...",730,US,...,New York,NY,btwn Elizabeth & Mott St,546663.0,https://www.seamless.com/menu/deluxe-green-bo-...,seamless,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",/delivery_provider_seamless_20180129.png,348599123.0
2,3fd66200f964a520d5e31ee3,Jing Fong Restaurant 金豐大酒樓,"[{'id': '4bf58dd8d48988d1f5931735', 'name': 'D...",v-1611367722,False,40.715881,-73.997209,"[{'label': 'display', 'lat': 40.7158812029412,...",817,US,...,New York,NY,btwn Bayard & Canal St,296411.0,https://www.seamless.com/menu/jing-fong-restau...,seamless,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",/delivery_provider_seamless_20180129.png,89860853.0
3,4a00df67f964a520ba701fe3,Bo Ky Restaurant 波記潮州小食,"[{'id': '4bf58dd8d48988d145941735', 'name': 'C...",v-1611367722,False,40.715696,-73.998667,"[{'label': 'display', 'lat': 40.71569636637641...",697,US,...,New York,NY,at Mott St,,,,,,,
4,4bc238adf8219c744286b410,Amore's Pizza Restaurant,"[{'id': '4bf58dd8d48988d1ca941735', 'name': 'P...",v-1611367722,False,40.71586,-74.009888,"[{'label': 'display', 'lat': 40.71585960614924...",472,US,...,New York,NY,Hudson Street,1431324.0,https://www.seamless.com/menu/cafe-amores-pizz...,seamless,https://fastly.4sqi.net/img/general/cap/,"[40, 50]",/delivery_provider_seamless_20180129.png,


### Extract categories

In [10]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in df_merged_d.columns if col.startswith('location.')] + ['id']
df_merged_d_filtered = df_merged_d.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
df_merged_d_filtered['categories'] = df_merged_d_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
df_merged_d_filtered.columns = [column.split('.')[-1] for column in df_merged_d_filtered.columns]

print(df_merged_d_filtered.shape)
df_merged_d_filtered.head(10)

(24, 16)


Unnamed: 0,name,categories,lat,lng,labeledLatLngs,distance,cc,country,formattedAddress,address,postalCode,city,state,crossStreet,neighborhood,id
0,Ravenna Urban Italian Restaurant,Italian Restaurant,32.780371,-96.801106,"[{'label': 'display', 'lat': 32.78037110712935...",560,US,United States,"[1301 Main St (at Field St.), Dallas, TX 75202...",1301 Main St,75202,Dallas,TX,at Field St.,,4b15f41cf964a52002b623e3
1,Enchilada's Restaurant,Mexican Restaurant,32.780968,-96.801241,"[{'label': 'display', 'lat': 32.78096829549494...",619,US,United States,"[1304 Elm St (Corner Elm & Field), Dallas, TX ...",1304 Elm St,75202,Dallas,TX,Corner Elm & Field,,41311c80f964a520320d1fe3
2,Pyramid Restaurant & Bar,American Restaurant,32.785974,-96.801532,"[{'label': 'display', 'lat': 32.78597407382344...",1116,US,United States,"[1717 N Akard St (in The Fairmont Dallas), Dal...",1717 N Akard St,75201,Dallas,TX,in The Fairmont Dallas,,4f442266d4f2bdcc71def513
3,650 North Restaurant,Hotel Bar,32.786953,-96.795799,"[{'label': 'display', 'lat': 32.78695297241211...",1146,US,United States,"[650 N Pearl St (in Marriott Hotel), Dallas, T...",650 N Pearl St,75201,Dallas,TX,in Marriott Hotel,,4b738b87f964a5206cb32de3
4,Brother's restaurant,Theme Restaurant,32.777605,-96.796342,"[{'label': 'display', 'lat': 32.77760499894799...",118,US,United States,"[677 Charla Lane, Dallas, TX 75212, United Sta...",677 Charla Lane,75212,Dallas,TX,,,57a77172498e3e87490d3153
5,Campisi's Restaurant - Downtown Dallas,Italian Restaurant,32.781522,-96.798556,"[{'label': 'display', 'lat': 32.78152218733115...",556,US,United States,"[1520 Elm St (btwn Akard St. & Ervay St.), Dal...",1520 Elm St,75201,Dallas,TX,btwn Akard St. & Ervay St.,,4ab963a5f964a520297f20e3
6,veronas Restaurant,Restaurant,32.780529,-96.800896,"[{'label': 'display', 'lat': 32.7805290222168,...",560,US,United States,"[1301 Main St Ste 798, Dallas, TX 75202, Unite...",1301 Main St Ste 798,75202,Dallas,TX,,,599d51cf51950e1c2ff8d920
7,Dragon Palace Chinese Restaurant,Chinese Restaurant,32.780059,-96.800979,"[{'label': 'display', 'lat': 32.78005925294478...",527,US,United States,"[Village Plaza (Constant Spring Road), Dallas,...",Village Plaza,75202,Dallas,TX,Constant Spring Road,,5114228ce4b02cf782f651b4
8,A&W Restaurant,Fast Food Restaurant,32.782268,-96.796758,"[{'label': 'display', 'lat': 32.782268, 'lng':...",620,US,United States,"[1700 Pacific Ave, Dallas, TX 75201, United St...",1700 Pacific Ave,75201,Dallas,TX,,,5efcadb8f2e6990008d21a58
9,Pacific Deck Restaurant,Restaurant,32.78252,-96.79552,"[{'label': 'display', 'lat': 32.78252, 'lng': ...",662,US,United States,"[1910 Pacific Ave, Dallas, TX 75201, United St...",1910 Pacific Ave,75201,Dallas,TX,,City Center District,5aaba0d9610f04694b348abe


In [12]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in df_merged_ny.columns if col.startswith('location.')] + ['id']
df_merged_ny_filtered = df_merged_ny.loc[:, filtered_columns]

# filter the category for each row
df_merged_ny_filtered['categories'] = df_merged_ny_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
df_merged_ny_filtered.columns = [column.split('.')[-1] for column in df_merged_ny_filtered.columns]

print(df_merged_ny_filtered.shape)
df_merged_ny_filtered.head(10)

(50, 15)


Unnamed: 0,name,categories,lat,lng,labeledLatLngs,distance,cc,country,formattedAddress,address,postalCode,city,state,crossStreet,id
0,Golden Unicorn Restaurant 麒麟金閣,Dim Sum Restaurant,40.713629,-73.99723,"[{'label': 'display', 'lat': 40.71362850464683...",745,US,United States,"[18 E Broadway (at Catherine St), New York, NY...",18 E Broadway,10002,New York,NY,at Catherine St,3fd66200f964a520ece31ee3
1,Deluxe Green Bo Restaurant,Chinese Restaurant,40.715545,-73.998137,"[{'label': 'display', 'lat': 40.71554491813315...",730,US,United States,"[66 Bayard St (btwn Elizabeth & Mott St), New ...",66 Bayard St,10013,New York,NY,btwn Elizabeth & Mott St,3fd66200f964a520ceea1ee3
2,Jing Fong Restaurant 金豐大酒樓,Dim Sum Restaurant,40.715881,-73.997209,"[{'label': 'display', 'lat': 40.7158812029412,...",817,US,United States,"[20 Elizabeth St (btwn Bayard & Canal St), New...",20 Elizabeth St,10013,New York,NY,btwn Bayard & Canal St,3fd66200f964a520d5e31ee3
3,Bo Ky Restaurant 波記潮州小食,Chinese Restaurant,40.715696,-73.998667,"[{'label': 'display', 'lat': 40.71569636637641...",697,US,United States,"[80 Bayard St (at Mott St), New York, NY 10013...",80 Bayard St,10013,New York,NY,at Mott St,4a00df67f964a520ba701fe3
4,Amore's Pizza Restaurant,Pizza Place,40.71586,-74.009888,"[{'label': 'display', 'lat': 40.71585960614924...",472,US,United States,"[147 Chambers St (Hudson Street), New York, NY...",147 Chambers St,10007,New York,NY,Hudson Street,4bc238adf8219c744286b410
5,Mudville Restaurant & Tap House,Wings Joint,40.715209,-74.008923,"[{'label': 'entrance', 'lat': 40.715239, 'lng'...",364,US,United States,[126 Chambers St (btwn W Broadway & Church St)...,126 Chambers St,10007,New York,NY,btwn W Broadway & Church St,45e5c256f964a52046431fe3
6,O'Hara's Restaurant & Pub,Pub,40.709894,-74.012836,"[{'label': 'display', 'lat': 40.70989378141622...",661,US,United States,"[120 Cedar St (at Greenwich St.), New York, NY...",120 Cedar St,10006,New York,NY,at Greenwich St.,49f125dcf964a52091691fe3
7,Royal Seafood Restaurant,Seafood Restaurant,40.717305,-73.997497,"[{'label': 'display', 'lat': 40.71730464970235...",875,US,United States,"[103-105 Mott St (btwn Canal & Hester St), New...",103-105 Mott St,10013,New York,NY,btwn Canal & Hester St,4bdd7814b0f5c928c4684ce3
8,TJ Byrnes Bar and Restaurant,Restaurant,40.709233,-74.003747,"[{'label': 'display', 'lat': 40.70923312629616...",440,US,United States,"[77 Fulton St (Gold St), New York, NY 10038, U...",77 Fulton St,10038,New York,NY,Gold St,4b4fdfc8f964a520801827e3
9,218 Restaurant,Chinese Restaurant,40.718833,-73.995895,"[{'label': 'display', 'lat': 40.71883283355385...",1085,US,United States,"[218 Grand St (btwn Elizabeth & Mott St.), New...",218 Grand St,10013,New York,NY,btwn Elizabeth & Mott St.,4bfdcd3ae529c928a589bb8c


## Visualization map

In [13]:
# Dallas
venues_map = folium.Map(location=[latitude_d, longitude_d], zoom_start=15) # generate map centred around Ecco

# add popular spots to the map as blue circle markers
for lat, lng, label in zip(df_merged_d_filtered.lat, df_merged_d_filtered.lng, df_merged_d_filtered.categories):
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        fill=True,
        color='blue',
        fill_color='blue',
        fill_opacity=0.6
        ).add_to(venues_map)

# display map
venues_map

In [14]:
# New York
venues_map = folium.Map(location=[latitude_ny, longitude_ny], zoom_start=15) # generate map centred around Ecco

# add popular spots to the map as blue circle markers
for lat, lng, label in zip(df_merged_ny_filtered.lat, df_merged_ny_filtered.lng, df_merged_ny_filtered.categories):
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        fill=True,
        color='blue',
        fill_color='blue',
        fill_opacity=0.6
        ).add_to(venues_map)

# display map
venues_map