# 1, Introduction

- Problem: Analyzing the restaurants in the neighborhood of Toronto.
- Audience: Everyone who wants to open a restaurant in the neighborhood of Toronto, they want to find the best place for opening, locations and distributions of other restaurants in every neighborhood, what type of restaurants are popular among these areas.

# 2, Data

- Foursquare location data: an API from Foursquare helps us to explore restaurants around Toronto such as longitude, latitude, venue, venue category ...
- List of postal codes in Canada from Wikipedia: postal codes of Canada to interact with Foursquare API

# 3, Methodology

- EDA: explore neighborhoods of Toronto and restaurants, find out what is the most popular type of restaurant for every neighborhood
- Cluster: split neighborhoods of Toronto into cluster based on favorite restaurants in every area

# 4, Get data, clean data and analyze

### Import libraries

In [73]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import folium
from bs4 import BeautifulSoup
import requests
import geocoder
from sklearn.cluster import KMeans
import matplotlib.cm as cm
import matplotlib.colors as colors

### Scrap data from Wikipedia

In [74]:
url = "https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M"
content = requests.get(url = url).text
soup = BeautifulSoup(content, "html.parser")

In [75]:
s_table = soup.find("table", {"class": "wikitable"})
s_col = s_table.find_all("th")
s_cell = s_table.find_all("td")
cols = [i.get_text().replace("\n","") for i in s_col]
cells = [i.get_text().replace("\n","") for i in s_cell]
l1, l2, l3 = [], [], []
for idx, val in enumerate(cells):
    if idx % 3 == 0:
        l1.append(val)
    elif idx % 3 == 1:
        l2.append(val)
    else:
        l3.append(val)
vals = [l1, l2, l3]
df = pd.DataFrame(dict(zip(cols, vals)))
df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


### Clean "Not assigned" data

In [76]:
df1 = df.drop(df[df.Borough == "Not assigned"].index)

In [78]:
def combine_nei(x):
    return ", ".join(x)
df1 = pd.DataFrame(df1.groupby(["Postcode", "Borough"]).Neighbourhood.apply(combine_nei)).reset_index()
df1.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1B,Scarborough,"Rouge, Malvern"
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


In [79]:
test_na = df1[df1.Neighbourhood.str.contains("Not assigned")]
test_na

Unnamed: 0,Postcode,Borough,Neighbourhood
93,M9A,Queen's Park,Not assigned


In [81]:
df1["Neighbourhood"] = df1.Neighbourhood.replace("Not assigned", df1.Borough)
df1.loc[test_na.index]

Unnamed: 0,Postcode,Borough,Neighbourhood
93,M9A,Queen's Park,Queen's Park


In [129]:
df1.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


### Add geo data (longitude and latitude)

In [82]:
geo_df = pd.read_csv("Geospatial_Coordinates.csv")
geo_df.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [83]:
df1 = pd.merge(left = df1, right = geo_df, 
               how = "left", left_on = "Postcode", right_on = "Postal Code").drop(columns = ["Postal Code"])
df1.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


### Explore Toronto

In [84]:
tor_df = df1[df1.Borough.str.contains("Toronto")].reset_index().drop(columns="index")
tor_df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M4E,East Toronto,The Beaches,43.676357,-79.293031
1,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188
2,M4L,East Toronto,"The Beaches West, India Bazaar",43.668999,-79.315572
3,M4M,East Toronto,Studio District,43.659526,-79.340923
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879


In [85]:
tor_map = folium.Map(location=[tor_df.Latitude.mean(), tor_df.Longitude.mean()], zoom_start=11)

# add markers to map
for lat, lng, label in zip(tor_df['Latitude'], tor_df['Longitude'], tor_df['Neighbourhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(tor_map)  
    
tor_map

In [86]:
CLIENT_ID = '1LH5UHA0XLZSSFNYESGAVK5TTPIJNG1VACOREOXGAXEMRGIE' # your Foursquare ID
CLIENT_SECRET = 'Z1PHKBXIOGYDCISVKKVJWGSTBIK2KL0YFVJLWWRRBGQPSZPO' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
categoryId = '4d4b7105d754a06374d81259' # food category

In [87]:
def getNearbyVenues(names, latitudes, longitudes, radius=500, LIMIT = 100, categoryId = '4d4b7105d754a06374d81259'):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&categoryId={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT,
            categoryId)
        try:    
            # make the GET request
            results = requests.get(url).json()["response"]['groups'][0]['items']

            # return only relevant information for each nearby venue
            venues_list.append([(
                name, 
                lat, 
                lng, 
                v['venue']['name'], 
                v['venue']['location']['lat'], 
                v['venue']['location']['lng'],  
                v['venue']['categories'][0]['name']) for v in results])
        except:
            pass

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [89]:
tor_venues = getNearbyVenues(names=tor_df['Neighbourhood'],
                                   latitudes=tor_df['Latitude'],
                                   longitudes=tor_df['Longitude']
                                  )

The Beaches
The Danforth West, Riverdale
The Beaches West, India Bazaar
Studio District
Lawrence Park
Davisville North
North Toronto West
Davisville
Moore Park, Summerhill East
Deer Park, Forest Hill SE, Rathnelly, South Hill, Summerhill West
Rosedale
Cabbagetown, St. James Town
Church and Wellesley
Harbourfront
Ryerson, Garden District
St. James Town
Berczy Park
Central Bay Street
Adelaide, King, Richmond
Harbourfront East, Toronto Islands, Union Station
Design Exchange, Toronto Dominion Centre
Commerce Court, Victoria Hotel
Roselawn
Forest Hill North, Forest Hill West
The Annex, North Midtown, Yorkville
Harbord, University of Toronto
Chinatown, Grange Park, Kensington Market
CN Tower, Bathurst Quay, Island airport, Harbourfront West, King and Spadina, Railway Lands, South Niagara
Stn A PO Boxes 25 The Esplanade
First Canadian Place, Underground city
Christie
Dovercourt Village, Dufferin
Little Portugal, Trinity
Brockton, Exhibition Place, Parkdale Village
High Park, The Junction Sout

In [130]:
tor_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,The Beaches,43.676357,-79.293031,Seaspray Restaurant,43.678888,-79.298167,Asian Restaurant
1,"The Danforth West, Riverdale",43.679557,-79.352188,Pantheon,43.677621,-79.351434,Greek Restaurant
2,"The Danforth West, Riverdale",43.679557,-79.352188,Cafe Fiorentina,43.677743,-79.350115,Italian Restaurant
3,"The Danforth West, Riverdale",43.679557,-79.352188,Mezes,43.677962,-79.350196,Greek Restaurant
4,"The Danforth West, Riverdale",43.679557,-79.352188,Messini Authentic Gyros,43.677827,-79.350569,Greek Restaurant


### Explore restaurant in Toronto

In [90]:
tor_venues = tor_venues[tor_venues["Venue Category"].str.contains("Restaurant")].reset_index(drop = True)

In [106]:
tor_mapres = folium.Map(location=[tor_venues["Venue Latitude"].mean(), tor_venues["Venue Longitude"].mean()], zoom_start=12)

# add markers to map
for lat, lng, label in zip(tor_venues["Venue Latitude"], tor_venues["Venue Longitude"], tor_venues["Venue"]):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(tor_mapres)  
    
tor_mapres

In [92]:
tor_venues.groupby("Neighborhood").count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Adelaide, King, Richmond",45,45,45,45,45,45
Berczy Park,29,29,29,29,29,29
"Brockton, Exhibition Place, Parkdale Village",4,4,4,4,4,4
Business Reply Mail Processing Centre 969 Eastern,3,3,3,3,3,3
"CN Tower, Bathurst Quay, Island airport, Harbourfront West, King and Spadina, Railway Lands, South Niagara",2,2,2,2,2,2
"Cabbagetown, St. James Town",14,14,14,14,14,14
Central Bay Street,29,29,29,29,29,29
"Chinatown, Grange Park, Kensington Market",32,32,32,32,32,32
Christie,3,3,3,3,3,3
Church and Wellesley,42,42,42,42,42,42


In [93]:
tor_onehot = pd.get_dummies(tor_venues[["Venue Category"]], prefix = "", prefix_sep = "")
tor_onehot["Neighborhood"] = tor_venues["Neighborhood"]
tor_cols = list(tor_onehot.columns)
tor_cols.remove("Neighborhood")
tor_cols = ["Neighborhood"] + tor_cols
tor_onehot = tor_onehot[tor_cols]

In [94]:
tor_onehot.head()

Unnamed: 0,Neighborhood,Afghan Restaurant,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Asian Restaurant,Belgian Restaurant,Brazilian Restaurant,Cajun / Creole Restaurant,Caribbean Restaurant,...,Seafood Restaurant,South American Restaurant,Southern / Soul Food Restaurant,Sushi Restaurant,Taiwanese Restaurant,Tapas Restaurant,Thai Restaurant,Theme Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant
0,The Beaches,0,0,0,0,1,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,"The Danforth West, Riverdale",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,"The Danforth West, Riverdale",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,"The Danforth West, Riverdale",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,"The Danforth West, Riverdale",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [95]:
tor_group = tor_onehot.groupby("Neighborhood").mean().reset_index()
tor_group

Unnamed: 0,Neighborhood,Afghan Restaurant,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Asian Restaurant,Belgian Restaurant,Brazilian Restaurant,Cajun / Creole Restaurant,Caribbean Restaurant,...,Seafood Restaurant,South American Restaurant,Southern / Soul Food Restaurant,Sushi Restaurant,Taiwanese Restaurant,Tapas Restaurant,Thai Restaurant,Theme Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant
0,"Adelaide, King, Richmond",0.0,0.088889,0.0,0.0,0.111111,0.0,0.022222,0.0,0.0,...,0.044444,0.0,0.0,0.066667,0.0,0.0,0.111111,0.0,0.066667,0.0
1,Berczy Park,0.0,0.034483,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,...,0.068966,0.0,0.0,0.068966,0.0,0.034483,0.034483,0.0,0.068966,0.0
2,"Brockton, Exhibition Place, Parkdale Village",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25
3,Business Reply Mail Processing Centre 969 Eastern,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"CN Tower, Bathurst Quay, Island airport, Harbo...",0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0
5,"Cabbagetown, St. James Town",0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,...,0.0,0.0,0.0,0.071429,0.071429,0.0,0.071429,0.0,0.0,0.0
6,Central Bay Street,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.034483,0.0,0.0,0.034483,0.0,0.0,0.034483,0.0,0.034483,0.0
7,"Chinatown, Grange Park, Kensington Market",0.0,0.0,0.03125,0.0,0.0,0.03125,0.0,0.0,0.0625,...,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.15625,0.125
8,Christie,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Church and Wellesley,0.02381,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,...,0.02381,0.0,0.0,0.166667,0.0,0.0,0.02381,0.02381,0.0,0.047619


In [96]:
num_top_venues = 10
list_cols = ["Neighborhood", "1st", "2nd", "3rd", "4th", "5th", "6th", "7th", "8th", "9th", "10th"]
tor_top10 = pd.DataFrame(columns = list_cols)
for i in tor_group.index:
    temp_nei = tor_group.loc[i, "Neighborhood"]
    temp_df = pd.DataFrame(tor_group.loc[i, :][1:]).reset_index()
    temp_df.columns = ["Venue", "Freq"]
    temp_df["Freq"] = round(temp_df.Freq.astype(float), 2)
    temp_df = temp_df.sort_values("Freq", ascending = False).head(num_top_venues).reset_index(drop = True)
    tor_top10.loc[i, :] = np.append([temp_nei], list(temp_df.Venue.values))
    print("---{}---".format(temp_nei))
    print(temp_df)
    

---Adelaide, King, Richmond---
                           Venue  Freq
0                     Restaurant  0.11
1               Asian Restaurant  0.11
2                Thai Restaurant  0.11
3            American Restaurant  0.09
4             Italian Restaurant  0.07
5               Sushi Restaurant  0.07
6  Vegetarian / Vegan Restaurant  0.07
7             Seafood Restaurant  0.04
8         Gluten-free Restaurant  0.02
9               Greek Restaurant  0.02
---Berczy Park---
                           Venue  Freq
0             Italian Restaurant  0.14
1                     Restaurant  0.07
2  Vegetarian / Vegan Restaurant  0.07
3               Greek Restaurant  0.07
4               Sushi Restaurant  0.07
5            Moroccan Restaurant  0.07
6              French Restaurant  0.07
7             Seafood Restaurant  0.07
8      Middle Eastern Restaurant  0.03
9            Japanese Restaurant  0.03
---Brockton, Exhibition Place, Parkdale Village---
                        Venue  Freq
0     

In [133]:
tor_top10.head()

Unnamed: 0,Cluster Labels,Neighborhood,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
0,0,"Adelaide, King, Richmond",Restaurant,Asian Restaurant,Thai Restaurant,American Restaurant,Italian Restaurant,Sushi Restaurant,Vegetarian / Vegan Restaurant,Seafood Restaurant,Gluten-free Restaurant,Greek Restaurant
1,0,Berczy Park,Italian Restaurant,Restaurant,Vegetarian / Vegan Restaurant,Greek Restaurant,Sushi Restaurant,Moroccan Restaurant,French Restaurant,Seafood Restaurant,Middle Eastern Restaurant,Japanese Restaurant
2,0,"Brockton, Exhibition Place, Parkdale Village",Italian Restaurant,Japanese Restaurant,Restaurant,Vietnamese Restaurant,Asian Restaurant,Portuguese Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant
3,4,Business Reply Mail Processing Centre 969 Eastern,Fast Food Restaurant,Restaurant,Afghan Restaurant,Persian Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant
4,0,"CN Tower, Bathurst Quay, Island airport, Harbo...",American Restaurant,Tapas Restaurant,Persian Restaurant,Korean Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant


In [110]:
rank_list = ["1st", "2nd", "3rd"]
for i in rank_list:
    print(i)
    print(tor_top10[i].value_counts())
    print("\n")

1st
Italian Restaurant               12
Restaurant                        8
Fast Food Restaurant              3
Sushi Restaurant                  3
Asian Restaurant                  3
Vegetarian / Vegan Restaurant     2
Dim Sum Restaurant                1
Portuguese Restaurant             1
American Restaurant               1
Thai Restaurant                   1
Vietnamese Restaurant             1
Greek Restaurant                  1
Japanese Restaurant               1
Name: 1st, dtype: int64


2nd
Restaurant                 9
Italian Restaurant         5
Afghan Restaurant          4
Sushi Restaurant           4
Japanese Restaurant        4
Mexican Restaurant         3
American Restaurant        2
Seafood Restaurant         1
Brazilian Restaurant       1
Tapas Restaurant           1
Fast Food Restaurant       1
Asian Restaurant           1
Comfort Food Restaurant    1
Vietnamese Restaurant      1
Name: 2nd, dtype: int64


3rd
Polish Restaurant                5
American Restaurant        

### Cluster the neighborhoods

In [98]:
k = 5
tor_group_clus = tor_group.drop(columns = ["Neighborhood"])
kmeans = KMeans(n_clusters = k, random_state = 0).fit(tor_group_clus)
kmeans.labels_

array([0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       2, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 1, 4, 0])

In [99]:
# add clustering labels
tor_top10.insert(0, 'Cluster Labels', kmeans.labels_)

tor_merged = tor_df

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
tor_merged = pd.merge(tor_merged, tor_top10.set_index('Neighborhood'), left_on='Neighbourhood', right_on = "Neighborhood")

tor_merged.head() # check the last columns!

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
0,M4E,East Toronto,The Beaches,43.676357,-79.293031,1,Asian Restaurant,Afghan Restaurant,Polish Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Molecular Gastronomy Restaurant
1,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,0,Greek Restaurant,Sushi Restaurant,Italian Restaurant,Restaurant,Indian Restaurant,Caribbean Restaurant,Thai Restaurant,American Restaurant,Japanese Restaurant,Mediterranean Restaurant
2,M4L,East Toronto,"The Beaches West, India Bazaar",43.668999,-79.315572,4,Fast Food Restaurant,Italian Restaurant,Sushi Restaurant,Persian Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant
3,M4M,East Toronto,Studio District,43.659526,-79.340923,0,Italian Restaurant,Comfort Food Restaurant,American Restaurant,Latin American Restaurant,Middle Eastern Restaurant,Seafood Restaurant,Restaurant,Thai Restaurant,Theme Restaurant,North Indian Restaurant
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,2,Dim Sum Restaurant,Afghan Restaurant,Polish Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Molecular Gastronomy Restaurant


In [113]:
tor_res = pd.merge(tor_group, tor_top10.set_index('Neighborhood'), left_on='Neighborhood', right_on = "Neighborhood")
tor_res

Unnamed: 0,Neighborhood,Afghan Restaurant,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Asian Restaurant,Belgian Restaurant,Brazilian Restaurant,Cajun / Creole Restaurant,Caribbean Restaurant,...,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
0,"Adelaide, King, Richmond",0.0,0.088889,0.0,0.0,0.111111,0.0,0.022222,0.0,0.0,...,Restaurant,Asian Restaurant,Thai Restaurant,American Restaurant,Italian Restaurant,Sushi Restaurant,Vegetarian / Vegan Restaurant,Seafood Restaurant,Gluten-free Restaurant,Greek Restaurant
1,Berczy Park,0.0,0.034483,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,...,Italian Restaurant,Restaurant,Vegetarian / Vegan Restaurant,Greek Restaurant,Sushi Restaurant,Moroccan Restaurant,French Restaurant,Seafood Restaurant,Middle Eastern Restaurant,Japanese Restaurant
2,"Brockton, Exhibition Place, Parkdale Village",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,Italian Restaurant,Japanese Restaurant,Restaurant,Vietnamese Restaurant,Asian Restaurant,Portuguese Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant
3,Business Reply Mail Processing Centre 969 Eastern,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,Fast Food Restaurant,Restaurant,Afghan Restaurant,Persian Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant
4,"CN Tower, Bathurst Quay, Island airport, Harbo...",0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,American Restaurant,Tapas Restaurant,Persian Restaurant,Korean Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant
5,"Cabbagetown, St. James Town",0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,...,Restaurant,Italian Restaurant,Chinese Restaurant,American Restaurant,Thai Restaurant,Indian Restaurant,Taiwanese Restaurant,Sushi Restaurant,Caribbean Restaurant,Japanese Restaurant
6,Central Bay Street,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,Italian Restaurant,Japanese Restaurant,Chinese Restaurant,Middle Eastern Restaurant,Thai Restaurant,Korean Restaurant,Fast Food Restaurant,Falafel Restaurant,Modern European Restaurant,Mediterranean Restaurant
7,"Chinatown, Grange Park, Kensington Market",0.0,0.0,0.03125,0.0,0.0,0.03125,0.0,0.0,0.0625,...,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Chinese Restaurant,Mexican Restaurant,Dumpling Restaurant,Caribbean Restaurant,Comfort Food Restaurant,Thai Restaurant,Belgian Restaurant,Korean Restaurant
8,Christie,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,Italian Restaurant,Restaurant,American Restaurant,Theme Restaurant,North Indian Restaurant,Korean Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant
9,Church and Wellesley,0.02381,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,...,Sushi Restaurant,Japanese Restaurant,Fast Food Restaurant,Restaurant,Vietnamese Restaurant,Korean Restaurant,Mediterranean Restaurant,Mexican Restaurant,Indian Restaurant,Persian Restaurant


In [100]:
tor_merged

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
0,M4E,East Toronto,The Beaches,43.676357,-79.293031,1,Asian Restaurant,Afghan Restaurant,Polish Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Molecular Gastronomy Restaurant
1,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,0,Greek Restaurant,Sushi Restaurant,Italian Restaurant,Restaurant,Indian Restaurant,Caribbean Restaurant,Thai Restaurant,American Restaurant,Japanese Restaurant,Mediterranean Restaurant
2,M4L,East Toronto,"The Beaches West, India Bazaar",43.668999,-79.315572,4,Fast Food Restaurant,Italian Restaurant,Sushi Restaurant,Persian Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant
3,M4M,East Toronto,Studio District,43.659526,-79.340923,0,Italian Restaurant,Comfort Food Restaurant,American Restaurant,Latin American Restaurant,Middle Eastern Restaurant,Seafood Restaurant,Restaurant,Thai Restaurant,Theme Restaurant,North Indian Restaurant
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,2,Dim Sum Restaurant,Afghan Restaurant,Polish Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Molecular Gastronomy Restaurant
5,M4P,Central Toronto,Davisville North,43.712751,-79.390197,1,Asian Restaurant,Afghan Restaurant,Polish Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Molecular Gastronomy Restaurant
6,M4R,Central Toronto,North Toronto West,43.715383,-79.405678,0,Italian Restaurant,Mexican Restaurant,Chinese Restaurant,Fast Food Restaurant,Restaurant,Polish Restaurant,Malay Restaurant,Mediterranean Restaurant,Middle Eastern Restaurant,Modern European Restaurant
7,M4S,Central Toronto,Davisville,43.704324,-79.38879,0,Italian Restaurant,Sushi Restaurant,American Restaurant,French Restaurant,Restaurant,Seafood Restaurant,Greek Restaurant,New American Restaurant,Chinese Restaurant,Indian Restaurant
8,M4T,Central Toronto,"Moore Park, Summerhill East",43.689574,-79.38316,0,Italian Restaurant,Restaurant,Polish Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Molecular Gastronomy Restaurant
9,M4V,Central Toronto,"Deer Park, Forest Hill SE, Rathnelly, South Hi...",43.686412,-79.400049,0,Vietnamese Restaurant,Sushi Restaurant,Restaurant,American Restaurant,Thai Restaurant,Tapas Restaurant,Korean Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant


In [101]:
# create map
map_clusters = folium.Map(location=[tor_df.Latitude.mean(), tor_df.Longitude.mean()], zoom_start=11)

# set color scheme for the clusters
x = np.arange(k)
ys = [i + x + (i*x)**2 for i in range(k)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(tor_merged['Latitude'], tor_merged['Longitude'], tor_merged['Neighbourhood'], tor_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

# 5, Result and discussion
Our analysis shows that general restaurant, Italian, Afghan and Polish seems to be the most popular types of restaurant in Toronto.
We split these neighborhoods into 5 clusters. 
- Cluster 0 (84.21%): the most popular cluster, focusing on general, Italian, Sushi and American restaurant
- Cluster 1 (5.26%): Asian, Afghan and Polish
- Cluster 2 (2.63%): Dim Sum, Afghan and Polish
- Cluster 3 (2.63%): Japan, Afghan and Polish
- Cluster 4 (5.27%): Fash Food, Italian, Sushi, Restaurant, Afghan

# 6, Conclusion
Purpose of this project was to identify which area is suitable for opening a restaurant. By analyzing restaurants around Toronto from Foursquare data, stakeholders can choose an optimal location for the restaurant, and know what type of restaurant is suitable (a general restaurant, Italian restaurant or American restaurant ...)

Final decission on optimal restaurant location will be made by stakeholders based on specific characteristics of neighborhoods and locations in every recommended zone.