<h1> Battle of the Neighborhoods Capstone Project 

First we have to import all relevant libraries

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes 
import folium # map rendering library
from numpy import cos, sin, arcsin, sqrt
from math import radians

<h3> Gathering data

Read data on neighborhoods and boroughs in Toronto from Wikipedia and then clean it to ensure it is in the format we need

In [2]:
df = pd.read_html( 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M' )
df = df[0]
df  = pd.DataFrame(df.values[1:], columns=df.iloc[0])
df = df[df['Borough'] != 'Not assigned']
df = df.reset_index(drop=True)
df.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


Read the csv with latitude and longitude values for each neighborhood and that data to the dataframe

In [3]:
df2 = pd.read_csv('Geospatial_Coordinates.csv')  
df = df.join(df2.set_index('Postal Code'), on='Postal Code')
df.head(10)

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
5,M9A,Etobicoke,"Islington Avenue, Humber Valley Village",43.667856,-79.532242
6,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353
7,M3B,North York,Don Mills,43.745906,-79.352188
8,M4B,East York,"Parkview Hill, Woodbine Gardens",43.706397,-79.309937
9,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937


Use the geolocator library to find the center of Toronto and store it in a variable. This will be useful later

In [4]:
address = 'Toronto, ON'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
toronto_center=[latitude,longitude]
print('The geograpical center coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical center coordinate of Toronto are 43.6534817, -79.3839347.


Create a function to calculate the distance between two point based on latitude and longitude. This will be used to calculate the distance between neighborhoods and the center of Toronto. Two functions were made, one for neighborhoods and one for venues. 

In [5]:
def haversine(row):
    lon1 = longitude
    lat1 = latitude
    lon2 = row['Neighborhood Longitude']
    lat2 = row['Neighborhood Latitude']
    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
    dlon = lon2 - lon1 
    dlat = lat2 - lat1 
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * arcsin(sqrt(a)) 
    km = 6367 * c
    return km
def haversine2(row):
    lon1 = longitude
    lat1 = latitude
    lon2 = row['Longitude']
    lat2 = row['Latitude']
    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
    dlon = lon2 - lon1 
    dlat = lat2 - lat1 
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * arcsin(sqrt(a)) 
    km = 6367 * c
    return km

Here we rename the Neighborhood column to keep it uniform with the other dataframes and then apply the haversine formula to calculate the distance from the center and store it in a column in the dataframe. 

In [6]:
toronto_data = df
toronto_data = toronto_data.rename(columns={"Neighbourhood": "Neighborhood"})
toronto_data.head()

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494


In [7]:
toronto_data['Distance from Center (km)'] = toronto_data.apply(lambda row: haversine2(row), axis=1)
toronto_data.head()

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Distance from Center (km)
0,M3A,North York,Parkwoods,43.753259,-79.329656,11.914322
1,M4A,North York,Victoria Village,43.725882,-79.315572,9.741969
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1.875256
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763,9.717018
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,1.077193


<h3> Exploratory Analysis and Visualization

Now we plot out the various neighborhoods of Toronto just to see what we are working with. 

In [8]:
# create map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(toronto_data['Latitude'], toronto_data['Longitude'], toronto_data['Borough'], toronto_data['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

We will be using the Foursquare API to get venue information on the neighborhoods, here we store our API credentials so that we can generate the url to use the API and retrieve our data.

In [9]:
CLIENT_ID = 'EFJ35P4GBVMCALU211BQ5Y1V31FPPCOATDQC3M0O0BO4FZOL' # your Foursquare ID
CLIENT_SECRET = 'ONWJKL4B531JGADVSU23ZVACVLQF4NLJ3OH25Z01KSFRH0BC' # your Foursquare Secret
VERSION = '20201028' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value
radius = 500 # define radius

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: EFJ35P4GBVMCALU211BQ5Y1V31FPPCOATDQC3M0O0BO4FZOL
CLIENT_SECRET:ONWJKL4B531JGADVSU23ZVACVLQF4NLJ3OH25Z01KSFRH0BC


In [10]:
toronto_data.loc[0, 'Neighborhood']
neighborhood_latitude = toronto_data.loc[0, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = toronto_data.loc[0, 'Longitude'] # neighborhood longitude value

neighborhood_name = toronto_data.loc[0, 'Neighborhood'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Parkwoods are 43.7532586, -79.3296565.


In [11]:
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=EFJ35P4GBVMCALU211BQ5Y1V31FPPCOATDQC3M0O0BO4FZOL&client_secret=ONWJKL4B531JGADVSU23ZVACVLQF4NLJ3OH25Z01KSFRH0BC&v=20201028&ll=43.7532586,-79.3296565&radius=500&limit=100'

In [12]:
results = requests.get(url).json()

Now that we have our venue data, we create functions to parse the data and put the venue data into a pandas dataframe that we can work with.

In [13]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [14]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Brookbanks Park,Park,43.751976,-79.33214
1,Variety Store,Food & Drink Shop,43.751974,-79.333114


In [15]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

2 venues were returned by Foursquare.


In [16]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [17]:
toronto_venues = getNearbyVenues(names=toronto_data['Neighborhood'],
                                   latitudes=toronto_data['Latitude'],
                                   longitudes=toronto_data['Longitude']
                                  )

Parkwoods
Victoria Village
Regent Park, Harbourfront
Lawrence Manor, Lawrence Heights
Queen's Park, Ontario Provincial Government
Islington Avenue, Humber Valley Village
Malvern, Rouge
Don Mills
Parkview Hill, Woodbine Gardens
Garden District, Ryerson
Glencairn
West Deane Park, Princess Gardens, Martin Grove, Islington, Cloverdale
Rouge Hill, Port Union, Highland Creek
Don Mills
Woodbine Heights
St. James Town
Humewood-Cedarvale
Eringate, Bloordale Gardens, Old Burnhamthorpe, Markland Wood
Guildwood, Morningside, West Hill
The Beaches
Berczy Park
Caledonia-Fairbanks
Woburn
Leaside
Central Bay Street
Christie
Cedarbrae
Hillcrest Village
Bathurst Manor, Wilson Heights, Downsview North
Thorncliffe Park
Richmond, Adelaide, King
Dufferin, Dovercourt Village
Scarborough Village
Fairview, Henry Farm, Oriole
Northwood Park, York University
East Toronto, Broadview North (Old East York)
Harbourfront East, Union Station, Toronto Islands
Little Portugal, Trinity
Kennedy Park, Ionview, East Birchmo

In [18]:
print(toronto_venues.shape)
toronto_venues.head()

(2139, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,43.753259,-79.329656,Brookbanks Park,43.751976,-79.33214,Park
1,Parkwoods,43.753259,-79.329656,Variety Store,43.751974,-79.333114,Food & Drink Shop
2,Victoria Village,43.725882,-79.315572,Victoria Village Arena,43.723481,-79.315635,Hockey Arena
3,Victoria Village,43.725882,-79.315572,Portugril,43.725819,-79.312785,Portuguese Restaurant
4,Victoria Village,43.725882,-79.315572,Tim Hortons,43.725517,-79.313103,Coffee Shop


In [19]:
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 273 uniques categories.


We now run the haversine formula again to calculate the distance from the center for that neighborhood. We calculate the neighborhood distance and not the venue distance for future analysis.

In [20]:
toronto_venues['Distance from Center (km)'] = toronto_venues.apply(lambda row: haversine(row), axis=1)
toronto_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Distance from Center (km)
0,Parkwoods,43.753259,-79.329656,Brookbanks Park,43.751976,-79.33214,Park,11.914322
1,Parkwoods,43.753259,-79.329656,Variety Store,43.751974,-79.333114,Food & Drink Shop,11.914322
2,Victoria Village,43.725882,-79.315572,Victoria Village Arena,43.723481,-79.315635,Hockey Arena,9.741969
3,Victoria Village,43.725882,-79.315572,Portugril,43.725819,-79.312785,Portuguese Restaurant,9.741969
4,Victoria Village,43.725882,-79.315572,Tim Hortons,43.725517,-79.313103,Coffee Shop,9.741969


In [21]:
toronto_venues.shape

(2139, 8)

Now we split the dataframe into two data frames. One with only restaurant data, and one with only sushi restaurant data. 

In [22]:
toronto_res = toronto_venues[toronto_venues['Venue Category'].str.contains("Restaurant")]
toronto_res.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Distance from Center (km)
3,Victoria Village,43.725882,-79.315572,Portugril,43.725819,-79.312785,Portuguese Restaurant,9.741969
10,"Regent Park, Harbourfront",43.65426,-79.360636,Impact Kitchen,43.656369,-79.35698,Restaurant,1.875256
29,"Regent Park, Harbourfront",43.65426,-79.360636,Cluny Bistro & Boulangerie,43.650565,-79.357843,French Restaurant,1.875256
31,"Regent Park, Harbourfront",43.65426,-79.360636,El Catrin,43.650601,-79.35892,Mexican Restaurant,1.875256
53,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763,Lac Vien Vietnamese Restaurant,43.721259,-79.468472,Vietnamese Restaurant,9.717018


In [23]:
toronto_res.shape

(482, 8)

In [24]:
toronto_sus = toronto_res[toronto_res['Venue Category'].str.contains("Sushi Restaurant")]
toronto_sus.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Distance from Center (km)
72,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,Tokyo Sushi,43.665885,-79.386977,Sushi Restaurant,1.077193
163,"Garden District, Ryerson",43.657162,-79.378937,Spring Sushi,43.656253,-79.38066,Sushi Restaurant,0.573314
404,Berczy Park,43.644771,-79.373306,Oyshi Sushi,43.64234,-79.375853,Sushi Restaurant,1.291265
432,Leaside,43.70906,-79.363452,Kintako Japanese Restaurant,43.711597,-79.363962,Sushi Restaurant,6.391792
464,Central Bay Street,43.657952,-79.387383,Japango,43.655268,-79.385165,Sushi Restaurant,0.568913


In [25]:
toronto_sus.shape

(30, 8)

We have our two dataframes, now lets map them out on a heatmap to see the density of both regular and sushi restaraunts in Toronto.

<h4>Heatmap of All Restaurants in Toronto

In [26]:
from folium.plugins import HeatMap
heat_data = [[row['Venue Latitude'],row['Venue Longitude']] for index, row in toronto_res.iterrows()]
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)
folium.TileLayer('cartodbpositron').add_to(map_toronto) #cartodbpositron cartodbdark_matter
HeatMap(heat_data).add_to(map_toronto)
folium.Marker(toronto_center).add_to(map_toronto)


map_toronto

<h4>Heatmap of Sushi Restaurants in Toronto

In [27]:
sushi_data = [[row['Venue Latitude'],row['Venue Longitude']] for index, row in toronto_sus.iterrows()]
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)
folium.TileLayer('cartodbpositron').add_to(map_toronto) #cartodbpositron cartodbdark_matter
HeatMap(sushi_data).add_to(map_toronto)
folium.Marker(toronto_center).add_to(map_toronto)


map_toronto

From these heat maps we can see that there are a lot of both regular and sushi restaurants clustered around the center of the city. Sushi restaurants get much less frequent outside of the center of the city. A good target zone for a restaurant appears to be just northwest of the city's center. This area is close to the center of the city and there are no sushi restaurants in that area. 

Now we get our data ready for our k-means clustering model. To do this we onehot encode the dataset and restrict it to only restaurants as other venues are irrelevant for this analysis. Then we create a datarframe that stores each neighborhoods most frequent restaurants.

In [28]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_res[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighborhood'] = toronto_res['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Neighborhood,Afghan Restaurant,American Restaurant,Asian Restaurant,Belgian Restaurant,Brazilian Restaurant,Cajun / Creole Restaurant,Caribbean Restaurant,Chinese Restaurant,Colombian Restaurant,Comfort Food Restaurant,Cuban Restaurant,Dim Sum Restaurant,Doner Restaurant,Dumpling Restaurant,Eastern European Restaurant,Ethiopian Restaurant,Falafel Restaurant,Fast Food Restaurant,Filipino Restaurant,French Restaurant,German Restaurant,Gluten-free Restaurant,Greek Restaurant,Hakka Restaurant,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Korean BBQ Restaurant,Korean Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Molecular Gastronomy Restaurant,Moroccan Restaurant,New American Restaurant,Portuguese Restaurant,Ramen Restaurant,Restaurant,Seafood Restaurant,Southern / Soul Food Restaurant,Sushi Restaurant,Taiwanese Restaurant,Thai Restaurant,Theme Restaurant,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant
3,Victoria Village,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0
10,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
29,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
31,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
53,"Lawrence Manor, Lawrence Heights",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1


In [29]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped

Unnamed: 0,Neighborhood,Afghan Restaurant,American Restaurant,Asian Restaurant,Belgian Restaurant,Brazilian Restaurant,Cajun / Creole Restaurant,Caribbean Restaurant,Chinese Restaurant,Colombian Restaurant,Comfort Food Restaurant,Cuban Restaurant,Dim Sum Restaurant,Doner Restaurant,Dumpling Restaurant,Eastern European Restaurant,Ethiopian Restaurant,Falafel Restaurant,Fast Food Restaurant,Filipino Restaurant,French Restaurant,German Restaurant,Gluten-free Restaurant,Greek Restaurant,Hakka Restaurant,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Korean BBQ Restaurant,Korean Restaurant,Latin American Restaurant,Malay Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Molecular Gastronomy Restaurant,Moroccan Restaurant,New American Restaurant,Portuguese Restaurant,Ramen Restaurant,Restaurant,Seafood Restaurant,Southern / Soul Food Restaurant,Sushi Restaurant,Taiwanese Restaurant,Thai Restaurant,Theme Restaurant,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant
0,Agincourt,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Bathurst Manor, Wilson Heights, Downsview North",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0
2,Bayview Village,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"Bedford Park, Lawrence Manor East",0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.111111,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.111111,0.0,0.111111,0.0,0.0,0.0,0.0
4,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.076923,0.0,0.0,0.076923,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.153846,0.153846,0.0,0.076923,0.0,0.076923,0.0,0.0,0.076923,0.0
5,"Brockton, Parkdale Village, Exhibition Place",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,"Business reply mail Processing Centre, South C...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Canada Post Gateway Processing Centre,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Cedarbrae,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0
9,Central Bay Street,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.05,0.15,0.1,0.0,0.05,0.0,0.0,0.0,0.0,0.05,0.05,0.0,0.0,0.05,0.05,0.05,0.05,0.05,0.0,0.05,0.0,0.1,0.0,0.0,0.05,0.0


In [30]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False) 
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [31]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Restaurant'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Restaurant'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Restaurant,2nd Most Common Restaurant,3rd Most Common Restaurant,4th Most Common Restaurant,5th Most Common Restaurant,6th Most Common Restaurant,7th Most Common Restaurant,8th Most Common Restaurant,9th Most Common Restaurant,10th Most Common Restaurant
0,Agincourt,Latin American Restaurant,Vietnamese Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Ethiopian Restaurant
1,"Bathurst Manor, Wilson Heights, Downsview North",Sushi Restaurant,Restaurant,Chinese Restaurant,Middle Eastern Restaurant,Vietnamese Restaurant,Doner Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant
2,Bayview Village,Japanese Restaurant,Chinese Restaurant,Vietnamese Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant
3,"Bedford Park, Lawrence Manor East",Italian Restaurant,Indian Restaurant,Sushi Restaurant,Greek Restaurant,Restaurant,Comfort Food Restaurant,Thai Restaurant,American Restaurant,Belgian Restaurant,Dumpling Restaurant
4,Berczy Park,Seafood Restaurant,Restaurant,Comfort Food Restaurant,Greek Restaurant,Vegetarian / Vegan Restaurant,Italian Restaurant,Japanese Restaurant,Eastern European Restaurant,French Restaurant,Sushi Restaurant


Now we run our model on the dataset and each neighborhood gets its own cluster label based on the restaurant type and distance data.

In [32]:
# set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 1, 0, 1, 1, 1, 1, 1, 1, 1])

Here is our dataset with cluster labels for each neighborhood

In [33]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = toronto_data

# merge manhattan_grouped with manhattan_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')
toronto_merged = toronto_merged.dropna()
toronto_merged.head() # check the last columns!

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Distance from Center (km),Cluster Labels,1st Most Common Restaurant,2nd Most Common Restaurant,3rd Most Common Restaurant,4th Most Common Restaurant,5th Most Common Restaurant,6th Most Common Restaurant,7th Most Common Restaurant,8th Most Common Restaurant,9th Most Common Restaurant,10th Most Common Restaurant
1,M4A,North York,Victoria Village,43.725882,-79.315572,9.741969,3.0,Portuguese Restaurant,Vietnamese Restaurant,Dim Sum Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Ethiopian Restaurant
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1.875256,1.0,French Restaurant,Restaurant,Mexican Restaurant,Vietnamese Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763,9.717018,1.0,Vietnamese Restaurant,Doner Restaurant,Greek Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Ethiopian Restaurant
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,1.077193,1.0,Italian Restaurant,Sushi Restaurant,Restaurant,Chinese Restaurant,Portuguese Restaurant,Mexican Restaurant,Vietnamese Restaurant,Doner Restaurant,French Restaurant,Filipino Restaurant
6,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353,22.838436,4.0,Fast Food Restaurant,Vietnamese Restaurant,Doner Restaurant,Greek Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Falafel Restaurant,Ethiopian Restaurant


<h4>Map of each neighborhood restaurant cluster

In [34]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels'].astype(int)):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
folium.Marker(toronto_center).add_to(map_clusters)
map_clusters

In [35]:
toronto_merged.shape

(62, 17)

Now we look at the data for each neighborhood cluster and we can then find out where the optimal location to open a sushi restaurant is

In [36]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Distance from Center (km),Cluster Labels,1st Most Common Restaurant,2nd Most Common Restaurant,3rd Most Common Restaurant,4th Most Common Restaurant,5th Most Common Restaurant,6th Most Common Restaurant,7th Most Common Restaurant,8th Most Common Restaurant,9th Most Common Restaurant,10th Most Common Restaurant
7,North York,10.582592,0.0,Japanese Restaurant,Dim Sum Restaurant,Italian Restaurant,Asian Restaurant,Caribbean Restaurant,Chinese Restaurant,Restaurant,Dumpling Restaurant,German Restaurant,French Restaurant
10,North York,7.937155,0.0,Japanese Restaurant,Vietnamese Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Ethiopian Restaurant
13,North York,8.758226,0.0,Japanese Restaurant,Dim Sum Restaurant,Italian Restaurant,Asian Restaurant,Caribbean Restaurant,Chinese Restaurant,Restaurant,Dumpling Restaurant,German Restaurant,French Restaurant
34,North York,15.191392,0.0,Caribbean Restaurant,Vietnamese Restaurant,Doner Restaurant,Greek Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant
39,North York,14.832285,0.0,Japanese Restaurant,Chinese Restaurant,Vietnamese Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant
70,Etobicoke,12.835371,0.0,Chinese Restaurant,Vietnamese Restaurant,Doner Restaurant,Greek Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant


In [37]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Distance from Center (km),Cluster Labels,1st Most Common Restaurant,2nd Most Common Restaurant,3rd Most Common Restaurant,4th Most Common Restaurant,5th Most Common Restaurant,6th Most Common Restaurant,7th Most Common Restaurant,8th Most Common Restaurant,9th Most Common Restaurant,10th Most Common Restaurant
2,Downtown Toronto,1.875256,1.0,French Restaurant,Restaurant,Mexican Restaurant,Vietnamese Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant
3,North York,9.717018,1.0,Vietnamese Restaurant,Doner Restaurant,Greek Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Ethiopian Restaurant
4,Downtown Toronto,1.077193,1.0,Italian Restaurant,Sushi Restaurant,Restaurant,Chinese Restaurant,Portuguese Restaurant,Mexican Restaurant,Vietnamese Restaurant,Doner Restaurant,French Restaurant,Filipino Restaurant
9,Downtown Toronto,0.573314,1.0,Japanese Restaurant,Fast Food Restaurant,Ramen Restaurant,Italian Restaurant,Middle Eastern Restaurant,Ethiopian Restaurant,Mexican Restaurant,Chinese Restaurant,New American Restaurant,Vietnamese Restaurant
15,Downtown Toronto,0.719526,1.0,Restaurant,American Restaurant,Japanese Restaurant,Seafood Restaurant,Moroccan Restaurant,Comfort Food Restaurant,New American Restaurant,German Restaurant,Vegetarian / Vegan Restaurant,Italian Restaurant
18,Scarborough,19.889469,1.0,Restaurant,Mexican Restaurant,Vietnamese Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant
20,Downtown Toronto,1.291265,1.0,Seafood Restaurant,Restaurant,Comfort Food Restaurant,Greek Restaurant,Vegetarian / Vegan Restaurant,Italian Restaurant,Japanese Restaurant,Eastern European Restaurant,French Restaurant,Sushi Restaurant
22,Scarborough,18.721466,1.0,Korean BBQ Restaurant,Mexican Restaurant,Vietnamese Restaurant,Hakka Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant
23,East York,6.391792,1.0,Sushi Restaurant,Restaurant,Mexican Restaurant,Vietnamese Restaurant,Dim Sum Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant
24,Downtown Toronto,0.568913,1.0,Italian Restaurant,Thai Restaurant,Japanese Restaurant,Indian Restaurant,Portuguese Restaurant,French Restaurant,Vegetarian / Vegan Restaurant,Korean Restaurant,Modern European Restaurant,New American Restaurant


In [38]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Distance from Center (km),Cluster Labels,1st Most Common Restaurant,2nd Most Common Restaurant,3rd Most Common Restaurant,4th Most Common Restaurant,5th Most Common Restaurant,6th Most Common Restaurant,7th Most Common Restaurant,8th Most Common Restaurant,9th Most Common Restaurant,10th Most Common Restaurant
31,West Toronto,4.996073,2.0,Middle Eastern Restaurant,Vietnamese Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Ethiopian Restaurant
71,Scarborough,12.856397,2.0,Middle Eastern Restaurant,Vietnamese Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Ethiopian Restaurant
74,Central Toronto,2.760599,2.0,Indian Restaurant,Middle Eastern Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Ethiopian Restaurant


In [39]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Distance from Center (km),Cluster Labels,1st Most Common Restaurant,2nd Most Common Restaurant,3rd Most Common Restaurant,4th Most Common Restaurant,5th Most Common Restaurant,6th Most Common Restaurant,7th Most Common Restaurant,8th Most Common Restaurant,9th Most Common Restaurant,10th Most Common Restaurant
1,North York,9.741969,3.0,Portuguese Restaurant,Vietnamese Restaurant,Dim Sum Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Ethiopian Restaurant


In [40]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Distance from Center (km),Cluster Labels,1st Most Common Restaurant,2nd Most Common Restaurant,3rd Most Common Restaurant,4th Most Common Restaurant,5th Most Common Restaurant,6th Most Common Restaurant,7th Most Common Restaurant,8th Most Common Restaurant,9th Most Common Restaurant,10th Most Common Restaurant
6,Scarborough,22.838436,4.0,Fast Food Restaurant,Vietnamese Restaurant,Doner Restaurant,Greek Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Falafel Restaurant,Ethiopian Restaurant
89,Etobicoke,19.004174,4.0,Fast Food Restaurant,Vietnamese Restaurant,Doner Restaurant,Greek Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Falafel Restaurant,Ethiopian Restaurant
102,Etobicoke,11.357574,4.0,Fast Food Restaurant,Vietnamese Restaurant,Doner Restaurant,Greek Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Falafel Restaurant,Ethiopian Restaurant


Here we make two new dataframes, one without sushi restaurants and one with only sushi restaurants. 

In [41]:
mer_res = toronto_merged[~toronto_merged.apply(lambda r: r.str.contains('Sushi', case=False).any(), axis=1)] 
mer_res

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Distance from Center (km),Cluster Labels,1st Most Common Restaurant,2nd Most Common Restaurant,3rd Most Common Restaurant,4th Most Common Restaurant,5th Most Common Restaurant,6th Most Common Restaurant,7th Most Common Restaurant,8th Most Common Restaurant,9th Most Common Restaurant,10th Most Common Restaurant
1,M4A,North York,Victoria Village,43.725882,-79.315572,9.741969,3.0,Portuguese Restaurant,Vietnamese Restaurant,Dim Sum Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Ethiopian Restaurant
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1.875256,1.0,French Restaurant,Restaurant,Mexican Restaurant,Vietnamese Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763,9.717018,1.0,Vietnamese Restaurant,Doner Restaurant,Greek Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Ethiopian Restaurant
6,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353,22.838436,4.0,Fast Food Restaurant,Vietnamese Restaurant,Doner Restaurant,Greek Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Falafel Restaurant,Ethiopian Restaurant
7,M3B,North York,Don Mills,43.745906,-79.352188,10.582592,0.0,Japanese Restaurant,Dim Sum Restaurant,Italian Restaurant,Asian Restaurant,Caribbean Restaurant,Chinese Restaurant,Restaurant,Dumpling Restaurant,German Restaurant,French Restaurant
9,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,0.573314,1.0,Japanese Restaurant,Fast Food Restaurant,Ramen Restaurant,Italian Restaurant,Middle Eastern Restaurant,Ethiopian Restaurant,Mexican Restaurant,Chinese Restaurant,New American Restaurant,Vietnamese Restaurant
10,M6B,North York,Glencairn,43.709577,-79.445073,7.937155,0.0,Japanese Restaurant,Vietnamese Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Ethiopian Restaurant
13,M3C,North York,Don Mills,43.7259,-79.340923,8.758226,0.0,Japanese Restaurant,Dim Sum Restaurant,Italian Restaurant,Asian Restaurant,Caribbean Restaurant,Chinese Restaurant,Restaurant,Dumpling Restaurant,German Restaurant,French Restaurant
15,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,0.719526,1.0,Restaurant,American Restaurant,Japanese Restaurant,Seafood Restaurant,Moroccan Restaurant,Comfort Food Restaurant,New American Restaurant,German Restaurant,Vegetarian / Vegan Restaurant,Italian Restaurant
18,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711,19.889469,1.0,Restaurant,Mexican Restaurant,Vietnamese Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant


In [42]:
mer_res.shape

(45, 17)

In [43]:
mer_sus = toronto_merged[toronto_merged.apply(lambda r: r.str.contains('Sushi', case=False).any(), axis=1)] 
mer_sus

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Distance from Center (km),Cluster Labels,1st Most Common Restaurant,2nd Most Common Restaurant,3rd Most Common Restaurant,4th Most Common Restaurant,5th Most Common Restaurant,6th Most Common Restaurant,7th Most Common Restaurant,8th Most Common Restaurant,9th Most Common Restaurant,10th Most Common Restaurant
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,1.077193,1.0,Italian Restaurant,Sushi Restaurant,Restaurant,Chinese Restaurant,Portuguese Restaurant,Mexican Restaurant,Vietnamese Restaurant,Doner Restaurant,French Restaurant,Filipino Restaurant
20,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,1.291265,1.0,Seafood Restaurant,Restaurant,Comfort Food Restaurant,Greek Restaurant,Vegetarian / Vegan Restaurant,Italian Restaurant,Japanese Restaurant,Eastern European Restaurant,French Restaurant,Sushi Restaurant
23,M4G,East York,Leaside,43.70906,-79.363452,6.391792,1.0,Sushi Restaurant,Restaurant,Mexican Restaurant,Vietnamese Restaurant,Dim Sum Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant
28,M3H,North York,"Bathurst Manor, Wilson Heights, Downsview North",43.754328,-79.442259,12.146661,1.0,Sushi Restaurant,Restaurant,Chinese Restaurant,Middle Eastern Restaurant,Vietnamese Restaurant,Doner Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant
30,M5H,Downtown Toronto,"Richmond, Adelaide, King",43.650571,-79.384568,0.327407,1.0,Restaurant,Thai Restaurant,American Restaurant,Sushi Restaurant,Modern European Restaurant,New American Restaurant,Gluten-free Restaurant,Vegetarian / Vegan Restaurant,Japanese Restaurant,Latin American Restaurant
36,M5J,Downtown Toronto,"Harbourfront East, Union Station, Toronto Islands",43.640816,-79.381752,1.418408,1.0,Restaurant,Italian Restaurant,Indian Restaurant,Japanese Restaurant,New American Restaurant,Chinese Restaurant,Mexican Restaurant,Seafood Restaurant,Sushi Restaurant,Vegetarian / Vegan Restaurant
41,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,3.861179,1.0,Greek Restaurant,Italian Restaurant,Restaurant,Indian Restaurant,American Restaurant,Sushi Restaurant,Caribbean Restaurant,Dumpling Restaurant,German Restaurant,French Restaurant
47,M4L,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572,5.759978,1.0,Italian Restaurant,Sushi Restaurant,Fast Food Restaurant,Restaurant,Vietnamese Restaurant,Dim Sum Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Falafel Restaurant
55,M5M,North York,"Bedford Park, Lawrence Manor East",43.733283,-79.41975,9.323103,1.0,Italian Restaurant,Indian Restaurant,Sushi Restaurant,Greek Restaurant,Restaurant,Comfort Food Restaurant,Thai Restaurant,American Restaurant,Belgian Restaurant,Dumpling Restaurant
59,M2N,North York,"Willowdale, Willowdale East",43.77012,-79.408493,13.110679,1.0,Ramen Restaurant,Japanese Restaurant,Sushi Restaurant,Vietnamese Restaurant,Fast Food Restaurant,Restaurant,Middle Eastern Restaurant,Dim Sum Restaurant,French Restaurant,Filipino Restaurant


In [44]:
mer_sus.shape

(17, 17)

In [45]:
mer_res = mer_res.sort_values(by=['Distance from Center (km)'])
mer_res.head(10)

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Distance from Center (km),Cluster Labels,1st Most Common Restaurant,2nd Most Common Restaurant,3rd Most Common Restaurant,4th Most Common Restaurant,5th Most Common Restaurant,6th Most Common Restaurant,7th Most Common Restaurant,8th Most Common Restaurant,9th Most Common Restaurant,10th Most Common Restaurant
24,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383,0.568913,1.0,Italian Restaurant,Thai Restaurant,Japanese Restaurant,Indian Restaurant,Portuguese Restaurant,French Restaurant,Vegetarian / Vegan Restaurant,Korean Restaurant,Modern European Restaurant,New American Restaurant
9,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,0.573314,1.0,Japanese Restaurant,Fast Food Restaurant,Ramen Restaurant,Italian Restaurant,Middle Eastern Restaurant,Ethiopian Restaurant,Mexican Restaurant,Chinese Restaurant,New American Restaurant,Vietnamese Restaurant
48,M5L,Downtown Toronto,"Commerce Court, Victoria Hotel",43.648198,-79.379817,0.674022,1.0,Restaurant,American Restaurant,Japanese Restaurant,Seafood Restaurant,Asian Restaurant,Thai Restaurant,Italian Restaurant,Vegetarian / Vegan Restaurant,New American Restaurant,Molecular Gastronomy Restaurant
15,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,0.719526,1.0,Restaurant,American Restaurant,Japanese Restaurant,Seafood Restaurant,Moroccan Restaurant,Comfort Food Restaurant,New American Restaurant,German Restaurant,Vegetarian / Vegan Restaurant,Italian Restaurant
42,M5K,Downtown Toronto,"Toronto Dominion Centre, Design Exchange",43.647177,-79.381576,0.725839,1.0,Restaurant,American Restaurant,Japanese Restaurant,Seafood Restaurant,Italian Restaurant,Asian Restaurant,Fast Food Restaurant,French Restaurant,New American Restaurant,Chinese Restaurant
92,M5W,Downtown Toronto,Stn A PO Boxes,43.646435,-79.374846,1.071082,1.0,Italian Restaurant,Restaurant,Japanese Restaurant,Seafood Restaurant,Molecular Gastronomy Restaurant,Fast Food Restaurant,Vegetarian / Vegan Restaurant,Eastern European Restaurant,Comfort Food Restaurant,French Restaurant
84,M5T,Downtown Toronto,"Kensington Market, Chinatown, Grange Park",43.653206,-79.400049,1.296014,1.0,Vegetarian / Vegan Restaurant,Mexican Restaurant,Vietnamese Restaurant,Dumpling Restaurant,Comfort Food Restaurant,Japanese Restaurant,Filipino Restaurant,Doner Restaurant,Dim Sum Restaurant,Caribbean Restaurant
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1.875256,1.0,French Restaurant,Restaurant,Mexican Restaurant,Vietnamese Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant
96,M4X,Downtown Toronto,"St. James Town, Cabbagetown",43.667967,-79.367675,2.073564,1.0,Restaurant,Italian Restaurant,Chinese Restaurant,Indian Restaurant,Thai Restaurant,Taiwanese Restaurant,Japanese Restaurant,Caribbean Restaurant,Doner Restaurant,French Restaurant
74,M5R,Central Toronto,"The Annex, North Midtown, Yorkville",43.67271,-79.405678,2.760599,2.0,Indian Restaurant,Middle Eastern Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Ethiopian Restaurant


Given that all sushi restaurants are already in the "1" cluster we exclude that cluster to expand to more unique markets when it comes to sushi. Then we sort by neighborhoods closest to the center of the city. 

In [46]:
mer_res.loc[mer_res['Cluster Labels'].astype(int) != 1]

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Distance from Center (km),Cluster Labels,1st Most Common Restaurant,2nd Most Common Restaurant,3rd Most Common Restaurant,4th Most Common Restaurant,5th Most Common Restaurant,6th Most Common Restaurant,7th Most Common Restaurant,8th Most Common Restaurant,9th Most Common Restaurant,10th Most Common Restaurant
74,M5R,Central Toronto,"The Annex, North Midtown, Yorkville",43.67271,-79.405678,2.760599,2.0,Indian Restaurant,Middle Eastern Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Ethiopian Restaurant
31,M6H,West Toronto,"Dufferin, Dovercourt Village",43.669005,-79.442259,4.996073,2.0,Middle Eastern Restaurant,Vietnamese Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Ethiopian Restaurant
10,M6B,North York,Glencairn,43.709577,-79.445073,7.937155,0.0,Japanese Restaurant,Vietnamese Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Ethiopian Restaurant
13,M3C,North York,Don Mills,43.7259,-79.340923,8.758226,0.0,Japanese Restaurant,Dim Sum Restaurant,Italian Restaurant,Asian Restaurant,Caribbean Restaurant,Chinese Restaurant,Restaurant,Dumpling Restaurant,German Restaurant,French Restaurant
1,M4A,North York,Victoria Village,43.725882,-79.315572,9.741969,3.0,Portuguese Restaurant,Vietnamese Restaurant,Dim Sum Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Ethiopian Restaurant
7,M3B,North York,Don Mills,43.745906,-79.352188,10.582592,0.0,Japanese Restaurant,Dim Sum Restaurant,Italian Restaurant,Asian Restaurant,Caribbean Restaurant,Chinese Restaurant,Restaurant,Dumpling Restaurant,German Restaurant,French Restaurant
102,M8Z,Etobicoke,"Mimico NW, The Queensway West, South of Bloor,...",43.628841,-79.520999,11.357574,4.0,Fast Food Restaurant,Vietnamese Restaurant,Doner Restaurant,Greek Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Falafel Restaurant,Ethiopian Restaurant
70,M9P,Etobicoke,Westmount,43.696319,-79.532242,12.835371,0.0,Chinese Restaurant,Vietnamese Restaurant,Doner Restaurant,Greek Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant
71,M1R,Scarborough,"Wexford, Maryvale",43.750072,-79.295849,12.856397,2.0,Middle Eastern Restaurant,Vietnamese Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant,Ethiopian Restaurant
39,M2K,North York,Bayview Village,43.786947,-79.385975,14.832285,0.0,Japanese Restaurant,Chinese Restaurant,Vietnamese Restaurant,Doner Restaurant,Gluten-free Restaurant,German Restaurant,French Restaurant,Filipino Restaurant,Fast Food Restaurant,Falafel Restaurant


Conclusions and results included in final written report