# The Battle of Neighborhoods Toronto

This notebook contains Questions 1, 2 & 3 of the Assignment. They have been segregated by Section headers

## Question 1

In [1]:
import pandas as pd
import requests

### 1.1 a quick check of the data frame loaded from wikipedia

In [2]:
url = "https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M"
wiki_page = requests.get(url)

wiki_data = pd.read_html(wiki_page.text)[0]
wiki_data.head(3)

Unnamed: 0,0,1,2,3,4,5,6,7,8
0,M1ANot assigned,M2ANot assigned,M3ANorth York(Parkwoods),M4ANorth York(Victoria Village),M5ADowntown Toronto(Regent Park / Harbourfront),M6ANorth York(Lawrence Manor / Lawrence Heights),M7AQueen's Park(Ontario Provincial Government),M8ANot assigned,M9AEtobicoke(Islington Avenue)
1,M1BScarborough(Malvern / Rouge),M2BNot assigned,M3BNorth York(Don Mills)North,M4BEast York(Parkview Hill / Woodbine Gardens),"M5BDowntown Toronto(Garden District, Ryerson)",M6BNorth York(Glencairn),M7BNot assigned,M8BNot assigned,M9BEtobicoke(West Deane Park / Princess Garden...
2,M1CScarborough(Rouge Hill / Port Union / Highl...,M2CNot assigned,M3CNorth York(Don Mills)South(Flemingdon Park),M4CEast York(Woodbine Heights),M5CDowntown Toronto(St. James Town),M6CYork(Humewood-Cedarvale),M7CNot assigned,M8CNot assigned,M9CEtobicoke(Eringate / Bloordale Gardens / Ol...


the labels, column/row values in the above table are obviously incorrect, so they need to be updated

### 1.2 reformat the dataframe structure

In [3]:
wiki_data = wiki_data.stack().reset_index()[[0]]
wiki_data.columns=['value']

post_code = [v[:3] for v in wiki_data['value']]
borough = [v[3:].split('(')[0] for v in wiki_data['value']]
neighborhood = [v[3:].split('(')[1].split(')')[0] if 'Not assigned' not in v else 'Not assigned' for v in wiki_data['value']]

when multiple neighborhoods share a common borough, separat them with a comma

In [4]:
neighborhood = [(', ').join(v.split(' / ')) if '/' in v else v for v in neighborhood]

If a cell has a borough but a Not assigned  neighborhood, then the neighborhood will be the same as the borough

In [5]:
for i in range(len(borough)):
  if borough[i] != 'Not assigned' and neighborhood[i] == 'Not assigned':
    neighborhood[i] = borough[i]

The dataframe will consist of three columns: PostalCode, Borough, and Neighborhood

In [6]:
df = pd.DataFrame({'PostalCode':post_code, 'Borough':borough, 'Neighborhood':neighborhood})

Only process the cells that have an assigned borough. Ignore cells with a borough that is Not assigned

In [7]:
df = df[df['Borough'] != 'Not assigned'].reset_index(drop=True)

a few abnormal cases (e.g., M5W, M7R and M7Y) need to be fixed manually

In [8]:
df.loc[df['PostalCode']=='M5W', ['Borough', 'Neighborhood']] = ['Downtown Toronto', 'Stn A PO Boxes']
df.loc[df['PostalCode']=='M7R', ['Borough', 'Neighborhood']] = ['Mississauga', 'Canada Post Gateway Processing Centre']
df.loc[df['PostalCode']=='M7Y', ['Borough', 'Neighborhood']] = ['East Toronto', 'Business reply mail Processing Centre']

display(df)

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Queen's Park,Ontario Provincial Government
...,...,...,...
98,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North"
99,M4Y,Downtown Toronto,Church and Wellesley
100,M7Y,East Toronto,Business reply mail Processing Centre
101,M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu..."


sanity check of the data frame

In [9]:
print('num. of non-identical postal codes: ', len(df.groupby(['PostalCode'])))
print('num. of not assigned boroughs: ', df['Borough'].str.count("Not assigned").sum())
print('num. of not assigned neighborhoods: ', df['Neighborhood'].str.count("Not assigned").sum())

num. of non-identical postal codes:  103
num. of not assigned boroughs:  0
num. of not assigned neighborhoods:  0


everything looks good, let's print the shape of the data frame

In [10]:
df.shape

(103, 3)

## Question 2

Importing the CSV file from the URL

In [11]:
data = pd.read_csv("https://cocl.us/Geospatial_data")
data.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


combine with the neighborhood data frame

In [12]:
combined_data = df.join(data.set_index('Postal Code'), on='PostalCode', how='inner')
display(combined_data)

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.654260,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Queen's Park,Ontario Provincial Government,43.662301,-79.389494
...,...,...,...,...,...
98,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North",43.653654,-79.506944
99,M4Y,Downtown Toronto,Church and Wellesley,43.665860,-79.383160
100,M7Y,East Toronto,Business reply mail Processing Centre,43.662744,-79.321558
101,M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu...",43.636258,-79.498509


In [13]:
combined_data.shape

(103, 5)

No errors occur, and the combined data frame has the same number of entries as the original one.

# Question 3

install the modules needed

In [14]:
!pip install geopy folium



In [15]:
from geopy.geocoders import Nominatim
import folium
import numpy as np

In [16]:

address = 'Toronto, Ontario'

geolocator = Nominatim(user_agent="toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The coordinates of Toronto are {}, {}.'.format(latitude, longitude))

The coordinates of Toronto are 43.6534817, -79.3839347.


borrow the map function from the previous lab

In [17]:
map_Toronto = folium.Map(location=[latitude, longitude], zoom_start=11)

# adding markers to map
for latitude, longitude, borough, neighbourhood in zip(combined_data['Latitude'], combined_data['Longitude'], combined_data['Borough'], combined_data['Neighborhood']):
    label = '{}, {}'.format(neighbourhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [latitude, longitude],
        radius=5,
        popup=label,
        color='blue',
        fill=True
        ).add_to(map_Toronto)  
    
map_Toronto

borrow the `getNearbyVenues` function from the previous lab

In [18]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius
            )
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Category']
    
    return(nearby_venues)

Define Foursquare Credentials and Version

In [19]:
CLIENT_ID = 'KHVGJRDS3CA55VPPXNKR5EZMXUBRXNG4LTYOQVHCJF14CT44'
CLIENT_SECRET = 'ODQKN3RDTPKT0P0YWZYRENVRNQBNWCW3CFIVTPRDDQDU5QXF'
VERSION = '20180605' # Foursquare API version
LIMIT = 100

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: KHVGJRDS3CA55VPPXNKR5EZMXUBRXNG4LTYOQVHCJF14CT44
CLIENT_SECRET:ODQKN3RDTPKT0P0YWZYRENVRNQBNWCW3CFIVTPRDDQDU5QXF


Collecting the venues in Toronto for each Neighbourhood

In [20]:
toronto_venues = getNearbyVenues(combined_data['Neighborhood'], combined_data['Latitude'], combined_data['Longitude'])

Parkwoods
Victoria Village
Regent Park, Harbourfront
Lawrence Manor, Lawrence Heights
Ontario Provincial Government
Islington Avenue
Malvern, Rouge
Don Mills
Parkview Hill, Woodbine Gardens
Garden District, Ryerson
Glencairn
West Deane Park, Princess Gardens, Martin Grove, Islington, Cloverdale
Rouge Hill, Port Union, Highland Creek
Don Mills
Woodbine Heights
St. James Town
Humewood-Cedarvale
Eringate, Bloordale Gardens, Old Burnhamthorpe, Markland Wood
Guildwood, Morningside, West Hill
The Beaches
Berczy Park
Caledonia-Fairbanks
Woburn
Leaside
Central Bay Street
Christie
Cedarbrae
Hillcrest Village
Bathurst Manor, Wilson Heights, Downsview North
Thorncliffe Park
Richmond, Adelaide, King
Dufferin, Dovercourt Village
Scarborough Village
Fairview, Henry Farm, Oriole
Northwood Park, York University
The Danforth East
Harbourfront East, Union Station, Toronto Islands
Little Portugal, Trinity
Kennedy Park, Ionview, East Birchmount Park
Bayview Village
Downsview
The Danforth West, Riverdale
T

check the shape of the data

In [21]:
toronto_venues.shape

(1342, 5)

check sample data

In [22]:
toronto_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Category
0,Parkwoods,43.753259,-79.329656,Brookbanks Park,Park
1,Parkwoods,43.753259,-79.329656,KFC,Fast Food Restaurant
2,Parkwoods,43.753259,-79.329656,Variety Store,Food & Drink Shop
3,Victoria Village,43.725882,-79.315572,Victoria Village Arena,Hockey Arena
4,Victoria Village,43.725882,-79.315572,Tim Hortons,Coffee Shop


check how many venues there are for each venue

In [23]:
toronto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Agincourt,3,3,3,3
"Alderwood, Long Branch",9,9,9,9
"Bathurst Manor, Wilson Heights, Downsview North",24,24,24,24
Bayview Village,4,4,4,4
"Bedford Park, Lawrence Manor East",25,25,25,25
...,...,...,...,...
Willowdale,35,35,35,35
"Willowdale, Newtonbrook",1,1,1,1
Woburn,3,3,3,3
Woodbine Heights,7,7,7,7


In [24]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot.drop('Neighborhood', 1, inplace=True)
toronto_onehot['Neighborhood'] = toronto_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Neighborhood,Accessories Store,Adult Boutique,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Victoria Village,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Victoria Village,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [25]:
toronto_onehot.shape

(1342, 244)

group the Neighborhoods, calculate the mean venue categories in each Neighborhood

In [26]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped.head()

Unnamed: 0,Neighborhood,Accessories Store,Adult Boutique,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,Agincourt,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Alderwood, Long Branch",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Bathurst Manor, Wilson Heights, Downsview North",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Bayview Village,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Bedford Park, Lawrence Manor East",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [27]:
toronto_grouped.shape

(95, 244)

get the top 10 venues for each neighborhood

In [28]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [29]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agincourt,Breakfast Spot,Lounge,Latin American Restaurant,Accessories Store,Music Venue,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop
1,"Alderwood, Long Branch",Pizza Place,Playground,Sandwich Place,Gym,Pub,Pharmacy,Coffee Shop,Athletics & Sports,Modern European Restaurant,Movie Theater
2,"Bathurst Manor, Wilson Heights, Downsview North",Coffee Shop,Bank,Pizza Place,Bridal Shop,Mobile Phone Shop,Shopping Mall,Supermarket,Sushi Restaurant,Fried Chicken Joint,Frozen Yogurt Shop
3,Bayview Village,Café,Japanese Restaurant,Chinese Restaurant,Bank,Accessories Store,Movie Theater,Music Venue,Museum,Motel,Nightclub
4,"Bedford Park, Lawrence Manor East",Coffee Shop,Italian Restaurant,Sandwich Place,Breakfast Spot,Butcher,Café,Liquor Store,Sushi Restaurant,Thai Restaurant,Fast Food Restaurant


make the model to cluster the Neighborhoods

In [30]:
# import k-means for clustering
from sklearn.cluster import KMeans

# set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1], dtype=int32)

Merge the dataframe with the top 10 and the cluster for each neighbourhood

In [31]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = combined_data

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

# Drop all the NaN values to prevent data skew
toronto_merged.dropna(subset=['Cluster Labels'], inplace=True)

toronto_merged.head() # check the last columns!

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M3A,North York,Parkwoods,43.753259,-79.329656,2.0,Food & Drink Shop,Park,Fast Food Restaurant,Accessories Store,Museum,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant
1,M4A,North York,Victoria Village,43.725882,-79.315572,1.0,Portuguese Restaurant,French Restaurant,Hockey Arena,Pizza Place,Coffee Shop,Accessories Store,Modern European Restaurant,Museum,Movie Theater,Motel
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1.0,Coffee Shop,Park,Bakery,Breakfast Spot,Theater,Yoga Studio,Mexican Restaurant,Farmers Market,French Restaurant,Café
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763,1.0,Clothing Store,Furniture / Home Store,Accessories Store,Vietnamese Restaurant,Event Space,Coffee Shop,Boutique,Pharmacy,Pet Store,Metro Station
4,M7A,Queen's Park,Ontario Provincial Government,43.662301,-79.389494,1.0,Coffee Shop,Sushi Restaurant,Yoga Studio,Creperie,Spa,Burrito Place,Smoothie Shop,Café,Sandwich Place,College Auditorium


plot the clusters on the map

In [32]:
import matplotlib.cm as cm
import matplotlib.colors as colors

In [33]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster-1)],
        fill=True,
        fill_color=rainbow[int(cluster-1)],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

verify clusters

In [34]:
def verify_cluster(df, cluster_label):
  return df.loc[df['Cluster Labels'] == cluster_label, df.columns[[1] + list(range(5, df.shape[1]))]]

Cluster 1

In [35]:
verify_cluster(toronto_merged, 0)

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
32,Scarborough,0.0,Playground,Accessories Store,Music Venue,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop
85,Scarborough,0.0,Playground,Park,Intersection,Accessories Store,Music Venue,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop


Cluster 2

In [36]:
verify_cluster(toronto_merged, 1)

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,North York,1.0,Portuguese Restaurant,French Restaurant,Hockey Arena,Pizza Place,Coffee Shop,Accessories Store,Modern European Restaurant,Museum,Movie Theater,Motel
2,Downtown Toronto,1.0,Coffee Shop,Park,Bakery,Breakfast Spot,Theater,Yoga Studio,Mexican Restaurant,Farmers Market,French Restaurant,Café
3,North York,1.0,Clothing Store,Furniture / Home Store,Accessories Store,Vietnamese Restaurant,Event Space,Coffee Shop,Boutique,Pharmacy,Pet Store,Metro Station
4,Queen's Park,1.0,Coffee Shop,Sushi Restaurant,Yoga Studio,Creperie,Spa,Burrito Place,Smoothie Shop,Café,Sandwich Place,College Auditorium
7,North York,1.0,Gym,Coffee Shop,Restaurant,Athletics & Sports,Asian Restaurant,Bike Shop,Clothing Store,Chinese Restaurant,Sandwich Place,Sushi Restaurant
...,...,...,...,...,...,...,...,...,...,...,...,...
98,Etobicoke,1.0,Pool,River,New American Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant
99,Downtown Toronto,1.0,Pub,Beer Bar,Martial Arts School,Men's Store,Bookstore,Mexican Restaurant,Bubble Tea Shop,Burger Joint,Café,Ethiopian Restaurant
100,East Toronto,1.0,Light Rail Station,Spa,Park,Garden,Garden Center,Brewery,Auto Workshop,Skate Park,Restaurant,Recording Studio
101,Etobicoke,1.0,Pool,Construction & Landscaping,Baseball Field,New American Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop


Cluster 3

In [37]:
verify_cluster(toronto_merged, 2)

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,North York,2.0,Food & Drink Shop,Park,Fast Food Restaurant,Accessories Store,Museum,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant
21,York,2.0,Park,Women's Store,Bar,Accessories Store,Mobile Phone Shop,Movie Theater,Motel,Monument / Landmark,Modern European Restaurant,Miscellaneous Shop
35,East YorkEast Toronto,2.0,Convenience Store,Park,Accessories Store,New American Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop
49,North York,2.0,Basketball Court,Construction & Landscaping,Bakery,Park,Accessories Store,Movie Theater,Motel,Monument / Landmark,Modern European Restaurant,Mobile Phone Shop
61,Central Toronto,2.0,Photography Studio,Swim School,Park,Bus Line,Accessories Store,Museum,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant
66,North York,2.0,Convenience Store,Electronics Store,Park,Accessories Store,New American Restaurant,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop
68,Central Toronto,2.0,Jewelry Store,Park,Trail,Sushi Restaurant,Movie Theater,Motel,Monument / Landmark,Modern European Restaurant,Mobile Phone Shop,Accessories Store
77,Etobicoke,2.0,Mobile Phone Shop,Park,Sandwich Place,Accessories Store,New American Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop
83,Central Toronto,2.0,Lawyer,Park,Tennis Court,Accessories Store,Music Venue,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop
91,Downtown Toronto,2.0,Park,Playground,Trail,Accessories Store,Mobile Phone Shop,Movie Theater,Motel,Monument / Landmark,Modern European Restaurant,Miscellaneous Shop


Cluster 4

In [38]:
verify_cluster(toronto_merged, 3)

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,Scarborough,3.0,Fast Food Restaurant,Accessories Store,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant


Cluster 5

In [39]:
verify_cluster(toronto_merged, 4)

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
52,North York,4.0,Park,Accessories Store,Music Venue,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop
64,York,4.0,Park,Accessories Store,Music Venue,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop
