In [117]:
import pandas as pd

In [118]:
# !conda install geocoder -y 
# there's an issue using conda to install geocoder library.  Official documentation for Geocoder doesn't list Conda, nor does Conda seem to have Geocoder.


!pip install geocoder

print("Geocoder library successfully loaded.")

Geocoder library successfully loaded.


# Goals

There are two objectives I hope to accomplish:
1. Create a Pandas DataFrame which lists the various neighborhoods of Toronto, Canada along with their respective latitude and longitude coordinates.
2. **Cluster**, through K-Means Clustering, the neighborhoods of Toronto, CA by geographic proximity.
3. Examine venues within select neighborhood(s) of Toronto.


Background:

The Canadian postal system created a 6-digit alphanumeric code for the delivery of mail throughout Canada, _down to the specific neighborhood or rural area_, which will aid in my stated goals.  Toronto was given a unique Forward Sortation Area from the Canadian Post due to the size of the city.  From the FSA for Toronto the neighborhood can be determined.  This is the basis of my exploration into Toronto's neighborhoods. 

Within the URL referenced below there is an embedded table which contains the Canadian postal code categorical designations (PostalCode, Borough, and Neighborhood) for Toronto, specifically.  This table was saved as an Excel Spreadsheet and then converted into a Pandas dataframe in order to further analyze the data.  
https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M



In [119]:
wiki = "M9WK3-Wiki-Canadian Postal Codes (RAW).xlsx"
df = pd.read_excel(wiki)

In [120]:
df.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M1B,Scarborough,"Malvern, Rouge"
2,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek"
3,M1E,Scarborough,"Guildwood, Morningside, West Hill"
4,M1G,Scarborough,Woburn


In [121]:
# prior to cleaning the raw data:
df.shape

(180, 3)

### In the raw data, all rows which have a 'Borough' listed as "Not assigned" also have a respective 'Neighbourhood' listed as "Not Assigned".  These rows, specifically, are dropped for analysis purposes, as we're interested in currently established neighborhoods.

In [122]:
df.drop(df[df['Borough'] == "Not assigned"].index, inplace=True)

In [123]:
# pd.set_option('display.max_rows', None)
df
# for index, row in df.iterrows():
#     print(row['Postal Code'], row['Neighbourhood'])

Unnamed: 0,Postal Code,Borough,Neighbourhood
1,M1B,Scarborough,"Malvern, Rouge"
2,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek"
3,M1E,Scarborough,"Guildwood, Morningside, West Hill"
4,M1G,Scarborough,Woburn
5,M1H,Scarborough,Cedarbrae
6,M1J,Scarborough,Scarborough Village
7,M1K,Scarborough,"Kennedy Park, Ionview, East Birchmount Park"
8,M1L,Scarborough,"Golden Mile, Clairlea, Oakridge"
9,M1M,Scarborough,"Cliffside, Cliffcrest, Scarborough Village West"
10,M1N,Scarborough,"Birch Cliff, Cliffside West"


In [124]:
# after removing "Not assigned" rows from the data.
df.shape

(103, 3)

### How many unique boroughs are there within the Greater Toronto Metropolitan area?

In [125]:
print("Number of Boroughs: ", df['Borough'].nunique())
df['Borough'].unique()
for index, row in df.iterrows():
    print(row['Postal Code'], row['Borough'])

Number of Boroughs:  10
M1B Scarborough
M1C Scarborough
M1E Scarborough
M1G Scarborough
M1H Scarborough
M1J Scarborough
M1K Scarborough
M1L Scarborough
M1M Scarborough
M1N Scarborough
M1P Scarborough
M1R Scarborough
M1S Scarborough
M1T Scarborough
M1V Scarborough
M1W Scarborough
M1X Scarborough
M2H North York
M2J North York
M2K North York
M2L North York
M2M North York
M2N North York
M2P North York
M2R North York
M3A North York
M3B North York
M3C North York
M3H North York
M3J North York
M3K North York
M3L North York
M3M North York
M3N North York
M4A North York
M4B East York
M4C East York
M4E East Toronto
M4G East York
M4H East York
M4J East York
M4K East Toronto
M4L East Toronto
M4M East Toronto
M4N Central Toronto
M4P Central Toronto
M4R Central Toronto
M4S Central Toronto
M4T Central Toronto
M4V Central Toronto
M4W Downtown Toronto
M4X Downtown Toronto
M4Y Downtown Toronto
M5A Downtown Toronto
M5B Downtown Toronto
M5C Downtown Toronto
M5E Downtown Toronto
M5G Downtown Toronto
M5H Down

In [126]:
df['Postal Code']

1      M1B
2      M1C
3      M1E
4      M1G
5      M1H
6      M1J
7      M1K
8      M1L
9      M1M
10     M1N
11     M1P
12     M1R
13     M1S
14     M1T
15     M1V
16     M1W
17     M1X
25     M2H
26     M2J
27     M2K
28     M2L
29     M2M
30     M2N
31     M2P
32     M2R
40     M3A
41     M3B
42     M3C
45     M3H
46     M3J
47     M3K
48     M3L
49     M3M
50     M3N
60     M4A
61     M4B
62     M4C
63     M4E
64     M4G
65     M4H
66     M4J
67     M4K
68     M4L
69     M4M
70     M4N
71     M4P
72     M4R
73     M4S
74     M4T
75     M4V
76     M4W
77     M4X
78     M4Y
80     M5A
81     M5B
82     M5C
83     M5E
84     M5G
85     M5H
86     M5J
87     M5K
88     M5L
89     M5M
90     M5N
91     M5P
92     M5R
93     M5S
94     M5T
95     M5V
96     M5W
97     M5X
100    M6A
101    M6B
102    M6C
103    M6E
104    M6G
105    M6H
106    M6J
107    M6K
108    M6L
109    M6M
110    M6N
111    M6P
112    M6R
113    M6S
120    M7A
132    M7R
138    M7Y
155    M8V
156    M8W
157    M8X

# Obtaining Lat/Long Coordinates
I attempted to use Geocoder and then subsequently Nominatim to obtain Latitude and Longitude coordinates for the different neighborhoods of Toronto. 

1. Geocoder was inconsistent, where a significant number of neighborhoods could not be located.  
2. Nomintatim was not specific enough with regard to lat/long coordinates, where numerous distinct neighborhoods were simply given the same blanket geographical coordinates by Nominatim.  

Coordinates were ultimately taken from: "Geospatial_Coordinates.xls" which I believe is dervied from [pay for service] GoogleAPI.

In [None]:
# import geocoder # import geocoder

# postal_code = list(df['Postal Code'])

# # initialize your variable to None
# lat_lng_coords = None

# # loop until you get the coordinates
# while(lat_lng_coords is None):
#   g = geocoder.google(f'{postal_code}, Toronto, Ontario')
#   lat_lng_coords = g.latlng

# latitude = lat_lng_coords[0]
# longitude = lat_lng_coords[1]

In [None]:
# import geocoder # import geocoder

# postal_code = list(df['Postal Code'])

# # loop until you get the coordinates
# while lat_lng_coords:
#   g = geocoder.google(f'{postal_code}, Toronto, Ontario')
#   lat_lng_coords = g.latlng

# latitude = lat_lng_coords[0]
# longitude = lat_lng_coords[1]

In [46]:
# from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

# postal_code = df['Postal Code']
# post_list = list(postal_code)
# geolocator = Nominatim(user_agent="Toronto_explorer")
# issues = []

# for i in post_list:
#     try:
#         address = f"Toronto ON {i}"
#         location = geolocator.geocode(address)
#         # type(location)
#         latitude = location.latitude
#         longitude = location.longitude
#         print(f"{i} are:  {latitude}, {longitude}")
#     except:
#         issues.append(i)
        
# print("Finished running code!")

M1B are:  43.6534817, -79.3839347
M1C are:  43.6534817, -79.3839347
M1G are:  43.76571676956549, -79.22189842824983
M1W are:  43.7170226, -79.41978303501344
M2J are:  43.7170226, -79.41978303501344
M2M are:  43.7859621, -79.4160307769213
M2N are:  43.7170226, -79.41978303501344
M3A are:  43.6534817, -79.3839347
M3C are:  43.7328216, -79.3469614
M3K are:  43.735823249999996, -79.47870883340411
M4L are:  43.6727601, -79.30405834999999
M4X are:  43.6680266, -79.3692816
M5E are:  43.6534817, -79.3839347
M5H are:  43.6523873, -79.3835641
M5J are:  43.6534817, -79.3839347
M5V are:  43.6456336, -79.39298744692186
M6J are:  43.6522219, -79.40753862886237
M6K are:  43.6534817, -79.3839347
M6N are:  43.7170226, -79.41978303501344
M6P are:  43.7170226, -79.41978303501344
M6S are:  43.7170226, -79.41978303501344
M7A are:  43.6534817, -79.3839347
M9B are:  43.64074125, -79.5419018239487
M9C are:  43.7170226, -79.41978303501344
M9R are:  43.7170226, -79.41978303501344
Finished running code!


In [234]:
# print(f"Issues: {len(issues)}")

NameError: name 'issues' is not defined

In [127]:
csvfile = 'Geospatial_Coordinates.csv'
ll = pd.read_csv(csvfile)

In [128]:
ll.head(103)

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476
5,M1J,43.744734,-79.239476
6,M1K,43.727929,-79.262029
7,M1L,43.711112,-79.284577
8,M1M,43.716316,-79.239476
9,M1N,43.692657,-79.264848


### Combining the lat/long ("Geospatial_Coordinates.csv") with the Postal Code ("M9WK3-Wiki-Canadian Postal Codes (RAW).xlsx").  
Here we have 2 separate dataframes and need to merge them together by the point of commonality: 'Postal Code'.

In [129]:

# pll (Postal Code, Latitude, Longitude).

plldata = pd.merge(df, ll, how='left', on=['Postal Code'])
plldata

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476
6,M1K,Scarborough,"Kennedy Park, Ionview, East Birchmount Park",43.727929,-79.262029
7,M1L,Scarborough,"Golden Mile, Clairlea, Oakridge",43.711112,-79.284577
8,M1M,Scarborough,"Cliffside, Cliffcrest, Scarborough Village West",43.716316,-79.239476
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848


In [130]:
plldata_bn = plldata.groupby(['Borough','Neighbourhood'])
plldata_bn.first()

Unnamed: 0_level_0,Unnamed: 1_level_0,Postal Code,Latitude,Longitude
Borough,Neighbourhood,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Central Toronto,Davisville,M4S,43.704324,-79.38879
Central Toronto,Davisville North,M4P,43.712751,-79.390197
Central Toronto,"Forest Hill North & West, Forest Hill Road Park",M5P,43.696948,-79.411307
Central Toronto,Lawrence Park,M4N,43.72802,-79.38879
Central Toronto,"Moore Park, Summerhill East",M4T,43.689574,-79.38316
Central Toronto,"North Toronto West, Lawrence Park",M4R,43.715383,-79.405678
Central Toronto,Roselawn,M5N,43.711695,-79.416936
Central Toronto,"Summerhill West, Rathnelly, South Hill, Forest Hill SE, Deer Park",M4V,43.686412,-79.400049
Central Toronto,"The Annex, North Midtown, Yorkville",M5R,43.67271,-79.405678
Downtown Toronto,Berczy Park,M5E,43.644771,-79.373306


# Data Visualization and Exploration
How do the various neighborhoods of Toronto cluster together? Which neighborhoods are closest in proximity to 1-another?

In [131]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)  # this is a pandas code to override default settings of limiting the output display of just 10 columns
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


In [132]:
address = 'Toronto, CA'

geolocator = Nominatim(user_agent="ca_explorer")
location = geolocator.geocode(address)
type(location)
latitude = location.latitude
longitude = location.longitude
print(f'The geograpical coordinates of Toronto, Ontario, Canada are {latitude}, {longitude}.')

The geograpical coordinates of Toronto, Ontario, Canada are 43.6534817, -79.3839347.


In [133]:
# create map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, post, borough, neighborhood in zip(plldata['Latitude'], plldata['Longitude'], plldata['Postal Code'], plldata['Borough'], plldata['Neighbourhood']):
    label = f'{post}; N: {neighborhood}; B: {borough}'
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

In [134]:
toronto_grouped = plldata.groupby('Neighbourhood').mean().reset_index()
toronto_grouped

Unnamed: 0,Neighbourhood,Latitude,Longitude
0,Agincourt,43.7942,-79.262029
1,"Alderwood, Long Branch",43.602414,-79.543484
2,"Bathurst Manor, Wilson Heights, Downsview North",43.754328,-79.442259
3,Bayview Village,43.786947,-79.385975
4,"Bedford Park, Lawrence Manor East",43.733283,-79.41975
5,Berczy Park,43.644771,-79.373306
6,"Birch Cliff, Cliffside West",43.692657,-79.264848
7,"Brockton, Parkdale Village, Exhibition Place",43.636847,-79.428191
8,"Business reply mail Processing Centre, South C...",43.662744,-79.321558
9,"CN Tower, King and Spadina, Railway Lands, Har...",43.628947,-79.39442


## I want to explore this data to see if geographical proximity of 'Neighborhoods' helps to determine 'Boroughs'; or if any correlation exists.

In [135]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_grouped[['Neighbourhood']], prefix="", prefix_sep="")

toronto_onehot.head()

Unnamed: 0,Agincourt,"Alderwood, Long Branch","Bathurst Manor, Wilson Heights, Downsview North",Bayview Village,"Bedford Park, Lawrence Manor East",Berczy Park,"Birch Cliff, Cliffside West","Brockton, Parkdale Village, Exhibition Place","Business reply mail Processing Centre, South Central Letter Processing Plant Toronto","CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport",Caledonia-Fairbanks,Canada Post Gateway Processing Centre,Cedarbrae,Central Bay Street,Christie,Church and Wellesley,"Clarks Corners, Tam O'Shanter, Sullivan","Cliffside, Cliffcrest, Scarborough Village West","Commerce Court, Victoria Hotel",Davisville,Davisville North,"Del Ray, Mount Dennis, Keelsdale and Silverthorn",Don Mills,"Dorset Park, Wexford Heights, Scarborough Town Centre",Downsview,"Dufferin, Dovercourt Village","East Toronto, Broadview North (Old East York)","Eringate, Bloordale Gardens, Old Burnhamthorpe, Markland Wood","Fairview, Henry Farm, Oriole","First Canadian Place, Underground city","Forest Hill North & West, Forest Hill Road Park","Garden District, Ryerson",Glencairn,"Golden Mile, Clairlea, Oakridge","Guildwood, Morningside, West Hill","Harbourfront East, Union Station, Toronto Islands","High Park, The Junction South",Hillcrest Village,Humber Summit,"Humberlea, Emery",Humewood-Cedarvale,"India Bazaar, The Beaches West","Islington Avenue, Humber Valley Village","Kennedy Park, Ionview, East Birchmount Park","Kensington Market, Chinatown, Grange Park","Kingsview Village, St. Phillips, Martin Grove Gardens, Richview Gardens","Lawrence Manor, Lawrence Heights",Lawrence Park,Leaside,"Little Portugal, Trinity","Malvern, Rouge","Milliken, Agincourt North, Steeles East, L'Amoreaux East","Mimico NW, The Queensway West, South of Bloor, Kingsway Park South West, Royal York South West","Moore Park, Summerhill East","New Toronto, Mimico South, Humber Bay Shores","North Park, Maple Leaf Park, Upwood Park","North Toronto West, Lawrence Park","Northwest, West Humber - Clairville","Northwood Park, York University","Old Mill South, King's Mill Park, Sunnylea, Humber Bay, Mimico NE, The Queensway East, Royal York South East, Kingsway Park South East","Parkdale, Roncesvalles","Parkview Hill, Woodbine Gardens",Parkwoods,"Queen's Park, Ontario Provincial Government","Regent Park, Harbourfront","Richmond, Adelaide, King",Rosedale,Roselawn,"Rouge Hill, Port Union, Highland Creek","Runnymede, Swansea","Runnymede, The Junction North",Scarborough Village,"South Steeles, Silverstone, Humbergate, Jamestown, Mount Olive, Beaumond Heights, Thistletown, Albion Gardens",St. James Town,"St. James Town, Cabbagetown","Steeles West, L'Amoreaux West",Stn A PO Boxes,Studio District,"Summerhill West, Rathnelly, South Hill, Forest Hill SE, Deer Park","The Annex, North Midtown, Yorkville",The Beaches,"The Danforth West, Riverdale","The Kingsway, Montgomery Road, Old Mill North",Thorncliffe Park,"Toronto Dominion Centre, Design Exchange","University of Toronto, Harbord",Upper Rouge,Victoria Village,"West Deane Park, Princess Gardens, Martin Grove, Islington, Cloverdale",Westmount,Weston,"Wexford, Maryvale","Willowdale, Newtonbrook","Willowdale, Willowdale East","Willowdale, Willowdale West",Woburn,Woodbine Heights,York Mills West,"York Mills, Silver Hills"
0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [136]:
# set number of clusters
kclusters=10

k_means = KMeans(init="random", n_clusters=kclusters, n_init=100, precompute_distances=True)
k_means.fit(toronto_onehot)
k_means_labels = k_means.labels_
k_means_labels

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0, 0, 6, 0, 0, 0,
       0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0, 4, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0])

In [137]:
k_means_cluster_centers = k_means.cluster_centers_
k_means_cluster_centers

array([[1.11111111e-02, 1.11111111e-02, 1.11111111e-02, 1.11111111e-02,
        1.11111111e-02, 1.11111111e-02, 1.11111111e-02, 1.11111111e-02,
        1.11111111e-02, 1.11111111e-02, 1.11111111e-02, 1.90819582e-17,
        1.11111111e-02, 1.11111111e-02, 1.11111111e-02, 1.11111111e-02,
        1.11111111e-02, 1.11111111e-02, 1.11111111e-02, 1.11111111e-02,
        1.90819582e-17, 1.11111111e-02, 1.11111111e-02, 1.11111111e-02,
        1.11111111e-02, 1.11111111e-02, 1.11111111e-02, 1.11111111e-02,
        1.11111111e-02, 1.11111111e-02, 1.11111111e-02, 1.11111111e-02,
        1.11111111e-02, 1.11111111e-02, 1.11111111e-02, 1.11111111e-02,
        1.90819582e-17, 1.11111111e-02, 1.11111111e-02, 1.11111111e-02,
        1.90819582e-17, 1.11111111e-02, 1.11111111e-02, 1.11111111e-02,
        1.11111111e-02, 1.11111111e-02, 1.11111111e-02, 1.90819582e-17,
        1.11111111e-02, 1.11111111e-02, 1.11111111e-02, 1.11111111e-02,
        1.11111111e-02, 1.11111111e-02, 1.11111111e-02, 1.111111

In [138]:
# run k-means clustering
k_means.fit(toronto_onehot)

# check cluster labels generated for each row in the dataframe
k = k_means.labels_
print("Number of clusters: ", np.count_nonzero(k)+1)
k

Number of clusters:  10


array([0, 0, 0, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0,
       0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 6, 0, 2, 0, 0, 5, 0, 0, 0, 0, 7, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [139]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, nei, cluster in zip(toronto_grouped['Latitude'], toronto_grouped['Longitude'], toronto_grouped['Neighbourhood'], k_means_labels):
    label = folium.Popup("N:" + str(nei) + '; Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster],
        fill=True,
        fill_color=rainbow[cluster],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

From the K-Means plot of Neighborhoods, with 100 iterations, we can see that Neighborhoods have no overarching correlation in terms of geographical distance to one another as the cluster groups have not formed the Toronto Boroughs, or anything close.  This demonstrates a categorical assessment based upon relative distance of Neighborhoods is inconclusive if attempting to form Neighborhoods into Boroughs based solely upon relative geographical location (Latitude, Longitude). A k-Means (Euclidian distance calculation) is not useful in this case.  This makes sense as Toronto has dynamically expanded over time, and if Boroughs where:

1. Organized according to communities (socioeconomic, etc.) where some have a lower population density.
2. Limited according to geography, such as lake front, etc.
3. Created at different points in time, meaning some are older than others, and subject to different population constraints.
4. Other factors

### We can instead examine current venues (restaurants, museums, gyms, etc.) within select neighborhoods (M5) of Toronto for an understanding of each Neighborhood's underlying dynamic.
Foursquare collects information related to venues and offers a free Devloper account with API access and is therefore a natural choice.

Let's examine the downtown area of Tornoto, limiting our slection to the postal code starting with "M5".

In [140]:
plldata.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


In [141]:
# 53 - 70
plldata_5M = plldata.loc[53:70]
plldata_5M.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
53,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
54,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
55,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
56,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306
57,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383


In [142]:
# reset the index for our sliced 5M data.
plldata_5M.reset_index(drop=True, inplace=True)

In [143]:
plldata_5M.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
1,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
2,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
3,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306
4,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383


In [144]:
plldata_5M.drop('Postal Code', axis=1, inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  errors=errors,


In [145]:
plldata_5M

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude
0,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
1,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
2,Downtown Toronto,St. James Town,43.651494,-79.375418
3,Downtown Toronto,Berczy Park,43.644771,-79.373306
4,Downtown Toronto,Central Bay Street,43.657952,-79.387383
5,Downtown Toronto,"Richmond, Adelaide, King",43.650571,-79.384568
6,Downtown Toronto,"Harbourfront East, Union Station, Toronto Islands",43.640816,-79.381752
7,Downtown Toronto,"Toronto Dominion Centre, Design Exchange",43.647177,-79.381576
8,Downtown Toronto,"Commerce Court, Victoria Hotel",43.648198,-79.379817
9,North York,"Bedford Park, Lawrence Manor East",43.733283,-79.41975


## Define Foursquare credentials and version

In [146]:
CLIENT_ID = 'GRRGR5JCKNQM3BGHCHAAFL5RFNESTRPGTROOSAXP2BT0K3PN'
CLIENT_SECRET = 'ZS1X2LB5UDJJXVQXQZ50UBQIMFQVUM4ARK0LAVZ11ZW4LIBV'
VERSION = '20180605' # Foursquare API version

print('My credentials:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

My credentials:
CLIENT_ID: GRRGR5JCKNQM3BGHCHAAFL5RFNESTRPGTROOSAXP2BT0K3PN
CLIENT_SECRET:ZS1X2LB5UDJJXVQXQZ50UBQIMFQVUM4ARK0LAVZ11ZW4LIBV


In [147]:
plldata_5M.loc[:0]

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude
0,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636


In [93]:
# i = 17
# postn = plldata_5M.loc[i, 'Postal Code']
# longn = plldata_5M.loc[i, 'Longitude']
# latn = plldata_5M.loc[i, 'Latitude']

    
# print("Postal Code: (lat, long)")
# print(f"{postn}: ({latn}, {longn})")

### Create the API URL

In [148]:
radius = 500 # define radius (meters, 500m = 1640.42 ft)
LIMIT = 50 # this number limits how many venues Foursquare returns

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    latn, 
    longn, 
    radius, 
    LIMIT)

url # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=GRRGR5JCKNQM3BGHCHAAFL5RFNESTRPGTROOSAXP2BT0K3PN&client_secret=ZS1X2LB5UDJJXVQXQZ50UBQIMFQVUM4ARK0LAVZ11ZW4LIBV&v=20180605&ll=43.6484292,-79.3822802&radius=500&limit=50'

In [149]:
results = requests.get(url).json()

In [62]:
#results

In [150]:
# Here we save the json file, so that we can open it up w/in a Jupyter Lab in order to conveninetly examine the underlying dictionary structure.

import json

with open('torontoM5X_500.json', 'w') as json_file:
    json.dump(results, json_file)

### Get all nearby venues w/in Foursquare for specified neighborhoods.

In [151]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(name, lat, lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])   # list comprehension used here in order to avoid results['venue']['name'] etc.

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood',                            
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    # df.columns is the list representation of all columns in the DF.
    
    return(nearby_venues)

#### Now we run the above function GetNearbyVenues on each 5M# prefixed postal code neighborhood and create a new dataframe called **toronto_venues**.

In [152]:
toronto_venues = getNearbyVenues(names=plldata_5M['Neighbourhood'],
                                   latitudes=plldata_5M['Latitude'],
                                   longitudes=plldata_5M['Longitude']
                                  )

Regent Park, Harbourfront
Garden District, Ryerson
St. James Town
Berczy Park
Central Bay Street
Richmond, Adelaide, King
Harbourfront East, Union Station, Toronto Islands
Toronto Dominion Centre, Design Exchange
Commerce Court, Victoria Hotel
Bedford Park, Lawrence Manor East
Roselawn
Forest Hill North & West, Forest Hill Road Park
The Annex, North Midtown, Yorkville
University of Toronto, Harbord
Kensington Market, Chinatown, Grange Park
CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport
Stn A PO Boxes
First Canadian Place, Underground city


In [153]:
print(toronto_venues.shape)
toronto_venues.head()

(695, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Regent Park, Harbourfront",43.65426,-79.360636,Roselle Desserts,43.653447,-79.362017,Bakery
1,"Regent Park, Harbourfront",43.65426,-79.360636,Tandem Coffee,43.653559,-79.361809,Coffee Shop
2,"Regent Park, Harbourfront",43.65426,-79.360636,Cooper Koo Family YMCA,43.653249,-79.358008,Distribution Center
3,"Regent Park, Harbourfront",43.65426,-79.360636,Body Blitz Spa East,43.654735,-79.359874,Spa
4,"Regent Park, Harbourfront",43.65426,-79.360636,Impact Kitchen,43.656369,-79.35698,Restaurant


In [154]:
toronto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Bedford Park, Lawrence Manor East",23,23,23,23,23,23
Berczy Park,50,50,50,50,50,50
"CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport",17,17,17,17,17,17
Central Bay Street,50,50,50,50,50,50
"Commerce Court, Victoria Hotel",50,50,50,50,50,50
"First Canadian Place, Underground city",50,50,50,50,50,50
"Forest Hill North & West, Forest Hill Road Park",4,4,4,4,4,4
"Garden District, Ryerson",50,50,50,50,50,50
"Harbourfront East, Union Station, Toronto Islands",50,50,50,50,50,50
"Kensington Market, Chinatown, Grange Park",50,50,50,50,50,50


In [155]:
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 160 uniques categories.


### Analyze each venue type by using SciKit-Learn 'get_dummies' to break down venue types.
From: <https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html>.
"...This creates a binary column for each category and returns a sparse matrix or dense array"

In [156]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighborhood'] = toronto_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Yoga Studio,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bakery,Bank,Bar,Baseball Stadium,Basketball Stadium,Beach,Beer Bar,Beer Store,Belgian Restaurant,Bistro,Boat or Ferry,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Burger Joint,Burrito Place,Butcher,Café,Camera Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Gym,College Rec Center,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Cosmetics Shop,Creperie,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distribution Center,Doner Restaurant,Donut Shop,Dumpling Restaurant,Electronics Store,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish Market,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gaming Cafe,Garden,Gastropub,General Travel,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Historic Site,History Museum,Home Service,Hotel,IT Services,Ice Cream Shop,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Lake,Liquor Store,Lounge,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Movie Theater,Museum,Music Venue,Neighborhood,New American Restaurant,Nightclub,Noodle House,Office,Organic Grocery,Park,Performing Arts Venue,Pharmacy,Pizza Place,Plane,Plaza,Poke Place,Portuguese Restaurant,Pub,Ramen Restaurant,Record Shop,Rental Car Location,Restaurant,Roof Deck,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Skating Rink,Smoke Shop,Snack Place,Spa,Speakeasy,Sporting Goods Shop,Steakhouse,Supermarket,Sushi Restaurant,Tailor Shop,Tanning Salon,Tea Room,Thai Restaurant,Theater,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [194]:
# find out how many venues were found by row.
toronto_onehot.shape

(695, 160)

In [158]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped

Unnamed: 0,Neighborhood,Yoga Studio,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bakery,Bank,Bar,Baseball Stadium,Basketball Stadium,Beach,Beer Bar,Beer Store,Belgian Restaurant,Bistro,Boat or Ferry,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Burger Joint,Burrito Place,Butcher,Café,Camera Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Gym,College Rec Center,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Cosmetics Shop,Creperie,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distribution Center,Doner Restaurant,Donut Shop,Dumpling Restaurant,Electronics Store,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish Market,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gaming Cafe,Garden,Gastropub,General Travel,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Historic Site,History Museum,Home Service,Hotel,IT Services,Ice Cream Shop,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Lake,Liquor Store,Lounge,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Movie Theater,Museum,Music Venue,New American Restaurant,Nightclub,Noodle House,Office,Organic Grocery,Park,Performing Arts Venue,Pharmacy,Pizza Place,Plane,Plaza,Poke Place,Portuguese Restaurant,Pub,Ramen Restaurant,Record Shop,Rental Car Location,Restaurant,Roof Deck,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Skating Rink,Smoke Shop,Snack Place,Spa,Speakeasy,Sporting Goods Shop,Steakhouse,Supermarket,Sushi Restaurant,Tailor Shop,Tanning Salon,Tea Room,Thai Restaurant,Theater,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar
0,"Bedford Park, Lawrence Manor East",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.086957,0.0,0.0,0.0,0.043478,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.043478,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.02,0.02,0.04,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.04,0.0,0.0,0.02,0.04,0.08,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.02,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.02,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.02,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.02,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.0
2,"CN Tower, King and Spadina, Railway Lands, Har...",0.0,0.058824,0.058824,0.058824,0.117647,0.176471,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Central Bay Street,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.04,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.16,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.02,0.02,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.04,0.02,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.02
4,"Commerce Court, Victoria Hotel",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.04,0.0,0.02,0.0,0.0,0.0,0.06,0.02,0.0,0.0,0.0,0.0,0.08,0.0,0.02,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.02,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0
5,"First Canadian Place, Underground city",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.02,0.0,0.0,0.02,0.0,0.02,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.12,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.02,0.0,0.0,0.0,0.04,0.02,0.0,0.0,0.0,0.0,0.06,0.0,0.02,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.06,0.0,0.02,0.0,0.02,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.04,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0
6,"Forest Hill North & West, Forest Hill Road Park",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0
7,"Garden District, Ryerson",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.06,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.06,0.0,0.0,0.02,0.0,0.0,0.02,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.04,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.02,0.02,0.0,0.0,0.02,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.02,0.02,0.0,0.02,0.0,0.02,0.02,0.02,0.04,0.0,0.0,0.0,0.0,0.0,0.0
8,"Harbourfront East, Union Station, Toronto Islands",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.06,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.04,0.02,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.04,0.02,0.02,0.0,0.02,0.02,0.0,0.0,0.02,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.04,0.02,0.0,0.02,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.02,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0
9,"Kensington Market, Chinatown, Grange Park",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.04,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.04,0.0,0.0,0.06,0.0,0.04,0.02,0.0,0.0,0.0,0.02,0.06,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.02,0.02,0.02,0.0,0.0,0.0,0.02,0.0,0.02,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.06,0.0,0.04,0.02


In [159]:
toronto_grouped.shape

(18, 160)

### Let's examine the 5 most common venues of each neighborhood.

In [160]:
num_top_venues = 5

for hood in toronto_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Bedford Park, Lawrence Manor East----
                venue  freq
0  Italian Restaurant  0.09
1         Coffee Shop  0.09
2     Thai Restaurant  0.09
3          Restaurant  0.09
4      Sandwich Place  0.09


----Berczy Park----
                venue  freq
0         Coffee Shop  0.08
1         Cheese Shop  0.04
2              Bakery  0.04
3  Seafood Restaurant  0.04
4            Beer Bar  0.04


----CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport----
             venue  freq
0  Airport Service  0.18
1   Airport Lounge  0.12
2  Harbor / Marina  0.06
3          Airport  0.06
4      Coffee Shop  0.06


----Central Bay Street----
                venue  freq
0         Coffee Shop  0.16
1      Sandwich Place  0.06
2  Italian Restaurant  0.04
3                Café  0.04
4     Bubble Tea Shop  0.04


----Commerce Court, Victoria Hotel----
         venue  freq
0  Coffee Shop  0.10
1         Café  0.10
2        Hotel  0.08
3   Restaur

### Create a Pandas DataFrame

In [168]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [201]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for i in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(i+1, indicators[i]))
    except:
        columns.append('{}th Most Common Venue'.format(i+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for i in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[i, 1:] = return_most_common_venues(toronto_grouped.iloc[i, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Bedford Park, Lawrence Manor East",Italian Restaurant,Sandwich Place,Coffee Shop,Restaurant,Thai Restaurant,Pizza Place,Indian Restaurant,Juice Bar,Pub,Liquor Store
1,Berczy Park,Coffee Shop,Café,Beer Bar,Cheese Shop,Seafood Restaurant,Bakery,Cocktail Bar,Restaurant,Farmers Market,Gym
2,"CN Tower, King and Spadina, Railway Lands, Har...",Airport Service,Airport Lounge,Sculpture Garden,Harbor / Marina,Boutique,Rental Car Location,Coffee Shop,Bar,Plane,Boat or Ferry
3,Central Bay Street,Coffee Shop,Sandwich Place,Italian Restaurant,Bubble Tea Shop,Burger Joint,Café,Wine Bar,Japanese Restaurant,Juice Bar,Middle Eastern Restaurant
4,"Commerce Court, Victoria Hotel",Café,Coffee Shop,Restaurant,Hotel,Gym,Beer Bar,American Restaurant,Gastropub,Deli / Bodega,Seafood Restaurant


### Create Clusters with all specified Neighborhoods

In [202]:
# set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0, n_init=100).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 1, 4, 1, 1, 1, 3, 1, 1, 1])

In [203]:
neighborhoods_venues_sorted.reset_index(drop=True)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Bedford Park, Lawrence Manor East",Italian Restaurant,Sandwich Place,Coffee Shop,Restaurant,Thai Restaurant,Pizza Place,Indian Restaurant,Juice Bar,Pub,Liquor Store
1,Berczy Park,Coffee Shop,Café,Beer Bar,Cheese Shop,Seafood Restaurant,Bakery,Cocktail Bar,Restaurant,Farmers Market,Gym
2,"CN Tower, King and Spadina, Railway Lands, Har...",Airport Service,Airport Lounge,Sculpture Garden,Harbor / Marina,Boutique,Rental Car Location,Coffee Shop,Bar,Plane,Boat or Ferry
3,Central Bay Street,Coffee Shop,Sandwich Place,Italian Restaurant,Bubble Tea Shop,Burger Joint,Café,Wine Bar,Japanese Restaurant,Juice Bar,Middle Eastern Restaurant
4,"Commerce Court, Victoria Hotel",Café,Coffee Shop,Restaurant,Hotel,Gym,Beer Bar,American Restaurant,Gastropub,Deli / Bodega,Seafood Restaurant
5,"First Canadian Place, Underground city",Café,Coffee Shop,Restaurant,Hotel,Deli / Bodega,Bar,Gym,Steakhouse,American Restaurant,Japanese Restaurant
6,"Forest Hill North & West, Forest Hill Road Park",Park,Trail,Jewelry Store,Sushi Restaurant,Clothing Store,Chocolate Shop,Department Store,Deli / Bodega,Dance Studio,Creperie
7,"Garden District, Ryerson",Coffee Shop,Café,Ramen Restaurant,Bookstore,Cosmetics Shop,Theater,Clothing Store,Italian Restaurant,Miscellaneous Shop,Plaza
8,"Harbourfront East, Union Station, Toronto Islands",Coffee Shop,Aquarium,Plaza,Park,Hotel,Brewery,Café,New American Restaurant,Lounge,Bistro
9,"Kensington Market, Chinatown, Grange Park",Mexican Restaurant,Vegetarian / Vegan Restaurant,Café,Coffee Shop,Pizza Place,Burger Joint,Vietnamese Restaurant,Caribbean Restaurant,Bakery,Dessert Shop


In [204]:
# add clustering labels

neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = plldata_5M

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighbourhood')

toronto_merged.head() # check the last columns!

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1,Coffee Shop,Park,Pub,Café,Bakery,Restaurant,Breakfast Spot,Theater,Performing Arts Venue,French Restaurant
1,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,1,Coffee Shop,Café,Ramen Restaurant,Bookstore,Cosmetics Shop,Theater,Clothing Store,Italian Restaurant,Miscellaneous Shop,Plaza
2,Downtown Toronto,St. James Town,43.651494,-79.375418,1,Café,Cosmetics Shop,Coffee Shop,Restaurant,Creperie,Hotel,Gastropub,Seafood Restaurant,Park,Farmers Market
3,Downtown Toronto,Berczy Park,43.644771,-79.373306,1,Coffee Shop,Café,Beer Bar,Cheese Shop,Seafood Restaurant,Bakery,Cocktail Bar,Restaurant,Farmers Market,Gym
4,Downtown Toronto,Central Bay Street,43.657952,-79.387383,1,Coffee Shop,Sandwich Place,Italian Restaurant,Bubble Tea Shop,Burger Joint,Café,Wine Bar,Japanese Restaurant,Juice Bar,Middle Eastern Restaurant


In [205]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighbourhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Let's examine the characteristics of all clusters

In [206]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
9,"Bedford Park, Lawrence Manor East",Italian Restaurant,Sandwich Place,Coffee Shop,Restaurant,Thai Restaurant,Pizza Place,Indian Restaurant,Juice Bar,Pub,Liquor Store
12,"The Annex, North Midtown, Yorkville",Café,Sandwich Place,Coffee Shop,Middle Eastern Restaurant,Burger Joint,Liquor Store,Indian Restaurant,Pub,BBQ Joint,History Museum


In [207]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Regent Park, Harbourfront",Coffee Shop,Park,Pub,Café,Bakery,Restaurant,Breakfast Spot,Theater,Performing Arts Venue,French Restaurant
1,"Garden District, Ryerson",Coffee Shop,Café,Ramen Restaurant,Bookstore,Cosmetics Shop,Theater,Clothing Store,Italian Restaurant,Miscellaneous Shop,Plaza
2,St. James Town,Café,Cosmetics Shop,Coffee Shop,Restaurant,Creperie,Hotel,Gastropub,Seafood Restaurant,Park,Farmers Market
3,Berczy Park,Coffee Shop,Café,Beer Bar,Cheese Shop,Seafood Restaurant,Bakery,Cocktail Bar,Restaurant,Farmers Market,Gym
4,Central Bay Street,Coffee Shop,Sandwich Place,Italian Restaurant,Bubble Tea Shop,Burger Joint,Café,Wine Bar,Japanese Restaurant,Juice Bar,Middle Eastern Restaurant
5,"Richmond, Adelaide, King",Coffee Shop,Café,Steakhouse,Restaurant,Concert Hall,Hotel,American Restaurant,General Travel,Pizza Place,Department Store
6,"Harbourfront East, Union Station, Toronto Islands",Coffee Shop,Aquarium,Plaza,Park,Hotel,Brewery,Café,New American Restaurant,Lounge,Bistro
7,"Toronto Dominion Centre, Design Exchange",Coffee Shop,Café,Seafood Restaurant,Restaurant,Beer Bar,Japanese Restaurant,Hotel,Bar,Sandwich Place,Basketball Stadium
8,"Commerce Court, Victoria Hotel",Café,Coffee Shop,Restaurant,Hotel,Gym,Beer Bar,American Restaurant,Gastropub,Deli / Bodega,Seafood Restaurant
13,"University of Toronto, Harbord",Café,Bookstore,Bar,Sandwich Place,Restaurant,Bakery,Japanese Restaurant,College Gym,College Arts Building,Chinese Restaurant


In [208]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,Roselawn,Garden,Home Service,Discount Store,Dessert Shop,Department Store,Deli / Bodega,Dance Studio,Creperie,Cosmetics Shop,Concert Hall


In [209]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
11,"Forest Hill North & West, Forest Hill Road Park",Park,Trail,Jewelry Store,Sushi Restaurant,Clothing Store,Chocolate Shop,Department Store,Deli / Bodega,Dance Studio,Creperie


### Discussion:
Our largest cluster within 5M# representing downtown Toronto, closest to the lakefront, has a high density of coffee shops which are captured as the 1st most common venue (except for 1 neighborhood) and also 2nd most common venue in several instances.  If you're interested in a cup of coffee it appears you can't go wrong in 5M# near the lakefront.