## This notebook is developed as part of the Peer-graded Assignment: Segmenting and Clustering Neighborhoods in Toronto (Module 9 Week 3)

#### Data about 'List of postal codes of Canada: M' is taken from https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M

In [44]:
# Import the libraries required
import pandas as pd
import numpy as np

# I tried using the geocoder library and the code is included it didnt work for me unfortunately. More details below.
#!pip install geocoder
#import geocoder # geocoder for the second part of the assignment

!pip install geopy
from geopy.geocoders import Nominatim
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors


!pip install folium
import folium

# import k-means from clustering stage
from sklearn.cluster import KMeans



In [2]:
# Read data from the URL provided

postcodeM = pd.read_html("https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M")
postcodeM[0]

Unnamed: 0,Postal code,Borough,Neighborhood
0,M1A,Not assigned,
1,M2A,Not assigned,
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Regent Park / Harbourfront
...,...,...,...
175,M5Z,Not assigned,
176,M6Z,Not assigned,
177,M7Z,Not assigned,
178,M8Z,Etobicoke,Mimico NW / The Queensway West / South of Bloo...


In [3]:
# Setup the final result data frame in the right format
column_names = ['Postcode', 'Borough','Neighborhood']

df_postcodeM = pd.DataFrame(columns=column_names)
df_postcodeM

Unnamed: 0,Postcode,Borough,Neighborhood


In [4]:
# Populate the database
df_postcodeM['Postcode'] = postcodeM[0]['Postal code']
df_postcodeM['Borough'] = postcodeM[0]['Borough']
df_postcodeM['Neighborhood'] = postcodeM[0]['Neighborhood']
df_postcodeM.head(5)

Unnamed: 0,Postcode,Borough,Neighborhood
0,M1A,Not assigned,
1,M2A,Not assigned,
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Regent Park / Harbourfront


In [5]:
# Drop rows where Borough is 'Not assigned'

indexName = df_postcodeM[df_postcodeM['Borough'] == 'Not assigned'].index
df_postcodeM.drop(indexName, inplace = True)

# replace '/' with ',' in the Neighborhood column
df_postcodeM['Neighborhood'] = df_postcodeM['Neighborhood'].str.replace("/",",", regex=True)

df_postcodeM.head(5)


Unnamed: 0,Postcode,Borough,Neighborhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park , Harbourfront"
5,M6A,North York,"Lawrence Manor , Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park , Ontario Provincial Government"


In [6]:
# Check if there are dual entries of postcode and mderge the rows
# Note: Visual inspection has confirmed that this is not the case. I am not sure why the assignment description mentions it.
#       Its possible that this was the case earlier and has since been corrected

df_postcodeM = df_postcodeM.groupby('Postcode').agg({
                                                    'Borough' : 'first',
                                                    'Neighborhood':','.join}).reset_index()

In [7]:
df_postcodeM.head(5)

Unnamed: 0,Postcode,Borough,Neighborhood
0,M1B,Scarborough,"Malvern , Rouge"
1,M1C,Scarborough,"Rouge Hill , Port Union , Highland Creek"
2,M1E,Scarborough,"Guildwood , Morningside , West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


In [8]:
print("Having processed the data as required, the shape of the DataFrame is ", df_postcodeM.shape)

Having processed the data as required, the shape of the DataFrame is  (103, 3)


### This brings us to the end of Q1 of the Week 3 Assignment on "Segmenting and Clustering Neighborhoods in Toronto" 

In [9]:
# part 2 of the assignment requires reading Latutute and Longitute from geocoder and adding it to the data frame
# Adding columns to read Latutute and Longitute
df_postcodeM.insert(3 , 'Latitute', np.nan)
df_postcodeM.insert(4 , 'Longitute', np.nan)
df_postcodeM.head(5)

Unnamed: 0,Postcode,Borough,Neighborhood,Latitute,Longitute
0,M1B,Scarborough,"Malvern , Rouge",,
1,M1C,Scarborough,"Rouge Hill , Port Union , Highland Creek",,
2,M1E,Scarborough,"Guildwood , Morningside , West Hill",,
3,M1G,Scarborough,Woburn,,
4,M1H,Scarborough,Cedarbrae,,


#### I tried using the geocoder library with the code below but it didnt work. It didn't give any error but it just didnt give any output and I had to eventually interrupt the kernel and even that didnt work. I eventually had to close the browser window and even the kernel instance on my local laptop. 

##### I eventually used the the link provided in the assignment.

for index in df_postcodeM.index:
    lat_lng_coords = None
    postal_code = df_postcodeM['Postcode'][index]
    # loop until you get the coordinates
    while(lat_lng_coords is None):
      g = geocoder.google('{}, Toronto, Ontario'.format(postal_code))
      lat_lng_coords = g.latlng

    df_postcodeM['Latitute'] = lat_lng_coords[0]
    df_postcodeM['Longitute'] = lat_lng_coords[1]

In [10]:
lat_long = pd.read_csv('http://cocl.us/Geospatial_data')
lat_long.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [11]:
# check shape of the lat_long data frame just read, to make sure its as expected
lat_long.shape

(103, 3)

In [12]:
# compare the postcode columns and copy the corresponding Latitute and Longitute to df_postcodeM

for index in df_postcodeM.index:
    if df_postcodeM['Postcode'][index] == lat_long['Postal Code'][index]:
        df_postcodeM['Latitute'] = lat_long['Latitude']
        df_postcodeM['Longitute'] = lat_long['Longitude']

df_postcodeM.head(5)      



Unnamed: 0,Postcode,Borough,Neighborhood,Latitute,Longitute
0,M1B,Scarborough,"Malvern , Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Rouge Hill , Port Union , Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood , Morningside , West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


### This brings us to the end of Q2 of the Week 3 Assignment on "Segmenting and Clustering Neighborhoods in Toronto"

In [14]:
# lets identify the Boroughs that contain the word Toronto

ContainsToronto = df_postcodeM[df_postcodeM['Borough'].str.contains("Toronto", na=False)].reset_index(drop=True)
ContainsToronto.head(5)

Unnamed: 0,Postcode,Borough,Neighborhood,Latitute,Longitute
0,M4E,East Toronto,The Beaches,43.676357,-79.293031
1,M4K,East Toronto,"The Danforth West , Riverdale",43.679557,-79.352188
2,M4L,East Toronto,"India Bazaar , The Beaches West",43.668999,-79.315572
3,M4M,East Toronto,Studio District,43.659526,-79.340923
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879


In [15]:
ContainsToronto.shape

(39, 5)

In [17]:
address = 'Toronto, CA'

geolocator = Nominatim(user_agent="Toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.6534817, -79.3839347.


In [22]:
# create map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, label in zip(ContainsToronto['Latitute'], ContainsToronto['Longitute'], ContainsToronto['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

In [31]:
# Define Foursquare Credentials and Version

CLIENT_ID = 'F1AJUU1UFHUIEB202G545SYM1XW0VMFJWBJODFLWLT2KILTG' # your Foursquare ID
CLIENT_SECRET = '1HFZHUCGQMPT21FECK2I2VCOC4ZPJO5WO50LEII1PTKGVFCJ' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: F1AJUU1UFHUIEB202G545SYM1XW0VMFJWBJODFLWLT2KILTG
CLIENT_SECRET:1HFZHUCGQMPT21FECK2I2VCOC4ZPJO5WO50LEII1PTKGVFCJ


In [32]:
# Define the function to get data of all neighborhoods

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [33]:
# type your answer here

ContainsToronto_venues = getNearbyVenues(names=ContainsToronto['Neighborhood'],
                                   latitudes=ContainsToronto['Latitute'],
                                   longitudes=ContainsToronto['Longitute']
                                  )

The Beaches
The Danforth West , Riverdale
India Bazaar , The Beaches West
Studio District
Lawrence Park
Davisville North
North Toronto West
Davisville
Moore Park , Summerhill East
Summerhill West , Rathnelly , South Hill , Forest Hill SE , Deer Park
Rosedale
St. James Town , Cabbagetown
Church and Wellesley
Regent Park , Harbourfront
Garden District, Ryerson
St. James Town
Berczy Park
Central Bay Street
Richmond , Adelaide , King
Harbourfront East , Union Station , Toronto Islands
Toronto Dominion Centre , Design Exchange
Commerce Court , Victoria Hotel
Roselawn
Forest Hill North & West
The Annex , North Midtown , Yorkville
University of Toronto , Harbord
Kensington Market , Chinatown , Grange Park
CN Tower , King and Spadina , Railway Lands , Harbourfront West , Bathurst  Quay , South Niagara , Island airport
Stn A PO Boxes
First Canadian Place , Underground city
Christie
Dufferin , Dovercourt Village
Little Portugal , Trinity
Brockton , Parkdale Village , Exhibition Place
High Park ,

In [34]:
print(ContainsToronto_venues.shape)
ContainsToronto_venues.head()

(1692, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,The Beaches,43.676357,-79.293031,Glen Manor Ravine,43.676821,-79.293942,Trail
1,The Beaches,43.676357,-79.293031,The Big Carrot Natural Food Market,43.678879,-79.297734,Health Food Store
2,The Beaches,43.676357,-79.293031,Grover Pub and Grub,43.679181,-79.297215,Pub
3,The Beaches,43.676357,-79.293031,Upper Beaches,43.680563,-79.292869,Neighborhood
4,"The Danforth West , Riverdale",43.679557,-79.352188,Pantheon,43.677621,-79.351434,Greek Restaurant


In [35]:
# Let's check how many venues were returned for each neighborhood
ContainsToronto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Berczy Park,56,56,56,56,56,56
"Brockton , Parkdale Village , Exhibition Place",23,23,23,23,23,23
Business reply mail Processing CentrE,16,16,16,16,16,16
"CN Tower , King and Spadina , Railway Lands , Harbourfront West , Bathurst Quay , South Niagara , Island airport",16,16,16,16,16,16
Central Bay Street,77,77,77,77,77,77
Christie,17,17,17,17,17,17
Church and Wellesley,83,83,83,83,83,83
"Commerce Court , Victoria Hotel",100,100,100,100,100,100
Davisville,34,34,34,34,34,34
Davisville North,7,7,7,7,7,7


In [37]:
# Let's find out how many unique categories can be curated from all the returned venues
print('There are {} uniques categories.'.format(len(ContainsToronto_venues['Venue Category'].unique())))

There are 233 uniques categories.


In [38]:
# Analyse each neighborhood
# one hot encoding
ContainsToronto_onehot = pd.get_dummies(ContainsToronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
ContainsToronto_onehot['Neighborhood'] = ContainsToronto_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [ContainsToronto_onehot.columns[-1]] + list(ContainsToronto_onehot.columns[:-1])
ContainsToronto_onehot = ContainsToronto_onehot[fixed_columns]

ContainsToronto_onehot.head()

Unnamed: 0,Yoga Studio,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,0,0,0,0,0,0,0,0,0,0,...,0,1,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [39]:
# let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

ContainsToronto_grouped = ContainsToronto_onehot.groupby('Neighborhood').mean().reset_index()
ContainsToronto_grouped

Unnamed: 0,Neighborhood,Yoga Studio,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,...,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0
1,"Brockton , Parkdale Village , Exhibition Place",0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Business reply mail Processing CentrE,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"CN Tower , King and Spadina , Railway Lands , ...",0.0,0.0,0.0625,0.0625,0.0625,0.125,0.0625,0.125,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Central Bay Street,0.012987,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,...,0.0,0.0,0.0,0.012987,0.0,0.0,0.012987,0.0,0.0,0.0
5,Christie,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Church and Wellesley,0.024096,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,...,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,0.012048,0.012048,0.0
7,"Commerce Court , Victoria Hotel",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,...,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0
8,Davisville,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Davisville North,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [40]:
ContainsToronto_grouped.shape

(39, 233)

In [41]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [42]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = ContainsToronto_grouped['Neighborhood']

for ind in np.arange(ContainsToronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(ContainsToronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Berczy Park,Coffee Shop,Cocktail Bar,Beer Bar,Seafood Restaurant,Restaurant,Farmers Market,Café,Cheese Shop,Bakery,Cosmetics Shop
1,"Brockton , Parkdale Village , Exhibition Place",Café,Breakfast Spot,Coffee Shop,Yoga Studio,Bakery,Stadium,Burrito Place,Restaurant,Climbing Gym,Performing Arts Venue
2,Business reply mail Processing CentrE,Light Rail Station,Fast Food Restaurant,Auto Workshop,Brewery,Spa,Burrito Place,Recording Studio,Pizza Place,Garden,Gym / Fitness Center
3,"CN Tower , King and Spadina , Railway Lands , ...",Airport Lounge,Airport Terminal,Boutique,Boat or Ferry,Bar,Rental Car Location,Sculpture Garden,Plane,Coffee Shop,Harbor / Marina
4,Central Bay Street,Coffee Shop,Italian Restaurant,Sandwich Place,Japanese Restaurant,Thai Restaurant,Burger Joint,Ice Cream Shop,Café,Middle Eastern Restaurant,Spa


In [45]:
# Run k-means to cluster the neighborhood into 5 clusters.

# set number of clusters
kclusters = 5

ContainsToronto_grouped_clustering = ContainsToronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(ContainsToronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [46]:
# Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

ContainsToronto_merged = ContainsToronto

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
ContainsToronto_merged = ContainsToronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

ContainsToronto_merged.head() # check the last columns!

Unnamed: 0,Postcode,Borough,Neighborhood,Latitute,Longitute,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M4E,East Toronto,The Beaches,43.676357,-79.293031,2,Health Food Store,Trail,Pub,Women's Store,Department Store,Empanada Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop
1,M4K,East Toronto,"The Danforth West , Riverdale",43.679557,-79.352188,2,Greek Restaurant,Coffee Shop,Italian Restaurant,Bookstore,Ice Cream Shop,Furniture / Home Store,Frozen Yogurt Shop,Pub,Pizza Place,Liquor Store
2,M4L,East Toronto,"India Bazaar , The Beaches West",43.668999,-79.315572,2,Sandwich Place,Coffee Shop,Food & Drink Shop,Liquor Store,Burrito Place,Restaurant,Italian Restaurant,Intersection,Fast Food Restaurant,Ice Cream Shop
3,M4M,East Toronto,Studio District,43.659526,-79.340923,2,Café,Coffee Shop,American Restaurant,Brewery,Bakery,Gastropub,Yoga Studio,Clothing Store,Latin American Restaurant,Wine Bar
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,0,Photography Studio,Park,Swim School,Bus Line,Dessert Shop,Empanada Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop


In [49]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(ContainsToronto_merged['Latitute'], ContainsToronto_merged['Longitute'], ContainsToronto_merged['Neighborhood'], ContainsToronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [50]:
# Cluster 1

ContainsToronto_merged.loc[ContainsToronto_merged['Cluster Labels'] == 0, ContainsToronto_merged.columns[[1] + list(range(5, ContainsToronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Central Toronto,0,Photography Studio,Park,Swim School,Bus Line,Dessert Shop,Empanada Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop
10,Downtown Toronto,0,Park,Playground,Trail,Women's Store,Department Store,Empanada Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop


In [52]:
# Cluster 2
ContainsToronto_merged.loc[ContainsToronto_merged['Cluster Labels'] == 1, ContainsToronto_merged.columns[[1] + list(range(5, ContainsToronto_merged.shape[1]))]]


Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,Central Toronto,1,Tennis Court,Department Store,Ethiopian Restaurant,Empanada Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant,Dog Run


In [53]:
# Cluster 3
ContainsToronto_merged.loc[ContainsToronto_merged['Cluster Labels'] == 2, ContainsToronto_merged.columns[[1] + list(range(5, ContainsToronto_merged.shape[1]))]]


Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,East Toronto,2,Health Food Store,Trail,Pub,Women's Store,Department Store,Empanada Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop
1,East Toronto,2,Greek Restaurant,Coffee Shop,Italian Restaurant,Bookstore,Ice Cream Shop,Furniture / Home Store,Frozen Yogurt Shop,Pub,Pizza Place,Liquor Store
2,East Toronto,2,Sandwich Place,Coffee Shop,Food & Drink Shop,Liquor Store,Burrito Place,Restaurant,Italian Restaurant,Intersection,Fast Food Restaurant,Ice Cream Shop
3,East Toronto,2,Café,Coffee Shop,American Restaurant,Brewery,Bakery,Gastropub,Yoga Studio,Clothing Store,Latin American Restaurant,Wine Bar
5,Central Toronto,2,Department Store,Park,Breakfast Spot,Gym,Hotel,Food & Drink Shop,Sandwich Place,Dog Run,Discount Store,Distribution Center
6,Central Toronto,2,Clothing Store,Coffee Shop,Sporting Goods Shop,Fast Food Restaurant,Diner,Dessert Shop,Mexican Restaurant,Park,Chinese Restaurant,Café
7,Central Toronto,2,Sandwich Place,Dessert Shop,Pizza Place,Gym,Italian Restaurant,Café,Sushi Restaurant,Coffee Shop,Asian Restaurant,Pub
9,Central Toronto,2,Coffee Shop,Pub,Supermarket,Sports Bar,Light Rail Station,Sushi Restaurant,Vietnamese Restaurant,Liquor Store,Bank,Pizza Place
11,Downtown Toronto,2,Coffee Shop,Restaurant,Pub,Bakery,Pizza Place,Park,Italian Restaurant,Café,Chinese Restaurant,Pet Store
12,Downtown Toronto,2,Coffee Shop,Japanese Restaurant,Gay Bar,Restaurant,Sushi Restaurant,Gastropub,Pub,Pizza Place,Hotel,Café


In [54]:
# Cluster 4
ContainsToronto_merged.loc[ContainsToronto_merged['Cluster Labels'] == 3, ContainsToronto_merged.columns[[1] + list(range(5, ContainsToronto_merged.shape[1]))]]


Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
23,Central Toronto,3,Jewelry Store,Trail,Mexican Restaurant,Sushi Restaurant,Women's Store,Dessert Shop,Empanada Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant


In [55]:
# Cluster 5
ContainsToronto_merged.loc[ContainsToronto_merged['Cluster Labels'] == 5, ContainsToronto_merged.columns[[1] + list(range(5, ContainsToronto_merged.shape[1]))]]


Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
