# Toronto vs Montreal: How similar or dissimmilar are Toronto and Montreal neighborhoods?

### Introduction

This project is for new Canadians who are deciding which metropolitan city to live in. Moving to a new country is challenging in its own right. Hopefully this project will capture the essence of the two cities and allow for the newcommers to chose their new home wisely based on their lifestyle. This project will focus on three types of newcomers. 

1. Couple with a young family or who want to start a family 
2. Young professional with a demanding job 
3. Newcomer with an active lifestyle  

### The data

The data is provided by Wikipedia (postal code, and neighborhood), Geocoder Library (longitude and latitude), and Foursquare API (location data before COVID-19). Foursquare location data will concentrate on nearby venues such as restaurants, cafe, parks and museums to name a few.

### Data Analysis and Visualization

The common venues in each neighborhood of both cities, Toronto and Montreal, will be grouped into the clusters using the unsupervised machine learning technique, k-means clustering. Folium maps will be used to visualize the results and cluster.

# Notebook 

## Toronto Data 

In [1]:
import pandas as pd
import numpy as np 

print('Pandas and NumPy imported')

Pandas and NumPy imported


In [2]:
#Create first Toronto dataframe based on Wiki data 
to_df = pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M')
to_df = to_df[0]

#Remove Not assigned and reset index
to_df.replace('Not assigned', np.nan, inplace = True)
to_df.dropna(subset = ['Borough'], axis = 0, inplace = True)
to_df.drop(columns = ['Borough'], inplace = True)
to_df.reset_index(drop = True, inplace = True)
to_df.head()

Unnamed: 0,Postal Code,Neighbourhood
0,M3A,Parkwoods
1,M4A,Victoria Village
2,M5A,"Regent Park, Harbourfront"
3,M6A,"Lawrence Manor, Lawrence Heights"
4,M7A,"Queen's Park, Ontario Provincial Government"


In [3]:
# Create new dataframe with postal code, latitude, longitude for toronto
geo_to = pd.read_csv('http://cocl.us/Geospatial_data')

# Merge both dataframe together to create the foundational dataframe toronto_df
toronto_df = pd.merge(to_df, geo_to, how = 'outer', on='Postal Code')
toronto_df.head()

Unnamed: 0,Postal Code,Neighbourhood,Latitude,Longitude
0,M3A,Parkwoods,43.753259,-79.329656
1,M4A,Victoria Village,43.725882,-79.315572
2,M5A,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494


## Montreal Data

In [4]:
#!conda install -c conda-forge geocoder --yes
import geocoder 

print('Geocoder imported!')

Geocoder imported!


In [5]:
#https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_H

mtl_data = {'Postal Code': ['H1A', 'H1B', 'H1C', 'H1E', 'H1G', 'H1H', 'H1J', 'H1K', 'H1L', 'H1M', 'H1N', 'H1P', 'H1R', 
                    'H1S', 'H1T', 'H1V', 'H1W', 'H1X', 'H1Y', 'H1Z', 'H2A', 'H2B', 'H2C', 'H2E', 'H2G', 'H2H', 'H2J', 
                    'H2K', 'H2L', 'H2M', 'H2N', 'H2P', 'H2R', 'H2S', 'H2T', 'H2V', 'H2W', 'H2X', 'H2Y', 'H2Z', 'H3A', 
                    'H3B', 'H3C', 'H3E', 'H3G', 'H3H', 'H3J', 'H3K', 'H3L', 'H3M', 'H3N', 'H3P', 'H3R', 'H3S', 'H3T', 
                    'H3V', 'H3W', 'H3X', 'H3X', 'H3Y', 'H3Z', 'H4A', 'H4B', 'H4C', 'H4E', 'H4G', 'H4H', 'H4J', 'H4K', 
                    'H4L', 'H4M', 'H4N', 'H4P', 'H4R', 'H4S', 'H4T', 'H4V', 'H4W', 'H4X', 'H4Y', 'H4Z', 'H5A', 'H5B', 
                    'H8N', 'H8P', 'H8R', 'H8R', 'H8S', 'H8T', 'H8Y', 'H8Z', 'H9A', 'H9B', 'H9C', 'H9E', 'H9G', 'H9H', 
                    'H9H', 'H9J', 'H9K', 'H9P', 'H9R', 'H9S', 'H9S', 'H9W', 'H9X'],
            'Neighborhood': ['Pointe-Aux-Trembles', 'Montreal-East', 'Rivière-des-Prairies', 'Rivière-des-Prairies', 
                     'Montreal-Nord', 'Montreal-Nord', 'Anjou', 'Anjou', 'Mercier', 'Mercier', 'Mercier', 'Saint-Leonard', 'Saint-Leonard', 
                     'Saint-Leonard', 'Rosemont', 'Maisonneuve', 'Hochlelaga', 'Rosemount', 'Rosemount', 'Saint-Michel',
                     'Saint-Michel', 'Ahunstic', 'Ahunstic', 'Villeray', 'Petite-Patrie', 'Plateau Mount-Royal', 'Plateau Mount-Royal', 
                     'Centre-Sud', 'Centre-Sud', 'Ahunstic', 'Ahunstic', 'Villeray', 'Villeray', 'Petite-Patrie', 'Plateau Mount-Royal', 
                     'Outremount', 'Plateau Mount-Royal', 'Plateau Mount-Royal', 'Old Montreal', 'Downtown Montreal', 
                     'Downtown Montreal (McGill University)', 'Downtown Montreal', 'Griffin Town (Université de Montréal)', "Nun's Island", 
                     'Downtown Montreal (Concordia University)', 'Downtown Montreal', 'Petite Bourgogne', 'Pointe-Saint-Charles', 
                     'Ahunstic', 'Catierville', 'Parc Extension', 'Mount Royal', 'Mount Royal', 'Côte-des-Neiges', 'Côte-des-Neiges', 
                     'Côte-des-Neiges', 'Côte-des-Neiges', 'Hampstead', 'Côte-Saint-Luc', 'Westmount', 'Westmount', 'Notre-Dame-de-Grâce', 
                     'Notre-Dame-de-Grâce', 'Saint-Henri', 'Ville-Émard', 'Verdun', 'Verdun', 'Cartierville', 'Cartierville', 
                     'Saint-Laurent', 'Saint-Laurent', 'Saint-Laurent', 'Mount Royal', 'Saint-Laurent', 'Saint-Laurent', 'Saint-Laurent', 
                     'Côte-Saint-Luc', 'Côte-Saint-Luc', 'Montreal West', 'Dorval', 'Tour de la Bourse', 'Place Bonaventure', 'Place Desjardins', 
                     'LaSalle', 'LaSalle', 'LaSalle', 'Ville Saint-Pierre', 'Lachine', 'Lachine', 
                     'Pierrefonds-Roxboro', 'Pierrefonds', 'Dollard-des-Ormeaux', 'Dollard-des-Ormeaux', 'Île-Bizard', 'Île-Bizard', 
                     'Dollard-des-Ormeaux', 'Pierrefonds', 'Sainte-Geneviève', 'Kirkland', 'Senneville', 'Dorval', 'Pointe-Claire', 'Dorval', 
                     "L'Île-Dorval", 'Beaconsfield', 'Sainte-Anne-de-Bellevue']}

In [6]:
mtl_df = pd.DataFrame(mtl_data)

print(mtl_df)
print('\n''source: https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_H')

    Postal Code             Neighborhood
0           H1A      Pointe-Aux-Trembles
1           H1B            Montreal-East
2           H1C     Rivière-des-Prairies
3           H1E     Rivière-des-Prairies
4           H1G            Montreal-Nord
..          ...                      ...
101         H9R            Pointe-Claire
102         H9S                   Dorval
103         H9S             L'Île-Dorval
104         H9W             Beaconsfield
105         H9X  Sainte-Anne-de-Bellevue

[106 rows x 2 columns]

source: https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_H


In [7]:
#latitude = lat_lng_coords[0]
#longitude = lat_lng_coords[1]

def get_lat(postal_code):
    lat_lng_coords = None
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Montreal, Quebec'.format(postal_code))
        lat_lng_coords = g.latlng
        return lat_lng_coords[0]

def get_lon(postal_code):
    lat_lng_coords = None
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Montreal, Quebec'.format(postal_code))
        lat_lng_coords = g.latlng
        return lat_lng_coords[1]

print('latitude retrieved by geocoder for H9R:', get_lat('H9R'))  #test function  
print('longitude retrieved by geocoder for H9R:', get_lon('H9R')) # test function 

latitude retrieved by geocoder for H9R: 45.46047000000004
longitude retrieved by geocoder for H9R: -73.81335999999999


In [8]:
mtl_df['Latitude'] = mtl_df['Postal Code'].apply(get_lat)
mtl_df['Longitude'] = mtl_df['Postal Code'].apply(get_lon)
mtl_df.head()

Unnamed: 0,Postal Code,Neighborhood,Latitude,Longitude
0,H1A,Pointe-Aux-Trembles,45.67415,-73.50059
1,H1B,Montreal-East,45.62939,-73.52003
2,H1C,Rivière-des-Prairies,45.66019,-73.54076
3,H1E,Rivière-des-Prairies,45.63678,-73.58602
4,H1G,Montreal-Nord,45.61155,-73.62116


## Data Analysis: Toronto

In [9]:
#import libraries 
#!conda install -c conda-forge folium=0.5.0 --yes
import folium 

import matplotlib as plt
%matplotlib inline 

import matplotlib.cm as cm
import matplotlib.colors as colors

from sklearn.cluster import KMeans

import json 
from pandas.io.json import json_normalize

import requests 

#!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim

print('Libraries imported!')

Libraries imported!


In [10]:
address_toronto = 'Toronto, ON'

geolocator = Nominatim(user_agent="to_explorer")
location_to = geolocator.geocode(address_toronto)
latitude_to = location_to.latitude
longitude_to = location_to.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude_to, longitude_to))

The geograpical coordinate of Toronto are 43.6534817, -79.3839347.


In [11]:
# create map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[latitude_to, longitude_to], zoom_start=11)

# add markers to map
for lat, lng, neighborhood in zip(toronto_df['Latitude'], toronto_df['Longitude'], toronto_df['Neighbourhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

In [12]:
CLIENT_ID = 'COUBGACZXHHSPKZBVRX4TFY0X1HD5QERNPHXZDTESKXQP1V1' # Foursquare ID
CLIENT_SECRET = '5XIW5Y1Z5LJTZYWMK1C3KXEDPYT10EUJFPEVWMTA3IUBP0YR' # Foursquare Secret
VERSION = '20200206' # Foursquare API version - I opted for a date before COVID-19

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: COUBGACZXHHSPKZBVRX4TFY0X1HD5QERNPHXZDTESKXQP1V1
CLIENT_SECRET:5XIW5Y1Z5LJTZYWMK1C3KXEDPYT10EUJFPEVWMTA3IUBP0YR


In [13]:
LIMIT = 100 #top 100 venues 
radius = 1000 #1000 meters 
neighborhood_latitude = toronto_df.loc[0, 'Latitude']
neighborhood_longitude = toronto_df.loc[0, 'Longitude'] 

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)

url #let's see what it looks like

'https://api.foursquare.com/v2/venues/explore?&client_id=COUBGACZXHHSPKZBVRX4TFY0X1HD5QERNPHXZDTESKXQP1V1&client_secret=5XIW5Y1Z5LJTZYWMK1C3KXEDPYT10EUJFPEVWMTA3IUBP0YR&v=20200206&ll=43.7532586,-79.3296565&radius=1000&limit=100'

In [14]:
results = requests.get(url).json()

In [15]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [16]:
venues = results['response']['groups'][0]['items'] #first neighborhood in df_toronto
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

print('{} venues were returned by Foursquare in Parkwoods.'.format(nearby_venues.shape[0]))
nearby_venues.head()

  nearby_venues = json_normalize(venues) # flatten JSON


28 venues were returned by Foursquare in Parkwoods.


Unnamed: 0,name,categories,lat,lng
0,Allwyn's Bakery,Caribbean Restaurant,43.75984,-79.324719
1,Brookbanks Park,Park,43.751976,-79.33214
2,Tim Hortons,Café,43.760668,-79.326368
3,A&W,Fast Food Restaurant,43.760643,-79.326865
4,Bruno's valu-mart,Grocery Store,43.746143,-79.32463


In [17]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [18]:
toronto_venues = getNearbyVenues(names=toronto_df['Neighbourhood'],
                                 latitudes= toronto_df['Latitude'],
                                 longitudes= toronto_df['Longitude']
                                 )
toronto_venues = pd.DataFrame(toronto_venues)

Parkwoods
Victoria Village
Regent Park, Harbourfront
Lawrence Manor, Lawrence Heights
Queen's Park, Ontario Provincial Government
Islington Avenue, Humber Valley Village
Malvern, Rouge
Don Mills
Parkview Hill, Woodbine Gardens
Garden District, Ryerson
Glencairn
West Deane Park, Princess Gardens, Martin Grove, Islington, Cloverdale
Rouge Hill, Port Union, Highland Creek
Don Mills
Woodbine Heights
St. James Town
Humewood-Cedarvale
Eringate, Bloordale Gardens, Old Burnhamthorpe, Markland Wood
Guildwood, Morningside, West Hill
The Beaches
Berczy Park
Caledonia-Fairbanks
Woburn
Leaside
Central Bay Street
Christie
Cedarbrae
Hillcrest Village
Bathurst Manor, Wilson Heights, Downsview North
Thorncliffe Park
Richmond, Adelaide, King
Dufferin, Dovercourt Village
Scarborough Village
Fairview, Henry Farm, Oriole
Northwood Park, York University
East Toronto, Broadview North (Old East York)
Harbourfront East, Union Station, Toronto Islands
Little Portugal, Trinity
Kennedy Park, Ionview, East Birchmo

In [19]:
toronto_grouped = toronto_venues.groupby('Neighborhood').count()
toronto_grouped.head()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Agincourt,5,5,5,5,5,5
"Alderwood, Long Branch",9,9,9,9,9,9
"Bathurst Manor, Wilson Heights, Downsview North",22,22,22,22,22,22
Bayview Village,4,4,4,4,4,4
"Bedford Park, Lawrence Manor East",25,25,25,25,25,25


In [20]:
# one hot encoding
toronto_dummies = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_dummies['Neighborhood'] = toronto_venues['Neighborhood']

# move neighborhood column to the first column
fixed_columns = [toronto_dummies.columns[-1]] + list(toronto_dummies.columns[:-1])
toronto_dummies = toronto_dummies[fixed_columns]
toronto_dummies.head()

Unnamed: 0,Yoga Studio,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,...,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [21]:
#group by neighborhood
toronto_grouped = toronto_dummies.groupby('Neighborhood').mean().reset_index()
print(toronto_grouped.shape)

(96, 270)


In [22]:
toronto_grouped.head()

Unnamed: 0,Neighborhood,Yoga Studio,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,...,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,Agincourt,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Alderwood, Long Branch",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Bathurst Manor, Wilson Heights, Downsview North",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Bayview Village,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Bedford Park, Lawrence Manor East",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [23]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [24]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
toronto_neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
toronto_neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    toronto_neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

toronto_neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agincourt,Lounge,Skating Rink,Latin American Restaurant,Breakfast Spot,Clothing Store,Drugstore,Discount Store,Distribution Center,Dog Run,Doner Restaurant
1,"Alderwood, Long Branch",Pizza Place,Coffee Shop,Sandwich Place,Athletics & Sports,Pub,Pool,Skating Rink,Gym,Concert Hall,Department Store
2,"Bathurst Manor, Wilson Heights, Downsview North",Bank,Coffee Shop,Park,Deli / Bodega,Supermarket,Middle Eastern Restaurant,Sushi Restaurant,Ice Cream Shop,Shopping Mall,Mobile Phone Shop
3,Bayview Village,Café,Bank,Chinese Restaurant,Japanese Restaurant,Women's Store,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant
4,"Bedford Park, Lawrence Manor East",Italian Restaurant,Coffee Shop,Thai Restaurant,Sandwich Place,Restaurant,Juice Bar,Butcher,Café,Indian Restaurant,Pub


### 5 Clusters

In [25]:
toronto_cluster_5 = toronto_neighborhoods_venues_sorted.copy(deep = True)
toronto_cluster_5.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agincourt,Lounge,Skating Rink,Latin American Restaurant,Breakfast Spot,Clothing Store,Drugstore,Discount Store,Distribution Center,Dog Run,Doner Restaurant
1,"Alderwood, Long Branch",Pizza Place,Coffee Shop,Sandwich Place,Athletics & Sports,Pub,Pool,Skating Rink,Gym,Concert Hall,Department Store
2,"Bathurst Manor, Wilson Heights, Downsview North",Bank,Coffee Shop,Park,Deli / Bodega,Supermarket,Middle Eastern Restaurant,Sushi Restaurant,Ice Cream Shop,Shopping Mall,Mobile Phone Shop
3,Bayview Village,Café,Bank,Chinese Restaurant,Japanese Restaurant,Women's Store,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant
4,"Bedford Park, Lawrence Manor East",Italian Restaurant,Coffee Shop,Thai Restaurant,Sandwich Place,Restaurant,Juice Bar,Butcher,Café,Indian Restaurant,Pub


In [26]:
# set number of clusters
kclusters5 = 5

toronto_grouped_clustering_5 = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters5, random_state=0).fit(toronto_grouped_clustering_5)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1], dtype=int32)

In [27]:
# add clustering labels
toronto_cluster_5.insert(0,'Cluster Labels', kmeans.labels_)

In [28]:
toronto_merged_5 = toronto_df

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged_5 = toronto_merged_5.join(toronto_cluster_5.set_index('Neighborhood'), on='Neighbourhood')

toronto_merged_5.head() 

Unnamed: 0,Postal Code,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M3A,Parkwoods,43.753259,-79.329656,0.0,Food & Drink Shop,Park,Women's Store,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore
1,M4A,Victoria Village,43.725882,-79.315572,1.0,French Restaurant,Coffee Shop,Hockey Arena,Portuguese Restaurant,Women's Store,Doner Restaurant,Dim Sum Restaurant,Diner,Discount Store,Distribution Center
2,M5A,"Regent Park, Harbourfront",43.65426,-79.360636,1.0,Coffee Shop,Park,Café,Pub,Bakery,Breakfast Spot,Theater,Ice Cream Shop,Chocolate Shop,Spa
3,M6A,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763,1.0,Furniture / Home Store,Clothing Store,Vietnamese Restaurant,Coffee Shop,Boutique,Event Space,Accessories Store,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant
4,M7A,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,1.0,Coffee Shop,College Cafeteria,Sushi Restaurant,Bar,Beer Bar,Smoothie Shop,Sandwich Place,Café,Portuguese Restaurant,Persian Restaurant


In [29]:
toronto_merged_5['Cluster Labels']=toronto_merged_5['Cluster Labels'].fillna(0).astype('int')

In [30]:
# create map
to_clusters_5 = folium.Map(location=[latitude_to, longitude_to], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters5)
ys = [i + x + (i*x)**2 for i in range(kclusters5)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged_5['Latitude'], toronto_merged_5['Longitude'], toronto_merged_5['Neighbourhood'], toronto_merged_5['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(to_clusters_5)
       
to_clusters_5

### 10 clusters 

In [31]:
toronto_cluster_10 = toronto_neighborhoods_venues_sorted.copy(deep = True)
toronto_cluster_10.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agincourt,Lounge,Skating Rink,Latin American Restaurant,Breakfast Spot,Clothing Store,Drugstore,Discount Store,Distribution Center,Dog Run,Doner Restaurant
1,"Alderwood, Long Branch",Pizza Place,Coffee Shop,Sandwich Place,Athletics & Sports,Pub,Pool,Skating Rink,Gym,Concert Hall,Department Store
2,"Bathurst Manor, Wilson Heights, Downsview North",Bank,Coffee Shop,Park,Deli / Bodega,Supermarket,Middle Eastern Restaurant,Sushi Restaurant,Ice Cream Shop,Shopping Mall,Mobile Phone Shop
3,Bayview Village,Café,Bank,Chinese Restaurant,Japanese Restaurant,Women's Store,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant
4,"Bedford Park, Lawrence Manor East",Italian Restaurant,Coffee Shop,Thai Restaurant,Sandwich Place,Restaurant,Juice Bar,Butcher,Café,Indian Restaurant,Pub


In [32]:
# set number of clusters
kclusters10 = 10

toronto_grouped_clustering10 = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans10 = KMeans(n_clusters=kclusters10, random_state=0).fit(toronto_grouped_clustering10)

# check cluster labels generated for each row in the dataframe
kmeans10.labels_[0:10]

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1], dtype=int32)

In [33]:
# add clustering labels
toronto_cluster_10.insert(0,'Cluster Labels', kmeans10.labels_)

In [34]:
toronto_merged10 = toronto_df

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged10 = toronto_merged10.join(toronto_cluster_10.set_index('Neighborhood'), on='Neighbourhood')

toronto_merged10.head() 

Unnamed: 0,Postal Code,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M3A,Parkwoods,43.753259,-79.329656,4.0,Food & Drink Shop,Park,Women's Store,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore
1,M4A,Victoria Village,43.725882,-79.315572,1.0,French Restaurant,Coffee Shop,Hockey Arena,Portuguese Restaurant,Women's Store,Doner Restaurant,Dim Sum Restaurant,Diner,Discount Store,Distribution Center
2,M5A,"Regent Park, Harbourfront",43.65426,-79.360636,1.0,Coffee Shop,Park,Café,Pub,Bakery,Breakfast Spot,Theater,Ice Cream Shop,Chocolate Shop,Spa
3,M6A,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763,1.0,Furniture / Home Store,Clothing Store,Vietnamese Restaurant,Coffee Shop,Boutique,Event Space,Accessories Store,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant
4,M7A,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,1.0,Coffee Shop,College Cafeteria,Sushi Restaurant,Bar,Beer Bar,Smoothie Shop,Sandwich Place,Café,Portuguese Restaurant,Persian Restaurant


In [35]:
toronto_merged10['Cluster Labels']=toronto_merged10['Cluster Labels'].fillna(0).astype('int')

In [36]:
# create map
to_clusters_10 = folium.Map(location=[latitude_to, longitude_to], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters10)
ys = [i + x + (i*x)**2 for i in range(kclusters10)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged10['Latitude'], toronto_merged10['Longitude'], toronto_merged10['Neighbourhood'], toronto_merged10['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(to_clusters_10)
       
to_clusters_10

## Data Analysis: Montreal 

In [37]:
address_mtl = 'Montreal, QC'

geolocator = Nominatim(user_agent="mtl_explorer")
location_mtl = geolocator.geocode(address_mtl)
latitude_mtl = location_mtl.latitude
longitude_mtl = location_mtl.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude_mtl, longitude_mtl))

The geograpical coordinate of Toronto are 45.4972159, -73.6103642.


In [38]:
# create map of New York using latitude and longitude values
map_montreal = folium.Map(location=[latitude_mtl, longitude_mtl], zoom_start=11)

# add markers to map
for lat, lng, neighborhood in zip(mtl_df['Latitude'], mtl_df['Longitude'], mtl_df['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_montreal)  
    
map_montreal

In [39]:
LIMIT = 100 #top 100 venues 
radius = 1000 #1000 meters 
neighborhood_latitude = mtl_df.loc[0, 'Latitude']
neighborhood_longitude = mtl_df.loc[0, 'Longitude'] 

url_mtl = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)

url_mtl #let's see what it looks like

'https://api.foursquare.com/v2/venues/explore?&client_id=COUBGACZXHHSPKZBVRX4TFY0X1HD5QERNPHXZDTESKXQP1V1&client_secret=5XIW5Y1Z5LJTZYWMK1C3KXEDPYT10EUJFPEVWMTA3IUBP0YR&v=20200206&ll=45.674150000000054,-73.50058999999999&radius=1000&limit=100'

In [40]:
results_mtl = requests.get(url_mtl).json()

In [95]:
venues = results_mtl['response']['groups'][0]['items'] #first neighborhood in mtl_df
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

print('{} venues were returned by Foursquare in Pointe-Aux-Trembles.'.format(nearby_venues.shape[0]))
nearby_venues.head()

5 venues were returned by Foursquare in Pointe-Aux-Trembles.


  nearby_venues = json_normalize(venues) # flatten JSON


Unnamed: 0,name,categories,lat,lng
0,Parc-nature de la Pointe-aux-Prairies,Park,45.678834,-73.501162
1,Dépanneur Mario,Convenience Store,45.671119,-73.496001
2,Proxim,Pharmacy,45.668749,-73.499512
3,Metro Plus De La Rousselière,Supermarket,45.669364,-73.506622
4,Parc Clémentine De La Rousselière,Baseball Field,45.667863,-73.494016


In [42]:
montreal_venues = getNearbyVenues(names=mtl_df['Neighborhood'],
                                 latitudes= mtl_df['Latitude'],
                                 longitudes= mtl_df['Longitude']
                                 )
montreal_venues = pd.DataFrame(montreal_venues)

Pointe-Aux-Trembles
Montreal-East
Rivière-des-Prairies
Rivière-des-Prairies
Montreal-Nord
Montreal-Nord
Anjou
Anjou
Mercier
Mercier
Mercier
Saint-Leonard
Saint-Leonard
Saint-Leonard
Rosemont
Maisonneuve
Hochlelaga
Rosemount
Rosemount
Saint-Michel
Saint-Michel
Ahunstic
Ahunstic
Villeray
Petite-Patrie
Plateau Mount-Royal
Plateau Mount-Royal
Centre-Sud
Centre-Sud
Ahunstic
Ahunstic
Villeray
Villeray
Petite-Patrie
Plateau Mount-Royal
Outremount
Plateau Mount-Royal
Plateau Mount-Royal
Old Montreal
Downtown Montreal
Downtown Montreal (McGill University)
Downtown Montreal
Griffin Town (Université de Montréal)
Nun's Island
Downtown Montreal (Concordia University)
Downtown Montreal
Petite Bourgogne
Pointe-Saint-Charles
Ahunstic
Catierville
Parc Extension
Mount Royal
Mount Royal
Côte-des-Neiges
Côte-des-Neiges
Côte-des-Neiges
Côte-des-Neiges
Hampstead
Côte-Saint-Luc
Westmount
Westmount
Notre-Dame-de-Grâce
Notre-Dame-de-Grâce
Saint-Henri
Ville-Émard
Verdun
Verdun
Cartierville
Cartierville
Saint-La

In [43]:
montreal_venues = pd.DataFrame(montreal_venues)
print(montreal_venues.shape)

(1755, 7)


In [44]:
montreal_grouped = montreal_venues.groupby('Neighborhood').count()
montreal_grouped.head()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Ahunstic,47,47,47,47,47,47
Anjou,6,6,6,6,6,6
Beaconsfield,1,1,1,1,1,1
Cartierville,10,10,10,10,10,10
Catierville,9,9,9,9,9,9


In [45]:
montreal_dummies = pd.get_dummies(montreal_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
montreal_dummies['Neighborhood'] = montreal_venues['Neighborhood']

fixed_columns = [montreal_dummies.columns[-1]] + list(montreal_dummies.columns[:-1])
montreal_dummies = montreal_dummies[fixed_columns]
montreal_dummies.head()

Unnamed: 0,Zoo,ATM,Accessories Store,Afghan Restaurant,Airport,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Arepa Restaurant,...,Tunnel,Vegetarian / Vegan Restaurant,Video Store,Vietnamese Restaurant,Warehouse Store,Water Park,Whisky Bar,Wine Bar,Women's Store,Yoga Studio
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [46]:
montreal_grouped = montreal_dummies.groupby('Neighborhood').mean().reset_index()
print(montreal_grouped.shape)

(57, 270)


In [47]:
montreal_grouped.head()

Unnamed: 0,Neighborhood,Zoo,ATM,Accessories Store,Afghan Restaurant,Airport,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,...,Tunnel,Vegetarian / Vegan Restaurant,Video Store,Vietnamese Restaurant,Warehouse Store,Water Park,Whisky Bar,Wine Bar,Women's Store,Yoga Studio
0,Ahunstic,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0
1,Anjou,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Beaconsfield,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Cartierville,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Catierville,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [48]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']

for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
mtl_neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
mtl_neighborhoods_venues_sorted['Neighborhood'] = montreal_grouped['Neighborhood']

for ind in np.arange(montreal_grouped.shape[0]):
    mtl_neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(montreal_grouped.iloc[ind, :], num_top_venues)

mtl_neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Ahunstic,Pharmacy,Café,Breakfast Spot,Clothing Store,Restaurant,Italian Restaurant,Ice Cream Shop,Pizza Place,Grocery Store,Bakery
1,Anjou,Hockey Arena,Pizza Place,Gym,American Restaurant,BBQ Joint,Convenience Store,Filipino Restaurant,Fast Food Restaurant,Farmers Market,Event Space
2,Beaconsfield,Soccer Field,Yoga Studio,English Restaurant,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant,Escape Room
3,Cartierville,Park,Playground,Pool,Soccer Field,Scenic Lookout,Coffee Shop,Convenience Store,Restaurant,Supermarket,Dog Run
4,Catierville,Discount Store,Shopping Mall,Bank,Asian Restaurant,Grocery Store,Gym,Gas Station,Pharmacy,Bookstore,Escape Room


### 5 Clusters 

In [49]:
mtl_cluster_5 = mtl_neighborhoods_venues_sorted.copy(deep = True)
mtl_cluster_5.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Ahunstic,Pharmacy,Café,Breakfast Spot,Clothing Store,Restaurant,Italian Restaurant,Ice Cream Shop,Pizza Place,Grocery Store,Bakery
1,Anjou,Hockey Arena,Pizza Place,Gym,American Restaurant,BBQ Joint,Convenience Store,Filipino Restaurant,Fast Food Restaurant,Farmers Market,Event Space
2,Beaconsfield,Soccer Field,Yoga Studio,English Restaurant,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant,Escape Room
3,Cartierville,Park,Playground,Pool,Soccer Field,Scenic Lookout,Coffee Shop,Convenience Store,Restaurant,Supermarket,Dog Run
4,Catierville,Discount Store,Shopping Mall,Bank,Asian Restaurant,Grocery Store,Gym,Gas Station,Pharmacy,Bookstore,Escape Room


In [50]:
# set number of clusters
kclusters5 = 5

montreal_grouped_clustering_5 = montreal_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans_mtl_5 = KMeans(n_clusters=kclusters5, random_state=0).fit(montreal_grouped_clustering_5)

# check cluster labels generated for each row in the dataframe
kmeans_mtl_5.labels_[0:10]

array([0, 0, 2, 0, 0, 0, 4, 0, 0, 0], dtype=int32)

In [51]:
# add clustering labels
mtl_cluster_5.insert(0,'Cluster Labels', kmeans_mtl_5.labels_)

In [52]:
montreal_merged5 = mtl_df

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
montreal_merged5 = montreal_merged5.join(mtl_cluster_5.set_index('Neighborhood'), on='Neighborhood')

montreal_merged5.head() 

Unnamed: 0,Postal Code,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,H1A,Pointe-Aux-Trembles,45.67415,-73.50059,0.0,Convenience Store,Train Station,Yoga Studio,Empanada Restaurant,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,English Restaurant
1,H1B,Montreal-East,45.62939,-73.52003,1.0,Business Service,Yoga Studio,Dog Run,Fish & Chips Shop,Filipino Restaurant,Fast Food Restaurant,Farmers Market,Event Space,Escape Room,English Restaurant
2,H1C,Rivière-des-Prairies,45.66019,-73.54076,0.0,Pharmacy,Train Station,Convenience Store,Grocery Store,Italian Restaurant,Yoga Studio,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop
3,H1E,Rivière-des-Prairies,45.63678,-73.58602,0.0,Pharmacy,Train Station,Convenience Store,Grocery Store,Italian Restaurant,Yoga Studio,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop
4,H1G,Montreal-Nord,45.61155,-73.62116,0.0,Fast Food Restaurant,Ice Cream Shop,Sandwich Place,Bank,Gas Station,Paper / Office Supplies Store,Restaurant,Gym,Hockey Arena,Gastropub


In [53]:
montreal_merged5['Cluster Labels']=montreal_merged5['Cluster Labels'].fillna(0).astype('int')

In [54]:
# create map
mtl_clusters_5 = folium.Map(location=[latitude_mtl, longitude_mtl], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters5)
ys = [i + x + (i*x)**2 for i in range(kclusters5)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(montreal_merged5['Latitude'], montreal_merged5['Longitude'], montreal_merged5['Neighborhood'], montreal_merged5['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(mtl_clusters_5)
       
mtl_clusters_5

### 10 Clusters 

In [55]:
mtl_cluster_10 = mtl_neighborhoods_venues_sorted.copy(deep = True)
mtl_cluster_10.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Ahunstic,Pharmacy,Café,Breakfast Spot,Clothing Store,Restaurant,Italian Restaurant,Ice Cream Shop,Pizza Place,Grocery Store,Bakery
1,Anjou,Hockey Arena,Pizza Place,Gym,American Restaurant,BBQ Joint,Convenience Store,Filipino Restaurant,Fast Food Restaurant,Farmers Market,Event Space
2,Beaconsfield,Soccer Field,Yoga Studio,English Restaurant,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant,Escape Room
3,Cartierville,Park,Playground,Pool,Soccer Field,Scenic Lookout,Coffee Shop,Convenience Store,Restaurant,Supermarket,Dog Run
4,Catierville,Discount Store,Shopping Mall,Bank,Asian Restaurant,Grocery Store,Gym,Gas Station,Pharmacy,Bookstore,Escape Room


In [56]:
# set number of clusters
kclusters10 = 10

montreal_grouped_clustering_10 = montreal_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans_mtl_10 = KMeans(n_clusters=kclusters10, random_state=0).fit(montreal_grouped_clustering_10)

# check cluster labels generated for each row in the dataframe
kmeans_mtl_10.labels_[0:10]

array([1, 9, 4, 9, 9, 1, 0, 1, 9, 9], dtype=int32)

In [57]:
# add clustering labels
mtl_cluster_10.insert(0,'Cluster Labels', kmeans_mtl_10.labels_)

In [58]:
montreal_merged10 = mtl_df

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
montreal_merged10 = montreal_merged10.join(mtl_cluster_10.set_index('Neighborhood'), on='Neighborhood')

montreal_merged10.head() 

Unnamed: 0,Postal Code,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,H1A,Pointe-Aux-Trembles,45.67415,-73.50059,2.0,Convenience Store,Train Station,Yoga Studio,Empanada Restaurant,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,English Restaurant
1,H1B,Montreal-East,45.62939,-73.52003,3.0,Business Service,Yoga Studio,Dog Run,Fish & Chips Shop,Filipino Restaurant,Fast Food Restaurant,Farmers Market,Event Space,Escape Room,English Restaurant
2,H1C,Rivière-des-Prairies,45.66019,-73.54076,1.0,Pharmacy,Train Station,Convenience Store,Grocery Store,Italian Restaurant,Yoga Studio,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop
3,H1E,Rivière-des-Prairies,45.63678,-73.58602,1.0,Pharmacy,Train Station,Convenience Store,Grocery Store,Italian Restaurant,Yoga Studio,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop
4,H1G,Montreal-Nord,45.61155,-73.62116,9.0,Fast Food Restaurant,Ice Cream Shop,Sandwich Place,Bank,Gas Station,Paper / Office Supplies Store,Restaurant,Gym,Hockey Arena,Gastropub


In [59]:
montreal_merged10['Cluster Labels']=montreal_merged10['Cluster Labels'].fillna(0).astype('int')

In [60]:
# create map
mtl_clusters_10 = folium.Map(location=[latitude_mtl, longitude_mtl], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters10)
ys = [i + x + (i*x)**2 for i in range(kclusters10)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(montreal_merged10['Latitude'], montreal_merged10['Longitude'], montreal_merged10['Neighborhood'], montreal_merged10['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(mtl_clusters_10)
       
mtl_clusters_10

## Examining Clusters: Toronto

recall the maps with markers:

In [61]:
print('Toronto Map with k = 5', '\n')
to_clusters_5

Toronto Map with k = 5 



In [62]:
print('Toronto Map with k = 10', '\n')
to_clusters_10

Toronto Map with k = 10 



### First Cluster: Red Markers

#### When k = 5

In [63]:
to_cluster_k5_0 = pd.DataFrame(toronto_merged_5.loc[toronto_merged_5['Cluster Labels'] == 0, toronto_merged_5.columns[[1] + list(range(5, toronto_merged_5.shape[1]))]])
to_cluster_k5_0

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Parkwoods,Food & Drink Shop,Park,Women's Store,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore
5,"Islington Avenue, Humber Valley Village",,,,,,,,,,
14,Woodbine Heights,Park,Beer Store,Skating Rink,Curling Ice,Doner Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Drugstore
16,Humewood-Cedarvale,Park,Hockey Arena,Trail,Field,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Drugstore,Donut Shop
21,Caledonia-Fairbanks,Park,Women's Store,Pool,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run
35,"East Toronto, Broadview North (Old East York)",Intersection,Park,Convenience Store,Women's Store,Donut Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant
45,"York Mills, Silver Hills",,,,,,,,,,
49,"North Park, Maple Leaf Park, Upwood Park",Park,Construction & Landscaping,Bakery,Trail,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Ethiopian Restaurant,Drugstore,Dessert Shop
52,"Willowdale, Newtonbrook",Park,Piano Bar,Women's Store,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run
61,Lawrence Park,Park,Swim School,Bus Line,Dog Run,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Doner Restaurant,Falafel Restaurant


#### When k = 10

In [64]:
to_cluster_k10_0 = pd.DataFrame(toronto_merged10.loc[toronto_merged10['Cluster Labels'] == 0, toronto_merged10.columns[[1] + list(range(5, toronto_merged10.shape[1]))]])
to_cluster_k10_0

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,"Islington Avenue, Humber Valley Village",,,,,,,,,,
26,Cedarbrae,Hakka Restaurant,Lounge,Bakery,Athletics & Sports,Bank,Caribbean Restaurant,Gas Station,Thai Restaurant,Fried Chicken Joint,Dog Run
45,"York Mills, Silver Hills",,,,,,,,,,
95,Upper Rouge,,,,,,,,,,


### Second Cluster: Purple Markers

#### When k = 5

In [65]:
to_cluster_k5_1 = pd.DataFrame(toronto_merged_5.loc[toronto_merged_5['Cluster Labels'] == 1, toronto_merged_5.columns[[1] + list(range(5, toronto_merged_5.shape[1]))]])
pd.set_option('display.max_rows', None)
to_cluster_k5_1


Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Victoria Village,French Restaurant,Coffee Shop,Hockey Arena,Portuguese Restaurant,Women's Store,Doner Restaurant,Dim Sum Restaurant,Diner,Discount Store,Distribution Center
2,"Regent Park, Harbourfront",Coffee Shop,Park,Café,Pub,Bakery,Breakfast Spot,Theater,Ice Cream Shop,Chocolate Shop,Spa
3,"Lawrence Manor, Lawrence Heights",Furniture / Home Store,Clothing Store,Vietnamese Restaurant,Coffee Shop,Boutique,Event Space,Accessories Store,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant
4,"Queen's Park, Ontario Provincial Government",Coffee Shop,College Cafeteria,Sushi Restaurant,Bar,Beer Bar,Smoothie Shop,Sandwich Place,Café,Portuguese Restaurant,Persian Restaurant
7,Don Mills,Gym,Coffee Shop,Clothing Store,Japanese Restaurant,Beer Store,Restaurant,Chinese Restaurant,Supermarket,Discount Store,Café
8,"Parkview Hill, Woodbine Gardens",Pizza Place,Breakfast Spot,Bank,Athletics & Sports,Gastropub,Intersection,Pharmacy,Gym / Fitness Center,Eastern European Restaurant,Dumpling Restaurant
9,"Garden District, Ryerson",Clothing Store,Coffee Shop,Café,Cosmetics Shop,Japanese Restaurant,Bubble Tea Shop,Ramen Restaurant,Middle Eastern Restaurant,Lingerie Store,Electronics Store
10,Glencairn,Sushi Restaurant,Bakery,Pub,Japanese Restaurant,Women's Store,Dim Sum Restaurant,Discount Store,Distribution Center,Dog Run,Doner Restaurant
11,"West Deane Park, Princess Gardens, Martin Grov...",Home Service,Women's Store,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dessert Shop
12,"Rouge Hill, Port Union, Highland Creek",Bar,Women's Store,Donut Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,Fast Food Restaurant


#### When k = 10

In [66]:
to_cluster_k10_1 = pd.DataFrame(toronto_merged10.loc[toronto_merged10['Cluster Labels'] == 1, toronto_merged10.columns[[1] + list(range(5, toronto_merged10.shape[1]))]])
to_cluster_k10_1

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Victoria Village,French Restaurant,Coffee Shop,Hockey Arena,Portuguese Restaurant,Women's Store,Doner Restaurant,Dim Sum Restaurant,Diner,Discount Store,Distribution Center
2,"Regent Park, Harbourfront",Coffee Shop,Park,Café,Pub,Bakery,Breakfast Spot,Theater,Ice Cream Shop,Chocolate Shop,Spa
3,"Lawrence Manor, Lawrence Heights",Furniture / Home Store,Clothing Store,Vietnamese Restaurant,Coffee Shop,Boutique,Event Space,Accessories Store,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant
4,"Queen's Park, Ontario Provincial Government",Coffee Shop,College Cafeteria,Sushi Restaurant,Bar,Beer Bar,Smoothie Shop,Sandwich Place,Café,Portuguese Restaurant,Persian Restaurant
7,Don Mills,Gym,Coffee Shop,Clothing Store,Japanese Restaurant,Beer Store,Restaurant,Chinese Restaurant,Supermarket,Discount Store,Café
8,"Parkview Hill, Woodbine Gardens",Pizza Place,Breakfast Spot,Bank,Athletics & Sports,Gastropub,Intersection,Pharmacy,Gym / Fitness Center,Eastern European Restaurant,Dumpling Restaurant
9,"Garden District, Ryerson",Clothing Store,Coffee Shop,Café,Cosmetics Shop,Japanese Restaurant,Bubble Tea Shop,Ramen Restaurant,Middle Eastern Restaurant,Lingerie Store,Electronics Store
10,Glencairn,Sushi Restaurant,Bakery,Pub,Japanese Restaurant,Women's Store,Dim Sum Restaurant,Discount Store,Distribution Center,Dog Run,Doner Restaurant
13,Don Mills,Gym,Coffee Shop,Clothing Store,Japanese Restaurant,Beer Store,Restaurant,Chinese Restaurant,Supermarket,Discount Store,Café
14,Woodbine Heights,Park,Beer Store,Skating Rink,Curling Ice,Doner Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Drugstore


### Third Cluster:

#### Blue Markers When k = 5

In [67]:
to_cluster_k5_2 = pd.DataFrame(toronto_merged_5.loc[toronto_merged_5['Cluster Labels'] == 2, toronto_merged_5.columns[[1] + list(range(5, toronto_merged_5.shape[1]))]])
to_cluster_k5_2

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
57,"Humberlea, Emery",Baseball Field,Women's Store,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Fast Food Restaurant
101,"Old Mill South, King's Mill Park, Sunnylea, Hu...",Baseball Field,Women's Store,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Fast Food Restaurant


#### Dark Blue Marker When k = 10 

In [68]:
to_cluster_k10_2 = pd.DataFrame(toronto_merged10.loc[toronto_merged10['Cluster Labels'] == 2, toronto_merged10.columns[[1] + list(range(5, toronto_merged10.shape[1]))]])
to_cluster_k10_2

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
11,"West Deane Park, Princess Gardens, Martin Grov...",Home Service,Women's Store,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dessert Shop


### Fourth Cluster:

#### Green Markers When k = 5

In [69]:
to_cluster_k5_3 = pd.DataFrame(toronto_merged_5.loc[toronto_merged_5['Cluster Labels'] == 3, toronto_merged_5.columns[[1] + list(range(5, toronto_merged_5.shape[1]))]])
to_cluster_k5_3


Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,"Malvern, Rouge",Fast Food Restaurant,Department Store,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Drugstore,Donut Shop,Doner Restaurant


#### Medium Blue When k = 10

In [70]:
to_cluster_k10_3 = pd.DataFrame(toronto_merged10.loc[toronto_merged10['Cluster Labels'] == 3, toronto_merged10.columns[[1] + list(range(5, toronto_merged10.shape[1]))]])
to_cluster_k10_3

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
63,"Runnymede, The Junction North",Grocery Store,Breakfast Spot,Convenience Store,Women's Store,Donut Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant
64,Weston,Convenience Store,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Women's Store,Department Store


### Fifth Cluster: 

#### Orange Markers When k = 5

In [71]:
to_cluster_k5_4 = pd.DataFrame(toronto_merged_5.loc[toronto_merged_5['Cluster Labels'] == 4, toronto_merged_5.columns[[1] + list(range(5, toronto_merged_5.shape[1]))]])
to_cluster_k5_4


Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
64,Weston,Convenience Store,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Women's Store,Department Store


#### Light Blue Markers When k = 10

In [72]:
to_cluster_k10_4 = pd.DataFrame(toronto_merged10.loc[toronto_merged10['Cluster Labels'] == 4, toronto_merged10.columns[[1] + list(range(5, toronto_merged10.shape[1]))]])
to_cluster_k10_4

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Parkwoods,Food & Drink Shop,Park,Women's Store,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore
21,Caledonia-Fairbanks,Park,Women's Store,Pool,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run
52,"Willowdale, Newtonbrook",Park,Piano Bar,Women's Store,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run
66,York Mills West,Flower Shop,Park,Convenience Store,Drugstore,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop
85,"Milliken, Agincourt North, Steeles East, L'Amo...",Park,Playground,Women's Store,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run
91,Rosedale,Park,Playground,Trail,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Ethiopian Restaurant,Drugstore,Donut Shop,Deli / Bodega
98,"The Kingsway, Montgomery Road, Old Mill North",Park,River,Women's Store,Dog Run,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Doner Restaurant


### Cluster 6 or Bright Green Markers when k = 10 

In [73]:
to_cluster_k10_5 = pd.DataFrame(toronto_merged10.loc[toronto_merged10['Cluster Labels'] == 5, toronto_merged10.columns[[1] + list(range(5, toronto_merged10.shape[1]))]])
to_cluster_k10_5

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
12,"Rouge Hill, Port Union, Highland Creek",Bar,Women's Store,Donut Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,Fast Food Restaurant


### Cluster 7 or Pale Green Markers when k = 10

In [74]:
to_cluster_k10_6 = pd.DataFrame(toronto_merged10.loc[toronto_merged10['Cluster Labels'] == 6, toronto_merged10.columns[[1] + list(range(5, toronto_merged10.shape[1]))]])
to_cluster_k10_6

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
57,"Humberlea, Emery",Baseball Field,Women's Store,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Fast Food Restaurant
101,"Old Mill South, King's Mill Park, Sunnylea, Hu...",Baseball Field,Women's Store,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Fast Food Restaurant


### Cluster 8 or Yellow Markers when k = 10

In [75]:
to_cluster_k10_7 = pd.DataFrame(toronto_merged10.loc[toronto_merged10['Cluster Labels'] == 7, toronto_merged10.columns[[1] + list(range(5, toronto_merged10.shape[1]))]])
to_cluster_k10_7

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
62,Roselawn,Garden,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Women's Store,Department Store


### Cluster 9 or Orange Markers when k = 10 

In [76]:
to_cluster_k10_8 = pd.DataFrame(toronto_merged10.loc[toronto_merged10['Cluster Labels'] == 8, toronto_merged10.columns[[1] + list(range(5, toronto_merged10.shape[1]))]])
to_cluster_k10_8

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
16,Humewood-Cedarvale,Park,Hockey Arena,Trail,Field,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Drugstore,Donut Shop
49,"North Park, Maple Leaf Park, Upwood Park",Park,Construction & Landscaping,Bakery,Trail,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Ethiopian Restaurant,Drugstore,Dessert Shop
61,Lawrence Park,Park,Swim School,Bus Line,Dog Run,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Doner Restaurant,Falafel Restaurant
68,"Forest Hill North & West, Forest Hill Road Park",Trail,Park,Sushi Restaurant,Bus Line,Jewelry Store,Doner Restaurant,Diner,Discount Store,Distribution Center,Dog Run
77,"Kingsview Village, St. Phillips, Martin Grove ...",Park,Mobile Phone Shop,Sandwich Place,Bus Line,Distribution Center,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Event Space
83,"Moore Park, Summerhill East",Park,Lawyer,Trail,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Drugstore,Department Store


### Cluster 10 or Light Red Markers when k = 10 

In [77]:
to_cluster_k10_9 = pd.DataFrame(toronto_merged10.loc[toronto_merged10['Cluster Labels'] == 9, toronto_merged10.columns[[1] + list(range(5, toronto_merged10.shape[1]))]])
to_cluster_k10_9

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,"Malvern, Rouge",Fast Food Restaurant,Department Store,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Drugstore,Donut Shop,Doner Restaurant


## Examining Clusters: Montreal 

recall the map with markers:

In [78]:
print('Montreal Map when k = 5')
mtl_clusters_5

Montreal Map when k = 5


In [79]:
print('Montreal Map when k = 10')
mtl_clusters_10

Montreal Map when k = 10


### First Cluster: Red Markers 

#### When k = 5

In [80]:
mtl_cluster_k5_0 = pd.DataFrame(montreal_merged5.loc[montreal_merged5['Cluster Labels'] == 0, montreal_merged5.columns[[1] + list(range(5, montreal_merged5.shape[1]))]])
#pd.set_option('display.max_rows', None)
#mtl_cluster_0.drop_duplicates(inplace = True)
mtl_cluster_k5_0

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Pointe-Aux-Trembles,Convenience Store,Train Station,Yoga Studio,Empanada Restaurant,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,English Restaurant
2,Rivière-des-Prairies,Pharmacy,Train Station,Convenience Store,Grocery Store,Italian Restaurant,Yoga Studio,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop
3,Rivière-des-Prairies,Pharmacy,Train Station,Convenience Store,Grocery Store,Italian Restaurant,Yoga Studio,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop
4,Montreal-Nord,Fast Food Restaurant,Ice Cream Shop,Sandwich Place,Bank,Gas Station,Paper / Office Supplies Store,Restaurant,Gym,Hockey Arena,Gastropub
5,Montreal-Nord,Fast Food Restaurant,Ice Cream Shop,Sandwich Place,Bank,Gas Station,Paper / Office Supplies Store,Restaurant,Gym,Hockey Arena,Gastropub
6,Anjou,Hockey Arena,Pizza Place,Gym,American Restaurant,BBQ Joint,Convenience Store,Filipino Restaurant,Fast Food Restaurant,Farmers Market,Event Space
7,Anjou,Hockey Arena,Pizza Place,Gym,American Restaurant,BBQ Joint,Convenience Store,Filipino Restaurant,Fast Food Restaurant,Farmers Market,Event Space
8,Mercier,Restaurant,Skating Rink,Water Park,Pharmacy,Vietnamese Restaurant,Hardware Store,Convenience Store,Auto Workshop,Gym,Supermarket
9,Mercier,Restaurant,Skating Rink,Water Park,Pharmacy,Vietnamese Restaurant,Hardware Store,Convenience Store,Auto Workshop,Gym,Supermarket
10,Mercier,Restaurant,Skating Rink,Water Park,Pharmacy,Vietnamese Restaurant,Hardware Store,Convenience Store,Auto Workshop,Gym,Supermarket


#### When k = 10

In [81]:
mtl_cluster_k10_0 = pd.DataFrame(montreal_merged10.loc[montreal_merged10['Cluster Labels'] == 0, montreal_merged10.columns[[1] + list(range(5, montreal_merged10.shape[1]))]])
mtl_cluster_k10_0

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
43,Nun's Island,Park,Tennis Court,Mediterranean Restaurant,Convenience Store,English Restaurant,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant
57,Hampstead,Park,Big Box Store,Food Service,Yoga Studio,English Restaurant,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant
58,Côte-Saint-Luc,Pool,Park,Gym,Pizza Place,Skating Rink,Big Box Store,Intersection,Food Service,Empanada Restaurant,Dry Cleaner
59,Westmount,Athletics & Sports,Park,Pool,Dog Run,Pizza Place,Skating Rink,Tennis Court,Sushi Restaurant,Dumpling Restaurant,Dive Bar
60,Westmount,Athletics & Sports,Park,Pool,Dog Run,Pizza Place,Skating Rink,Tennis Court,Sushi Restaurant,Dumpling Restaurant,Dive Bar
76,Côte-Saint-Luc,Pool,Park,Gym,Pizza Place,Skating Rink,Big Box Store,Intersection,Food Service,Empanada Restaurant,Dry Cleaner
77,Côte-Saint-Luc,Pool,Park,Gym,Pizza Place,Skating Rink,Big Box Store,Intersection,Food Service,Empanada Restaurant,Dry Cleaner
86,Ville Saint-Pierre,,,,,,,,,,
93,Île-Bizard,Hockey Arena,Athletics & Sports,Park,Dive Bar,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant
94,Île-Bizard,Hockey Arena,Athletics & Sports,Park,Dive Bar,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant


### Second Cluster: Purple Markers 

#### When k = 5

In [82]:
mtl_cluster_k5_1 = pd.DataFrame(montreal_merged5.loc[montreal_merged5['Cluster Labels'] == 1, montreal_merged5.columns[[1] + list(range(5, montreal_merged5.shape[1]))]])
mtl_cluster_k5_1

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Montreal-East,Business Service,Yoga Studio,Dog Run,Fish & Chips Shop,Filipino Restaurant,Fast Food Restaurant,Farmers Market,Event Space,Escape Room,English Restaurant


#### When k = 10 

In [83]:
mtl_cluster_k10_1 = pd.DataFrame(montreal_merged10.loc[montreal_merged10['Cluster Labels'] == 1, montreal_merged10.columns[[1] + list(range(5, montreal_merged10.shape[1]))]])
mtl_cluster_k10_1

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Rivière-des-Prairies,Pharmacy,Train Station,Convenience Store,Grocery Store,Italian Restaurant,Yoga Studio,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop
3,Rivière-des-Prairies,Pharmacy,Train Station,Convenience Store,Grocery Store,Italian Restaurant,Yoga Studio,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop
21,Ahunstic,Pharmacy,Café,Breakfast Spot,Clothing Store,Restaurant,Italian Restaurant,Ice Cream Shop,Pizza Place,Grocery Store,Bakery
22,Ahunstic,Pharmacy,Café,Breakfast Spot,Clothing Store,Restaurant,Italian Restaurant,Ice Cream Shop,Pizza Place,Grocery Store,Bakery
23,Villeray,Café,Bakery,Pharmacy,Breakfast Spot,Sushi Restaurant,Liquor Store,Bar,French Restaurant,Sporting Goods Shop,Toy / Game Store
24,Petite-Patrie,Café,Bar,Restaurant,Sushi Restaurant,Vietnamese Restaurant,Bakery,Coffee Shop,Grocery Store,Breakfast Spot,Beer Store
25,Plateau Mount-Royal,Café,Bar,Bakery,Breakfast Spot,Pizza Place,Brewery,Dessert Shop,Restaurant,Portuguese Restaurant,Cocktail Bar
26,Plateau Mount-Royal,Café,Bar,Bakery,Breakfast Spot,Pizza Place,Brewery,Dessert Shop,Restaurant,Portuguese Restaurant,Cocktail Bar
27,Centre-Sud,Park,French Restaurant,Gym,Café,Fast Food Restaurant,Restaurant,Caribbean Restaurant,Pharmacy,Convenience Store,Breakfast Spot
28,Centre-Sud,Park,French Restaurant,Gym,Café,Fast Food Restaurant,Restaurant,Caribbean Restaurant,Pharmacy,Convenience Store,Breakfast Spot


### Third Cluster: Blue Markers 

#### When k = 5

In [84]:
mtl_cluster_k5_2 = pd.DataFrame(montreal_merged5.loc[montreal_merged5['Cluster Labels'] == 2, montreal_merged5.columns[[1] + list(range(5, montreal_merged5.shape[1]))]])
mtl_cluster_k5_2

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
104,Beaconsfield,Soccer Field,Yoga Studio,English Restaurant,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant,Escape Room


#### When k = 10 

In [85]:
mtl_cluster_k10_2 = pd.DataFrame(montreal_merged10.loc[montreal_merged10['Cluster Labels'] == 2, montreal_merged10.columns[[1] + list(range(5, montreal_merged10.shape[1]))]])
mtl_cluster_k10_2

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Pointe-Aux-Trembles,Convenience Store,Train Station,Yoga Studio,Empanada Restaurant,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,English Restaurant
89,Pierrefonds-Roxboro,Train Station,Dry Cleaner,Convenience Store,Hardware Store,Yoga Studio,Donut Shop,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant


### Fourth Cluster: Green Markers 

#### When k = 5

In [86]:
mtl_cluster_k5_3 = pd.DataFrame(montreal_merged5.loc[montreal_merged5['Cluster Labels'] == 3, montreal_merged5.columns[[1] + list(range(5, montreal_merged5.shape[1]))]])
mtl_cluster_k5_3


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
87,Lachine,Construction & Landscaping,Storage Facility,Yoga Studio,English Restaurant,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant,Event Space
88,Lachine,Construction & Landscaping,Storage Facility,Yoga Studio,English Restaurant,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant,Event Space


#### When k = 10

In [87]:
mtl_cluster_k10_3 = pd.DataFrame(montreal_merged10.loc[montreal_merged10['Cluster Labels'] == 3, montreal_merged10.columns[[1] + list(range(5, montreal_merged10.shape[1]))]])
mtl_cluster_k10_3

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Montreal-East,Business Service,Yoga Studio,Dog Run,Fish & Chips Shop,Filipino Restaurant,Fast Food Restaurant,Farmers Market,Event Space,Escape Room,English Restaurant


### Fifth Cluster: Orange Markers 

#### When k = 5

In [88]:
mtl_cluster_k5_4 = pd.DataFrame(montreal_merged5.loc[montreal_merged5['Cluster Labels'] == 4, montreal_merged5.columns[[1] + list(range(5, montreal_merged5.shape[1]))]])
mtl_cluster_k5_4

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
43,Nun's Island,Park,Tennis Court,Mediterranean Restaurant,Convenience Store,English Restaurant,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant
57,Hampstead,Park,Big Box Store,Food Service,Yoga Studio,English Restaurant,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant
58,Côte-Saint-Luc,Pool,Park,Gym,Pizza Place,Skating Rink,Big Box Store,Intersection,Food Service,Empanada Restaurant,Dry Cleaner
59,Westmount,Athletics & Sports,Park,Pool,Dog Run,Pizza Place,Skating Rink,Tennis Court,Sushi Restaurant,Dumpling Restaurant,Dive Bar
60,Westmount,Athletics & Sports,Park,Pool,Dog Run,Pizza Place,Skating Rink,Tennis Court,Sushi Restaurant,Dumpling Restaurant,Dive Bar
76,Côte-Saint-Luc,Pool,Park,Gym,Pizza Place,Skating Rink,Big Box Store,Intersection,Food Service,Empanada Restaurant,Dry Cleaner
77,Côte-Saint-Luc,Pool,Park,Gym,Pizza Place,Skating Rink,Big Box Store,Intersection,Food Service,Empanada Restaurant,Dry Cleaner
93,Île-Bizard,Hockey Arena,Athletics & Sports,Park,Dive Bar,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant
94,Île-Bizard,Hockey Arena,Athletics & Sports,Park,Dive Bar,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant


#### When k = 10

In [89]:
mtl_cluster_k10_4 = pd.DataFrame(montreal_merged10.loc[montreal_merged10['Cluster Labels'] == 4, montreal_merged10.columns[[1] + list(range(5, montreal_merged10.shape[1]))]])
mtl_cluster_k10_4

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
104,Beaconsfield,Soccer Field,Yoga Studio,English Restaurant,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant,Escape Room


## Cluster Six When k = 10

In [90]:
mtl_cluster_k10_5 = pd.DataFrame(montreal_merged10.loc[montreal_merged10['Cluster Labels'] == 5, montreal_merged10.columns[[1] + list(range(5, montreal_merged10.shape[1]))]])
mtl_cluster_k10_5

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
105,Sainte-Anne-de-Bellevue,Zoo,Afghan Restaurant,Escape Room,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant,English Restaurant,Event Space


## Cluster Seven When k = 10

In [91]:
mtl_cluster_k10_6 = pd.DataFrame(montreal_merged10.loc[montreal_merged10['Cluster Labels'] == 6, montreal_merged10.columns[[1] + list(range(5, montreal_merged10.shape[1]))]])
mtl_cluster_k10_6

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
87,Lachine,Construction & Landscaping,Storage Facility,Yoga Studio,English Restaurant,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant,Event Space
88,Lachine,Construction & Landscaping,Storage Facility,Yoga Studio,English Restaurant,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Empanada Restaurant,Event Space


## Cluster Eight When k = 10

In [92]:
mtl_cluster_k10_7 = pd.DataFrame(montreal_merged10.loc[montreal_merged10['Cluster Labels'] == 7, montreal_merged10.columns[[1] + list(range(5, montreal_merged10.shape[1]))]])
mtl_cluster_k10_7

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
98,Kirkland,Discount Store,Thrift / Vintage Store,Empanada Restaurant,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,English Restaurant,Dive Bar


## Cluster Nine When k = 10

In [93]:
mtl_cluster_k10_8 = pd.DataFrame(montreal_merged10.loc[montreal_merged10['Cluster Labels'] == 8, montreal_merged10.columns[[1] + list(range(5, montreal_merged10.shape[1]))]])
mtl_cluster_k10_8

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
99,Senneville,Health & Beauty Service,Electronics Store,Yoga Studio,English Restaurant,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Empanada Restaurant,Escape Room


## Cluster Ten When k = 10

In [94]:
mtl_cluster_k10_9 = pd.DataFrame(montreal_merged10.loc[montreal_merged10['Cluster Labels'] == 9, montreal_merged10.columns[[1] + list(range(5, montreal_merged10.shape[1]))]])
mtl_cluster_k10_9

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Montreal-Nord,Fast Food Restaurant,Ice Cream Shop,Sandwich Place,Bank,Gas Station,Paper / Office Supplies Store,Restaurant,Gym,Hockey Arena,Gastropub
5,Montreal-Nord,Fast Food Restaurant,Ice Cream Shop,Sandwich Place,Bank,Gas Station,Paper / Office Supplies Store,Restaurant,Gym,Hockey Arena,Gastropub
6,Anjou,Hockey Arena,Pizza Place,Gym,American Restaurant,BBQ Joint,Convenience Store,Filipino Restaurant,Fast Food Restaurant,Farmers Market,Event Space
7,Anjou,Hockey Arena,Pizza Place,Gym,American Restaurant,BBQ Joint,Convenience Store,Filipino Restaurant,Fast Food Restaurant,Farmers Market,Event Space
8,Mercier,Restaurant,Skating Rink,Water Park,Pharmacy,Vietnamese Restaurant,Hardware Store,Convenience Store,Auto Workshop,Gym,Supermarket
9,Mercier,Restaurant,Skating Rink,Water Park,Pharmacy,Vietnamese Restaurant,Hardware Store,Convenience Store,Auto Workshop,Gym,Supermarket
10,Mercier,Restaurant,Skating Rink,Water Park,Pharmacy,Vietnamese Restaurant,Hardware Store,Convenience Store,Auto Workshop,Gym,Supermarket
11,Saint-Leonard,Restaurant,Fast Food Restaurant,Grocery Store,Bakery,Bank,Dessert Shop,Gas Station,Park,Sandwich Place,Greek Restaurant
12,Saint-Leonard,Restaurant,Fast Food Restaurant,Grocery Store,Bakery,Bank,Dessert Shop,Gas Station,Park,Sandwich Place,Greek Restaurant
13,Saint-Leonard,Restaurant,Fast Food Restaurant,Grocery Store,Bakery,Bank,Dessert Shop,Gas Station,Park,Sandwich Place,Greek Restaurant


## Conclusions

Montreal or Toronto? Well that depends on your lifestyle. 


