# Segmenting and Clustering Neighborhoods in Toronto

This notebook is part of an assignment for the course _Applied Data Science Capstone_ in [Coursera](https://www.coursera.org).

## 1. Scraping the data

### Import libraries

In [296]:
# for HTTP requests
import requests  

# for HTML scrapping 
from bs4 import BeautifulSoup 

# for table analysis
import pandas as pd
import numpy as np

# for transforming addresses into latitude/longitude locations
!pip install geocoder
import geocoder



### URL for Wikipedia article

In [297]:
# URL of wikipedia page from which to scrap tabular data.
wiki_url = "https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M"

### Request & Response

In [298]:
# If the request was successful, reponse should be '200'.
response = requests.get(wiki_url) #.json()
response

<Response [200]>

### Wrangling HTML With BeautifulSoup

In [299]:
# Parse response content to html
soup = BeautifulSoup(response.content, 'html.parser')
#soup

### Viewing HTML content

In [300]:
# Title of Wikipedia page
soup.title.string

'List of postal codes of Canada: M - Wikipedia'

In [301]:
# Find all the tables in the HTML
all_tables=soup.find_all('table')

In [302]:
# Find the right table to scrap
right_table=soup.find('table', {"class":'wikitable sortable'})

In [303]:
# Get the 1st row of the table i.e. the header
row0 = right_table.findAll("tr")[0]

# Show the column names
header = [th.text.rstrip() for th in row0.find_all('th')]
header

['Postal Code', 'Borough', 'Neighbourhood']

### Scraping the table contents

In [304]:
# Scrap the data and append to respective lists
c0=[]
c1=[]
c2=[]

# Iterate through the rows of the table
for row in right_table.findAll("tr"):
    cells = row.findAll('td')
    if len(cells)==3 and 'Not assigned' not in cells[1].find(text=True): #Only extract assigned postal codes
        c0.append(cells[0].find(text=True).replace('\n', ''))
        c1.append(cells[1].find(text=True).replace('\n', ''))
        c2.append(cells[2].find(text=True).replace('\n', ''))

In [305]:
# Create a dictionary
dict_toronto = dict([(x,0) for x in header])
dict_toronto

{'Postal Code': 0, 'Borough': 0, 'Neighbourhood': 0}

In [306]:
# Append dictionary with corresponding data list.
dict_toronto['Postal Code'] = c0
dict_toronto['Borough'] = c1
dict_toronto['Neighbourhood'] = c2
#dict_toronto

### Creating dataframe

In [307]:
# Convert dict to dataFrame
df_toronto = pd.DataFrame(dict_toronto)

# Size of dataframe
print(f'Shape: {df_toronto.shape}')

# Top 5 records
df_toronto.head(5)

Shape: (103, 3)


Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


# 2. Geographical coordinates

In [308]:
# Function that retrieves the geographical coordinates for a given neighborhood
def get_coordinates(row):
    # initialize your variable to None
    lat_lng_coords = None

    # loop until you get the coordinates
    while(lat_lng_coords is None):
      g = geocoder.arcgis('{}, Toronto, Ontario'.format(row['Postal Code']))
      lat_lng_coords = g.latlng
    
    # return pair lat,long
    return pd.Series([lat_lng_coords[0], lat_lng_coords[1]])

In [309]:
# Fill coordinates for each row
df_toronto[['Latitude','Longitude']] = df_toronto.apply(get_coordinates, axis=1)
df_toronto.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.75245,-79.32991
1,M4A,North York,Victoria Village,43.73057,-79.31306
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65512,-79.36264
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.72327,-79.45042
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.66253,-79.39188


## 3. Clustering the data

### Import libraries

In [310]:
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library

### Visualizing Toronto

In [312]:
df_toronto.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.75245,-79.32991
1,M4A,North York,Victoria Village,43.73057,-79.31306
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65512,-79.36264
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.72327,-79.45042
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.66253,-79.39188


Let's get the geographical coordinates of Toronto.


In [313]:
address = 'Toronto, Ontario'

geolocator = Nominatim(user_agent="toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(f'The geograpical coordinates of Toronto are {latitude}, {longitude}.')

The geograpical coordinates of Toronto are 43.6534817, -79.3839347.


Let's visualize Toronto.


In [314]:
# create map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(df_toronto['Latitude'], df_toronto['Longitude'], df_toronto['Neighbourhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

### Exploring neighbourhoods of Toronto

#### Foursquare credentials

In [315]:
CLIENT_ID = 'BHJXSXUIL3QMVG5D40DTXTROBNKAMXXYNZSISUHVXAL3HXZO' # your Foursquare ID
CLIENT_SECRET = 'FRLBLH40G0T4OWROEXCQU3MY35OZAIE3HXJ5J1X02RXNMTWP' # your Foursquare Secret
ACCESS_TOKEN = 'N1LDGNZGUPHICQM1Y5LRR5RMQGULD1MLDZQIISPLTI0CL4I0' # your FourSquare Access Token
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)
print('ACCESS_TOKEN:' + ACCESS_TOKEN)

Your credentails:
CLIENT_ID: BHJXSXUIL3QMVG5D40DTXTROBNKAMXXYNZSISUHVXAL3HXZO
CLIENT_SECRET:FRLBLH40G0T4OWROEXCQU3MY35OZAIE3HXJ5J1X02RXNMTWP
ACCESS_TOKEN:N1LDGNZGUPHICQM1Y5LRR5RMQGULD1MLDZQIISPLTI0CL4I0


#### Define a function to apply to all the neighbourhoods.

In [316]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        #print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Code to run the above function on each neighbourhood.

In [317]:
# type your answer here
toronto_venues = getNearbyVenues(names=df_toronto['Neighbourhood'],
                                   latitudes=df_toronto['Latitude'],
                                   longitudes=df_toronto['Longitude'])

print(toronto_venues.shape)
toronto_venues.head()

(2406, 7)


Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,43.75245,-79.32991,Brookbanks Park,43.751976,-79.33214,Park
1,Parkwoods,43.75245,-79.32991,Variety Store,43.751974,-79.333114,Food & Drink Shop
2,Victoria Village,43.73057,-79.31306,Wigmore Park,43.731023,-79.310771,Park
3,Victoria Village,43.73057,-79.31306,Memories of Africa,43.726602,-79.312427,Grocery Store
4,Victoria Village,43.73057,-79.31306,Guardian Drug,43.730584,-79.307432,Pharmacy


### Analysing each neighbourhood

In [318]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighbourhood column back to dataframe
toronto_onehot['Neighbourhood'] = toronto_venues['Neighbourhood'] 

# move neighbourhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

print(f'Size: {toronto_onehot.shape}')
toronto_onehot.head()

Size: (2406, 271)


Unnamed: 0,Neighbourhood,Accessories Store,Afghan Restaurant,American Restaurant,Antique Shop,Aquarium,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,...,Veterinarian,Video Game Store,Video Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo Exhibit
0,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Victoria Village,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Victoria Village,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Victoria Village,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


#### Next, let's group rows by neighbourhood and by taking the mean of the frequency of occurrence of each category


In [319]:
toronto_grouped = toronto_onehot.groupby('Neighbourhood').mean().reset_index()
print(f'Size: {toronto_grouped.shape}')
toronto_grouped

Size: (96, 271)


Unnamed: 0,Neighbourhood,Accessories Store,Afghan Restaurant,American Restaurant,Antique Shop,Aquarium,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,...,Veterinarian,Video Game Store,Video Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo Exhibit
0,Agincourt,0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.0,...,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.000000,0.0
1,"Alderwood, Long Branch",0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.0,...,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.000000,0.0
2,Bayview Village,0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.0,...,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.000000,0.0
3,"Bedford Park, Lawrence Manor East",0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.0,...,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.000000,0.0
4,Berczy Park,0.0,0.0,0.0,0.016667,0.0,0.016667,0.0,0.000000,0.0,...,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.016667,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
91,"Willowdale, Willowdale West",0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.0,...,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.000000,0.0
92,Woburn,0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.0,...,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.000000,0.0
93,Woodbine Heights,0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.055556,0.0,...,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.000000,0.0
94,York Mills West,0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.000000,0.0,...,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.000000,0.0


#### Let's print each neighbourhood along with the top 5 most common venues


In [320]:
num_top_venues = 5

for hood in toronto_grouped['Neighbourhood']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Neighbourhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Agincourt----
                 venue  freq
0   Chinese Restaurant  0.12
1        Grocery Store  0.06
2      Badminton Court  0.06
3  Shanghai Restaurant  0.06
4            Newsagent  0.06


----Alderwood, Long Branch----
                   venue  freq
0      Convenience Store  0.25
1                    Pub  0.25
2  Performing Arts Venue  0.25
3                    Gym  0.25
4      Accessories Store  0.00


----Bayview Village----
                        venue  freq
0          Golf Driving Range  0.25
1  Construction & Landscaping  0.25
2                       Trail  0.25
3                        Park  0.25
4                Neighborhood  0.00


----Bedford Park, Lawrence Manor East----
                venue  freq
0      Sandwich Place  0.11
1         Coffee Shop  0.11
2  Italian Restaurant  0.11
3           Juice Bar  0.05
4    Sushi Restaurant  0.05


----Berczy Park----
            venue  freq
0     Coffee Shop  0.08
1      Restaurant  0.03
2          Bakery  0.03
3  Breakfast Spot

#### Let's put that into a _pandas_ dataframe


First, let's write a function to sort the venues in descending order.


In [321]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each neighbourhood.


In [322]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighbourhoods_venues_sorted = pd.DataFrame(columns=columns)
neighbourhoods_venues_sorted['Neighbourhood'] = toronto_grouped['Neighbourhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighbourhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighbourhoods_venues_sorted.head()

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agincourt,Chinese Restaurant,Hong Kong Restaurant,Discount Store,Skating Rink,Shopping Mall,Shanghai Restaurant,Supermarket,Sushi Restaurant,Bubble Tea Shop,Badminton Court
1,"Alderwood, Long Branch",Convenience Store,Gym,Pub,Performing Arts Venue,Electronics Store,Elementary School,Escape Room,Ethiopian Restaurant,Event Space,Falafel Restaurant
2,Bayview Village,Construction & Landscaping,Park,Golf Driving Range,Trail,Electronics Store,Elementary School,Escape Room,Ethiopian Restaurant,Event Space,Falafel Restaurant
3,"Bedford Park, Lawrence Manor East",Coffee Shop,Sandwich Place,Italian Restaurant,Comfort Food Restaurant,Pharmacy,Butcher,Café,Sports Club,Liquor Store,Sushi Restaurant
4,Berczy Park,Coffee Shop,Cheese Shop,Cocktail Bar,Breakfast Spot,Beer Bar,Bakery,Restaurant,Seafood Restaurant,Farmers Market,Sandwich Place


### Clustering neighbourhoods

Run _k_-means to cluster the neighbourhoods into 5 clusters.


In [323]:
# set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([1, 1, 0, 1, 1, 1, 1, 1, 1, 1], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighbourhood.


In [324]:
# add clustering labels
neighbourhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = df_toronto

# merge toronto_grouped with df_toronto to add latitude/longitude for each neighbourhood
toronto_merged = toronto_merged.join(neighbourhoods_venues_sorted.set_index('Neighbourhood'), on='Neighbourhood')

toronto_merged.head() # check the last columns!

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M3A,North York,Parkwoods,43.75245,-79.32991,2.0,Food & Drink Shop,Park,Falafel Restaurant,Electronics Store,Elementary School,Escape Room,Ethiopian Restaurant,Event Space,Farm,Dumpling Restaurant
1,M4A,North York,Victoria Village,43.73057,-79.31306,0.0,Park,Pharmacy,Grocery Store,Zoo Exhibit,Event Space,Electronics Store,Elementary School,Escape Room,Ethiopian Restaurant,Falafel Restaurant
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65512,-79.36264,1.0,Coffee Shop,Breakfast Spot,Pub,Electronics Store,Distribution Center,Bakery,Restaurant,Thai Restaurant,Theater,Italian Restaurant
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.72327,-79.45042,1.0,Clothing Store,Furniture / Home Store,Coffee Shop,Toy / Game Store,Women's Store,Cosmetics Shop,American Restaurant,Bookstore,Food Court,Men's Store
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.66253,-79.39188,1.0,Coffee Shop,Sandwich Place,Burrito Place,Café,Theater,Fried Chicken Joint,Italian Restaurant,Gastropub,Falafel Restaurant,Park


Drop any neighbourhoods for which we cannot obtain venues and therefore cannot include them within any cluster.

In [325]:
print(f'Number of NaNs: {toronto_merged["Cluster Labels"].isnull().sum()}')
toronto_merged.dropna(subset=['Cluster Labels'], inplace=True)
print(f'Number of NaNs: {toronto_merged["Cluster Labels"].isnull().sum()}')

Number of NaNs: 3
Number of NaNs: 0


Finally, let's visualize the resulting clusters


In [326]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighbourhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    #print(f'lat:{lat} lon:{lon} poi:{poi} cluster:{cluster}')
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Examining clusters

Let's examine each cluster and determine the discriminating venue categories that distinguish each cluster.


#### Cluster 1


In [327]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,North York,0.0,Park,Pharmacy,Grocery Store,Zoo Exhibit,Event Space,Electronics Store,Elementary School,Escape Room,Ethiopian Restaurant,Falafel Restaurant
6,Scarborough,0.0,Zoo Exhibit,Furniture / Home Store,Fast Food Restaurant,Flea Market,Fish Market,Fish & Chips Shop,Field,Flower Shop,Food & Drink Shop,Farmers Market
12,Scarborough,0.0,Construction & Landscaping,Bar,Farm,Elementary School,Escape Room,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Food & Drink Shop
16,York,0.0,Hockey Arena,Park,Field,Trail,Grocery Store,Electronics Store,Elementary School,Escape Room,Ethiopian Restaurant,Event Space
17,Etobicoke,0.0,Grocery Store,Electronics Store,Fish & Chips Shop,Shopping Mall,College Rec Center,Park,Zoo Exhibit,Elementary School,Escape Room,Ethiopian Restaurant
18,Scarborough,0.0,Construction & Landscaping,Park,Gym / Fitness Center,Fish Market,Fish & Chips Shop,Field,Fast Food Restaurant,Farmers Market,Farm,Dumpling Restaurant
22,Scarborough,0.0,Construction & Landscaping,Coffee Shop,Park,Business Service,Fish & Chips Shop,Field,Fast Food Restaurant,Farmers Market,Farm,Eastern European Restaurant
32,Scarborough,0.0,Spa,Park,Restaurant,Indian Restaurant,Grocery Store,Electronics Store,Elementary School,Escape Room,Ethiopian Restaurant,Event Space
35,East York,0.0,Convenience Store,Park,Intersection,Fish Market,Fish & Chips Shop,Field,Fast Food Restaurant,Farmers Market,Farm,Dumpling Restaurant
39,North York,0.0,Construction & Landscaping,Park,Golf Driving Range,Trail,Electronics Store,Elementary School,Escape Room,Ethiopian Restaurant,Event Space,Falafel Restaurant


#### Cluster 2


In [328]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Downtown Toronto,1.0,Coffee Shop,Breakfast Spot,Pub,Electronics Store,Distribution Center,Bakery,Restaurant,Thai Restaurant,Theater,Italian Restaurant
3,North York,1.0,Clothing Store,Furniture / Home Store,Coffee Shop,Toy / Game Store,Women's Store,Cosmetics Shop,American Restaurant,Bookstore,Food Court,Men's Store
4,Downtown Toronto,1.0,Coffee Shop,Sandwich Place,Burrito Place,Café,Theater,Fried Chicken Joint,Italian Restaurant,Gastropub,Falafel Restaurant,Park
5,Etobicoke,1.0,Pharmacy,Skating Rink,Café,Shopping Mall,Park,Bank,Grocery Store,Zoo Exhibit,Elementary School,Escape Room
7,North York,1.0,Coffee Shop,Intersection,Burger Joint,Gym,Gas Station,Soccer Field,Supermarket,Grocery Store,Beer Store,Smoke Shop
...,...,...,...,...,...,...,...,...,...,...,...,...
97,Downtown Toronto,1.0,Coffee Shop,Hotel,Café,Restaurant,Gym,Asian Restaurant,Seafood Restaurant,American Restaurant,Japanese Restaurant,Gastropub
99,Downtown Toronto,1.0,Coffee Shop,Gay Bar,Japanese Restaurant,Restaurant,Sushi Restaurant,Café,Hotel,Bubble Tea Shop,Men's Store,Mediterranean Restaurant
100,East Toronto,1.0,Coffee Shop,Hotel,Restaurant,Café,Bar,Italian Restaurant,Asian Restaurant,Seafood Restaurant,Movie Theater,Steakhouse
101,Etobicoke,1.0,Bank,Coffee Shop,Chinese Restaurant,Flower Shop,Fast Food Restaurant,Park,Sushi Restaurant,Italian Restaurant,Electronics Store,Elementary School


#### Cluster 3


In [329]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,North York,2.0,Food & Drink Shop,Park,Falafel Restaurant,Electronics Store,Elementary School,Escape Room,Ethiopian Restaurant,Event Space,Farm,Dumpling Restaurant
27,North York,2.0,Park,Residential Building (Apartment / Condo),Zoo Exhibit,Falafel Restaurant,Electronics Store,Elementary School,Escape Room,Ethiopian Restaurant,Event Space,Farm
45,North York,2.0,Park,Zoo Exhibit,Falafel Restaurant,Electronics Store,Elementary School,Escape Room,Ethiopian Restaurant,Event Space,Farm,Dumpling Restaurant
49,North York,2.0,Bakery,Park,Falafel Restaurant,Electronics Store,Elementary School,Escape Room,Ethiopian Restaurant,Event Space,Zoo Exhibit,Eastern European Restaurant
68,Central Toronto,2.0,French Restaurant,Park,Zoo Exhibit,Farm,Elementary School,Escape Room,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market


#### Cluster 4

In [330]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
26,Scarborough,3.0,Trail,Zoo Exhibit,Falafel Restaurant,Electronics Store,Elementary School,Escape Room,Ethiopian Restaurant,Event Space,Farm,Dumpling Restaurant


#### Cluster 5


In [331]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
71,Scarborough,4.0,Auto Garage,Zoo Exhibit,Farm,Elementary School,Escape Room,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Eastern European Restaurant
