# Segmenting and Clustering Neighborhoods in Toronto

This notebook shows a project about segmentation and clustering of neighborhoods in Toronto, Canada, based on the distribution of venues categories nearby.

## Part 1 - Collecting and Processing Neighborhoods 

### Webscraping for neighborhoods data

The neighborhoods names are taken through **webscraping** from a Wikipedia page: 
https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M. This can be achieved with the pandas function **read.html( )**.

In [1]:
# Importing pandas library
import pandas as pd
print("Pandas library succesfully imported.")

Pandas library succesfully imported.


In [2]:
# URL with postal codes, boroughs and neighbourhoods in Toronto
url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'

# Extracting url's tables as a list of dataframes
df_list = pd.read_html(url)

In [3]:
#The neighbourhood's table is the 1st dataframe in this list
df = df_list[0]
print("This dataframe has {} rows.".format(df.shape[0]))
df

This dataframe has 180 rows.


Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
...,...,...,...
175,M5Z,Not assigned,Not assigned
176,M6Z,Not assigned,Not assigned
177,M7Z,Not assigned,Not assigned
178,M8Z,Etobicoke,"Mimico NW, The Queensway West, South of Bloor,..."


### Processing neighborhood table

Some requirements were previously specified for the dataframe containing the neighborhoods data:
* The dataframe will consist of **three columns**: PostalCode, Borough, and Neighborhood

In [4]:
# Renaming columns in df
df.columns = ['PostalCode', 'Borough','Neighborhood']

* Only process the cells that have an assigned borough. Ignore cells with a borough that is **Not assigned**.

In [5]:
# Masking the boroughs that are not 'Not assigned'
mask = (df['Borough'] != 'Not assigned')
print("There are {} 'not assigned' boroughs.".format(df.shape[0] - sum(mask)))

There are 77 'not assigned' boroughs.


In [6]:
# Keeping only the rows that have an assigned borough
df = df[mask].reset_index(drop=True)

In [7]:
# Checking for non-duplicated postal code areas
sum(df[['PostalCode']].duplicated())

0

* If a cell has a borough but a **Not assigned** neighborhood, then the neighborhood will be the same as the borough.

In [8]:
# Not assigned neighborhoods
mask = df['Neighborhood'] == 'Not assigned'
print("There are {} 'not assigned' neighborhoods.".format(sum(mask)))

There are 0 'not assigned' neighborhoods.


In [9]:
# Changing not assigned neighborhoods for their boroughs
df[mask]['Neighborhood'] = df[mask]['Borough']

* In the last cell of your notebook, use the **.shape** method to print the number of rows of your dataframe.

In [10]:
# Final dataframe
print("The final dataframe has {} rows.".format(df.shape[0]))
df

The final dataframe has 103 rows.


Unnamed: 0,PostalCode,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
...,...,...,...
98,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North"
99,M4Y,Downtown Toronto,Church and Wellesley
100,M7Y,East Toronto,"Business reply mail Processing Centre, South C..."
101,M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu..."


## Part 2 - Collecting and Processing Locations

In order to utilize the Foursquare location data, we need to get the latitude and the longitude coordinates of each neighborhood.cThe neighborhoods locations are obtained through the **Geocoder Python Package:** https://geocoder.readthedocs.io/index.html

The problem with this Package is you have to be persistent sometimes in order to get the geographical coordinates of a given postal code. So you can make a call to get the latitude and longitude coordinates of a given postal code and the result would be **None**, and then make the call again and you would get the coordinates. So, in order to make sure that you get the coordinates for all of our neighborhoods, you can run a while loop for each postal code. Taking postal code M5G as an example, your code would look something like this:

In [11]:
# Installing and importing geocoder library
#!pip install geocoder
import geocoder
print("Geocoder library succesfully imported.")

Geocoder library succesfully imported.


In [12]:
# Searching for location coordinates using Bing geocoder
for index, row in df.iterrows():
    g = geocoder.bing(row['PostalCode'] + ', Toronto', key = 'Ap2Ed0Z779lp2UHLSYFfPUhFeNXewJGj6ny9LKItYZUX6mDndgex92W5LvZujhky')
    df.at[index,'Latitude'] = g.latlng[0]
    df.at[index,'Longitude'] = g.latlng[1]

print('Location coordinates succesfully added!')
df

Location coordinates succesfully added!


Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.756123,-79.329636
1,M4A,North York,Victoria Village,43.726780,-79.310738
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.655354,-79.365044
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.721996,-79.445915
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.663910,-79.388733
...,...,...,...,...,...
98,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North",43.652699,-79.511276
99,M4Y,Downtown Toronto,Church and Wellesley,43.666286,-79.382446
100,M7Y,East Toronto,"Business reply mail Processing Centre, South C...",43.663506,-79.317429
101,M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu...",43.633709,-79.496521


## Part 3 - Exploring and Clustering Neighborhoods

### Visualizing Toronto's Neighborhoods

In order to start exploring the dataset, let's plot the positions of the Postal Code Areas in Toronto. The **folium** library will be useful for map rendering.

In [13]:
# Installing and importing folium library
#!pip install folium
import folium
print("Folium library succesfully imported.")

Folium library succesfully imported.


In [14]:
# Getting Toronto's geographical coordinates
g = geocoder.bing('Toronto, ON', key = 'Ap2Ed0Z779lp2UHLSYFfPUhFeNXewJGj6ny9LKItYZUX6mDndgex92W5LvZujhky')
latitude = g.latlng[0]
longitude = g.latlng[1]
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.651893615722656, -79.3817138671875.


In [15]:
# Creating a map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

# Add markers to map
for lat, lng, postal_code, borough, neighborhood in zip(df['Latitude'], df['Longitude'], df['PostalCode'], df['Borough'], df['Neighborhood']):
    label = '{}, {}, {}'.format(postal_code, neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker([lat, lng], radius=5, popup=label, color='blue',
                        fill=True, fill_color='#3186cc', fill_opacity=0.7,
                        parse_html=False).add_to(map_toronto)
map_toronto

### Define Foursquare Credentials and Version

In [39]:
CLIENT_ID = 'KPNDYDTNYHMQ0VG5BS3BRHL23STHS2T104N3OY3BR1ZXMLDR' # your Foursquare ID
CLIENT_SECRET = 'ETCLI4ZATQ3VM1BSOT4THVGJXEU00VRNY0ZB21FPLN1DMOOL' # your Foursquare Secret
VERSION = '20201127' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: KPNDYDTNYHMQ0VG5BS3BRHL23STHS2T104N3OY3BR1ZXMLDR
CLIENT_SECRET:ETCLI4ZATQ3VM1BSOT4THVGJXEU00VRNY0ZB21FPLN1DMOOL


### Creating a function to get nearby venues

In [40]:
import requests # library to handle requests
from pandas import json_normalize # tranform JSON file into a pandas dataframe

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['PostalCode', 
                  'Latitude', 
                  'Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Generating the venues in Toronto

In [41]:
toronto_venues = getNearbyVenues(df['PostalCode'],
                                   df['Latitude'],
                                   df['Longitude'])
toronto_venues

M3A
M4A
M5A
M6A
M7A
M9A
M1B
M3B
M4B
M5B
M6B
M9B
M1C
M3C
M4C
M5C
M6C
M9C
M1E
M4E
M5E
M6E
M1G
M4G
M5G
M6G
M1H
M2H
M3H
M4H
M5H
M6H
M1J
M2J
M3J
M4J
M5J
M6J
M1K
M2K
M3K
M4K
M5K
M6K
M1L
M2L
M3L
M4L
M5L
M6L
M9L
M1M
M2M
M3M
M4M
M5M
M6M
M9M
M1N
M2N
M3N
M4N
M5N
M6N
M9N
M1P
M2P
M4P
M5P
M6P
M9P
M1R
M2R
M4R
M5R
M6R
M7R
M9R
M1S
M4S
M5S
M6S
M1T
M4T
M5T
M1V
M4V
M5V
M8V
M9V
M1W
M4W
M5W
M8W
M9W
M1X
M4X
M5X
M8X
M4Y
M7Y
M8Y
M8Z


Unnamed: 0,PostalCode,Latitude,Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,M3A,43.756123,-79.329636,TTC Stop #09083,43.759655,-79.332223,Bus Stop
1,M3A,43.756123,-79.329636,DVP at York Mills,43.758899,-79.334099,Intersection
2,M3A,43.756123,-79.329636,Chick-N-Joy,43.759900,-79.326520,Fried Chicken Joint
3,M3A,43.756123,-79.329636,TTC Stop 9083,43.759251,-79.334000,Bus Stop
4,M4A,43.726780,-79.310738,Tim Hortons,43.725517,-79.313103,Coffee Shop
...,...,...,...,...,...,...,...
2539,M8Z,43.629711,-79.517479,RONA,43.629393,-79.518320,Hardware Store
2540,M8Z,43.629711,-79.517479,Value Village,43.631269,-79.518238,Thrift / Vintage Store
2541,M8Z,43.629711,-79.517479,Once Upon A Child,43.631075,-79.518290,Kids Store
2542,M8Z,43.629711,-79.517479,Royal Canadian Legion #210,43.628855,-79.518903,Social Club


### Grouping venues per category

In [42]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['PostalCode'] = toronto_venues['PostalCode'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_grouped = toronto_onehot.groupby('PostalCode').mean().reset_index()
toronto_grouped

Unnamed: 0,PostalCode,Accessories Store,Afghan Restaurant,American Restaurant,Antique Shop,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,...,Vegetarian / Vegan Restaurant,Veterinarian,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wings Joint,Women's Store,Yoga Studio
0,M1B,0.0,0.0,0.0,0.0,0.0,0.0,0.00,0.0,0.0,...,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.0
1,M1C,0.0,0.0,0.0,0.0,0.0,0.0,0.00,0.0,0.0,...,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.0
2,M1E,0.0,0.0,0.0,0.0,0.0,0.0,0.00,0.0,0.0,...,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.0
3,M1G,0.0,0.0,0.0,0.0,0.0,0.0,0.00,0.0,0.0,...,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.0
4,M1H,0.0,0.0,0.0,0.0,0.0,0.0,0.00,0.0,0.0,...,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
98,M9N,0.0,0.0,0.0,0.0,0.0,0.0,0.00,0.0,0.0,...,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.0
99,M9P,0.0,0.0,0.0,0.0,0.0,0.0,0.00,0.0,0.0,...,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.0
100,M9R,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,...,0.0,0.0,0.0,0.0000,0.0,0.0,0.0,0.0,0.0,0.0
101,M9V,0.0,0.0,0.0,0.0,0.0,0.0,0.00,0.0,0.0,...,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0


### Top 10 venues per postal area

Function to sort venues in descending order.

In [43]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Generating dataframe with 10 top venues per each postal area.

In [44]:
import numpy as np

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['PostalCode']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['PostalCode'] = toronto_grouped['PostalCode']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,PostalCode,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M1B,Hobby Shop,Furniture / Home Store,Electronics Store,Fast Food Restaurant,Fish Market,Falafel Restaurant,Farmers Market,Field,Fish & Chips Shop,Yoga Studio
1,M1C,Bar,Park,Yoga Studio,Ethiopian Restaurant,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Fish Market
2,M1E,Pizza Place,Bank,Fast Food Restaurant,Coffee Shop,Convenience Store,Fried Chicken Joint,Sports Bar,Breakfast Spot,Supermarket,Beer Store
3,M1G,Coffee Shop,Park,Mexican Restaurant,Korean BBQ Restaurant,Business Service,Diner,Falafel Restaurant,Department Store,French Restaurant,Fountain
4,M1H,Construction & Landscaping,Trail,Yoga Studio,Fish Market,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop


### Clustering Postal Areas

Run k-means to cluster the neighborhood into 4 clusters.

In [45]:
# Importing library
from sklearn.cluster import KMeans

In [67]:
# Set number of clusters
kclusters = 4

# Independent variables
toronto_grouped_clustering = toronto_grouped.drop('PostalCode', 1)

# Run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# Check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 0, 1, 1, 1, 1, 1, 1, 1, 1], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each postal code area.

In [68]:
# Add clustering labels
#neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
neighborhoods_venues_sorted['Cluster Labels'] = kmeans.labels_
toronto_merged = df

# Merge toronto_grouped with df to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('PostalCode'), on='PostalCode')

toronto_merged # check the last columns!

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster Labels
0,M3A,North York,Parkwoods,43.756123,-79.329636,Bus Stop,Fried Chicken Joint,Intersection,Flea Market,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Fish Market,1
1,M4A,North York,Victoria Village,43.726780,-79.310738,Park,Portuguese Restaurant,Intersection,Pizza Place,Coffee Shop,Department Store,Escape Room,Food Truck,Food Court,Food & Drink Shop,0
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.655354,-79.365044,Coffee Shop,Italian Restaurant,Breakfast Spot,Yoga Studio,Grocery Store,German Restaurant,Liquor Store,Café,Food Truck,Skating Rink,1
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.721996,-79.445915,Coffee Shop,Platform,Metro Station,Convenience Store,Video Game Store,Bakery,Park,Yoga Studio,Fish & Chips Shop,Falafel Restaurant,1
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.663910,-79.388733,Coffee Shop,Gym,Chinese Restaurant,Ethiopian Restaurant,College Cafeteria,Sushi Restaurant,Escape Room,Bubble Tea Shop,Restaurant,Beer Bar,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
98,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North",43.652699,-79.511276,Pool,Fish Market,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Flea Market,Escape Room,2
99,M4Y,Downtown Toronto,Church and Wellesley,43.666286,-79.382446,Coffee Shop,Restaurant,Sushi Restaurant,Gay Bar,Japanese Restaurant,Café,Pub,Bubble Tea Shop,Hotel,Men's Store,1
100,M7Y,East Toronto,"Business reply mail Processing Centre, South C...",43.663506,-79.317429,Harbor / Marina,Park,Fast Food Restaurant,Brewery,Sushi Restaurant,Liquor Store,Farmers Market,Garden,Restaurant,Movie Theater,1
101,M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu...",43.633709,-79.496521,Baseball Field,Construction & Landscaping,Park,Yoga Studio,Fish Market,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,0


Visualizing the resulting clusters

In [69]:
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=10)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['PostalCode'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Examining Cluster Distribution

Now, we can examine each cluster and determine the discriminating venue categories that distinguish each cluster.

#### Cluster 1

In [77]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[0] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,PostalCode,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster Labels
1,M4A,Park,Portuguese Restaurant,Intersection,Pizza Place,Coffee Shop,Department Store,Escape Room,Food Truck,Food Court,Food & Drink Shop,0
5,M9A,Park,Pharmacy,Skating Rink,Baseball Field,Bank,Shopping Mall,Grocery Store,Café,Flea Market,Fish Market,0
12,M1C,Bar,Park,Yoga Studio,Ethiopian Restaurant,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Fish Market,0
28,M3H,Middle Eastern Restaurant,Mediterranean Restaurant,Pizza Place,Park,Fish Market,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,0
35,M4J,Convenience Store,Intersection,Park,Yoga Studio,Fish & Chips Shop,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,0
46,M3L,Grocery Store,Park,Escape Room,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Yoga Studio,0
64,M9N,Grocery Store,Park,Diner,Pharmacy,Field,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,0
66,M2P,Speakeasy,Convenience Store,Park,Yoga Studio,Fish & Chips Shop,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,0
95,M1X,Playground,Park,Yoga Studio,Fish & Chips Shop,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish Market,0
101,M8Y,Baseball Field,Construction & Landscaping,Park,Yoga Studio,Fish Market,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,0


#### Cluster 2

In [78]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[0] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,PostalCode,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster Labels
0,M3A,Bus Stop,Fried Chicken Joint,Intersection,Flea Market,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Fish Market,1
2,M5A,Coffee Shop,Italian Restaurant,Breakfast Spot,Yoga Studio,Grocery Store,German Restaurant,Liquor Store,Café,Food Truck,Skating Rink,1
3,M6A,Coffee Shop,Platform,Metro Station,Convenience Store,Video Game Store,Bakery,Park,Yoga Studio,Fish & Chips Shop,Falafel Restaurant,1
4,M7A,Coffee Shop,Gym,Chinese Restaurant,Ethiopian Restaurant,College Cafeteria,Sushi Restaurant,Escape Room,Bubble Tea Shop,Restaurant,Beer Bar,1
6,M1B,Hobby Shop,Furniture / Home Store,Electronics Store,Fast Food Restaurant,Fish Market,Falafel Restaurant,Farmers Market,Field,Fish & Chips Shop,Yoga Studio,1
...,...,...,...,...,...,...,...,...,...,...,...,...
96,M4X,Coffee Shop,Café,Pizza Place,Restaurant,Grocery Store,Market,Pub,Japanese Restaurant,Bakery,Chinese Restaurant,1
97,M5X,Coffee Shop,Hotel,Café,Restaurant,Gym,Japanese Restaurant,Deli / Bodega,American Restaurant,Salad Place,Seafood Restaurant,1
99,M4Y,Coffee Shop,Restaurant,Sushi Restaurant,Gay Bar,Japanese Restaurant,Café,Pub,Bubble Tea Shop,Hotel,Men's Store,1
100,M7Y,Harbor / Marina,Park,Fast Food Restaurant,Brewery,Sushi Restaurant,Liquor Store,Farmers Market,Garden,Restaurant,Movie Theater,1


#### Cluster 3

In [79]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[0] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,PostalCode,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster Labels
7,M3B,Pool,Fish Market,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Flea Market,Escape Room,2
45,M2L,Pool,Concert Hall,Fish Market,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Yoga Studio,2
98,M8X,Pool,Fish Market,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Flea Market,Escape Room,2


#### Cluster 4

In [80]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[0] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,PostalCode,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster Labels
56,M6M,Fast Food Restaurant,Playground,Lawyer,Yoga Studio,Fish & Chips Shop,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Field,3
68,M5P,Lawyer,Yoga Studio,Fish Market,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fish & Chips Shop,Flea Market,3
