## Segmenting and Clustering Neighborhoods in Toronto
As part of the Applied Data Science Capstone (week 3) we will explore and build a cluster of Toronto neighborhoods by scraping data from a wikipedia page.

In [1]:
import numpy as np
import pandas as pd
from urllib.request import urlopen
from bs4 import BeautifulSoup as bs

### Section 1 - Get and parse the data from the wikipedia page into a dataframe

In [2]:
postalWiki = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
page = urlopen(postalWiki)
soup = bs(page)

By looking at the html and rendered site we assume that the first table is the one we are looking for.

In [3]:
# find the first table
htmlTable = soup.find('table')

# convert the html table to a dataframe
neighDf = pd.read_html(str(htmlTable))[0] # first table
neighDf.head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront
5,M5A,Downtown Toronto,Regent Park
6,M6A,North York,Lawrence Heights
7,M6A,North York,Lawrence Manor
8,M7A,Queen's Park,Not assigned
9,M8A,Not assigned,Not assigned


Cleanup and structure df:

In [4]:
# remove empty (Not assigned) Borough named rows
neighDf = neighDf[neighDf.Borough != 'Not assigned']

# join the Neighbourhoods which belong to the same Borough
neighDf = neighDf.groupby(['Postcode', 'Borough'], sort=False).agg( ','.join).reset_index()

# for no Neighbourhodd use the Borough name
neighDf.loc[neighDf.Neighbourhood == 'Not assigned', 'Neighbourhood'] = neighDf.Borough

# rename columns to better names
neighDf.rename(columns={'Postcode': 'PostalCode', 'Neighbourhood': 'Neighborhood'}, inplace=True)

neighDf.head(10)

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Harbourfront,Regent Park"
3,M6A,North York,"Lawrence Heights,Lawrence Manor"
4,M7A,Queen's Park,Queen's Park
5,M9A,Etobicoke,Islington Avenue
6,M1B,Scarborough,"Rouge,Malvern"
7,M3B,North York,Don Mills North
8,M4B,East York,"Woodbine Gardens,Parkview Hill"
9,M5B,Downtown Toronto,"Ryerson,Garden District"


In [5]:
neighDf.shape

(103, 3)

### Section 2 - Add geodata (lat and long) to the dataframe

Note: The geocoder package didn't work so great so we are using the included geospatial Toronto csv file.

In [6]:
geoDf = pd.read_csv('Geospatial_Coordinates.csv')
nGeoDf = neighDf.join(geoDf.set_index('Postal Code'), on='PostalCode')

nGeoDf.head(10)

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Harbourfront,Regent Park",43.65426,-79.360636
3,M6A,North York,"Lawrence Heights,Lawrence Manor",43.718518,-79.464763
4,M7A,Queen's Park,Queen's Park,43.662301,-79.389494
5,M9A,Etobicoke,Islington Avenue,43.667856,-79.532242
6,M1B,Scarborough,"Rouge,Malvern",43.806686,-79.194353
7,M3B,North York,Don Mills North,43.745906,-79.352188
8,M4B,East York,"Woodbine Gardens,Parkview Hill",43.706397,-79.309937
9,M5B,Downtown Toronto,"Ryerson,Garden District",43.657162,-79.378937


### Section 3 - Explore and cluster the neighborhoods in Toronto

In this section we'll:
* Visualize the Toronto neighborhoods.
* Grab venue data from the Foursquare API.
* Explore and Analyze the neighborhoods with the venue data.
* Cluster the Toronto neighborhoods using K-Means and examine the clusters.

Note: This will be a similar procedure to the New York City example in the course.

In [7]:
import matplotlib.cm as cm
import matplotlib.colors as colors

from sklearn.cluster import KMeans

from geopy.geocoders import Nominatim
import folium

import json
import requests

First lets only include the neighborhoods including the word Toronto cause 103 boroughs is a little much:

In [8]:
boroughNames = list(nGeoDf.Borough.unique())

torontoBoroughs = []

for x in boroughNames:
    if "toronto" in x.lower():
        torontoBoroughs.append(x)
        
torontoDf = nGeoDf[nGeoDf['Borough'].isin(torontoBoroughs)].reset_index(drop=True)
torontoDf.head(10)

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M5A,Downtown Toronto,"Harbourfront,Regent Park",43.65426,-79.360636
1,M5B,Downtown Toronto,"Ryerson,Garden District",43.657162,-79.378937
2,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
3,M4E,East Toronto,The Beaches,43.676357,-79.293031
4,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306
5,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383
6,M6G,Downtown Toronto,Christie,43.669542,-79.422564
7,M5H,Downtown Toronto,"Adelaide,King,Richmond",43.650571,-79.384568
8,M6H,West Toronto,"Dovercourt Village,Dufferin",43.669005,-79.442259
9,M5J,Downtown Toronto,"Harbourfront East,Toronto Islands,Union Station",43.640816,-79.381752


#### Let's map the Toronto neighborhoods.

In [9]:
# get the center of Toronto
geolocator = Nominatim(user_agent="my-application")
location = geolocator.geocode('Toronto')
latitude = location.latitude
longitude = location.longitude
print('Coordinate of Toronto are {}, {}.'.format(latitude, longitude))

Coordinate of Toronto are 43.653963, -79.387207.


In [10]:
# map of Toronto using coordinates above
torontoMap = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, borough, neighborhood in zip(torontoDf['Latitude'], torontoDf['Longitude'], torontoDf['Borough'], torontoDf['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(torontoMap)  
    
torontoMap

#### Foursquare Venue Data:

In [11]:
# Foursquare Credentials and Version, hidden before upload (add your own to try)
CLIENT_ID = '######'
CLIENT_SECRET = '######'
VERSION = '20180605'

In [12]:
# function to pull upto 100 venues from radius of 500 meters
LIMIT = 100
radius = 500

venues = []

for lat, long, post, borough, neighborhood in zip(torontoDf['Latitude'], torontoDf['Longitude'], torontoDf['PostalCode'], torontoDf['Borough'], torontoDf['Neighborhood']):
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        long,
        radius, 
        LIMIT)
    
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    for venue in results:
        venues.append((
            post, 
            borough,
            neighborhood,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))

In [13]:
# convert the venues list into a new DataFrame
venuesDf = pd.DataFrame(venues)

# define the column names
venuesDf.columns = ['PostalCode', 'Borough', 'Neighborhood', 'BoroughLatitude', 'BoroughLongitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']

print(venuesDf.shape)
venuesDf.head(10)

(1704, 9)


Unnamed: 0,PostalCode,Borough,Neighborhood,BoroughLatitude,BoroughLongitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,M5A,Downtown Toronto,"Harbourfront,Regent Park",43.65426,-79.360636,Roselle Desserts,43.653447,-79.362017,Bakery
1,M5A,Downtown Toronto,"Harbourfront,Regent Park",43.65426,-79.360636,Tandem Coffee,43.653559,-79.361809,Coffee Shop
2,M5A,Downtown Toronto,"Harbourfront,Regent Park",43.65426,-79.360636,Toronto Cooper Koo Family Cherry St YMCA Centre,43.653191,-79.357947,Gym / Fitness Center
3,M5A,Downtown Toronto,"Harbourfront,Regent Park",43.65426,-79.360636,Body Blitz Spa East,43.654735,-79.359874,Spa
4,M5A,Downtown Toronto,"Harbourfront,Regent Park",43.65426,-79.360636,Impact Kitchen,43.656369,-79.35698,Restaurant
5,M5A,Downtown Toronto,"Harbourfront,Regent Park",43.65426,-79.360636,Morning Glory Cafe,43.653947,-79.361149,Breakfast Spot
6,M5A,Downtown Toronto,"Harbourfront,Regent Park",43.65426,-79.360636,Figs Breakfast & Lunch,43.655675,-79.364503,Breakfast Spot
7,M5A,Downtown Toronto,"Harbourfront,Regent Park",43.65426,-79.360636,The Extension Room,43.653313,-79.359725,Gym / Fitness Center
8,M5A,Downtown Toronto,"Harbourfront,Regent Park",43.65426,-79.360636,Corktown Common,43.655618,-79.356211,Park
9,M5A,Downtown Toronto,"Harbourfront,Regent Park",43.65426,-79.360636,Dominion Pub and Kitchen,43.656919,-79.358967,Pub


Count the venues for neigborhoods:

In [14]:
venuesDf.groupby(['PostalCode', 'Borough', 'Neighborhood']).count()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,BoroughLatitude,BoroughLongitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
PostalCode,Borough,Neighborhood,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
M4E,East Toronto,The Beaches,4,4,4,4,4,4
M4K,East Toronto,"The Danforth West,Riverdale",44,44,44,44,44,44
M4L,East Toronto,"The Beaches West,India Bazaar",20,20,20,20,20,20
M4M,East Toronto,Studio District,38,38,38,38,38,38
M4N,Central Toronto,Lawrence Park,3,3,3,3,3,3
M4P,Central Toronto,Davisville North,7,7,7,7,7,7
M4R,Central Toronto,North Toronto West,21,21,21,21,21,21
M4S,Central Toronto,Davisville,33,33,33,33,33,33
M4T,Central Toronto,"Moore Park,Summerhill East",1,1,1,1,1,1
M4V,Central Toronto,"Deer Park,Forest Hill SE,Rathnelly,South Hill,Summerhill West",15,15,15,15,15,15


In [15]:
print('There are {} uniques categories.'.format(len(venuesDf['VenueCategory'].unique())))

There are 233 uniques categories.


### Analyze the neighborhoods:

In [16]:
# one hot encoding
torontoOnehot = pd.get_dummies(venuesDf[['VenueCategory']], prefix="", prefix_sep="")

# add postal, borough, and neighborhood column back to dataframe
torontoOnehot['PostalCode'] = venuesDf['PostalCode'] 
torontoOnehot['Borough'] = venuesDf['Borough'] 
torontoOnehot['Neighborhoods'] = venuesDf['Neighborhood'] 

fixedColumns = list(torontoOnehot.columns[-3:]) + list(torontoOnehot.columns[:-3])
torontoOnehot = torontoOnehot[fixedColumns]

torontoOnehot.head()

Unnamed: 0,PostalCode,Borough,Neighborhoods,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,...,Theme Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wings Joint,Yoga Studio
0,M5A,Downtown Toronto,"Harbourfront,Regent Park",0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,M5A,Downtown Toronto,"Harbourfront,Regent Park",0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,M5A,Downtown Toronto,"Harbourfront,Regent Park",0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,M5A,Downtown Toronto,"Harbourfront,Regent Park",0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,M5A,Downtown Toronto,"Harbourfront,Regent Park",0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [17]:
torontoOnehot.shape

(1704, 236)

Group rows by PostalCode and by taking the mean of the frequency of occurrence of each category:

In [18]:
torontoGrouped = torontoOnehot.groupby(['PostalCode', 'Borough', 'Neighborhoods']).mean().reset_index()
torontoGrouped.head(10)

Unnamed: 0,PostalCode,Borough,Neighborhoods,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,...,Theme Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wings Joint,Yoga Studio
0,M4E,East Toronto,The Beaches,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,M4K,East Toronto,"The Danforth West,Riverdale",0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.022727
2,M4L,East Toronto,"The Beaches West,India Bazaar",0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,M4M,East Toronto,Studio District,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316
4,M4N,Central Toronto,Lawrence Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,M4P,Central Toronto,Davisville North,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,M4R,Central Toronto,North Toronto West,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619
7,M4S,Central Toronto,Davisville,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,M4T,Central Toronto,"Moore Park,Summerhill East",0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,M4V,Central Toronto,"Deer Park,Forest Hill SE,Rathnelly,South Hill,...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0


Now let's create the new dataframe and display the top 10 venues for each postalcode.

In [19]:
numTopVenues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
areaColumns = ['PostalCode', 'Borough', 'Neighborhoods']
freqColumns = []
for ind in np.arange(numTopVenues):
    try:
        freqColumns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        freqColumns.append('{}th Most Common Venue'.format(ind+1))
columns = areaColumns+freqColumns

# create a new dataframe
venuesSorted = pd.DataFrame(columns=columns)
venuesSorted['PostalCode'] = torontoGrouped['PostalCode']
venuesSorted['Borough'] = torontoGrouped['Borough']
venuesSorted['Neighborhoods'] = torontoGrouped['Neighborhoods']

for ind in np.arange(torontoGrouped.shape[0]):
    rowCategories = torontoGrouped.iloc[ind, :].iloc[3:]
    rowCategoriesSorted = rowCategories.sort_values(ascending=False)
    venuesSorted.iloc[ind, 3:] = rowCategoriesSorted.index.values[0:numTopVenues]

print(venuesSorted.shape)
venuesSorted

(38, 13)


Unnamed: 0,PostalCode,Borough,Neighborhoods,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M4E,East Toronto,The Beaches,Health Food Store,Neighborhood,Trail,Pub,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant
1,M4K,East Toronto,"The Danforth West,Riverdale",Greek Restaurant,Coffee Shop,Italian Restaurant,Furniture / Home Store,Bookstore,Ice Cream Shop,Yoga Studio,Brewery,Bubble Tea Shop,Café
2,M4L,East Toronto,"The Beaches West,India Bazaar",Park,Sandwich Place,Sushi Restaurant,Pet Store,Pizza Place,Pub,Movie Theater,Burrito Place,Burger Joint,Brewery
3,M4M,East Toronto,Studio District,Café,Coffee Shop,Italian Restaurant,Bakery,American Restaurant,Yoga Studio,Park,Seafood Restaurant,Sandwich Place,Cheese Shop
4,M4N,Central Toronto,Lawrence Park,Park,Bus Line,Swim School,Yoga Studio,Dog Run,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant
5,M4P,Central Toronto,Davisville North,Breakfast Spot,Park,Clothing Store,Food & Drink Shop,Hotel,Gym,Sandwich Place,Falafel Restaurant,Event Space,Ethiopian Restaurant
6,M4R,Central Toronto,North Toronto West,Clothing Store,Sporting Goods Shop,Coffee Shop,Yoga Studio,Gym / Fitness Center,Dessert Shop,Chinese Restaurant,Rental Car Location,Restaurant,Diner
7,M4S,Central Toronto,Davisville,Dessert Shop,Sandwich Place,Gym,Café,Coffee Shop,Pizza Place,Sushi Restaurant,Italian Restaurant,Diner,Farmers Market
8,M4T,Central Toronto,"Moore Park,Summerhill East",Restaurant,Yoga Studio,Diner,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant
9,M4V,Central Toronto,"Deer Park,Forest Hill SE,Rathnelly,South Hill,...",Pub,Coffee Shop,Pizza Place,Light Rail Station,Sports Bar,Restaurant,Supermarket,Sushi Restaurant,Fried Chicken Joint,Bagel Shop


### Cluster PostalCodes 

In [20]:
# set number of clusters
kclusters = 5

torontoGroupedClustering = torontoGrouped.drop(['PostalCode', 'Borough', 'Neighborhoods'], 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(torontoGroupedClustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([4, 4, 4, 4, 2, 4, 4, 4, 0, 4])

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each postalcode.

In [21]:
# create a new dataframe that includes the cluster as well as the top 10 venues for each postalcode and neighborhoods.
torontoMerged = torontoDf.copy()

# add clustering labels
torontoMerged["Cluster Labels"] = kmeans.labels_

# merge torontoGrouped with torontoData to add latitude/longitude for each neighborhood
torontoMerged = torontoMerged.join(venuesSorted.drop(['Borough', 'Neighborhoods'], 1).set_index('PostalCode'), on='PostalCode')

print(torontoMerged.shape)
torontoMerged.head() # check the last columns!

(38, 16)


Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M5A,Downtown Toronto,"Harbourfront,Regent Park",43.65426,-79.360636,4,Coffee Shop,Pub,Bakery,Park,Café,Mexican Restaurant,Restaurant,Gym / Fitness Center,Breakfast Spot,Theater
1,M5B,Downtown Toronto,"Ryerson,Garden District",43.657162,-79.378937,4,Coffee Shop,Clothing Store,Café,Middle Eastern Restaurant,Cosmetics Shop,Tea Room,Plaza,Japanese Restaurant,Ramen Restaurant,Restaurant
2,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,4,Coffee Shop,Café,Restaurant,Hotel,Italian Restaurant,Breakfast Spot,Cosmetics Shop,Clothing Store,Beer Bar,Bakery
3,M4E,East Toronto,The Beaches,43.676357,-79.293031,4,Health Food Store,Neighborhood,Trail,Pub,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant
4,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,2,Coffee Shop,Cocktail Bar,Café,Cheese Shop,Steakhouse,Bakery,Italian Restaurant,Seafood Restaurant,Farmers Market,Beer Bar


Visualize the clustering:

In [22]:
# create map
clustersMap = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colorsArray = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colorsArray]

# add markers to the map
for lat, lon, post, bor, poi, cluster in zip(torontoMerged['Latitude'], torontoMerged['Longitude'], torontoMerged['PostalCode'], torontoMerged['Borough'], torontoMerged['Neighborhood'], torontoMerged['Cluster Labels']):
    label = folium.Popup('{} ({}): {} - Cluster {}'.format(bor, post, poi, cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(clustersMap)
       
clustersMap

### Examine the Cluster

#### Cluster 1

In [23]:
torontoMerged.loc[torontoMerged['Cluster Labels'] == 0, torontoMerged.columns[[1] + list(range(5, torontoMerged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,West Toronto,0,Supermarket,Pharmacy,Bakery,Café,Bar,Bank,Brewery,Middle Eastern Restaurant,Music Venue,Art Gallery


#### Cluster 2

In [24]:
torontoMerged.loc[torontoMerged['Cluster Labels'] == 1, torontoMerged.columns[[1] + list(range(5, torontoMerged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
22,Central Toronto,1,Clothing Store,Sporting Goods Shop,Coffee Shop,Yoga Studio,Gym / Fitness Center,Dessert Shop,Chinese Restaurant,Rental Car Location,Restaurant,Diner


#### Cluster 3

In [25]:
torontoMerged.loc[torontoMerged['Cluster Labels'] == 2, torontoMerged.columns[[1] + list(range(5, torontoMerged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Downtown Toronto,2,Coffee Shop,Cocktail Bar,Café,Cheese Shop,Steakhouse,Bakery,Italian Restaurant,Seafood Restaurant,Farmers Market,Beer Bar


#### Cluster 4

In [26]:
torontoMerged.loc[torontoMerged['Cluster Labels'] == 3, torontoMerged.columns[[1] + list(range(5, torontoMerged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,West Toronto,3,Bar,Asian Restaurant,Coffee Shop,Café,Boutique,Vietnamese Restaurant,French Restaurant,Pizza Place,Men's Store,Restaurant
23,Central Toronto,3,Coffee Shop,Café,Sandwich Place,Pizza Place,Liquor Store,Park,Jewish Restaurant,BBQ Joint,Pub,Indian Restaurant


#### Cluster 5

In [27]:
torontoMerged.loc[torontoMerged['Cluster Labels'] == 4, torontoMerged.columns[[1] + list(range(5, torontoMerged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Downtown Toronto,4,Coffee Shop,Pub,Bakery,Park,Café,Mexican Restaurant,Restaurant,Gym / Fitness Center,Breakfast Spot,Theater
1,Downtown Toronto,4,Coffee Shop,Clothing Store,Café,Middle Eastern Restaurant,Cosmetics Shop,Tea Room,Plaza,Japanese Restaurant,Ramen Restaurant,Restaurant
2,Downtown Toronto,4,Coffee Shop,Café,Restaurant,Hotel,Italian Restaurant,Breakfast Spot,Cosmetics Shop,Clothing Store,Beer Bar,Bakery
3,East Toronto,4,Health Food Store,Neighborhood,Trail,Pub,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant
5,Downtown Toronto,4,Coffee Shop,Café,Italian Restaurant,Sandwich Place,Middle Eastern Restaurant,Burger Joint,Ice Cream Shop,Salad Place,Bubble Tea Shop,Spa
6,Downtown Toronto,4,Grocery Store,Café,Park,Baby Store,Diner,Italian Restaurant,Restaurant,Nightclub,Coffee Shop,Convenience Store
7,Downtown Toronto,4,Coffee Shop,Café,Steakhouse,Bar,Restaurant,Burger Joint,American Restaurant,Hotel,Cosmetics Shop,Thai Restaurant
9,Downtown Toronto,4,Coffee Shop,Aquarium,Hotel,Café,Brewery,Fried Chicken Joint,Scenic Lookout,Italian Restaurant,History Museum,Restaurant
11,East Toronto,4,Greek Restaurant,Coffee Shop,Italian Restaurant,Furniture / Home Store,Bookstore,Ice Cream Shop,Yoga Studio,Brewery,Bubble Tea Shop,Café
12,Downtown Toronto,4,Coffee Shop,Café,Hotel,Restaurant,Gastropub,Deli / Bodega,Italian Restaurant,Bakery,Bar,American Restaurant


***fin***