# Segmenting and Clustering Neighborhoods in Toronto

In this assignment, we will explore, segment, and cluster the neighborhoods in the city of Toronto based on the postal code and borough information.


## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1.  <a href="#item1">Webscraping and Data Wrangling</a>
    
2.  <a href="#item2">Fetching Coordinates</a>
    
3.  <a href="#item3">Explore Neighborhoods in Toronto</a>    
    </font>
    </div>


In [1]:
import pandas as pd 
import numpy as np
import requests  # to download a web page
from bs4 import BeautifulSoup  # helps in web scrapping
import pgeocode  # for querying of GPS coordinates from postal codes
from geopy.geocoders import Nominatim  # convert an address into latitude and longitude values
import folium  # map rendering library
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
import json

## 1. <a id="item1">Webscraping and Data Wrangling</a>

The required information on the Toronto neighborhoods can be found on the respective [Wikipedia webpage](https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M):
> This is a list of postal codes in Canada where the first letter is M. Postal codes beginning with M are located within the city of Toronto in the province of Ontario. Only the first three characters are listed, corresponding to the Forward Sortation Area.

We will first scrape this Wiki fot the relevant data, clean it, and read it into a pandas dataframe so that it is in a structured format.

In [31]:
# Get the webpage and store it in text format
url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
data = requests.get(url).text
data[:1000]

'<!DOCTYPE html>\n<html class="client-nojs" lang="en" dir="ltr">\n<head>\n<meta charset="UTF-8"/>\n<title>List of postal codes of Canada: M - Wikipedia</title>\n<script>document.documentElement.className="client-js";RLCONF={"wgBreakFrames":!1,"wgSeparatorTransformTable":["",""],"wgDigitTransformTable":["",""],"wgDefaultDateFormat":"dmy","wgMonthNames":["","January","February","March","April","May","June","July","August","September","October","November","December"],"wgRequestId":"89e20f76-9f5d-42b4-b21e-26fd700031ab","wgCSPNonce":!1,"wgCanonicalNamespace":"","wgCanonicalSpecialPageName":!1,"wgNamespaceNumber":0,"wgPageName":"List_of_postal_codes_of_Canada:_M","wgTitle":"List of postal codes of Canada: M","wgCurRevisionId":1013111980,"wgRevisionId":1013111980,"wgArticleId":539066,"wgIsArticle":!0,"wgIsRedirect":!1,"wgAction":"view","wgUserName":null,"wgUserGroups":["*"],"wgCategories":["Articles with short description","Short description is different from Wikidata","Wikipedia semi-protec

Beautiful Soup objects can be used to parse the contents of an HTML file

In [32]:
soup = BeautifulSoup(data, 'html.parser')
#print(soup.prettify())

In [4]:
# See how many tables overall
tables = soup.find_all('table')
print('Number of tables in the page: ', len(tables))

# Find the correct table index
for index, table in enumerate(tables):
    if ('Downtown Toronto' in str(table)):
        table_index, table_html = index, table
        print('Table with neighborhoods data: ', table_index)

Number of tables in the page:  3
Table with neighborhoods data:  0


The dataframe will consist of three columns: Postal Code, Borough, and Neighborhood

In [5]:
# First extract the contents of table cells, and store them in a list
table_contents = []

for row in table_html('td'):
    cell = {}
    # Only process the cells that have an assigned borough
    if row.span.text == 'Not assigned':
        pass
    else:
        cell['Postal Code'] = row.p.text[:3]
        cell['Borough'] = (row.span.text).split('(')[0]
        cell['Neighborhood'] = (((((row.span.text).split('(')[1]).strip(')')).replace(' /',',')).replace(')',' ')).strip(' ')
        table_contents.append(cell)

df = pd.DataFrame(table_contents)

Some of the entries contain garbage or incorrect values, that can be spotted, e.g., using unique() function

In [6]:
print(df['Borough'].unique())

['North York' 'Downtown Toronto' "Queen's Park" 'Etobicoke' 'Scarborough'
 'East York' 'York' 'East Toronto' 'West Toronto' 'East YorkEast Toronto'
 'Central Toronto' 'MississaugaCanada Post Gateway Processing Centre'
 'Downtown TorontoStn A PO Boxes25 The Esplanade' 'EtobicokeNorthwest'
 'East TorontoBusiness reply mail Processing Centre969 Eastern']


In [7]:
# Manually clean the data
df['Borough'].replace({
    'Downtown TorontoStn A PO Boxes25 The Esplanade': 'Downtown Toronto', 
    'East TorontoBusiness reply mail Processing Centre969 Eastern': 'East Toronto', 
    'EtobicokeNorthwest': 'Etobicoke Northwest', 
    'East YorkEast Toronto': 'East York/East Toronto',
    'MississaugaCanada Post Gateway Processing Centre': 'Mississauga'
    }, inplace = True)

df.head()

Unnamed: 0,Postal Code,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Queen's Park,Ontario Provincial Government


In [8]:
print('The dimension of the dataframe: ', df.shape)

The dimension of the dataframe:  (103, 3)


## 2. <a id="item2"> Fetching Coordinates</a>

Now that we have built a dataframe of neighbourhoods along with the postal code and borough names, we want to utilize the [Foursquare](https://foursquare.com/) API for location data. In order to do this, we need to first get the latitude and the longitude coordinates of each neighborhood.

The suggested method for this exercise to use Geocoder Python package does not work, returning null results (known issue). Instead of using prepared 'Geospatial_Coordinates.csv', we will try alternative [pgeocode module](https://pypi.org/project/pgeocode/).

In [9]:
postal_code = df.loc[0, 'Postal Code']
nomi = pgeocode.Nominatim('ca')
nomi.query_postal_code(postal_code)

postal_code                                                     M3A
country_code                                                     CA
place_name        North York (York Heights / Victoria Village / ...
state_name                                                  Ontario
state_code                                                       ON
county_name                                             North York 
county_code                                                     NaN
community_name                                                  NaN
community_code                                                  NaN
latitude                                                    43.7545
longitude                                                    -79.33
accuracy                                                        1.0
Name: 0, dtype: object

Several arguments will return a dataframe. Together with coordinates we can also obtain Borough and Neighborhood from place_name using this method.

In [10]:
postal_codes = list(df['Postal Code'])
df_pgc = nomi.query_postal_code(postal_codes)[['postal_code', 'place_name', 'latitude', 'longitude']]
df_pgc.head()

Unnamed: 0,postal_code,place_name,latitude,longitude
0,M3A,North York (York Heights / Victoria Village / ...,43.7545,-79.33
1,M4A,North York (Sweeney Park / Wigmore Park),43.7276,-79.3148
2,M5A,Downtown Toronto (Regent Park / Port of Toronto),43.6555,-79.3626
3,M6A,North York (Lawrence Manor / Lawrence Heights),43.7223,-79.4504
4,M7A,Queen's Park Ontario Provincial Government,43.6641,-79.3889


Let's compare this with the data, contained in the prepared csv file. We combine information with the orginal df, merging them on Postal Code (i.e., inner join).

In [11]:
df_coord = pd.read_csv('Geospatial_Coordinates.csv')
df = pd.merge(df, df_coord, on = 'Postal Code')
df.head()

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Queen's Park,Ontario Provincial Government,43.662301,-79.389494


We see that the coordinates are very similar, but the naming of neighborhoods differs slightly in two methods. We continue using df dataframe in the following.

#### Create a map of Toronto with neighborhoods superimposed on top

Use geopy library to get the latitude and longitude values of Toronto. 
In order to define an instance of the geocoder, we need to define a user_agent. We will name our agent <em>toronto_explorer</em>, as shown below.

In [12]:
address = 'Toronto, Ontario'

geolocator = Nominatim(user_agent = "toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinates of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinates of Toronto are 43.6534817, -79.3839347.


Use latitude and longitude values to generate map of Toronto with folium, and add markers to it.

In [13]:
city_map = folium.Map(location=[latitude, longitude], zoom_start=10)

for lat, lng, borough, neighborhood in zip(df['Latitude'], df['Longitude'], df['Borough'], df['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(city_map)  
    
city_map

## 3. <a id="item3">Explore Neighborhoods in Toronto</a>

Next, we are going to start utilizing the Foursquare API to explore and cluster the neighborhoods in the city of Toronto. We will mostly replicate the analysis for the New York City dataset from the course lab.

In [14]:
# Define Foursquare credentials and version
with open('foursquare.json') as f:
    foursquare = json.load(f)
    CLIENT_ID = foursquare['CLIENT_ID']
    CLIENT_SECRET = foursquare['CLIENT_SECRET']

VERSION = '20180605'  # Foursquare API version
LIMIT = 100  # A default Foursquare API limit value

Let's create a function to repeatedly get nearby venues for each neighborhood.


In [15]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        #print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood',
                             'Neighborhood Latitude', 
                             'Neighborhood Longitude', 
                             'Venue', 
                             'Venue Latitude', 
                             'Venue Longitude', 
                             'Venue Category']
    
    return(nearby_venues)

Now write the code to run the above function on each neighborhood and create a new dataframe for venues.


In [16]:
venues = getNearbyVenues(names = df['Neighborhood'],
                         latitudes = df['Latitude'],
                         longitudes = df['Longitude']
                        )

venues

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,43.753259,-79.329656,Brookbanks Park,43.751976,-79.332140,Park
1,Parkwoods,43.753259,-79.329656,Variety Store,43.751974,-79.333114,Food & Drink Shop
2,Victoria Village,43.725882,-79.315572,Victoria Village Arena,43.723481,-79.315635,Hockey Arena
3,Victoria Village,43.725882,-79.315572,Portugril,43.725819,-79.312785,Portuguese Restaurant
4,Victoria Village,43.725882,-79.315572,Tim Hortons,43.725517,-79.313103,Coffee Shop
...,...,...,...,...,...,...,...
2101,"Mimico NW, The Queensway West, South of Bloor,...",43.628841,-79.520999,Jim & Maria's No Frills,43.631152,-79.518617,Grocery Store
2102,"Mimico NW, The Queensway West, South of Bloor,...",43.628841,-79.520999,McDonald's,43.630007,-79.518041,Fast Food Restaurant
2103,"Mimico NW, The Queensway West, South of Bloor,...",43.628841,-79.520999,Koala Tan Tanning Salon & Sunless Spa,43.631370,-79.519006,Tanning Salon
2104,"Mimico NW, The Queensway West, South of Bloor,...",43.628841,-79.520999,Once Upon A Child,43.631075,-79.518290,Kids Store


Let's check how many venues were returned for each neighborhood.


In [17]:
venues.groupby('Neighborhood')[['Venue']].count().transpose()

Neighborhood,Agincourt,"Alderwood, Long Branch","Bathurst Manor, Wilson Heights, Downsview North",Bayview Village,"Bedford Park, Lawrence Manor East",Berczy Park,"Birch Cliff, Cliffside West","Brockton, Parkdale Village, Exhibition Place","CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport",Caledonia-Fairbanks,...,"University of Toronto, Harbord",Victoria Village,Westmount,Weston,"Wexford, Maryvale",Willowdale South,Willowdale West,Woburn,Woodbine Heights,York Mills West
Venue,4,7,23,4,25,58,5,23,13,4,...,32,4,8,1,7,35,6,3,7,3


Let's find out how many unique categories can be curated from all the returned venues.

In [18]:
print('There are {} unique categories.'.format(len(venues['Venue Category'].unique())))

There are 270 unique categories.


In [33]:
# Save it in a file
venues.to_csv('Toronto venues.csv', index = False)

### Analyze Each Neighborhood, Find Most Popular Venues


In [19]:
# one hot encoding
venues_onehot = pd.get_dummies(venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe at the first position (but call it differently)
print('NOTE There is a Venue Category called Neighborhood as well: ', 'Neighborhood' in list(venues['Venue Category']))
venues_onehot.insert(0, 'neighborhood', venues['Neighborhood'])

venues_onehot.head()

NOTE There is a Venue Category called Neighborhood as well:  True


Unnamed: 0,neighborhood,Accessories Store,Airport,Airport Food Court,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,...,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wings Joint,Women's Store,Yoga Studio
0,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Victoria Village,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Victoria Village,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Victoria Village,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Next, let's group rows by neighborhood and take the mean of the frequency of occurrence of each category.


In [20]:
venues_grouped = venues_onehot.groupby('neighborhood').mean().reset_index()

Let's write a function to sort the venues in descending order.

In [21]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each neighborhood.


In [22]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['neighborhood'] = venues_grouped['neighborhood']

for ind in np.arange(venues_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(venues_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agincourt,Breakfast Spot,Lounge,Latin American Restaurant,Skating Rink,Mexican Restaurant,Middle Eastern Restaurant,Metro Station,Miscellaneous Shop,Museum,Mobile Phone Shop
1,"Alderwood, Long Branch",Pizza Place,Gym,Sandwich Place,Coffee Shop,Pub,Pharmacy,Other Great Outdoors,Pet Store,Men's Store,Metro Station
2,"Bathurst Manor, Wilson Heights, Downsview North",Coffee Shop,Bank,Gift Shop,Shopping Mall,Bridal Shop,Mobile Phone Shop,Fried Chicken Joint,Frozen Yogurt Shop,Sandwich Place,Supermarket
3,Bayview Village,Bank,Chinese Restaurant,Japanese Restaurant,Café,Nail Salon,Music Venue,Museum,Movie Theater,Motel,Accessories Store
4,"Bedford Park, Lawrence Manor East",Coffee Shop,Sandwich Place,Italian Restaurant,Pizza Place,Greek Restaurant,Sushi Restaurant,Restaurant,Juice Bar,Thai Restaurant,Pub


### Cluster Neighborhoods

Run _k_-means to cluster the neighborhood into 5 clusters.

In [23]:
# set number of clusters
kclusters = 5

venues_grouped_clustering = venues_grouped.drop('neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(venues_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 3])

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.


In [24]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

# merge venues_grouped with the original df to add latitude/longitude for each neighborhood
venues_merged = df.join(neighborhoods_venues_sorted.set_index('neighborhood'), on='Neighborhood').dropna(0)

venues_merged.head()

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M3A,North York,Parkwoods,43.753259,-79.329656,4.0,Food & Drink Shop,Park,Accessories Store,Mobile Phone Shop,Motel,Moroccan Restaurant,Monument / Landmark,Molecular Gastronomy Restaurant,Modern European Restaurant,Middle Eastern Restaurant
1,M4A,North York,Victoria Village,43.725882,-79.315572,1.0,Pizza Place,Hockey Arena,Portuguese Restaurant,Coffee Shop,Molecular Gastronomy Restaurant,Museum,Movie Theater,Motel,Moroccan Restaurant,Monument / Landmark
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1.0,Coffee Shop,Bakery,Park,Café,Breakfast Spot,Pub,Restaurant,Theater,Yoga Studio,Chocolate Shop
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763,1.0,Clothing Store,Furniture / Home Store,Accessories Store,Boutique,Vietnamese Restaurant,Miscellaneous Shop,Coffee Shop,Monument / Landmark,Movie Theater,Motel
4,M7A,Queen's Park,Ontario Provincial Government,43.662301,-79.389494,1.0,Coffee Shop,Sushi Restaurant,Yoga Studio,Theater,Smoothie Shop,Burrito Place,Café,Mexican Restaurant,Fried Chicken Joint,Sandwich Place


Finally, let's visualize the resulting clusters.


In [25]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in np.array(venues_merged[['Latitude', 'Longitude', 'Neighborhood', 'Cluster Labels']]):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Examine Clusters

Now, we can examine each cluster and determine the discriminating venue categories that distinguish each cluster. Based on the defining categories, you can then assign a name to each cluster.

#### Cluster 1


In [26]:
venues_merged.loc[venues_merged['Cluster Labels'] == 0, venues_merged.columns[[1] + list(range(5, venues_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,Scarborough,0.0,Fast Food Restaurant,Accessories Store,Music Venue,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant
27,North York,0.0,Dog Run,Pool,Golf Course,Fast Food Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Music Venue


#### Cluster 2


In [27]:
venues_merged.loc[venues_merged['Cluster Labels'] == 1, venues_merged.columns[[1] + list(range(5, venues_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,North York,1.0,Pizza Place,Hockey Arena,Portuguese Restaurant,Coffee Shop,Molecular Gastronomy Restaurant,Museum,Movie Theater,Motel,Moroccan Restaurant,Monument / Landmark
2,Downtown Toronto,1.0,Coffee Shop,Bakery,Park,Café,Breakfast Spot,Pub,Restaurant,Theater,Yoga Studio,Chocolate Shop
3,North York,1.0,Clothing Store,Furniture / Home Store,Accessories Store,Boutique,Vietnamese Restaurant,Miscellaneous Shop,Coffee Shop,Monument / Landmark,Movie Theater,Motel
4,Queen's Park,1.0,Coffee Shop,Sushi Restaurant,Yoga Studio,Theater,Smoothie Shop,Burrito Place,Café,Mexican Restaurant,Fried Chicken Joint,Sandwich Place
7,North York,1.0,Construction & Landscaping,Japanese Restaurant,Caribbean Restaurant,Café,Gym,Moroccan Restaurant,Museum,Movie Theater,Motel,Monument / Landmark
...,...,...,...,...,...,...,...,...,...,...,...,...
97,Downtown Toronto,1.0,Coffee Shop,Café,Hotel,Japanese Restaurant,Gym,Restaurant,Salad Place,American Restaurant,Seafood Restaurant,Steakhouse
99,Downtown Toronto,1.0,Sushi Restaurant,Coffee Shop,Japanese Restaurant,Gay Bar,Restaurant,Yoga Studio,Pub,Men's Store,Grocery Store,Mediterranean Restaurant
100,East Toronto,1.0,Light Rail Station,Yoga Studio,Garden Center,Park,Comic Shop,Recording Studio,Restaurant,Farmers Market,Fast Food Restaurant,Skate Park
101,Etobicoke,1.0,Baseball Field,Locksmith,Accessories Store,Modern European Restaurant,Movie Theater,Motel,Moroccan Restaurant,Monument / Landmark,Molecular Gastronomy Restaurant,Mobile Phone Shop


#### Cluster 3


In [28]:
venues_merged.loc[venues_merged['Cluster Labels'] == 2, venues_merged.columns[[1] + list(range(5, venues_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
62,Central Toronto,2.0,Garden,Accessories Store,Modern European Restaurant,Movie Theater,Motel,Moroccan Restaurant,Monument / Landmark,Molecular Gastronomy Restaurant,Mobile Phone Shop,Massage Studio


#### Cluster 4

In [29]:
venues_merged.loc[venues_merged['Cluster Labels'] == 3, venues_merged.columns[[1] + list(range(5, venues_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
21,York,3.0,Park,Women's Store,Pool,Accessories Store,Mobile Phone Shop,Motel,Moroccan Restaurant,Monument / Landmark,Molecular Gastronomy Restaurant,Modern European Restaurant
64,York,3.0,Park,Accessories Store,Mobile Phone Shop,Movie Theater,Motel,Moroccan Restaurant,Monument / Landmark,Molecular Gastronomy Restaurant,Modern European Restaurant,Miscellaneous Shop
66,North York,3.0,Park,Convenience Store,Modern European Restaurant,Museum,Movie Theater,Motel,Moroccan Restaurant,Monument / Landmark,Molecular Gastronomy Restaurant,Mobile Phone Shop
83,Central Toronto,3.0,Park,Accessories Store,Mobile Phone Shop,Movie Theater,Motel,Moroccan Restaurant,Monument / Landmark,Molecular Gastronomy Restaurant,Modern European Restaurant,Miscellaneous Shop


#### Cluster 5


In [30]:
venues_merged.loc[venues_merged['Cluster Labels'] == 4, venues_merged.columns[[1] + list(range(5, venues_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,North York,4.0,Food & Drink Shop,Park,Accessories Store,Mobile Phone Shop,Motel,Moroccan Restaurant,Monument / Landmark,Molecular Gastronomy Restaurant,Modern European Restaurant,Middle Eastern Restaurant
10,North York,4.0,Bakery,Pizza Place,Japanese Restaurant,Park,Molecular Gastronomy Restaurant,Museum,Movie Theater,Motel,Moroccan Restaurant,Monument / Landmark
32,Scarborough,4.0,Playground,Jewelry Store,Modern European Restaurant,Movie Theater,Motel,Moroccan Restaurant,Monument / Landmark,Molecular Gastronomy Restaurant,Mobile Phone Shop,Massage Studio
35,East York/East Toronto,4.0,Convenience Store,Park,Modern European Restaurant,Museum,Movie Theater,Motel,Moroccan Restaurant,Monument / Landmark,Molecular Gastronomy Restaurant,Mobile Phone Shop
40,North York,4.0,Construction & Landscaping,Park,Other Repair Shop,Airport,Wine Bar,Museum,Medical Center,Mediterranean Restaurant,Men's Store,Metro Station
46,North York,4.0,Shopping Mall,Park,Grocery Store,Bank,Accessories Store,Movie Theater,Motel,Moroccan Restaurant,Monument / Landmark,Molecular Gastronomy Restaurant
49,North York,4.0,Construction & Landscaping,Basketball Court,Park,Bakery,Monument / Landmark,Music Venue,Museum,Movie Theater,Motel,Moroccan Restaurant
61,Central Toronto,4.0,Bus Line,Park,Swim School,Business Service,Modern European Restaurant,Motel,Moroccan Restaurant,Monument / Landmark,Molecular Gastronomy Restaurant,Accessories Store
77,Etobicoke,4.0,Sandwich Place,Mobile Phone Shop,Park,Movie Theater,Motel,Moroccan Restaurant,Monument / Landmark,Molecular Gastronomy Restaurant,Modern European Restaurant,Accessories Store
85,Scarborough,4.0,Park,Playground,Intersection,Modern European Restaurant,Movie Theater,Motel,Moroccan Restaurant,Monument / Landmark,Molecular Gastronomy Restaurant,Mobile Phone Shop
