# Segmenting and Clustering Neighborhoods in the city of  Toronto, Canada

#### In this assignment, you will be required to explore, segment, and cluster the neighborhoods in the city of Toronto. However, unlike New York, the neighborhood data is not readily available on the internet. What is interesting about the field of data science is that each project can be challenging in its unique way, so you need to learn to be agile and refine the skill to learn new libraries and tools quickly depending on the project.

### All the three parts are included in this notebook.

## Requirement 1: create a dataframe by web scraping wikipedia pages: https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M

In [1]:
!pip install folium

Collecting folium
[?25l  Downloading https://files.pythonhosted.org/packages/fd/a0/ccb3094026649cda4acd55bf2c3822bb8c277eb11446d13d384e5be35257/folium-0.10.1-py2.py3-none-any.whl (91kB)
[K     |████████████████████████████████| 92kB 16.1MB/s eta 0:00:01
[?25hCollecting branca>=0.3.0 (from folium)
  Downloading https://files.pythonhosted.org/packages/81/6d/31c83485189a2521a75b4130f1fee5364f772a0375f81afff619004e5237/branca-0.4.0-py3-none-any.whl
Installing collected packages: branca, folium
Successfully installed branca-0.4.0 folium-0.10.1


In [2]:
import requests
import bs4
import numpy as np
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json 
from geopy.geocoders import Nominatim 
from pandas.io.json import json_normalize
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


In [3]:
web_link = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'

In [4]:
response = requests.get(web_link)

try:
    response.raise_for_status()
    soup_obj = bs4.BeautifulSoup(response.text, 'html.parser')
    
except Exception as exc:
    print('Error while downloading the webpage.. %s' % exc)

In [5]:
content = soup_obj.find('table', attrs={'class':'wikitable sortable'})
#content

#### create the data frame with columns: 'PostalCode', 'Borough', 'Neighborhood' and filtering those Boroughs which are not assigned

In [6]:
column_name = ['PostalCode', 'Borough', 'Neighborhood'] 
data_frame = pd.DataFrame(columns=column_name)
data_frame

for row in content.findAll('tr'):
    line_item = row.findAll('td')
    if (len(line_item) == 3):
        post_code = line_item[0].text.strip()
        borough = line_item[1].text.strip()
        neighborhood = line_item[2].text.strip()
        data_frame = data_frame.append({'PostalCode': post_code,
                                        'Borough': borough,
                                        'Neighborhood': neighborhood},
                                        ignore_index=True)
        
df = data_frame[data_frame['Borough'] != 'Not assigned']
#df.head(12)

#### If it has a borough but a Not assigned neighborhood, then the neighborhood will be the same as the borough. 

In [7]:
def set_neighborhood_as_borough(neighborhood):
    if neighborhood['Neighborhood'] == 'Not assigned':
        neighborhood['Neighborhood'] = neighborhood['Borough']
    return neighborhood
    
df = df.apply(set_neighborhood_as_borough, axis=1)
#df.head(12)

#### More than one neighborhood can exist in one postal code area. Those rows will be combined into one row with the neighborhoods separated with a comma 

In [8]:
df = df.groupby(['PostalCode', 'Borough'], sort=False)['Neighborhood'].apply(', '.join).reset_index()
df.head(12)

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M6A,North York,"Lawrence Heights, Lawrence Manor"
4,M7A,Downtown Toronto,Queen's Park
5,M9A,Etobicoke,Islington Avenue
6,M1B,Scarborough,"Rouge, Malvern"
7,M3B,North York,Don Mills North
8,M4B,East York,"Woodbine Gardens, Parkview Hill"
9,M5B,Downtown Toronto,"Ryerson, Garden District"


In [9]:
print('The dataframe has exactly {} rows.'.format(df.shape[0]))

The dataframe has exactly 103 rows.


## Requirement 2: get the latitude and the longitude coordinates of each neighborhood.

#### Reading the csv file and getting the coordinates of the postal codes

In [10]:
lat_lon_file = pd.read_csv('http://cocl.us/Geospatial_data')
lat_lon_file.columns = ['PostalCode', 'Latitude', 'Longitude']
lat_lon_file.head(12)

Unnamed: 0,PostalCode,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476
5,M1J,43.744734,-79.239476
6,M1K,43.727929,-79.262029
7,M1L,43.711112,-79.284577
8,M1M,43.716316,-79.239476
9,M1N,43.692657,-79.264848


#### Merging the dataframe with the neighborhood details dataframe created in Question 1 to create the final dataframe

In [11]:
df_toronto = pd.merge(df, lat_lon_file, on='PostalCode')
df_toronto.head(12)

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,Harbourfront,43.65426,-79.360636
3,M6A,North York,"Lawrence Heights, Lawrence Manor",43.718518,-79.464763
4,M7A,Downtown Toronto,Queen's Park,43.662301,-79.389494
5,M9A,Etobicoke,Islington Avenue,43.667856,-79.532242
6,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
7,M3B,North York,Don Mills North,43.745906,-79.352188
8,M4B,East York,"Woodbine Gardens, Parkview Hill",43.706397,-79.309937
9,M5B,Downtown Toronto,"Ryerson, Garden District",43.657162,-79.378937


## Requirement 3: Replicate NYC analysis and generate map for neighborhoods and clusters in Toronto. 

Boroughs only containing Toronto

In [12]:
toronto_borough = df_toronto[df_toronto['Borough'].str.contains('Toronto')].reset_index(drop=True)
toronto_borough.head(12)

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M5A,Downtown Toronto,Harbourfront,43.65426,-79.360636
1,M7A,Downtown Toronto,Queen's Park,43.662301,-79.389494
2,M5B,Downtown Toronto,"Ryerson, Garden District",43.657162,-79.378937
3,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
4,M4E,East Toronto,The Beaches,43.676357,-79.293031
5,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306
6,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383
7,M6G,Downtown Toronto,Christie,43.669542,-79.422564
8,M5H,Downtown Toronto,"Adelaide, King, Richmond",43.650571,-79.384568
9,M6H,West Toronto,"Dovercourt Village, Dufferin",43.669005,-79.442259


In [13]:
print('{} Postal codes with Toronto'.format(toronto_borough.shape[0]))

39 Postal codes with Toronto


### Explore the neighborhoods in Toronto using Foursquare

In [69]:
# The code was removed by Watson Studio for sharing.

In [15]:
radius = 500
limit = 100

In [16]:
def getNearbyVenuesbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
            
        # API URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            limit)
            
        # GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # get results
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [17]:
toronto_venues = getNearbyVenuesbyVenues(names=toronto_borough['Neighborhood'],latitudes=toronto_borough['Latitude'],longitudes=toronto_borough['Longitude']                            )

In [18]:
print(toronto_venues.shape)
toronto_venues.head(12)

(1715, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Harbourfront,43.65426,-79.360636,Roselle Desserts,43.653447,-79.362017,Bakery
1,Harbourfront,43.65426,-79.360636,Tandem Coffee,43.653559,-79.361809,Coffee Shop
2,Harbourfront,43.65426,-79.360636,Cooper Koo Family YMCA,43.653249,-79.358008,Distribution Center
3,Harbourfront,43.65426,-79.360636,Body Blitz Spa East,43.654735,-79.359874,Spa
4,Harbourfront,43.65426,-79.360636,Morning Glory Cafe,43.653947,-79.361149,Breakfast Spot
5,Harbourfront,43.65426,-79.360636,Impact Kitchen,43.656369,-79.35698,Restaurant
6,Harbourfront,43.65426,-79.360636,Corktown Common,43.655618,-79.356211,Park
7,Harbourfront,43.65426,-79.360636,Figs Breakfast & Lunch,43.655675,-79.364503,Breakfast Spot
8,Harbourfront,43.65426,-79.360636,The Distillery Historic District,43.650244,-79.359323,Historic Site
9,Harbourfront,43.65426,-79.360636,Dominion Pub and Kitchen,43.656919,-79.358967,Pub


In [19]:
toronto_venues[['Neighborhood','Venue']].groupby('Neighborhood').count()

Unnamed: 0_level_0,Venue
Neighborhood,Unnamed: 1_level_1
"Adelaide, King, Richmond",100
Berczy Park,56
"Brockton, Exhibition Place, Parkdale Village",23
Business Reply Mail Processing Centre 969 Eastern,16
"CN Tower, Bathurst Quay, Island airport, Harbourfront West, King and Spadina, Railway Lands, South Niagara",17
"Cabbagetown, St. James Town",45
Central Bay Street,80
"Chinatown, Grange Park, Kensington Market",89
Christie,18
Church and Wellesley,87


In [20]:
neighborhood_venue_category = toronto_venues[['Neighborhood','Venue Category','Venue']].groupby(['Neighborhood','Venue Category']).count()
top_neighborhood_venue_category = neighborhood_venue_category[neighborhood_venue_category['Venue'] > 4]
top_neighborhood_venue_category                                                              

Unnamed: 0_level_0,Unnamed: 1_level_0,Venue
Neighborhood,Venue Category,Unnamed: 2_level_1
"Adelaide, King, Richmond",Coffee Shop,7
"Adelaide, King, Richmond",Restaurant,5
Berczy Park,Coffee Shop,5
Central Bay Street,Coffee Shop,13
"Chinatown, Grange Park, Kensington Market",Bar,6
"Chinatown, Grange Park, Kensington Market",Café,5
"Chinatown, Grange Park, Kensington Market",Vietnamese Restaurant,5
Church and Wellesley,Coffee Shop,7
Church and Wellesley,Japanese Restaurant,6
"Commerce Court, Victoria Hotel",Café,7


### Analyze neighborhoods

In [32]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")
toronto_onehot = toronto_onehot.rename(columns={"Neighborhood": "Neighborhod"})
toronto_onehot.insert(0, 'Neighborhood', toronto_venues['Neighborhood'])
# group by neighborhood
toronto_grouped = toronto_onehot.groupby("Neighborhood").mean().reset_index()

## Print each neighborhood along with the top 5 most common venues

In [34]:
num_top_venues = 5

for hood in toronto_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Adelaide, King, Richmond----
             venue  freq
0      Coffee Shop  0.07
1       Restaurant  0.05
2  Thai Restaurant  0.04
3             Café  0.04
4       Steakhouse  0.03


----Berczy Park----
            venue  freq
0     Coffee Shop  0.09
1    Cocktail Bar  0.05
2     Cheese Shop  0.04
3  Farmers Market  0.04
4          Bakery  0.04


----Brockton, Exhibition Place, Parkdale Village----
                 venue  freq
0                 Café  0.13
1          Coffee Shop  0.09
2       Breakfast Spot  0.09
3        Grocery Store  0.04
4  Japanese Restaurant  0.04


----Business Reply Mail Processing Centre 969 Eastern----
                venue  freq
0         Yoga Studio  0.06
1                 Spa  0.06
2       Garden Center  0.06
3              Garden  0.06
4  Light Rail Station  0.06


----CN Tower, Bathurst Quay, Island airport, Harbourfront West, King and Spadina, Railway Lands, South Niagara----
              venue  freq
0   Airport Service  0.18
1    Airport Lounge  0.12

Let's write  a function to sort the venues in descending order.

In [35]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each neighborhood.

In [51]:
import numpy as np
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Adelaide, King, Richmond",Coffee Shop,Restaurant,Thai Restaurant,Café,Bar,Steakhouse,Sushi Restaurant,Hotel,Bakery,Gastropub
1,Berczy Park,Coffee Shop,Cocktail Bar,Café,Farmers Market,Bakery,Restaurant,Beer Bar,Cheese Shop,Seafood Restaurant,Beach
2,"Brockton, Exhibition Place, Parkdale Village",Café,Breakfast Spot,Coffee Shop,Climbing Gym,Burrito Place,Japanese Restaurant,Italian Restaurant,Restaurant,Stadium,Intersection
3,Business Reply Mail Processing Centre 969 Eastern,Yoga Studio,Auto Workshop,Garden Center,Garden,Fast Food Restaurant,Farmers Market,Light Rail Station,Comic Shop,Pizza Place,Restaurant
4,"CN Tower, Bathurst Quay, Island airport, Harbo...",Airport Service,Airport Lounge,Airport Terminal,Harbor / Marina,Airport,Airport Food Court,Airport Gate,Bar,Boutique,Rental Car Location


### Clustering the neighborhoods

In [52]:
# cluster #
n = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# kmeans clustering
kmeans = KMeans(n_clusters=n, random_state=0).fit(toronto_grouped_clustering)
kmeans.labels_[:12] 

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int32)

In [53]:
neighbors = toronto_borough[['Neighborhood','Latitude','Longitude']]
neighbors.head()

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Harbourfront,43.65426,-79.360636
1,Queen's Park,43.662301,-79.389494
2,"Ryerson, Garden District",43.657162,-79.378937
3,St. James Town,43.651494,-79.375418
4,The Beaches,43.676357,-79.293031


In [54]:
neighborhoods = toronto_grouped[['Neighborhood']]
neighborhoods.insert(0, 'Cluster Labels', kmeans.labels_)
neighborhoods.head()

Unnamed: 0,Cluster Labels,Neighborhood
0,0,"Adelaide, King, Richmond"
1,0,Berczy Park
2,0,"Brockton, Exhibition Place, Parkdale Village"
3,0,Business Reply Mail Processing Centre 969 Eastern
4,0,"CN Tower, Bathurst Quay, Island airport, Harbo..."


In [55]:
locations = toronto_borough[['Neighborhood','Latitude','Longitude']]
locations = locations.join(neighborhoods.set_index('Neighborhood'),on='Neighborhood')
locations.head()

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels
0,Harbourfront,43.65426,-79.360636,0
1,Queen's Park,43.662301,-79.389494,0
2,"Ryerson, Garden District",43.657162,-79.378937,0
3,St. James Town,43.651494,-79.375418,0
4,The Beaches,43.676357,-79.293031,4


Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [58]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster_Labels', kmeans.labels_)

toronto_merged = df_toronto

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

toronto_merged.head() 

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M3A,North York,Parkwoods,43.753259,-79.329656,,,,,,,,,,,
1,M4A,North York,Victoria Village,43.725882,-79.315572,,,,,,,,,,,
2,M5A,Downtown Toronto,Harbourfront,43.65426,-79.360636,0.0,Coffee Shop,Park,Bakery,Pub,Café,Theater,Mexican Restaurant,Breakfast Spot,Restaurant,Shoe Store
3,M6A,North York,"Lawrence Heights, Lawrence Manor",43.718518,-79.464763,,,,,,,,,,,
4,M7A,Downtown Toronto,Queen's Park,43.662301,-79.389494,0.0,Coffee Shop,Park,Yoga Studio,Distribution Center,Beer Bar,Seafood Restaurant,Japanese Restaurant,Sandwich Place,Juice Bar,Restaurant


### generate the map

In [59]:
# get location of Toronto
address = 'Toronto, CA'

geolocator = Nominatim(user_agent="toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.653963, -79.387207.


In [63]:
# clean up na
toronto_merged=toronto_merged.dropna()
toronto_merged['Cluster_Labels'] = toronto_merged.Cluster_Labels.astype(int)

# generate map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(n)
ys = [i + x + (i*x)**2 for i in range(n)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster_Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
        
map_clusters

In [28]:
from IPython.display import Image
from IPython.core.display import HTML 
Image(url= "https://github.com/henryxji/learn/blob/master/segmenting%20and%20clustering%20neighborhoods%20in%20Toronto%20map.PNG?raw=true")

# Examine Clusters

### Cluster 1

In [64]:
toronto_merged.loc[toronto_merged['Cluster_Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Downtown Toronto,0,Coffee Shop,Park,Bakery,Pub,Café,Theater,Mexican Restaurant,Breakfast Spot,Restaurant,Shoe Store
4,Downtown Toronto,0,Coffee Shop,Park,Yoga Studio,Distribution Center,Beer Bar,Seafood Restaurant,Japanese Restaurant,Sandwich Place,Juice Bar,Restaurant
9,Downtown Toronto,0,Coffee Shop,Clothing Store,Café,Japanese Restaurant,Middle Eastern Restaurant,Bubble Tea Shop,Plaza,Cosmetics Shop,Bookstore,Diner
15,Downtown Toronto,0,Coffee Shop,Café,Restaurant,Hotel,Italian Restaurant,Cosmetics Shop,Clothing Store,Bakery,Breakfast Spot,Beer Bar
20,Downtown Toronto,0,Coffee Shop,Cocktail Bar,Café,Farmers Market,Bakery,Restaurant,Beer Bar,Cheese Shop,Seafood Restaurant,Beach
24,Downtown Toronto,0,Coffee Shop,Italian Restaurant,Sandwich Place,Burger Joint,Juice Bar,Ice Cream Shop,Japanese Restaurant,Middle Eastern Restaurant,Salad Place,Bubble Tea Shop
25,Downtown Toronto,0,Grocery Store,Café,Park,Italian Restaurant,Candy Store,Baby Store,Coffee Shop,Gas Station,Nightclub,Diner
30,Downtown Toronto,0,Coffee Shop,Restaurant,Thai Restaurant,Café,Bar,Steakhouse,Sushi Restaurant,Hotel,Bakery,Gastropub
31,West Toronto,0,Bakery,Pharmacy,Grocery Store,Gym / Fitness Center,Middle Eastern Restaurant,Music Venue,Pool,Portuguese Restaurant,Café,Brewery
36,Downtown Toronto,0,Coffee Shop,Aquarium,Café,Italian Restaurant,Hotel,Sporting Goods Shop,Scenic Lookout,Restaurant,Brewery,Fried Chicken Joint


### Cluster 2

In [65]:
toronto_merged.loc[toronto_merged['Cluster_Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
62,Central Toronto,1,Garden,Yoga Studio,Department Store,Empanada Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant,Dog Run


### Cluster 3

In [66]:
toronto_merged.loc[toronto_merged['Cluster_Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
83,Central Toronto,2,Restaurant,Playground,Summer Camp,Department Store,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant,Dog Run


### Cluster 4

In [67]:
toronto_merged.loc[toronto_merged['Cluster_Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
61,Central Toronto,3,Park,Swim School,Bus Line,Ethiopian Restaurant,Empanada Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant
91,Downtown Toronto,3,Park,Playground,Trail,Department Store,Empanada Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant


### Cluster 5

In [68]:
toronto_merged.loc[toronto_merged['Cluster_Labels'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
19,East Toronto,4,Trail,Pub,Neighborhod,Health Food Store,Yoga Studio,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run
