# <center>Segmenting and Clustering Neighborhoods in Toronto City</center>

## Introduction

In this lab, I have used geopy library to convert addresses into their equivalent latitude and longitude values. Also, I have used the Foursquare API to explore neighborhoods in Toronto City. I have used the <b>explore</b> function to get the most common venue categories in each neighborhood, and then use this feature to group the neighborhoods into clusters. I have used the k-means clustering algorithm to complete this task. Finally, I have used the Folium library to visualize the neighborhoods in Toronto City and their emerging clusters.

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1. <a href="#item1">Scrape Wikipedia Page for Toronto Neighborhoods</a><br>
<br>
2. <a href="#item2">Explore Neighborhoods in Toronto City</a><br>
<br>
3. <a href="#item3">Analyze Each Neighborhood</a><br>
<br>
4. <a href="#item4">Cluster Neighborhoods</a><br>
<br>
5. <a href="#item5">Examine Clusters</a>    <br>
</font>
</div>

In [482]:
from urllib.request import urlopen #library to open and read http requests
from bs4 import BeautifulSoup #library helpful to scrap the web pages
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
import json # library to handle JSON files
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library

print('Libraries imported.')

Libraries imported.


## 1. Scrape Wikipedia page for Toronto Neighborhoods

The web page we are going to look at is 'https://en.wikipedia.org/wiki/List_of_neighbourhoods_in_Toronto'. It has the full list of Toronto neighborhood details which are categorized into 6 boroughs and 210 neighborhoods. In turn the link to each neighborhood page provides access to the coordinates (longitude, latitude). The page also has few duplicate neighborhoods.

In [2]:
#open the wikipedia url using urllib.urlopen method
html = urlopen('https://en.wikipedia.org/wiki/List_of_neighbourhoods_in_Toronto')
html

<http.client.HTTPResponse at 0x54e5048>

In [3]:
#Create an object of BeautifulSoup to read the html object returned in the above step
res = BeautifulSoup(html.read(), 'html5lib')

In [4]:
#print the title of Wikipedia page
print(res.title)

<title>List of neighbourhoods in Toronto - Wikipedia</title>


In [139]:
#Use the BeautifulSoup object 'res' to scrape the boroughs of Toronto
headings = res.findAll('li', {'class':'toclevel-2'})

boroughs = []
for heading in headings:
    boroughs.append(heading.find('span', attrs = {'class':'toctext'}).text)
boroughs

['Old Toronto', 'East York', 'Etobicoke', 'Scarborough', 'North York', 'York']

In [386]:
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

In [388]:
#Scrape Neighborhood, longitude and latitude values from the Wikipedia page
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
geolocator = Nominatim()

table = res.findAll('table', attrs = {'class':'multicol'}) 

for i in range(len(table)):
    
    rs = table[i].findAll('li')
    
    for result in rs:
        
        borough = boroughs[i] #Get the i'th borough from boroughs list
        neighborhood_name = result.a.text.strip() #Get the name of neighborhood
        
        #Get the latitude & longitude values
        url = 'https://en.wikipedia.org' + result.a['href']
        html = urlopen(url)
        response = BeautifulSoup(html.read(), 'html5lib')
        location = response.find('span', {'class':'geo'})
        
        #Check if location has got lat & long values
        if location is None:
            try:
                #If none scraped, Use Nominatim object geolocator to retrieve latitude and longitude
                address = neighborhood_name + (', Toronto, CA')
                location = geolocator.geocode(address)
            except Exception as error: #Handle timeout errors for inappropriate addresses
                neighborhood_lat is null
                neighborhood_lon is null
        else:
            neighborhood_lat = location.text.split(';')[0].strip()
            neighborhood_lon = location.text.split(';')[1].strip()
        
                
        #Append to the neighborhoods dataframe
        neighborhoods = neighborhoods.append({'Borough': borough,
                                              'Neighborhood': neighborhood_name,
                                              'Latitude': neighborhood_lat,
                                              'Longitude': neighborhood_lon}, ignore_index=True)
neighborhoods.head()



Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Old Toronto,Alexandra Park,43.65,-79.4
1,Old Toronto,The Annex,43.67,-79.404
2,Old Toronto,Baldwin Village,43.656,-79.3934
3,Old Toronto,Cabbagetown,43.6664,-79.3629
4,Old Toronto,CityPlace,43.640044,-79.395179


In [389]:
neighborhoods.tail()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
205,York,Old Mill,43.6521,-79.4893
206,York,Rockcliffe–Smythe,43.67528,-79.48861
207,York,Silverthorn,43.69,-79.476
208,York,Tichester,43.689829,-79.478066
209,York,Weston,43.7009889,-79.5197


In [431]:
#Check for duplicated values
neighborhoods[neighborhoods.Neighborhood.duplicated()]

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
82,Old Toronto,Queen Street West,43.649584,-79.39241
91,East York,East Danforth,43.68806,-79.30194
164,North York,Bermondsey,43.69556,-79.45


In [434]:
neighborhoods[neighborhoods.Neighborhood.isin(['Queen Street West','East Danforth','Bermondsey'])]

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
23,Old Toronto,Queen Street West,43.649584,-79.39241
33,Old Toronto,East Danforth,43.68806,-79.30194
82,Old Toronto,Queen Street West,43.649584,-79.39241
91,East York,East Danforth,43.68806,-79.30194
94,East York,Bermondsey,43.69556,-79.45
164,North York,Bermondsey,43.69556,-79.45


In [443]:
#Remove duplicte values based on column Neighborhood
neighborhoods.drop_duplicates('Neighborhood', inplace=True)

In [442]:
neighborhoods.shape

(207, 4)

In [444]:
#Check for null values
neighborhoods.isnull().sum()

Borough         0
Neighborhood    0
Latitude        0
Longitude       0
dtype: int64

In [483]:
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(neighborhoods['Borough'].unique()),
        neighborhoods.shape[0]
    )
)

The dataframe has 6 boroughs and 207 neighborhoods.


In [446]:
#Export the dataframe to a csv file
neighborhoods.to_csv('TorontoNeighborhoods.csv', index=False)

In [395]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [396]:
#Define FourSquare credentials and version
CLIENT_ID = 'YEP0IXWCN4ZGNGNO21BL2WUSF3F2ZPOXRZ13YOTBYLL4XJGN' # your Foursquare ID
CLIENT_SECRET = 'KUMERWHCHRO1LOC3FNTYK5RKGGU1WVWN1YP3KQ1Y0FXRRPOS' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: YEP0IXWCN4ZGNGNO21BL2WUSF3F2ZPOXRZ13YOTBYLL4XJGN
CLIENT_SECRET:KUMERWHCHRO1LOC3FNTYK5RKGGU1WVWN1YP3KQ1Y0FXRRPOS


## 2. Explore Neighborhoods in Toronto

#### Let's create a function to extract venue details for all neighborhoods in Toronto

In [400]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [448]:
# Now write the code to run the above function on each neighborhood and create a new dataframe called *toronto_venues*.
toronto_venues = getNearbyVenues(names=neighborhoods['Neighborhood'],
                                   latitudes=neighborhoods['Latitude'],
                                   longitudes=neighborhoods['Longitude']
                                  )
toronto_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Alexandra Park,43.65,-79.4,Maker Pizza,43.650401,-79.39804,Pizza Place
1,Alexandra Park,43.65,-79.4,Come As You Are Co-operative,43.648307,-79.398315,Adult Boutique
2,Alexandra Park,43.65,-79.4,Core Studio Yoga & Pilates,43.64792,-79.400196,Yoga Studio
3,Alexandra Park,43.65,-79.4,Saku Sushi,43.648038,-79.400268,Sushi Restaurant
4,Alexandra Park,43.65,-79.4,Sonic Boom,43.650859,-79.396985,Record Shop


In [449]:
#Let's check the size of the resulting dataframe
toronto_venues.shape

(5205, 7)

In [450]:
#Let's check how many venues were returned for each neighborhood
toronto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Agincourt,3,3,3,3,3,3
Alderwood,4,4,4,4,4,4
Alexandra Park,100,100,100,100,100,100
Amesbury,3,3,3,3,3,3
Armadale,12,12,12,12,12,12
Armour Heights,1,1,1,1,1,1
Baby Point,13,13,13,13,13,13
Baldwin Village,73,73,73,73,73,73
Bathurst Manor,4,4,4,4,4,4
Bayview Village,3,3,3,3,3,3


In [451]:
#Let's find out how many unique categories can be curated from all the returned venues
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 322 uniques categories.


## 3. Analyze Each Neighborhood

In [453]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighborhood'] = toronto_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Yoga Studio,Accessories Store,Adult Boutique,Afghan Restaurant,African Restaurant,American Restaurant,Animal Shelter,Antique Shop,Aquarium,Arepa Restaurant,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [454]:
#And let's examine the new dataframe size.
toronto_onehot.shape

(5205, 322)

In [456]:
#Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped

Unnamed: 0,Neighborhood,Yoga Studio,Accessories Store,Adult Boutique,Afghan Restaurant,African Restaurant,American Restaurant,Animal Shelter,Antique Shop,Aquarium,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,Agincourt,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.00000,0.0,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000
1,Alderwood,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.00000,0.0,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000
2,Alexandra Park,0.020000,0.000000,0.010000,0.000000,0.000000,0.000000,0.000000,0.00000,0.0,...,0.050000,0.010000,0.000000,0.010000,0.000000,0.0,0.010000,0.000000,0.000000,0.000000
3,Amesbury,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.00000,0.0,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000
4,Armadale,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.00000,0.0,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.083333,0.000000
5,Armour Heights,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.00000,0.0,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000
6,Baby Point,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.00000,0.0,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000
7,Baldwin Village,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.00000,0.0,...,0.013699,0.000000,0.000000,0.041096,0.000000,0.0,0.000000,0.000000,0.000000,0.000000
8,Bathurst Manor,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.00000,0.0,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000
9,Bayview Village,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.00000,0.0,...,0.000000,0.000000,0.000000,0.000000,0.000000,0.0,0.000000,0.000000,0.000000,0.000000


In [484]:
#Let's confirm the new size
toronto_grouped.shape

(200, 322)

In [410]:
#let's write a function to sort the venues in descending order.
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [458]:
#Now let's create the new dataframe and display the top 10 venues for each neighborhood.
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agincourt,Park,BBQ Joint,Women's Store,Falafel Restaurant,Electronics Store,Elementary School,Empanada Restaurant,Ethiopian Restaurant,Event Space,Exhibit
1,Alderwood,Playground,Athletics & Sports,Market,Park,Women's Store,Falafel Restaurant,Electronics Store,Elementary School,Empanada Restaurant,Ethiopian Restaurant
2,Alexandra Park,Vegetarian / Vegan Restaurant,Café,Bar,Coffee Shop,Dessert Shop,French Restaurant,Ice Cream Shop,Indian Restaurant,Filipino Restaurant,Park
3,Amesbury,Grocery Store,Lighting Store,Supermarket,Women's Store,Falafel Restaurant,Electronics Store,Elementary School,Empanada Restaurant,Ethiopian Restaurant,Event Space
4,Armadale,Spa,Fast Food Restaurant,Juice Bar,Greek Restaurant,Liquor Store,Big Box Store,Hakka Restaurant,Coffee Shop,Burger Joint,Pizza Place
5,Armour Heights,Pool,Women's Store,Farm,Electronics Store,Elementary School,Empanada Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Falafel Restaurant
6,Baby Point,Playground,Coffee Shop,Pizza Place,Burger Joint,Café,Taco Place,Latin American Restaurant,South American Restaurant,Bakery,Grocery Store
7,Baldwin Village,Coffee Shop,Café,Sandwich Place,Chinese Restaurant,Vietnamese Restaurant,Art Gallery,Ramen Restaurant,Dim Sum Restaurant,Sushi Restaurant,Japanese Restaurant
8,Bathurst Manor,Convenience Store,Baseball Field,Playground,Park,Falafel Restaurant,Electronics Store,Elementary School,Empanada Restaurant,Ethiopian Restaurant,Event Space
9,Bayview Village,Construction & Landscaping,Dog Run,Trail,Falafel Restaurant,Electronics Store,Elementary School,Empanada Restaurant,Ethiopian Restaurant,Event Space,Exhibit


## 4. Cluster Neighborhoods

Run k-means to cluster the neighborhood into 5 clusters.

In [459]:
# set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 2, 2, 1, 1, 2, 2, 2, 2, 2])

In [464]:
#Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.
toronto_merged = neighborhoods

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood', how='right')

# add clustering labels
toronto_merged['Cluster Labels'] = kmeans.labels_

toronto_merged.head() # check the last columns!

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster Labels
0,Old Toronto,Alexandra Park,43.65,-79.4,Vegetarian / Vegan Restaurant,Café,Bar,Coffee Shop,Dessert Shop,French Restaurant,Ice Cream Shop,Indian Restaurant,Filipino Restaurant,Park,0
1,Old Toronto,The Annex,43.67,-79.404,Coffee Shop,Pharmacy,Park,Greek Restaurant,Gym,Thai Restaurant,Pub,Pizza Place,Bookstore,Café,2
2,Old Toronto,Baldwin Village,43.656,-79.3934,Coffee Shop,Café,Sandwich Place,Chinese Restaurant,Vietnamese Restaurant,Art Gallery,Ramen Restaurant,Dim Sum Restaurant,Sushi Restaurant,Japanese Restaurant,2
3,Old Toronto,Cabbagetown,43.6664,-79.3629,Coffee Shop,Park,Breakfast Spot,Snack Place,Baseball Field,Jewelry Store,Taiwanese Restaurant,General Entertainment,Sushi Restaurant,Beer Store,1
4,Old Toronto,CityPlace,43.640044,-79.395179,Coffee Shop,Café,Gym,Park,Pub,Japanese Restaurant,Grocery Store,Caribbean Restaurant,French Restaurant,Sushi Restaurant,1


In [468]:
#Get geographical coordinates for Toronto City
address = 'Toronto, CA'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))



The geograpical coordinate of Toronto are 43.653963, -79.387207.


In [474]:
#Convert datatypes of latitude and longitude to float in toronto_merged dataframe
toronto_merged.Latitude = toronto_merged.Latitude.astype(float)
toronto_merged.Longitude = toronto_merged.Longitude.astype(float)
toronto_merged.dtypes

Borough                    object
Neighborhood               object
Latitude                  float64
Longitude                 float64
1st Most Common Venue      object
2nd Most Common Venue      object
3rd Most Common Venue      object
4th Most Common Venue      object
5th Most Common Venue      object
6th Most Common Venue      object
7th Most Common Venue      object
8th Most Common Venue      object
9th Most Common Venue      object
10th Most Common Venue     object
Cluster Labels              int32
dtype: object

In [475]:
# Finally, let's visualize the resulting clusters on map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## 5. Examine Clusters

#### Cluster 1

In [476]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(4, toronto_merged.shape[1]-1))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Alexandra Park,Vegetarian / Vegan Restaurant,Café,Bar,Coffee Shop,Dessert Shop,French Restaurant,Ice Cream Shop,Indian Restaurant,Filipino Restaurant,Park
10,The Entertainment District,Coffee Shop,Café,Bar,Italian Restaurant,Japanese Restaurant,Bubble Tea Shop,Sandwich Place,Salad Place,Ice Cream Shop,Spa
66,Dovercourt Park,Café,Bar,Bakery,Coffee Shop,Pizza Place,Ice Cream Shop,Bus Line,Sushi Restaurant,Japanese Restaurant,Restaurant
72,Junction Triangle,Bar,Bakery,Café,Caribbean Restaurant,Breakfast Spot,Other Great Outdoors,Gastropub,Music Store,Clothing Store,Coffee Shop
77,Little Tibet,Coffee Shop,Café,Bar,Grocery Store,Vietnamese Restaurant,Bakery,Restaurant,Middle Eastern Restaurant,Pizza Place,Gastropub
94,Bermondsey,Bakery,Park,Thai Restaurant,Bank,Grocery Store,Pharmacy,Discount Store,Coffee Shop,Eastern European Restaurant,Electronics Store
119,Stonegate-Queensway,Coffee Shop,Electronics Store,Liquor Store,Grocery Store,Fried Chicken Joint,Chinese Restaurant,Sandwich Place,Bank,Smoothie Shop,Light Rail Station
129,Birch Cliff,Thai Restaurant,Discount Store,Diner,Bank,Falafel Restaurant,Electronics Store,Elementary School,Empanada Restaurant,Ethiopian Restaurant,Event Space
155,West Hill,Pizza Place,Fast Food Restaurant,Breakfast Spot,Coffee Shop,Burger Joint,Greek Restaurant,Beer Store,Supermarket,Sandwich Place,Fried Chicken Joint
166,The Bridle Path,Music Venue,Convenience Store,Electronics Store,Elementary School,Empanada Restaurant,Ethiopian Restaurant,Event Space,Exhibit,Falafel Restaurant,Farm


#### Cluster 2

In [477]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(4, toronto_merged.shape[1]-1))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Cabbagetown,Coffee Shop,Park,Breakfast Spot,Snack Place,Baseball Field,Jewelry Store,Taiwanese Restaurant,General Entertainment,Sushi Restaurant,Beer Store
4,CityPlace,Coffee Shop,Café,Gym,Park,Pub,Japanese Restaurant,Grocery Store,Caribbean Restaurant,French Restaurant,Sushi Restaurant
12,Fashion District,Restaurant,Coffee Shop,Italian Restaurant,Bar,Café,French Restaurant,Sandwich Place,Dessert Shop,Gym,Yoga Studio
13,Financial District,Coffee Shop,Café,Hotel,American Restaurant,Restaurant,Steakhouse,Deli / Bodega,Gastropub,Gym,Seafood Restaurant
17,Harbourfront,Boat or Ferry,Coffee Shop,Aquarium,Café,Pizza Place,Brewery,Food Court,Gym,Sports Bar,Music Venue
32,The Beaches,Beach,Nail Salon,Tea Room,Bar,Thai Restaurant,Bakery,Coffee Shop,Japanese Restaurant,Park,Sandwich Place
33,East Danforth,Pharmacy,Grocery Store,Skating Rink,Café,Bakery,Chinese Restaurant,Sandwich Place,Fried Chicken Joint,Light Rail Station,Moving Target
34,Gerrard Street East,Chinese Restaurant,Vietnamese Restaurant,Bar,Bakery,Fast Food Restaurant,Fish Market,Trail,Baseball Field,Thai Restaurant,Grocery Store
44,Chaplin Estates,Italian Restaurant,Sushi Restaurant,Restaurant,Coffee Shop,Indian Restaurant,Sandwich Place,French Restaurant,Japanese Restaurant,Bank,Café
53,Rosedale,Park,Playground,Trail,Women's Store,Exhibit,Eastern European Restaurant,Electronics Store,Elementary School,Empanada Restaurant,Ethiopian Restaurant


#### Cluster 3

In [478]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(4, toronto_merged.shape[1]-1))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,The Annex,Coffee Shop,Pharmacy,Park,Greek Restaurant,Gym,Thai Restaurant,Pub,Pizza Place,Bookstore,Café
2,Baldwin Village,Coffee Shop,Café,Sandwich Place,Chinese Restaurant,Vietnamese Restaurant,Art Gallery,Ramen Restaurant,Dim Sum Restaurant,Sushi Restaurant,Japanese Restaurant
5,Chinatown,Café,Bar,Vegetarian / Vegan Restaurant,Chinese Restaurant,Vietnamese Restaurant,Mexican Restaurant,Coffee Shop,Dumpling Restaurant,Art Gallery,Noodle House
6,Church and Wellesley,Coffee Shop,Gay Bar,Japanese Restaurant,Burger Joint,Dance Studio,Diner,Fast Food Restaurant,Bubble Tea Shop,Restaurant,Ramen Restaurant
7,Corktown,Coffee Shop,Restaurant,Bakery,Park,Breakfast Spot,Hotel,Greek Restaurant,Pub,Mexican Restaurant,Mediterranean Restaurant
8,Discovery District,Coffee Shop,Café,Bar,Italian Restaurant,Japanese Restaurant,Bubble Tea Shop,Sandwich Place,Salad Place,Ice Cream Shop,Spa
9,Distillery District,Coffee Shop,Café,Bar,Italian Restaurant,Japanese Restaurant,Bubble Tea Shop,Sandwich Place,Salad Place,Ice Cream Shop,Spa
11,East Bayfront,Coffee Shop,Park,Lounge,Beach,Grocery Store,Tourist Information Center,Clothing Store,Bar,Playground,Movie Theater
14,Garden District,Coffee Shop,Café,Ramen Restaurant,Clothing Store,Hotel,Sandwich Place,Japanese Restaurant,Diner,Bar,Lounge
15,Grange Park,Café,Coffee Shop,Chinese Restaurant,Sandwich Place,Japanese Restaurant,Ramen Restaurant,Art Gallery,French Restaurant,Arts & Crafts Store,Clothing Store


#### Cluster 4

In [479]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(4, toronto_merged.shape[1]-1))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
198,Briar Hill–Belgravia,Bakery,Park,Thai Restaurant,Bank,Grocery Store,Pharmacy,Discount Store,Coffee Shop,Eastern European Restaurant,Electronics Store


#### Cluster 5

In [480]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1] + list(range(4, toronto_merged.shape[1]-1))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
177,Humbermede,Tea Room,Hockey Field,Hockey Arena,Restaurant,Women's Store,Exhibit,Eastern European Restaurant,Electronics Store,Elementary School,Empanada Restaurant
