# Segmenting and Clustering Neighborhoods in Toronto


## Importing Numpy and Pandas, and Extracting the Data

In [7]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis

**Step 1: Importing Requests and Beautiful Soup to extract data from Wikipage into a list of dataframes**

In [8]:


import requests
from bs4 import BeautifulSoup

res = requests.get("https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M")
soup = BeautifulSoup(res.content,'lxml')
table = soup.find_all('table')[0] 
df = pd.read_html(str(table))

toronto_area_codes = df[0]

**Step 2: See that the dataframe only has 3 columns - Postal Code, Borough and Neighborhood**

In [5]:
toronto_area_codes

Unnamed: 0,Postal Code,Borough,Neighborhood
0,M1A,Not assigned,
1,M2A,Not assigned,
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
5,M6A,North York,"Lawrence Manor, Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
7,M8A,Not assigned,
8,M9A,Etobicoke,"Islington Avenue, Humber Valley Village"
9,M1B,Scarborough,"Malvern, Rouge"


**Step 3: Create a new dataframe that filters out codes that do not have an assigned Borough**

In [9]:
resultdf = toronto_area_codes[toronto_area_codes['Borough'] != 'Not assigned']

*Resetting the index after  filtering*

In [10]:
resultdf.reset_index()

Unnamed: 0,index,Postal Code,Borough,Neighborhood
0,2,M3A,North York,Parkwoods
1,3,M4A,North York,Victoria Village
2,4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,5,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
5,8,M9A,Etobicoke,"Islington Avenue, Humber Valley Village"
6,9,M1B,Scarborough,"Malvern, Rouge"
7,11,M3B,North York,Don Mills
8,12,M4B,East York,"Parkview Hill, Woodbine Gardens"
9,13,M5B,Downtown Toronto,"Garden District, Ryerson"


*Rows no longer need to be combined via group by, the Wikipedia page already has the neighbourhoods grouped, separated by comma*

**Step 4: If neighbourhood is not assigned, replace it with the value of the Borough**

In [11]:
resultdf['Neighborhood'].replace('Not assigned', resultdf['Borough'])

2                                              Parkwoods
3                                       Victoria Village
4                              Regent Park, Harbourfront
5                       Lawrence Manor, Lawrence Heights
6            Queen's Park, Ontario Provincial Government
8                Islington Avenue, Humber Valley Village
9                                         Malvern, Rouge
11                                             Don Mills
12                       Parkview Hill, Woodbine Gardens
13                              Garden District, Ryerson
14                                             Glencairn
17     West Deane Park, Princess Gardens, Martin Grov...
18                Rouge Hill, Port Union, Highland Creek
20                                             Don Mills
21                                      Woodbine Heights
22                                        St. James Town
23                                    Humewood-Cedarvale
26     Eringate, Bloordale Gard

In [12]:
resultdf.reset_index()

Unnamed: 0,index,Postal Code,Borough,Neighborhood
0,2,M3A,North York,Parkwoods
1,3,M4A,North York,Victoria Village
2,4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,5,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
5,8,M9A,Etobicoke,"Islington Avenue, Humber Valley Village"
6,9,M1B,Scarborough,"Malvern, Rouge"
7,11,M3B,North York,Don Mills
8,12,M4B,East York,"Parkview Hill, Woodbine Gardens"
9,13,M5B,Downtown Toronto,"Garden District, Ryerson"


**Step 5: Print the shape of the dataframe**

In [13]:
resultdf.shape

(103, 3)

**Step 6: Share on Github**

*Done*

## Adding Latitude and Longitude for the Neighborhoods

In [14]:
!pip install geocoder
import geocoder


Collecting geocoder
[?25l  Downloading https://files.pythonhosted.org/packages/4f/6b/13166c909ad2f2d76b929a4227c952630ebaf0d729f6317eb09cbceccbab/geocoder-1.38.1-py2.py3-none-any.whl (98kB)
[K     |████████████████████████████████| 102kB 7.0MB/s ta 0:00:011
[?25hCollecting ratelim (from geocoder)
  Downloading https://files.pythonhosted.org/packages/f2/98/7e6d147fd16a10a5f821db6e25f192265d6ecca3d82957a4fdd592cad49c/ratelim-0.1.6-py2.py3-none-any.whl
Installing collected packages: ratelim, geocoder
Successfully installed geocoder-1.38.1 ratelim-0.1.6


In [15]:
postal_codes = resultdf['Postal Code'].values

#### GeoCode API

In [16]:
API_KEY = 'd81586d5707f448e89f23450fe404003'

In [17]:
import json

latitudes = [] # Initializing the latitude array
longitudes = [] # Initializing the longitude array

for postal_code in postal_codes : 
    place_name = postal_code + " Toronto, Canada" # Formats the place name
    url = 'https://api.opencagedata.com/geocode/v1/json?q={}&key={}'.format(place_name, API_KEY) # Gets the proper url to make the API call
    obj = json.loads(requests.get(url).text) # Loads the JSON file in the form of a python dictionary
    
    results = obj['results'] # Extracts the results information out of the JSON file
    lat = results[0]['geometry']['lat'] # Extracts the latitude value
    lng = results[0]['geometry']['lng'] # Extracts the longitude value
    
    latitudes.append(lat) # Appending to the list of latitudes
    longitudes.append(lng) # Appending to the list of longitudes
    
resultdf['Latitude'] = latitudes
resultdf['Longitude'] = longitudes

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


In [18]:
resultdf.head(10)

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
2,M3A,North York,Parkwoods,43.653482,-79.383935
3,M4A,North York,Victoria Village,43.7276,-79.3148
4,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.6555,-79.3626
5,M6A,North York,"Lawrence Manor, Lawrence Heights",43.7223,-79.4504
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.653482,-79.383935
8,M9A,Etobicoke,"Islington Avenue, Humber Valley Village",43.6662,-79.5282
9,M1B,Scarborough,"Malvern, Rouge",43.653482,-79.383935
11,M3B,North York,Don Mills,43.745,-79.359
12,M4B,East York,"Parkview Hill, Woodbine Gardens",43.7063,-79.3094
13,M5B,Downtown Toronto,"Garden District, Ryerson",43.6572,-79.3783


## Importing Libraries for clustering

In [61]:


# Downloading folium, if not installed
!pip install folium
import folium # Map plotting library
import numpy as np
from pandas.io.json import json_normalize # Tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# Import k-means from clustering stage
from sklearn.cluster import KMeans



### Mapping Toronto Neighborhoods in Folium

In [22]:
# Toronto latitude and longitude using Google search
tor_lat = 43.6532
tor_lng = -79.3832

# Creates map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[tor_lat, tor_lng], zoom_start=10)

# Add markers to map
for lat, lng, borough, neighbourhood in zip(resultdf['Latitude'], resultdf['Longitude'], resultdf['Borough'], resultdf['Neighborhood']):
    label = '{}, {}'.format(neighbourhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

### Foursquare Creds

In [32]:
CLIENT_ID = 'XRQC2X3B1LEQ5OQT5JIPB2YP2QHEQ53O0XSQMCZL1OMOFAMQ'
CLIENT_SECRET = 'NQMWFNYPZKSX2D1PAZUWYSOFXP24BCHO33FKK250TBZW50DK'

In [33]:
VERSION = '20180604'

### Function to get name of category

In [38]:
# Gets the name of the category

def get_category_type(row):
    try:
        categories_list = row['Category']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

### Using Foursquare API for all Neighborhoods

In [39]:
explore_df_list = []

for i, nbd_name in enumerate(resultdf['Neighborhood']):  
    
    try :         
        ### Getting the data of neighbourhood
        nbd_name = resultdf.loc[i, 'Neighborhood']
        nbd_lat = resultdf.loc[i, 'Latitude']
        nbd_lng = resultdf.loc[i, 'Longitude']

        radius = 500 # Setting the radius as 500 metres
        LIMIT = 100 # Getting the top 100 venues

        url = 'https://api.foursquare.com/v2/venues/explore?client_id={} \
        &client_secret={}&ll={},{}&v={}&radius={}&limit={}'\
        .format(CLIENT_ID, CLIENT_SECRET, nbd_lat, nbd_lng, VERSION, radius, LIMIT)

        results = json.loads(requests.get(url).text)
        results = results['response']['groups'][0]['items']

        nearby = json_normalize(results) # Flattens JSON

        # Filtering the columns
        filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
        nearby = nearby.loc[:, filtered_columns]

        # Renaming the columns
        columns = ['Name', 'Category', 'Latitude', 'Longitude']
        nearby.columns = columns

        # Gets the categories
        nearby['Category'] = nearby.apply(get_category_type, axis=1)

        # Gets the data required
        for i, name in enumerate(nearby['Name']):
            explore_df_list.append([nbd_name, nbd_lat, nbd_lng] + nearby.loc[i, :].values.tolist())
    
    except Exception as e:
        pass

### Creating a dataframe for performing clustering operations

In [40]:

explore_df = pd.DataFrame([item for item in explore_df_list])
explore_df.columns = ['Neighborhood', 'Neighborhood Latitude', 'Neighborhood Longitude', 'Venue Name', 'Venue Category', 'Venue Latitude', 'Venue Longitude']
explore_df.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue Name,Venue Category,Venue Latitude,Venue Longitude
0,Parkwoods,43.653482,-79.383935,Downtown Toronto,Neighborhood,43.653232,-79.385296
1,Parkwoods,43.653482,-79.383935,Nathan Phillips Square,Plaza,43.65227,-79.383516
2,Parkwoods,43.653482,-79.383935,Indigo,Bookstore,43.653515,-79.380696
3,Parkwoods,43.653482,-79.383935,Chatime 日出茶太,Bubble Tea Shop,43.655542,-79.384684
4,Parkwoods,43.653482,-79.383935,Textile Museum of Canada,Art Museum,43.654396,-79.3865


**Total number of unique categories**

In [41]:
len(explore_df['Venue Category'].unique())

236

**Performing one-hot encoding to analyze the neighborhoods**

In [53]:
# One hot encoding
toronto_onehot = pd.get_dummies(explore_df[['Venue Category']], prefix="", prefix_sep="")

# Add neighborhood column back to dataframe
toronto_onehot['Neighborhood'] = explore_df['Neighborhood'] 

# Move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Yoga Studio,Accessories Store,Afghan Restaurant,Airport,American Restaurant,Aquarium,Arcade,Art Gallery,Art Museum,Asian Restaurant,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,1,0,...,0,0,0,0,0,0,0,0,0,0


**Aggregating venues by neighborhoods**

In [54]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped.head()

Unnamed: 0,Neighborhood,Yoga Studio,Accessories Store,Afghan Restaurant,Airport,American Restaurant,Aquarium,Arcade,Art Gallery,Art Museum,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,"Bathurst Manor, Wilson Heights, Downsview North",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Bayview Village,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Bedford Park, Lawrence Manor East",0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478
3,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.013889,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Birch Cliff, Cliffside West",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


### Function to return the most common venues

In [55]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

### Creating a Dataframe for the top 10 venues

In [57]:
num_top_venues = 10
indicators = ['st', 'nd', 'rd']

# Create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# Create a new dataframe
neighbourhoods_venues_sorted = pd.DataFrame(columns=columns)
neighbourhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighbourhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighbourhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Bathurst Manor, Wilson Heights, Downsview North",Coffee Shop,Bank,Bridal Shop,Pizza Place,Sushi Restaurant,Middle Eastern Restaurant,Diner,Deli / Bodega,Shopping Mall,Restaurant
1,Bayview Village,Locksmith,Park,Flower Shop,Trail,Women's Store,Diner,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant
2,"Bedford Park, Lawrence Manor East",Sushi Restaurant,Sandwich Place,Italian Restaurant,Coffee Shop,Pizza Place,Restaurant,Butcher,Café,Pub,Women's Store
3,Berczy Park,Coffee Shop,Boat or Ferry,Restaurant,Hotel,Sporting Goods Shop,Japanese Restaurant,Liquor Store,Bar,Plaza,Fried Chicken Joint
4,"Birch Cliff, Cliffside West",Café,College Stadium,Skating Rink,General Entertainment,Women's Store,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store


### Using K-Means Clustering 

In [58]:
# Set number of clusters
kclusters = 5
toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# Run k-means clustering
kmeans = KMeans(n_clusters = kclusters, random_state = 0).fit(toronto_grouped_clustering)

# Check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

# Add clustering labels
neighbourhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

### Creating a new daataframe with the most common venues

In [59]:
toronto_merged = resultdf
toronto_merged = toronto_merged.join(neighbourhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')
toronto_merged.dropna(inplace = True)
toronto_merged['Cluster Labels'] = toronto_merged['Cluster Labels'].astype(int)
toronto_merged.head()

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,M3A,North York,Parkwoods,43.653482,-79.383935,4,Clothing Store,Coffee Shop,Restaurant,Thai Restaurant,Seafood Restaurant,Diner,Hotel,American Restaurant,Theater,Plaza
3,M4A,North York,Victoria Village,43.7276,-79.3148,4,French Restaurant,Hockey Arena,Park,Pizza Place,Coffee Shop,Financial or Legal Service,Portuguese Restaurant,Intersection,Donut Shop,Diner
4,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.6555,-79.3626,4,Coffee Shop,Breakfast Spot,Theater,Health Food Store,Italian Restaurant,Food Truck,Event Space,Electronics Store,Distribution Center,Playground
5,M6A,North York,"Lawrence Manor, Lawrence Heights",43.7223,-79.4504,4,Clothing Store,Coffee Shop,Cosmetics Shop,Restaurant,Women's Store,Men's Store,Jewelry Store,Toy / Game Store,Electronics Store,Sushi Restaurant
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.653482,-79.383935,4,Clothing Store,Coffee Shop,Restaurant,Thai Restaurant,Seafood Restaurant,Diner,Hotel,American Restaurant,Theater,Plaza


### Visualizing the clusters

In [63]:
# Create map
map_clusters = folium.Map(location=[tor_lat, tor_lng], zoom_start=11)

# Set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# Add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' (Cluster ' + str(cluster) + ')', parse_html=True)
    map_clusters.add_child(
        folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7))
       
map_clusters

### Cluster 0

In [66]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1, 2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
46,North York,Hillcrest Village,0,Park,Residential Building (Apartment / Condo),Women's Store,Dessert Shop,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Dumpling Restaurant
54,Scarborough,Scarborough Village,0,Grocery Store,Park,Women's Store,Dessert Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store
55,North York,"Fairview, Henry Farm, Oriole",0,Park,Pharmacy,Coffee Shop,Middle Eastern Restaurant,Convenience Store,Thai Restaurant,Pizza Place,Wine Shop,Ethiopian Restaurant,Electronics Store
57,East York,"East Toronto, Broadview North (Old East York)",0,Park,Greek Restaurant,Convenience Store,Women's Store,Diner,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant
93,Central Toronto,Lawrence Park,0,Photography Studio,Park,Department Store,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Dumpling Restaurant,Donut Shop
100,North York,York Mills West,0,Park,Convenience Store,Bank,Women's Store,Discount Store,Field,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space


### Cluster 1

In [68]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1, 2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
77,North York,"North Park, Maple Leaf Park, Upwood Park",1,Massage Studio,Bakery,Financial or Legal Service,Filipino Restaurant,Field,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant
99,Scarborough,"Dorset Park, Wexford Heights, Scarborough Town...",1,Bakery,Asian Restaurant,Women's Store,Discount Store,Field,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant


### Cluster 2

In [69]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1, 2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
26,Etobicoke,"Eringate, Bloordale Gardens, Old Burnhamthorpe...",2,Home Service,Diner,Field,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Dumpling Restaurant
94,Central Toronto,Roselawn,2,Home Service,Diner,Field,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Dumpling Restaurant
98,York,Weston,2,Home Service,Diner,Field,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Dumpling Restaurant


### Cluster 3 

In [70]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1, 2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
73,North York,"York Mills, Silver Hills",3,Pool,Women's Store,Dessert Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Dumpling Restaurant


### Cluster 4

In [71]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1, 2] + list(range(5, toronto_merged.shape[1]))]]


Unnamed: 0,Borough,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,North York,Parkwoods,4,Clothing Store,Coffee Shop,Restaurant,Thai Restaurant,Seafood Restaurant,Diner,Hotel,American Restaurant,Theater,Plaza
3,North York,Victoria Village,4,French Restaurant,Hockey Arena,Park,Pizza Place,Coffee Shop,Financial or Legal Service,Portuguese Restaurant,Intersection,Donut Shop,Diner
4,Downtown Toronto,"Regent Park, Harbourfront",4,Coffee Shop,Breakfast Spot,Theater,Health Food Store,Italian Restaurant,Food Truck,Event Space,Electronics Store,Distribution Center,Playground
5,North York,"Lawrence Manor, Lawrence Heights",4,Clothing Store,Coffee Shop,Cosmetics Shop,Restaurant,Women's Store,Men's Store,Jewelry Store,Toy / Game Store,Electronics Store,Sushi Restaurant
6,Downtown Toronto,"Queen's Park, Ontario Provincial Government",4,Clothing Store,Coffee Shop,Restaurant,Thai Restaurant,Seafood Restaurant,Diner,Hotel,American Restaurant,Theater,Plaza
8,Etobicoke,"Islington Avenue, Humber Valley Village",4,Pharmacy,Grocery Store,Park,Skating Rink,Bank,Dessert Shop,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store
9,Scarborough,"Malvern, Rouge",4,Clothing Store,Coffee Shop,Restaurant,Thai Restaurant,Seafood Restaurant,Diner,Hotel,American Restaurant,Theater,Plaza
11,North York,Don Mills,4,Restaurant,Coffee Shop,Bank,Pizza Place,Clothing Store,Shoe Store,Furniture / Home Store,Gourmet Shop,Bubble Tea Shop,Sandwich Place
12,East York,"Parkview Hill, Woodbine Gardens",4,Pizza Place,Gym / Fitness Center,Intersection,Pharmacy,Fast Food Restaurant,Breakfast Spot,Bank,Gastropub,Pet Store,General Entertainment
13,Downtown Toronto,"Garden District, Ryerson",4,Coffee Shop,Clothing Store,Café,Middle Eastern Restaurant,Japanese Restaurant,Cosmetics Shop,Italian Restaurant,Restaurant,Diner,Hotel
