# Segmenting and Clustering Neighborhoods in Toronto

First, we import the fileprepared in Section 2, and pass it into a dataframe.

In [44]:
import pandas as pd

In [45]:
# tn - dataframe for Toronto's neighborhood data

tnc=pd.read_csv('torontoc.csv')
tnc.drop('Unnamed: 0',axis=1,inplace=True)
tnc.head()

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494


___________________
## 3. Neighborhoods in Toronto: Clustering

In this section, we will collect data concerning different venues in Toronto city, and divide the city into clusters according to the frequency of those venues. Therefore, within each cluster in the city, the overall environment, with respect to venues, is more similar than outside the cluster.

We start my drawing a folium map showing the organization of the city in Boroughs.

In [46]:
!pip install folium

Defaulting to user installation because normal site-packages is not writeable


In [47]:
import folium 

In [48]:
!pip install geopy

Defaulting to user installation because normal site-packages is not writeable


In [49]:
from geopy.geocoders import Nominatim

Getting the geographical coordinates of Toronto city:

In [50]:
address = 'Toronto, CN'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.6425637, -79.38708718320467.


In the map below, information concerning the Borough names, Neighborhoods names, and Postal Code are shown by clicking on the circles.

In [51]:
toronto_map=folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood, postalcode in zip(tnc['Latitude'], tnc['Longitude'],\
                                           tnc['Borough'], tnc['Neighborhood'],tnc['Postal Code']):
    label = '{}, {},{}'.format(neighborhood, borough,postalcode)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
#        color='blue',
        color='red',
        fill=True,
#        fill_color='#3186cc',
        fill_color='green',
        fill_opacity=0.7,
        parse_html=False).add_to(toronto_map)  
    
toronto_map

Now, let us explore the city of Toronto. We start by accessing a Foursquare account.

Let us imagine we live in the borough Downtown Toronto, so we wish to make a study of the nearby venues in our area.

In [10]:
tnc.loc[24,'Borough']

'Downtown Toronto'

We get the coordinates of Downtown Toronto:

In [11]:
myn_latitude=tnc.loc[24,'Latitude'] 
myn_longitude=tnc.loc[24,'Longitude']

myn_name = tnc.loc[24, 'Borough'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(myn_name, myn_latitude, myn_longitude))

Latitude and longitude values of Downtown Toronto are 43.6579524, -79.3873826.


Now, we make a Foursquare regular call to access the top 100 venues that are in Downtown Toronto, within a radius of 500 meters.

In [12]:
radius = 500
LIMIT=100 

url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, VERSION,myn_latitude,myn_longitude,radius, LIMIT)

In [13]:
import json # library to handle JSON files
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

In [14]:
results=requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5fecb42f1a514c11a726d0bb'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Bay Street Corridor',
  'headerFullLocation': 'Bay Street Corridor, Toronto',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 62,
  'suggestedBounds': {'ne': {'lat': 43.6624524045, 'lng': -79.38117421839567},
   'sw': {'lat': 43.6534523955, 'lng': -79.39359098160432}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '537d4d6d498ec171ba22e7fe',
       'name': "Jimmy's Coffee",
       'location': {'address': '82 Gerrard Street W',
        'crossStreet': 'Gerrard & LaPlante',
        'lat': 43.65842123574496,
        'lng': -79.38561319551111,
        'label

Then we organize the results of our query in categories and build a dataframe with the nearby venues.

In [15]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [16]:

venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

  nearby_venues = json_normalize(venues) # flatten JSON


Unnamed: 0,name,categories,lat,lng
0,Jimmy's Coffee,Coffee Shop,43.658421,-79.385613
1,Tim Hortons,Coffee Shop,43.65857,-79.385123
2,Somethin' 2 Talk About,Middle Eastern Restaurant,43.658395,-79.385338
3,Hailed Coffee,Coffee Shop,43.658833,-79.383684
4,NEO COFFEE BAR,Coffee Shop,43.66013,-79.38583


In [17]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

62 venues were returned by Foursquare.


In order to divide the city in clusters by venues, and therefore to find similar environments to ours, we now make Foursquare calls (1 call per Postal Code) to find the most frequent venues of all Boroughs in Toronto.  

In [18]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Borough', 
                  'Borough Latitude', 
                  'Borough Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Calling the function and building the dataframe *toronto_venues*.

In [19]:
# type your answer here
name=tnc['Borough']
lat=tnc['Latitude']
long=tnc['Longitude']
radius=500

toronto_venues=getNearbyVenues(name, lat, long, radius)
#manhattan_venues=pd.DataFrame(nearby_venues)

North York
North York
Downtown Toronto
North York
Downtown Toronto
Etobicoke
Scarborough
North York
East York
Downtown Toronto
North York
Etobicoke
Scarborough
North York
East York
Downtown Toronto
York
Etobicoke
Scarborough
East Toronto
Downtown Toronto
York
Scarborough
East York
Downtown Toronto
Downtown Toronto
Scarborough
North York
North York
East York
Downtown Toronto
West Toronto
Scarborough
North York
North York
East York
Downtown Toronto
West Toronto
Scarborough
North York
North York
East Toronto
Downtown Toronto
West Toronto
Scarborough
North York
North York
East Toronto
Downtown Toronto
North York
North York
Scarborough
North York
North York
East Toronto
North York
York
North York
Scarborough
North York
North York
Central Toronto
Central Toronto
York
York
Scarborough
North York
Central Toronto
Central Toronto
West Toronto
Etobicoke
Scarborough
North York
Central Toronto
Central Toronto
West Toronto
Mississauga
Etobicoke
Scarborough
Central Toronto
Downtown Toronto
West Toron

In [20]:
print('Number of venues in Toronto:', toronto_venues.shape[0],'\n')
toronto_venues.head()

Number of venues in Toronto: 2128 



Unnamed: 0,Borough,Borough Latitude,Borough Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,North York,43.753259,-79.329656,Brookbanks Park,43.751976,-79.33214,Park
1,North York,43.753259,-79.329656,Variety Store,43.751974,-79.333114,Food & Drink Shop
2,North York,43.725882,-79.315572,Victoria Village Arena,43.723481,-79.315635,Hockey Arena
3,North York,43.725882,-79.315572,Portugril,43.725819,-79.312785,Portuguese Restaurant
4,North York,43.725882,-79.315572,Tim Hortons,43.725517,-79.313103,Coffee Shop


Number of venues per Borough:

In [21]:
toronto_venues.drop(['Borough Latitude','Borough Longitude','Venue Latitude','Venue Longitude','Venue Category'],axis=1,inplace=False).groupby('Borough').count()

Unnamed: 0_level_0,Venue
Borough,Unnamed: 1_level_1
Central Toronto,113
Downtown Toronto,1229
East Toronto,120
East York,73
Etobicoke,72
Mississauga,14
North York,245
Scarborough,92
West Toronto,151
York,19


Number of different venue categories:

In [22]:
print('There are {} unique categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 270 unique categories.


Building a dataframe, *toronto_onehot*, that organizes each venue by Borough and by category.

In [23]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Borough'] = toronto_venues['Borough'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Borough,Accessories Store,Adult Boutique,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,...,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,North York,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,North York,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,North York,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,North York,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,North York,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [24]:
toronto_onehot.shape

(2128, 271)

Grouping the venues by Borough, and calculating the average of each venue category per Borough. This is the data that will be directly used for clustering.

In [25]:
toronto_grouped = toronto_onehot.groupby('Borough').mean().reset_index()
toronto_grouped

Unnamed: 0,Borough,Accessories Store,Adult Boutique,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,...,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,Central Toronto,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017699,...,0.0,0.0,0.0,0.00885,0.0,0.0,0.0,0.0,0.0,0.00885
1,Downtown Toronto,0.0,0.000814,0.000814,0.000814,0.000814,0.001627,0.001627,0.001627,0.013832,...,0.0,0.011391,0.001627,0.004068,0.0,0.006509,0.0,0.0,0.0,0.005696
2,East Toronto,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667
3,East York,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.013699,0.0,0.0,0.0,0.0,0.013699
4,Etobicoke,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013889,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013889,0.0,0.0
5,Mississauga,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,North York,0.008163,0.0,0.004082,0.0,0.0,0.0,0.0,0.0,0.008163,...,0.0,0.0,0.004082,0.008163,0.0,0.0,0.0,0.0,0.008163,0.0
7,Scarborough,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01087,...,0.0,0.0,0.0,0.01087,0.0,0.0,0.0,0.0,0.0,0.0
8,West Toronto,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.019868,0.0,0.013245,0.0,0.006623,0.006623,0.0,0.0,0.013245
9,York,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0


In [26]:
toronto_grouped.shape

(10, 271)

Listing the top 5 venues per Borough:

In [27]:
num_top_venues = 5

for hood in toronto_grouped['Borough']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Borough'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Central Toronto----
            venue  freq
0     Coffee Shop  0.09
1  Sandwich Place  0.06
2     Pizza Place  0.05
3            Café  0.05
4            Park  0.05


----Downtown Toronto----
                 venue  freq
0          Coffee Shop  0.10
1                 Café  0.05
2  Japanese Restaurant  0.03
3                Hotel  0.03
4           Restaurant  0.03


----East Toronto----
                venue  freq
0    Greek Restaurant  0.07
1         Coffee Shop  0.06
2  Italian Restaurant  0.04
3             Brewery  0.04
4      Ice Cream Shop  0.03


----East York----
            venue  freq
0            Bank  0.05
1     Coffee Shop  0.05
2     Pizza Place  0.04
3    Burger Joint  0.04
4  Sandwich Place  0.04


----Etobicoke----
                  venue  freq
0           Pizza Place  0.10
1        Sandwich Place  0.07
2              Pharmacy  0.06
3           Coffee Shop  0.06
4  Fast Food Restaurant  0.04


----Mississauga----
                      venue  freq
0               Coff

Building a dataframe with the the top 10 most common venues in each Borough:

In [28]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [29]:
import numpy as np

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Borough']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
borough_venues_sorted = pd.DataFrame(columns=columns)
borough_venues_sorted['Borough'] = toronto_grouped['Borough']

for ind in np.arange(toronto_grouped.shape[0]):
    borough_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

borough_venues_sorted.head()

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Central Toronto,Coffee Shop,Sandwich Place,Pizza Place,Park,Café,Sushi Restaurant,Gym,Restaurant,Clothing Store,Dessert Shop
1,Downtown Toronto,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Bakery,Park,Seafood Restaurant,Beer Bar
2,East Toronto,Greek Restaurant,Coffee Shop,Brewery,Italian Restaurant,Park,Ice Cream Shop,Bakery,Fast Food Restaurant,Pub,Bookstore
3,East York,Bank,Coffee Shop,Burger Joint,Sporting Goods Shop,Sandwich Place,Park,Intersection,Pizza Place,Restaurant,Pet Store
4,Etobicoke,Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Fast Food Restaurant,Gym,Grocery Store,Fried Chicken Joint,Bakery,Discount Store


In [30]:
!pip install sklearn

Defaulting to user installation because normal site-packages is not writeable


In [31]:
from sklearn.cluster import KMeans

Clustering the data using *toronto_grouped* dataframe. We choose seven clusters within the K-Means algorithm, i.e. k=7.

In [32]:
# set number of clusters
kclusters = 7

toronto_grouped_clustering = toronto_grouped.drop('Borough', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 4, 6, 5, 1, 0, 3, 4, 2], dtype=int32)

Attributing a Cluster label to each Postal Code, an integer running from 0 to 6. 

In [33]:
# add clustering labels
borough_venues_sorted.insert(0,'Cluster Labels', kmeans.labels_)

toronto_merged = tnc

# merge manhattan_grouped with manhattan_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(borough_venues_sorted.set_index('Borough'), on='Borough')

toronto_merged.head() # check the last columns!

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M3A,North York,Parkwoods,43.753259,-79.329656,0,Coffee Shop,Clothing Store,Restaurant,Park,Japanese Restaurant,Sandwich Place,Bank,Pizza Place,Grocery Store,Shopping Mall
1,M4A,North York,Victoria Village,43.725882,-79.315572,0,Coffee Shop,Clothing Store,Restaurant,Park,Japanese Restaurant,Sandwich Place,Bank,Pizza Place,Grocery Store,Shopping Mall
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,0,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Bakery,Park,Seafood Restaurant,Beer Bar
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763,0,Coffee Shop,Clothing Store,Restaurant,Park,Japanese Restaurant,Sandwich Place,Bank,Pizza Place,Grocery Store,Shopping Mall
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,0,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Bakery,Park,Seafood Restaurant,Beer Bar


Finally, we display a map with the clustering results!

In [34]:
import matplotlib.cm as cm
import matplotlib.colors as colors
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

In [35]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'],toronto_merged['Longitude'],toronto_merged['Neighborhood'],toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
#        color='red',
        fill=True,
        fill_color=rainbow[cluster-1],
#        fill_color='green',
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

We can furthermore take a look on the distinguishable features of each cluster, by inspecting the top 10 venues of each cluster: 

In [36]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,North York,0,Coffee Shop,Clothing Store,Restaurant,Park,Japanese Restaurant,Sandwich Place,Bank,Pizza Place,Grocery Store,Shopping Mall
1,North York,0,Coffee Shop,Clothing Store,Restaurant,Park,Japanese Restaurant,Sandwich Place,Bank,Pizza Place,Grocery Store,Shopping Mall
2,Downtown Toronto,0,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Bakery,Park,Seafood Restaurant,Beer Bar
3,North York,0,Coffee Shop,Clothing Store,Restaurant,Park,Japanese Restaurant,Sandwich Place,Bank,Pizza Place,Grocery Store,Shopping Mall
4,Downtown Toronto,0,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Bakery,Park,Seafood Restaurant,Beer Bar
7,North York,0,Coffee Shop,Clothing Store,Restaurant,Park,Japanese Restaurant,Sandwich Place,Bank,Pizza Place,Grocery Store,Shopping Mall
9,Downtown Toronto,0,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Bakery,Park,Seafood Restaurant,Beer Bar
10,North York,0,Coffee Shop,Clothing Store,Restaurant,Park,Japanese Restaurant,Sandwich Place,Bank,Pizza Place,Grocery Store,Shopping Mall
13,North York,0,Coffee Shop,Clothing Store,Restaurant,Park,Japanese Restaurant,Sandwich Place,Bank,Pizza Place,Grocery Store,Shopping Mall
15,Downtown Toronto,0,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Bakery,Park,Seafood Restaurant,Beer Bar


In [37]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
76,Mississauga,1,Coffee Shop,Hotel,Intersection,Gym,Middle Eastern Restaurant,Mediterranean Restaurant,American Restaurant,Sandwich Place,Fried Chicken Joint,Burrito Place


In [38]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
16,York,2,Park,Coffee Shop,Bus Line,Discount Store,Skating Rink,Caribbean Restaurant,Jewelry Store,Pool,Sandwich Place,Hockey Arena
21,York,2,Park,Coffee Shop,Bus Line,Discount Store,Skating Rink,Caribbean Restaurant,Jewelry Store,Pool,Sandwich Place,Hockey Arena
56,York,2,Park,Coffee Shop,Bus Line,Discount Store,Skating Rink,Caribbean Restaurant,Jewelry Store,Pool,Sandwich Place,Hockey Arena
63,York,2,Park,Coffee Shop,Bus Line,Discount Store,Skating Rink,Caribbean Restaurant,Jewelry Store,Pool,Sandwich Place,Hockey Arena
64,York,2,Park,Coffee Shop,Bus Line,Discount Store,Skating Rink,Caribbean Restaurant,Jewelry Store,Pool,Sandwich Place,Hockey Arena


In [39]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,Scarborough,3,Intersection,Bank,Coffee Shop,Bakery,Fast Food Restaurant,Chinese Restaurant,Pharmacy,Pizza Place,Indian Restaurant,Breakfast Spot
12,Scarborough,3,Intersection,Bank,Coffee Shop,Bakery,Fast Food Restaurant,Chinese Restaurant,Pharmacy,Pizza Place,Indian Restaurant,Breakfast Spot
18,Scarborough,3,Intersection,Bank,Coffee Shop,Bakery,Fast Food Restaurant,Chinese Restaurant,Pharmacy,Pizza Place,Indian Restaurant,Breakfast Spot
22,Scarborough,3,Intersection,Bank,Coffee Shop,Bakery,Fast Food Restaurant,Chinese Restaurant,Pharmacy,Pizza Place,Indian Restaurant,Breakfast Spot
26,Scarborough,3,Intersection,Bank,Coffee Shop,Bakery,Fast Food Restaurant,Chinese Restaurant,Pharmacy,Pizza Place,Indian Restaurant,Breakfast Spot
32,Scarborough,3,Intersection,Bank,Coffee Shop,Bakery,Fast Food Restaurant,Chinese Restaurant,Pharmacy,Pizza Place,Indian Restaurant,Breakfast Spot
38,Scarborough,3,Intersection,Bank,Coffee Shop,Bakery,Fast Food Restaurant,Chinese Restaurant,Pharmacy,Pizza Place,Indian Restaurant,Breakfast Spot
44,Scarborough,3,Intersection,Bank,Coffee Shop,Bakery,Fast Food Restaurant,Chinese Restaurant,Pharmacy,Pizza Place,Indian Restaurant,Breakfast Spot
51,Scarborough,3,Intersection,Bank,Coffee Shop,Bakery,Fast Food Restaurant,Chinese Restaurant,Pharmacy,Pizza Place,Indian Restaurant,Breakfast Spot
58,Scarborough,3,Intersection,Bank,Coffee Shop,Bakery,Fast Food Restaurant,Chinese Restaurant,Pharmacy,Pizza Place,Indian Restaurant,Breakfast Spot


In [40]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
19,East Toronto,4,Greek Restaurant,Coffee Shop,Brewery,Italian Restaurant,Park,Ice Cream Shop,Bakery,Fast Food Restaurant,Pub,Bookstore
31,West Toronto,4,Café,Bar,Coffee Shop,Italian Restaurant,Bakery,Restaurant,Breakfast Spot,Diner,Bookstore,Pizza Place
37,West Toronto,4,Café,Bar,Coffee Shop,Italian Restaurant,Bakery,Restaurant,Breakfast Spot,Diner,Bookstore,Pizza Place
41,East Toronto,4,Greek Restaurant,Coffee Shop,Brewery,Italian Restaurant,Park,Ice Cream Shop,Bakery,Fast Food Restaurant,Pub,Bookstore
43,West Toronto,4,Café,Bar,Coffee Shop,Italian Restaurant,Bakery,Restaurant,Breakfast Spot,Diner,Bookstore,Pizza Place
47,East Toronto,4,Greek Restaurant,Coffee Shop,Brewery,Italian Restaurant,Park,Ice Cream Shop,Bakery,Fast Food Restaurant,Pub,Bookstore
54,East Toronto,4,Greek Restaurant,Coffee Shop,Brewery,Italian Restaurant,Park,Ice Cream Shop,Bakery,Fast Food Restaurant,Pub,Bookstore
69,West Toronto,4,Café,Bar,Coffee Shop,Italian Restaurant,Bakery,Restaurant,Breakfast Spot,Diner,Bookstore,Pizza Place
75,West Toronto,4,Café,Bar,Coffee Shop,Italian Restaurant,Bakery,Restaurant,Breakfast Spot,Diner,Bookstore,Pizza Place
81,West Toronto,4,Café,Bar,Coffee Shop,Italian Restaurant,Bakery,Restaurant,Breakfast Spot,Diner,Bookstore,Pizza Place


In [41]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 5, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,Etobicoke,5,Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Fast Food Restaurant,Gym,Grocery Store,Fried Chicken Joint,Bakery,Discount Store
11,Etobicoke,5,Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Fast Food Restaurant,Gym,Grocery Store,Fried Chicken Joint,Bakery,Discount Store
17,Etobicoke,5,Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Fast Food Restaurant,Gym,Grocery Store,Fried Chicken Joint,Bakery,Discount Store
70,Etobicoke,5,Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Fast Food Restaurant,Gym,Grocery Store,Fried Chicken Joint,Bakery,Discount Store
77,Etobicoke,5,Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Fast Food Restaurant,Gym,Grocery Store,Fried Chicken Joint,Bakery,Discount Store
88,Etobicoke,5,Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Fast Food Restaurant,Gym,Grocery Store,Fried Chicken Joint,Bakery,Discount Store
89,Etobicoke,5,Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Fast Food Restaurant,Gym,Grocery Store,Fried Chicken Joint,Bakery,Discount Store
93,Etobicoke,5,Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Fast Food Restaurant,Gym,Grocery Store,Fried Chicken Joint,Bakery,Discount Store
94,Etobicoke,5,Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Fast Food Restaurant,Gym,Grocery Store,Fried Chicken Joint,Bakery,Discount Store
98,Etobicoke,5,Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Fast Food Restaurant,Gym,Grocery Store,Fried Chicken Joint,Bakery,Discount Store


In [42]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 6, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,East York,6,Bank,Coffee Shop,Burger Joint,Sporting Goods Shop,Sandwich Place,Park,Intersection,Pizza Place,Restaurant,Pet Store
14,East York,6,Bank,Coffee Shop,Burger Joint,Sporting Goods Shop,Sandwich Place,Park,Intersection,Pizza Place,Restaurant,Pet Store
23,East York,6,Bank,Coffee Shop,Burger Joint,Sporting Goods Shop,Sandwich Place,Park,Intersection,Pizza Place,Restaurant,Pet Store
29,East York,6,Bank,Coffee Shop,Burger Joint,Sporting Goods Shop,Sandwich Place,Park,Intersection,Pizza Place,Restaurant,Pet Store
35,East York,6,Bank,Coffee Shop,Burger Joint,Sporting Goods Shop,Sandwich Place,Park,Intersection,Pizza Place,Restaurant,Pet Store


We can easily verify the similarities between elements within the same cluster, as the top 10 venues are very similar!

Moreover, we can also check that our Downtown Toronto Borough lies in Cluster 0, where Coffe and Restaurant is a priority, specially Japanese and Pizza. Since this Cluster covers a large area, it is easier to find similar environments within the city!

The end.