# Capstone Project: Segmenting and Clustering Neighborhoods in Toronto

## Part 3 - Data Analysis

**NOTE**
I will be using only the central Toronto areas for my analysis along with the dataset with the coords I retrieved from LocationIQ

---
Import Modules

In [1]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis

import requests # library to handle requests
import json # library to handle JSON files
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
from geopy.geocoders import Nominatim

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library

---
define any user-defined functions that we are going to use

In [2]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [3]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [4]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        # print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

---
declare any static data

In [5]:
CLIENT_ID = 'SN0VIJAY21QJWFCYLHZ40KWI3KNJGBQGKYKMWALYK4UFXM2C' # your Foursquare ID
CLIENT_SECRET = 'R4O5CJFTGFYSOXFXIUYMZODZLJLPKDQWGLKUL2EY4XA3P40I' # your Foursquare Secret
VERSION = '20180604'

---
Now we map the data and narrow the field we want to look at

In [6]:
df_T = pd.read_csv('TNC_myLatLon.csv').drop('Unnamed: 0', axis=1)
df_T.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.7588,-79.320197
1,M4A,North York,Victoria Village,43.732658,-79.311189
2,M5A,Downtown Toronto,Regent Park,43.659738,-79.361559
3,M5A,Downtown Toronto,Harbourfront,43.654652,-79.381164
4,M6A,North York,Lawrence Manor,43.722079,-79.437507


In [7]:
address = 'Toronto, Ontario, Canada'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.6534817, -79.3839347.


In [8]:
map_toronto= folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, postal_code, neighbourhood in zip(df_T['Latitude'], df_T['Longitude'],
                                                df_T['Postal Code'], df_T['Neighbourhood']):
    label = '{}, {}'.format(neighbourhood, postal_code)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto) 

map_toronto

---
Testing GEOLocator lets say we are new University of Toronto students and see whats around university

In [9]:
address = 'University of Toronto, Toronto, Ontario, CA'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinates of UoT are {}, {}.'.format(latitude, longitude))

The geograpical coordinates of UoT are 43.663461999999996, -79.39775965337452.


In [10]:
radius = 500
LIMIT = 10000
url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID,
                                                                                                                           CLIENT_SECRET, 
                                                                                                                           latitude, 
                                                                                                                           longitude,
                                                                                                                           VERSION, 
                                                                                                                           radius, 
                                                                                                                           LIMIT)
results = requests.get(url).json()

venues = results['response']['groups'][0]['items']
nearby_venues = json_normalize(venues) # flatten JSON

filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]
nearby_venues.head()

  nearby_venues = json_normalize(venues) # flatten JSON


Unnamed: 0,name,categories,lat,lng
0,Philosopher's Walk,Park,43.666894,-79.395597
1,Hart House Theatre,Theater,43.663571,-79.394616
2,Yasu,Japanese Restaurant,43.662837,-79.403217
3,Queen's Park,Park,43.663946,-79.39218
4,The Dessert Kitchen,Dessert Shop,43.662823,-79.402746


In [11]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

30 venues were returned by Foursquare.


In [12]:
neighbourhood_latitude = df_T.loc[:, 'Latitude']
neighbourhood_longitude = df_T.loc[:, 'Longitude']
neighbourhood_name = df_T.loc[:, 'Neighbourhood']

In [13]:
toronto_venues = getNearbyVenues(names=neighbourhood_name,
                                   latitudes=neighbourhood_latitude,
                                   longitudes=neighbourhood_longitude
                                  )

In [14]:
print(toronto_venues.shape)
toronto_venues.head()

(5806, 7)


Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,43.7588,-79.320197,Allwyn's Bakery,43.75984,-79.324719,Caribbean Restaurant
1,Parkwoods,43.7588,-79.320197,LCBO,43.757774,-79.314257,Liquor Store
2,Parkwoods,43.7588,-79.320197,Shoppers Drug Mart,43.760857,-79.324961,Pharmacy
3,Parkwoods,43.7588,-79.320197,Petro-Canada,43.75795,-79.315187,Gas Station
4,Parkwoods,43.7588,-79.320197,Pizza Pizza,43.760231,-79.325666,Pizza Place


In [15]:
toronto_venues.groupby('Neighbourhood').count()

Unnamed: 0_level_0,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Adelaide,100,100,100,100,100,100
Agincourt,14,14,14,14,14,14
Agincourt North,28,28,28,28,28,28
Albion Gardens,4,4,4,4,4,4
Alderwood,7,7,7,7,7,7
...,...,...,...,...,...,...
Woodbine Heights,10,10,10,10,10,10
York Mills,15,15,15,15,15,15
York Mills West,15,15,15,15,15,15
York University,2,2,2,2,2,2


In [16]:
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 328 uniques categories.


In [17]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighbourhood'] = toronto_venues['Neighbourhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Neighbourhood,ATM,Accessories Store,Afghan Restaurant,African Restaurant,Airport,Airport Food Court,Airport Lounge,Airport Service,Airport Terminal,...,University,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wings Joint,Women's Store,Yoga Studio
0,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [18]:
print(toronto_onehot.shape)
toronto_grouped = toronto_onehot.groupby('Neighbourhood').mean().reset_index()
print(toronto_grouped.shape)
toronto_grouped.head(10)

(5806, 329)
(204, 329)


Unnamed: 0,Neighbourhood,ATM,Accessories Store,Afghan Restaurant,African Restaurant,Airport,Airport Food Court,Airport Lounge,Airport Service,Airport Terminal,...,University,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wings Joint,Women's Store,Yoga Studio
0,Adelaide,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0
1,Agincourt,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0
2,Agincourt North,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.035714,0.0,0.0
3,Albion Gardens,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Alderwood,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Bathurst Manor,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Bathurst Quay,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0
7,Bayview Village,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Beaumond Heights,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Bedford Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [19]:
num_top_venues = 5

for hood in toronto_grouped['Neighbourhood']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Neighbourhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Adelaide----
         venue  freq
0         Café  0.07
1  Coffee Shop  0.07
2        Hotel  0.04
3   Restaurant  0.04
4          Gym  0.04


----Agincourt----
                   venue  freq
0     Chinese Restaurant  0.29
1   Cantonese Restaurant  0.14
2             Food Court  0.07
3  Vietnamese Restaurant  0.07
4            Coffee Shop  0.07


----Agincourt North----
                 venue  freq
0                 Bank  0.07
1               Bakery  0.07
2       Discount Store  0.04
3           Beer Store  0.04
4  Sporting Goods Shop  0.04


----Albion Gardens----
         venue  freq
0   Playground  0.25
1       Garden  0.25
2     Pharmacy  0.25
3  Coffee Shop  0.25
4          ATM  0.00


----Alderwood----
            venue  freq
0     Pizza Place  0.29
1  Sandwich Place  0.14
2             Gym  0.14
3             Pub  0.14
4        Pharmacy  0.14


----Bathurst Manor----
               venue  freq
0  Convenience Store  0.17
1         Playground  0.17
2        Men's Store  0.17
3  

              venue  freq
0     Train Station  0.50
1  Storage Facility  0.25
2    Baseball Field  0.25
3               ATM  0.00
4       Music Venue  0.00


----Harbord----
               venue  freq
0  Korean Restaurant  0.22
1        Coffee Shop  0.04
2               Café  0.04
3      Grocery Store  0.04
4     Ice Cream Shop  0.03


----Harbourfront----
            venue  freq
0     Coffee Shop  0.11
1  Clothing Store  0.06
2           Hotel  0.04
3             Gym  0.02
4      Restaurant  0.02


----Harbourfront East----
         venue  freq
0  Coffee Shop  0.09
1        Hotel  0.07
2   Restaurant  0.06
3         Café  0.06
4          Gym  0.04


----Harbourfront West----
         venue  freq
0  Coffee Shop  0.09
1        Hotel  0.07
2   Restaurant  0.06
3         Café  0.06
4          Gym  0.04


----Henry Farm----
          venue  freq
0  Tennis Court   0.2
1        Lawyer   0.2
2  Intersection   0.2
3    Restaurant   0.2
4          Park   0.2


----High Park----
               v

          venue  freq
0  Skating Rink  0.25
1        Bakery  0.25
2          Café  0.25
3           Bar  0.25
4           ATM  0.00


----Mimico South----
          venue  freq
0  Skating Rink  0.25
1        Bakery  0.25
2          Café  0.25
3           Bar  0.25
4           ATM  0.00


----Montgomery Road----
                 venue  freq
0          Coffee Shop  0.16
1                  Pub  0.11
2           Steakhouse  0.05
3           Restaurant  0.05
4  Arts & Crafts Store  0.05


----Moore Park----
         venue  freq
0   Playground  0.33
1        Trail  0.33
2          Gym  0.33
3          ATM  0.00
4  Music Venue  0.00


----Morningside----
               venue  freq
0        Coffee Shop  0.15
1               Park  0.15
2  Mobile Phone Shop  0.08
3     Discount Store  0.08
4         Beer Store  0.08


----Mount Dennis----
                    venue  freq
0  Furniture / Home Store  0.18
1             Coffee Shop  0.18
2    Caribbean Restaurant  0.09
3                    Park  0.09

4             Noodle House   0.0


----St. Phillips----
                     venue  freq
0              Coffee Shop  0.50
1            Grocery Store  0.25
2              Pizza Place  0.25
3                   Office  0.00
4  North Indian Restaurant  0.00


----Steeles East----
                     venue  freq
0               Playground   1.0
1                      ATM   0.0
2                   Museum   0.0
3  North Indian Restaurant   0.0
4             Noodle House   0.0


----Steeles West----
                     venue  freq
0               Playground   1.0
1                      ATM   0.0
2                   Museum   0.0
3  North Indian Restaurant   0.0
4             Noodle House   0.0


----Stn A PO Boxes----
                     venue  freq
0                    Trail   0.5
1         Business Service   0.5
2                      ATM   0.0
3               Nail Salon   0.0
4  North Indian Restaurant   0.0


----Studio District----
                           venue  freq
0               

4          ATM  0.00


----Woodbine Heights----
                venue  freq
0        Skating Rink   0.2
1                 ATM   0.1
2                Park   0.1
3        Intersection   0.1
4  Athletics & Sports   0.1


----York Mills----
          venue  freq
0   Coffee Shop  0.27
1    Restaurant  0.13
2          Park  0.07
3           Gym  0.07
4  Tennis Court  0.07


----York Mills West----
          venue  freq
0   Coffee Shop  0.27
1    Restaurant  0.13
2          Park  0.07
3           Gym  0.07
4  Tennis Court  0.07


----York University----
                     venue  freq
0                   Bakery   0.5
1              Gas Station   0.5
2                      ATM   0.0
3              Music Venue   0.0
4  North Indian Restaurant   0.0


----Yorkville----
                 venue  freq
0             Boutique  0.05
1                 Café  0.05
2                  Spa  0.04
3           Restaurant  0.04
4  Japanese Restaurant  0.04




In [20]:

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighbourhood'] = toronto_grouped['Neighbourhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Adelaide,Café,Coffee Shop,Hotel,Gym,Restaurant,Steakhouse,Salad Place,American Restaurant,Asian Restaurant,Thai Restaurant
1,Agincourt,Chinese Restaurant,Cantonese Restaurant,Food Court,Korean Restaurant,Asian Restaurant,Rental Car Location,Train Station,Coffee Shop,Hong Kong Restaurant,Vietnamese Restaurant
2,Agincourt North,Bakery,Bank,Convenience Store,Japanese Restaurant,Discount Store,Juice Bar,Gift Shop,Chinese Restaurant,Restaurant,Movie Theater
3,Albion Gardens,Playground,Pharmacy,Garden,Coffee Shop,Fish Market,Fish & Chips Shop,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant
4,Alderwood,Pizza Place,Pharmacy,Gym,Pub,Coffee Shop,Sandwich Place,Fast Food Restaurant,Doctor's Office,Dog Run,Donut Shop


### Clustering Analysis

In [21]:
df_T.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.7588,-79.320197
1,M4A,North York,Victoria Village,43.732658,-79.311189
2,M5A,Downtown Toronto,Regent Park,43.659738,-79.361559
3,M5A,Downtown Toronto,Harbourfront,43.654652,-79.381164
4,M6A,North York,Lawrence Manor,43.722079,-79.437507


In [22]:
kclusters = 5
toronto_grouped_clustering = toronto_grouped.drop('Neighbourhood', 1)
kmeans = KMeans(n_clusters=kclusters).fit(toronto_grouped_clustering)
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

In [23]:
toronto_merged = df_T.drop(columns=["Postal Code", "Borough"])
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighbourhood'), on='Neighbourhood')
toronto_merged.dropna(inplace=True)
toronto_merged.reset_index(drop=True)
toronto_merged.head()

Unnamed: 0,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Parkwoods,43.7588,-79.320197,2.0,Pizza Place,Coffee Shop,Shopping Mall,Bus Line,Liquor Store,Gas Station,Electronics Store,Bar,Bank,Laundry Service
1,Victoria Village,43.732658,-79.311189,4.0,Middle Eastern Restaurant,Bus Line,Thai Restaurant,Mediterranean Restaurant,Food & Drink Shop,Flower Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store
2,Regent Park,43.659738,-79.361559,2.0,Coffee Shop,Thai Restaurant,Animal Shelter,Restaurant,Auto Dealership,Electronics Store,Grocery Store,Beer Store,Pub,Sushi Restaurant
3,Harbourfront,43.654652,-79.381164,2.0,Coffee Shop,Clothing Store,Hotel,Cosmetics Shop,Seafood Restaurant,Breakfast Spot,Bubble Tea Shop,Electronics Store,Restaurant,Sushi Restaurant
4,Lawrence Manor,43.722079,-79.437507,4.0,Doctor's Office,Kids Store,Electronics Store,Bank,Park,Dog Run,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Ethiopian Restaurant


In [24]:
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighbourhood'], toronto_merged['Cluster Labels'].astype(int)):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

#### Cluster 1

In [25]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[0] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
11,Rouge,Park,Yoga Studio,Event Space,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant
13,Parkview Hill,Yoga Studio,Event Space,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Service
24,Port Union,Yoga Studio,Event Space,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Service
30,Eringate,Park,Yoga Studio,Falafel Restaurant,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Service
59,Northwood Park,Baseball Field,Yoga Studio,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Service,Event Space
139,Kingsview Village,Yoga Studio,Event Space,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Service
148,Swansea,Pilates Studio,Dance Studio,Yoga Studio,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Service
149,Clarks Corners,Gas Station,Caribbean Restaurant,Event Space,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Service
192,Upper Rouge,Park,Yoga Studio,Event Space,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant


#### Cluster 2

In [26]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Latitude,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
32,43.639373,Park,Dog Run,Flower Shop,Flea Market,Fish Market,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant
60,43.779242,Gas Station,Yoga Studio,Farmers Market,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Service,Event Space
70,43.714167,Convenience Store,Beer Store,Gas Station,Farmers Market,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Service
92,43.708655,Pizza Place,Vietnamese Restaurant,Convenience Store,Gas Station,Yoga Studio,Event Space,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant
106,43.721317,Convenience Store,Bakery,Baseball Field,Italian Restaurant,Farmers Market,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Service
127,43.69364,Pizza Place,Park,Yoga Studio,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Service
176,43.75981,Bank,Residential Building (Apartment / Condo),Gas Station,Yoga Studio,Event Space,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant
210,43.634045,Gym / Fitness Center,Gas Station,Dog Run,Bridge,Light Rail Station,Deli / Bodega,Basketball Court,Event Space,Egyptian Restaurant


#### Cluster 3

In [27]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Latitude,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,43.758800,Coffee Shop,Shopping Mall,Bus Line,Liquor Store,Gas Station,Electronics Store,Bar,Bank,Laundry Service
2,43.659738,Thai Restaurant,Animal Shelter,Restaurant,Auto Dealership,Electronics Store,Grocery Store,Beer Store,Pub,Sushi Restaurant
3,43.654652,Clothing Store,Hotel,Cosmetics Shop,Seafood Restaurant,Breakfast Spot,Bubble Tea Shop,Electronics Store,Restaurant,Sushi Restaurant
5,43.722778,Restaurant,Coffee Shop,Women's Store,Jewelry Store,Furniture / Home Store,Toy / Game Store,American Restaurant,Cosmetics Shop,Sandwich Place
6,43.659897,Café,Sandwich Place,Italian Restaurant,Bubble Tea Shop,Restaurant,Spa,Juice Bar,Thai Restaurant,Japanese Restaurant
...,...,...,...,...,...,...,...,...,...,...
200,43.661480,Japanese Restaurant,Gay Bar,Spa,Sandwich Place,Italian Restaurant,Diner,Pizza Place,Gastropub,Yoga Studio
206,43.609309,Gym,Eastern European Restaurant,Sushi Restaurant,Coffee Shop,Event Space,Dumpling Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant
207,43.648183,Sushi Restaurant,Pub,Bank,Breakfast Spot,Dessert Shop,Italian Restaurant,Grocery Store,Tapas Restaurant,Liquor Store
211,43.648010,Dessert Shop,Italian Restaurant,Sushi Restaurant,Breakfast Spot,Bank,Pub,Gastropub,Bar,Bakery


##### Cluster 4

In [28]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Latitude,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
19,43.67061,Pharmacy,Garden,Coffee Shop,Fish Market,Fish & Chips Shop,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant
152,43.690388,Gym,Trail,Dive Bar,Dog Run,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store
159,43.816178,Yoga Studio,Falafel Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Service
179,43.671459,Supermarket,Pharmacy,Coffee Shop,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant
180,43.671459,Supermarket,Pharmacy,Coffee Shop,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant
181,43.67061,Pharmacy,Garden,Coffee Shop,Fish Market,Fish & Chips Shop,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant
183,43.67061,Pharmacy,Garden,Coffee Shop,Fish Market,Fish & Chips Shop,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant
184,43.816178,Yoga Studio,Falafel Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Service
190,43.671459,Supermarket,Pharmacy,Coffee Shop,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant


#### Cluster 5

In [29]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Latitude,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,43.732658,Bus Line,Thai Restaurant,Mediterranean Restaurant,Food & Drink Shop,Flower Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store
4,43.722079,Kids Store,Electronics Store,Bank,Park,Dog Run,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Ethiopian Restaurant
7,43.631212,Park,Sculpture Garden,Restaurant,Night Market,Theater,Theme Park,Theme Park Ride / Attraction,Arts & Crafts Store,Nightclub
8,43.678207,Bus Stop,Intersection,Yoga Studio,Falafel Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Service
9,43.666472,Convenience Store,Bakery,Bus Stop,Baseball Field,Skating Rink,Yoga Studio,Falafel Restaurant,Electronics Store,Ethiopian Restaurant
14,43.712078,Bakery,Bar,Park,Donut Shop,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Service
23,43.780271,Bus Line,Yoga Studio,Event Space,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Service
25,43.790117,Construction & Landscaping,Neighborhood,Falafel Restaurant,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Service
27,43.69992,ATM,Athletics & Sports,Bus Stop,Park,Intersection,Dance Studio,Pharmacy,Snack Place,Fast Food Restaurant
28,34.012293,Yoga Studio,Falafel Restaurant,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Service,Event Space


### Results

Cluster 1: Yoga studio and middle eastern food - seems health related

Cluster 2: supermarkets and pubs with mixed restaurantes - possibly mid level shopping centres

Cluster 3: mixed shopping and retail - likely med/high density residential, lots of foot traffic

Cluster 4: doughtnust shops, events spaces sports fields - likely has low population density

Cluster 5: higher density of parks with other shops so likely high end residential (more grean