# Segmenting and Clustering Neighborhoods in Toronto

## Scraping Wikipedia page and building postcode dataframe
Let us scrape Wikipedia page into list of dataframes and choose the right dataframe for further processing. Note, that the right dataframe is at the first position of the list, and its header is in the first row.

In [1]:
# Import libraries and objects
import pandas as pd

# Scrap Wikipedia page into dataframes list
pc_dfs = pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M', header=0)
# Relevant dataframe is at the first position of the list
pc_df = pc_dfs[0]

# Display few first rows of the dataframe
pc_df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


There is a lot rows where _Borough_ column has _Not assigned_ value, let us drop such rows and reset index.

In [2]:
# Drop all rows where 'Borough' is 'Not assigned'
pc_df = pc_df.drop(pc_df[pc_df['Borough'] == 'Not assigned'].index).reset_index(drop=True)

# Display few first rows of the dataframe
pc_df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M6A,North York,Lawrence Heights
4,M6A,North York,Lawrence Manor


There is a lot of rows where _Neighbourhood_ column has _Not assigned_ value. Let us replace that value with relevant _Borough_ column one. Note, that the replacement musts take place before concatenation, which makes that replacement more difficult.

In [3]:
# Use 'Borough' as 'Neighbourhood' when 'Neighbourhood' is 'Not assigned'
pc_df['Neighbourhood'].mask(pc_df.Neighbourhood == 'Not assigned', pc_df['Borough'], inplace=True)

# Display few first rows of the dataframe
pc_df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M6A,North York,Lawrence Heights
4,M6A,North York,Lawrence Manor


There are a lot of rows where _Postcode_ and _Borough_ columns have the same values. Let us group the data by those columns and concatenate _Neighbourhood_ column values using comma followed by white space as the separator. Note, that at the end index reset takes place.

In [4]:
# Group by 'Postcode' and 'Borough' concatenating 'Neighbourhood' values
pc_df = pc_df.groupby(['Postcode', 'Borough'])['Neighbourhood'].apply(lambda x: '%s' % ', '.join(x)).reset_index()

# Display few first rows of the dataframe
pc_df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1B,Scarborough,"Rouge, Malvern"
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


Let us see what is the shape of the dataframe, which has just been built.

In [5]:
# Display shape of dataframe
print('Shape:', pc_df.shape)

Shape: (103, 3)


## Retrieving location coordinates
Let us retrieve location coordinates for each postal code and add them to the dataframe. Location coordinates will be obtained using Google's Geocode API.

In [6]:
# Import libraries and objects
import geocoder

# Google API key, the key is restricted to Geocode API and IP addresses
API_KEY = 'AIzaSyAmIcwix4zGGCWAzqQ4FA7OClA4OtYy4lE'

# Retrieve location coordinates using Google's Gecode API
lats = []
lngs = []
for index, row in pc_df.iterrows():
    ll = None
    while (ll is None):
        g = geocoder.google('{}, Toronto, Ontario'.format(row['Postcode']), key=API_KEY)
        ll = g.latlng
    lats.insert(index, ll[0])
    lngs.insert(index, ll[1])

# Add location coordinates to the dataframe
pc_df['Latitude'] = lats
pc_df['Longitude'] = lngs

# Display few first rows of the dataframe
pc_df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


## Segmenting and clustering neighbourhoods
### Exploring neighbourhoods
Let us see how all neighbourhoods are located on a map of Toronto.

In [7]:
# Import libraries and object
import folium

# Make Toronto as map center coordinates
ll = None
while (ll is None):
    g = geocoder.google('Toronto, Ontario', key=API_KEY)
    ll = g.latlng

# Create map of all neighbourhoods using latitude and longitude values
pc_map = folium.Map(location=[ll[0], ll[1]], zoom_start=10)

# Add markers to map
for lat, lng, borough, neighborhood in zip(pc_df['Latitude'], pc_df['Longitude'], pc_df['Borough'], pc_df['Neighbourhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(pc_map)

# Display map
pc_map

Official map of Toronto city looks similar, so let us take all neighbourhoods under consideration.
  
![Map of Toronto city](https://upload.wikimedia.org/wikipedia/commons/thumb/9/9a/Toronto_map.png/1280px-Toronto_map.png "Map of Toronto city")

Let us explore neighbourhood in Toronto city using Foursqare.

In [8]:
# Import libraries and objects
import requests

# Foursquare API constants
CLIENT_ID = 'NENWEIPR0RT1XBNZL3BBOWFS2WDL3PZWJECGBEMYMYFIG3SM'
CLIENT_SECRET = 'LRWIIROC1AXIPUUWDZTCVT3RJVITNXSOEPNUQMDAFSSUUPKR'
VERSION = '20180605'
LIMIT = 100

# Return nearby venues
def getNearbyVenues(postcodes, boroughs, neighbourhoods, latitudes, longitudes, radius=500):
    venues_list=[]
    for postcode, borough, neighbourhood, lat, lng in zip(postcodes, boroughs, neighbourhoods, latitudes, longitudes):
        # Create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
        # Make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        # Return only relevant information for each nearby venue
        venues_list.append([(
            postcode,
            borough,
            neighbourhood,
            lat,
            lng,
            v['venue']['name'],
            v['venue']['location']['lat'],
            v['venue']['location']['lng'],
            v['venue']['categories'][0]['name']) for v in results])
    # Create returned dataframe
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Postcode',
                             'Borough',
                             'Neighbourhood', 
                             'Neighbourhood Latitude', 
                             'Neighbourhood Longitude', 
                             'Venue', 
                             'Venue Latitude', 
                             'Venue Longitude', 
                             'Venue Category']
    # Return dataframe
    return (nearby_venues)

In [9]:
# Explore neighbourhood
pc_venues = getNearbyVenues(postcodes=pc_df['Postcode'],
                            boroughs=pc_df['Borough'],
                            neighbourhoods=pc_df['Neighbourhood'],
                            latitudes=pc_df['Latitude'],
                            longitudes=pc_df['Longitude']
                           )
# Display few first rows of dataframe
pc_venues.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353,Wendy's,43.807448,-79.199056,Fast Food Restaurant
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497,Royal Canadian Legion,43.782533,-79.163085,Bar
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711,Swiss Chalet Rotisserie & Grill,43.767697,-79.189914,Pizza Place
3,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711,G & G Electronics,43.765309,-79.191537,Electronics Store
4,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711,Marina Spa,43.766,-79.191,Spa


Let us see what is the shape of the dataframe, which has just been built.

In [10]:
# Display shape of dataframe
print("Shape:", pc_venues.shape)

Shape: (2242, 9)


### Analyzing neighbourhoods
Let us see how many venues were found for each neighbourhoods.

In [11]:
# Display few first rows of dataframe
pc_venues[['Postcode', 'Borough', 'Neighbourhood', 'Venue']].groupby(by=['Postcode', 'Borough', 'Neighbourhood']).count().head(15)

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Venue
Postcode,Borough,Neighbourhood,Unnamed: 3_level_1
M1B,Scarborough,"Rouge, Malvern",1
M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",1
M1E,Scarborough,"Guildwood, Morningside, West Hill",8
M1G,Scarborough,Woburn,3
M1H,Scarborough,Cedarbrae,9
M1J,Scarborough,Scarborough Village,1
M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park",5
M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",10
M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",2
M1N,Scarborough,"Birch Cliff, Cliffside West",4


Let us see how many unique venue categories were found.

In [12]:
# Display unique venue categories
print('There are {} uniques categories.'.format(len(pc_venues['Venue Category'].unique())))

There are 270 uniques categories.


Let us use one hot encoding and build relevant dataframe.

In [13]:
# One hot encoding
pc_onehot = pd.get_dummies(pc_venues[['Venue Category']], prefix="", prefix_sep="")

# Add additional columns back to dataframe
pc_onehot['Postcode'] = pc_venues['Postcode']
pc_onehot['Borough'] = pc_venues['Borough']
pc_onehot['Neighbourhood'] = pc_venues['Neighbourhood'] 

# Make additional columns the first ones
fixed_columns = list(pc_onehot.columns[-3:]) + list(pc_onehot.columns[:-3])
pc_onehot = pc_onehot[fixed_columns]

# Display few first rows of dataframe
pc_onehot.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,M1B,Scarborough,"Rouge, Malvern",0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,M1E,Scarborough,"Guildwood, Morningside, West Hill",0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,M1E,Scarborough,"Guildwood, Morningside, West Hill",0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Let us see what is the shape of the dataframe, which has just been built.

In [14]:
# Display shape of dataframe
print("Shape:", pc_onehot.shape)

Shape: (2242, 273)


Let us group rows by neighbourhood and by taking the mean of the frequency of occurrence of each venue category.

In [15]:
# Group by neighbour and take the mean of the frequecy of occurence of each venue category
pc_grouped = pc_onehot.groupby(['Postcode', 'Borough', 'Neighbourhood']).mean().reset_index()

# Display few first rows of dataframe
pc_grouped.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,M1B,Scarborough,"Rouge, Malvern",0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,M1G,Scarborough,Woburn,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,M1H,Scarborough,Cedarbrae,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


Let us see what is the shape of the dataframe, which has just been built.

In [16]:
# Display shape of dataframe
print("Shape:", pc_grouped.shape)

Shape: (100, 273)


Let us find out the most common 10 venue categories for each neighbourhood.

In [17]:
# Import libraries and objects
import numpy as np

# Return most common venues
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[3:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    return row_categories_sorted.index.values[0:num_top_venues]

In [18]:
# Number of top venues
num_top_venues = 10
# Suffixes for ordinal numbers
indicators = ['st', 'nd', 'rd']

# Create columns according to number of top venues
columns = ['Postcode', 'Borough', 'Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# Create a new dataframe
pc_venues_sorted = pd.DataFrame(columns=columns)
pc_venues_sorted['Postcode'] = pc_grouped['Postcode']
pc_venues_sorted['Borough'] = pc_grouped['Borough']
pc_venues_sorted['Neighbourhood'] = pc_grouped['Neighbourhood']
for ind in np.arange(pc_grouped.shape[0]):
    pc_venues_sorted.iloc[ind, 3:] = return_most_common_venues(pc_grouped.iloc[ind, :], num_top_venues)

# Display few rows of dataframe
pc_venues_sorted.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M1B,Scarborough,"Rouge, Malvern",Fast Food Restaurant,Yoga Studio,Dim Sum Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",Bar,Yoga Studio,Dumpling Restaurant,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Eastern European Restaurant,Filipino Restaurant
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",Medical Center,Spa,Pizza Place,Electronics Store,Breakfast Spot,Rental Car Location,Intersection,Mexican Restaurant,Doner Restaurant,Discount Store
3,M1G,Scarborough,Woburn,Coffee Shop,Korean Restaurant,Eastern European Restaurant,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Yoga Studio
4,M1H,Scarborough,Cedarbrae,Hakka Restaurant,Fried Chicken Joint,Lounge,Caribbean Restaurant,Athletics & Sports,Thai Restaurant,Gas Station,Bank,Bakery,Dumpling Restaurant


### Clustering neighbourhoods
Let us run k-means to cluster neighbourhoods into 5 clusters.

In [19]:
# Import libraries and objects
from sklearn.cluster import KMeans

# Number of clusters
kclusters = 5

# Drop unneeded columns
pc_grouped_clustering = pc_grouped.drop(['Postcode', 'Borough', 'Neighbourhood'], 1)

# Run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(pc_grouped_clustering)

# Check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 2, 2, 2, 2, 2, 2, 2, 0, 2], dtype=int32)

Let us apply clustering to all neighbourhoods. Note, that some neighbourhoods may have no data retrieved for venues, so they need to be dropped.

In [20]:
# Add clustering labels
pc_venues_sorted.insert(0, 'Cluster Label', kmeans.labels_)

# Merge dataframes
pc_merged = pc_df
pc_merged = pc_merged.join(pc_venues_sorted.drop(columns=['Borough', 'Neighbourhood']).set_index('Postcode'), on='Postcode')

# Drop rows where is no clustering label
pc_merged = pc_merged.loc[pc_merged['Cluster Label'].isnull() == False]
# Cluster label musts be int32, it is not when one or more rows have no clustering labels
pc_merged = pc_merged.astype({'Cluster Label':'int32'})

# Display few first rows of dataframe
pc_merged.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude,Cluster Label,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353,0,Fast Food Restaurant,Yoga Studio,Dim Sum Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497,2,Bar,Yoga Studio,Dumpling Restaurant,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Eastern European Restaurant,Filipino Restaurant
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711,2,Medical Center,Spa,Pizza Place,Electronics Store,Breakfast Spot,Rental Car Location,Intersection,Mexican Restaurant,Doner Restaurant,Discount Store
3,M1G,Scarborough,Woburn,43.770992,-79.216917,2,Coffee Shop,Korean Restaurant,Eastern European Restaurant,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Yoga Studio
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476,2,Hakka Restaurant,Fried Chicken Joint,Lounge,Caribbean Restaurant,Athletics & Sports,Thai Restaurant,Gas Station,Bank,Bakery,Dumpling Restaurant


Let us see how clustered neighbourhoods are located on a map of Toronto.

In [21]:
# Import libraries and objects
import matplotlib.cm as cm
import matplotlib.colors as colors

# Create map
pc_map_clusters = folium.Map(location=[ll[0], ll[1]], zoom_start=10)

# Set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i * x) ** 2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# Add markers to the map
markers_colors = []
for lat, lng, borough, neighbourhood, cluster in zip(pc_merged['Latitude'], pc_merged['Longitude'], pc_merged['Borough'], pc_merged['Neighbourhood'], pc_merged['Cluster Label']):
    label = '{}, {}: Cluster {}'.format(neighborhood, borough, str(cluster))
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color=rainbow[cluster - 1],
        fill=True,
        fill_color=rainbow[cluster - 1],
        fill_opacity=0.7).add_to(pc_map_clusters)

# Display map
pc_map_clusters

### Examining clusters
Cluster 1  
It seems to be in relation of presence of: Construction & Landscaping, Yoga Studio.

In [22]:
pc_merged.loc[pc_merged['Cluster Label'] == 0].head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude,Cluster Label,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353,0,Fast Food Restaurant,Yoga Studio,Dim Sum Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",43.716316,-79.239476,0,American Restaurant,Motel,Yoga Studio,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant
91,M8Y,Etobicoke,"Humber Bay, King's Mill Park, Kingsway Park So...",43.636258,-79.498509,0,Construction & Landscaping,Baseball Field,Yoga Studio,Eastern European Restaurant,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Electronics Store
96,M9L,North York,Humber Summit,43.756303,-79.565963,0,Pizza Place,Construction & Landscaping,Dumpling Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Eastern European Restaurant


Cluster 2  
It seems to be in relation of presence of: Convenience Store, Yoga Studio, Discount Store, Dog Run...

In [23]:
pc_merged.loc[pc_merged['Cluster Label'] == 1].head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude,Cluster Label,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
40,M4J,East York,East Toronto,43.685347,-79.338106,1,Park,Convenience Store,Yoga Studio,Dumpling Restaurant,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Electronics Store
98,M9N,York,Weston,43.706876,-79.518188,1,Convenience Store,Yoga Studio,Eastern European Restaurant,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Electronics Store


Cluster 3  
It seems to be in relation of presence of: unclear, the biggest cluster.

In [24]:
pc_merged.loc[pc_merged['Cluster Label'] == 2].head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude,Cluster Label,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497,2,Bar,Yoga Studio,Dumpling Restaurant,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Eastern European Restaurant,Filipino Restaurant
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711,2,Medical Center,Spa,Pizza Place,Electronics Store,Breakfast Spot,Rental Car Location,Intersection,Mexican Restaurant,Doner Restaurant,Discount Store
3,M1G,Scarborough,Woburn,43.770992,-79.216917,2,Coffee Shop,Korean Restaurant,Eastern European Restaurant,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Yoga Studio
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476,2,Hakka Restaurant,Fried Chicken Joint,Lounge,Caribbean Restaurant,Athletics & Sports,Thai Restaurant,Gas Station,Bank,Bakery,Dumpling Restaurant
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476,2,Playground,Yoga Studio,Dumpling Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Eastern European Restaurant
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park",43.727929,-79.262029,2,Discount Store,Department Store,Hobby Shop,Coffee Shop,Dumpling Restaurant,Diner,Dog Run,Doner Restaurant,Donut Shop,Drugstore
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",43.711112,-79.284577,2,Bus Line,Bakery,Park,Fast Food Restaurant,Metro Station,Bus Station,Intersection,Soccer Field,Creperie,Cuban Restaurant
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848,2,Café,General Entertainment,Skating Rink,College Stadium,Concert Hall,Diner,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant
10,M1P,Scarborough,"Dorset Park, Scarborough Town Centre, Wexford ...",43.75741,-79.273304,2,Indian Restaurant,Pet Store,Vietnamese Restaurant,Chinese Restaurant,Yoga Studio,Donut Shop,Diner,Discount Store,Dog Run,Doner Restaurant
11,M1R,Scarborough,"Maryvale, Wexford",43.750071,-79.295849,2,Auto Garage,Sandwich Place,Middle Eastern Restaurant,Smoke Shop,Breakfast Spot,Shopping Mall,Bakery,Dog Run,Doner Restaurant,Donut Shop


Cluster 4  
It seems to be in relation of presence of: Park.

In [25]:
pc_merged.loc[pc_merged['Cluster Label'] == 3].head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude,Cluster Label,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
14,M1V,Scarborough,"Agincourt North, L'Amoreaux East, Milliken, St...",43.815252,-79.284577,3,Playground,Park,Yoga Studio,Dumpling Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore
23,M2P,North York,York Mills West,43.752758,-79.400049,3,Park,Bank,Convenience Store,Yoga Studio,Dumpling Restaurant,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore
25,M3A,North York,Parkwoods,43.753259,-79.329656,3,Food & Drink Shop,Park,Eastern European Restaurant,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Yoga Studio
30,M3K,North York,"CFB Toronto, Downsview East",43.737473,-79.464763,3,Airport,Park,Yoga Studio,Dumpling Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore
44,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,3,Park,Bus Line,Swim School,Yoga Studio,Drugstore,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant
50,M4W,Downtown Toronto,Rosedale,43.679563,-79.377529,3,Park,Trail,Playground,Yoga Studio,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop
74,M6E,York,Caledonia-Fairbanks,43.689026,-79.453512,3,Park,Women's Store,Market,Fast Food Restaurant,Comic Shop,Concert Hall,Farmers Market,Falafel Restaurant,Comfort Food Restaurant,Event Space
79,M6L,North York,"Downsview, North Park, Upwood Park",43.713756,-79.490074,3,Basketball Court,Park,Bakery,Construction & Landscaping,Yoga Studio,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Eastern European Restaurant
90,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North",43.653654,-79.506944,3,Park,River,Yoga Studio,Drugstore,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant
100,M9R,Etobicoke,"Kingsview Village, Martin Grove Gardens, Richv...",43.688905,-79.554724,3,Pizza Place,Park,Mobile Phone Shop,Bus Line,Yoga Studio,Donut Shop,Diner,Discount Store,Dog Run,Doner Restaurant


Cluster 5  
It seems to be in relation of presence of: Baseball Field, Dog Run.

In [26]:
pc_merged.loc[pc_merged['Cluster Label'] == 4].head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude,Cluster Label,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
32,M3M,North York,Downsview Central,43.728496,-79.495697,4,Food Truck,Baseball Field,Eastern European Restaurant,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Yoga Studio,Diner
97,M9M,North York,"Emery, Humberlea",43.724766,-79.532242,4,Baseball Field,Yoga Studio,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Eastern European Restaurant,Filipino Restaurant
