# Segmenting and Clustering Neighborhoods in Toronto
## Website scraping exercise

### Installing and importing all the needed libraries

In [1]:
#Install the Beautiful Soup library for pulling data out of HTML and XML files
!conda install -c conda-forge beautifulsoup4 --yes

#Install the lxml’s HTML parser
!conda install -c conda-forge lxml --yes

#Install geopy
!conda install -c conda-forge geopy --yes 

#Install folium
!conda install -c conda-forge folium=0.5.0 --yes

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/DSX-Python35

  added / updated specs: 
    - beautifulsoup4


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    beautifulsoup4-4.6.3       |           py35_0         139 KB  conda-forge
    ca-certificates-2019.3.9   |       hecc5488_0         146 KB  conda-forge
    certifi-2018.8.24          |        py35_1001         139 KB  conda-forge
    openssl-1.0.2r             |       h14c3975_0         3.1 MB  conda-forge
    ------------------------------------------------------------
                                           Total:         3.6 MB

The following packages will be UPDATED:

    beautifulsoup4:  4.6.0-py35h442a8c9_1 --> 4.6.3-py35_0        conda-forge
    ca-certificates: 2019.1.23-0          --> 2019.3.9-hecc5488_0 conda-forge
    certifi:         2018.8.24-py35_1     --> 2018.8.24-py35_1001 

In [2]:
from bs4 import BeautifulSoup #import the Beautiful Soup library
import requests #import the requests library
import pandas as pd #import the pandas library
import numpy as np #import the numpy library
from geopy.geocoders import Photon
from geopy.geocoders import Nominatim
from sklearn.cluster import KMeans #import KMeans

import folium # map rendering library
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

### Scrape the information from the website using the BeautifulSoup library

In [3]:
#Specifiy the url of the website that will be explored
wiki_url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'

#Get the webpage's source code
wiki_source = requests.get(wiki_url).text

#Convert the webpage's source code into a BeautifulSoup object
wiki_soup = BeautifulSoup(wiki_source, 'lxml')

#Extract the table with the list of postal codes in Canada
wiki_table = wiki_soup.find('table', class_='wikitable sortable')

### Organize the information retrieved from the web in a DataFrame

In [4]:
#Put the information of the table in a list
table = []
for row in wiki_table.find_all("tr"):
    row = row.text.split("\n")
    row = [elem for elem in row if elem != ''] #Eliminate the '' elements in the list
    row = [np.nan if elem == "Not assigned" else elem for elem in row] #replace the 'Not assigned' elements with NaNs
    table.append(row)

#Turn the list into a Pandas Dataframe
PC_Canada_df = pd.DataFrame(table[1:],columns=table[0]) #The first row contains the column names
PC_Canada_df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,,
1,M2A,,
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


### DataFrame conditioning (1/3) - Remove the "Not assigned" Boroughs

In [5]:
#Drop the rows that don't have a Borough assigned
PC_Canada_df.dropna(subset=["Borough"],inplace=True)
PC_Canada_df.reset_index(drop=True,inplace=True)
PC_Canada_df.head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M5A,Downtown Toronto,Regent Park
4,M6A,North York,Lawrence Heights
5,M6A,North York,Lawrence Manor
6,M7A,Queen's Park,
7,M9A,Etobicoke,Islington Avenue
8,M1B,Scarborough,Rouge
9,M1B,Scarborough,Malvern


### DataFrame conditioning (2/3) - Replace the "Not assigned" Neighbourhoods with their Borough's name

In [6]:
#Identify the entries where there are Neighbourhoods not assigned
#Assign its Borough name as Neighbourhood name
index_nan = PC_Canada_df[PC_Canada_df["Neighbourhood"].isna()].index
PC_Canada_df.iloc[index_nan,2] = PC_Canada_df.iloc[index_nan,1]
PC_Canada_df.head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M5A,Downtown Toronto,Regent Park
4,M6A,North York,Lawrence Heights
5,M6A,North York,Lawrence Manor
6,M7A,Queen's Park,Queen's Park
7,M9A,Etobicoke,Islington Avenue
8,M1B,Scarborough,Rouge
9,M1B,Scarborough,Malvern


### DataFrame conditioning (3/3) - Group the dataframe by Postcode

In [7]:
#Group the dataframe by Postcode
PC_Canada_grouped_df = PC_Canada_df[['Postcode','Borough']].drop_duplicates().reset_index(drop=True)
PC_Canada_grouped_df['Neighbourhood'] = PC_Canada_df.groupby(by='Postcode',as_index=False,sort=False)['Neighbourhood'].apply(lambda x:  "%s" % ', '.join(x))
PC_Canada_grouped_df.head(18)

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Harbourfront, Regent Park"
3,M6A,North York,"Lawrence Heights, Lawrence Manor"
4,M7A,Queen's Park,Queen's Park
5,M9A,Etobicoke,Islington Avenue
6,M1B,Scarborough,"Rouge, Malvern"
7,M3B,North York,Don Mills North
8,M4B,East York,"Woodbine Gardens, Parkview Hill"
9,M5B,Downtown Toronto,"Ryerson, Garden District"


### Number of entries (rows) in the final DataFrame

In [8]:
#Print the number of rows of the final DataFrame
print("Number of entries: "+str(PC_Canada_grouped_df.shape[0])+" datapoints")

Number of entries: 103 datapoints


### Getting the geographical data

In [9]:
#geolocator = Nominatim(user_agent="toronto_explorer")
geolocator = Photon()

lat_list = []
long_list = []
for index,row in PC_Canada_grouped_df.iterrows():
    postal_code = row["Postcode"]
    #For a structured query, provide a dictionary whose keys are one of: addressLine, locality (city), adminDistrict (state), countryRegion, or postalcode.
    query = postal_code+", Toronto, Canada"
    location = None
    while(location is None):
        location = geolocator.geocode(query)
    lat_list.append(location.latitude)
    long_list.append(location.longitude)
    

In [10]:
PC_Canada_grouped_df['Latitude'] = lat_list
PC_Canada_grouped_df['Longitude'] = long_list
PC_Canada_grouped_df.head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.740375,-79.321746
1,M4A,North York,Victoria Village,43.732658,-79.311189
2,M5A,Downtown Toronto,"Harbourfront, Regent Park",43.6514,-79.365837
3,M6A,North York,"Lawrence Heights, Lawrence Manor",43.722778,-79.450933
4,M7A,Queen's Park,Queen's Park,43.774391,-79.504811
5,M9A,Etobicoke,Islington Avenue,43.677966,-79.540909
6,M1B,Scarborough,"Rouge, Malvern",43.819623,-79.184498
7,M3B,North York,Don Mills North,43.761171,-79.351054
8,M4B,East York,"Woodbine Gardens, Parkview Hill",43.708823,-79.295986
9,M5B,Downtown Toronto,"Ryerson, Garden District",43.654174,-79.380812


### Visualization of the Postcode Areas

In [11]:
#Get the coordinates of the city of Toronto
geolocator = Photon()
address = 'Toronto, Canada'
toronto_location = geolocator.geocode(address)
toronto_latitude = toronto_location.latitude
toronto_longitude = toronto_location.longitude

In [12]:
#Show the map of Toronto, with the Boroughs and Neighbourhoods marked down
map_toronto = folium.Map(location=[toronto_latitude, toronto_longitude], zoom_start=11)

# add markers to map
for lat, lng, borough, neighborhood in zip(PC_Canada_grouped_df['Latitude'], PC_Canada_grouped_df['Longitude'], PC_Canada_grouped_df['Borough'], PC_Canada_grouped_df['Neighbourhood']):
    label = '{} - {}'.format(borough , neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

### Explore the Boroughs with the Foursquare API

In [13]:
# @hidden_cell
CLIENT_ID = 'VDNDBUVWGAEHK4UBPP4C01SXN4J2XGQAWGPHD12DMYURNGNF' #Foursquare ID
CLIENT_SECRET = 'OUD0AXBRTI4YTYR02ITH1XRVOJZNL4FMIIR5WSQ4Z54R3NYJ' #Foursquare Secret
VERSION = '20180605' #Foursquare API version

In [14]:
#Taken from the course lab and modified
#Get the categories of the top 10 venues around each postalcode area
def getNearbyVenuesCategories(postcodes, latitudes, longitudes, radius=500, LIMIT=100):
    
    venues_list=[]
    for postcode, lat, lng in zip(postcodes, latitudes, longitudes):
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            postcode,
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Postcode', 'Venue Category']
    
    return(nearby_venues)

In [16]:
trending_venues_categories_df = getNearbyVenuesCategories(postcodes=PC_Canada_grouped_df['Postcode'], latitudes=PC_Canada_grouped_df['Latitude'], longitudes=PC_Canada_grouped_df['Longitude'])
trending_venues_categories_df.shape

(2931, 2)

In [17]:
#Get the one-hot encoding for the top venues' categories
trending_venues_categories_onehot = pd.get_dummies(trending_venues_categories_df["Venue Category"])
trending_venues_categories_onehot['Postcode'] = PC_Canada_grouped_df['Postcode']
#Move the Postcode and Borough column to the beginning
fixed_columns = [trending_venues_categories_onehot.columns[-1]] + list(trending_venues_categories_onehot.columns[:-1])
trending_venues_categories_onehot = trending_venues_categories_onehot[fixed_columns]
trending_venues_categories_onehot.head()

Unnamed: 0,Postcode,Accessories Store,Adult Boutique,Afghan Restaurant,Airport Lounge,American Restaurant,Antique Shop,Arcade,Art Gallery,Art Museum,...,Vietnamese Restaurant,Volleyball Court,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
0,M3A,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,M4A,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,M5A,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,M6A,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,M7A,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


### Cluster analysis (K-Means)

In [18]:
#Group the trending venues information by Postcode area
X = trending_venues_categories_onehot.groupby(by='Postcode').mean()
X.sort_index
X.head(10)

Unnamed: 0_level_0,Accessories Store,Adult Boutique,Afghan Restaurant,Airport Lounge,American Restaurant,Antique Shop,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,...,Vietnamese Restaurant,Volleyball Court,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
Postcode,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
M1B,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
M1C,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
M1E,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
M1G,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
M1H,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
M1J,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
M1K,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
M1L,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
M1M,0,0,0,0,0,0,0,1,0,0,...,0,0,0,0,0,0,0,0,0,0
M1N,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [19]:
#Performs the KMeans algorithm
num_clusters = 6
kmeans_classifier = KMeans(n_clusters=num_clusters)
cluster_labels = kmeans_classifier.fit_predict(X)
cluster_labels.shape


(103,)

In [20]:
#Get the original dataset and add the corresponding cluster label to each datapoint
PC_Canada_clustered_df = PC_Canada_grouped_df.copy()
PC_Canada_clustered_df.sort_values(by='Postcode',inplace=True)
PC_Canada_clustered_df.reset_index(inplace=True,drop=True)
PC_Canada_clustered_df['Cluster'] = cluster_labels
PC_Canada_clustered_df.head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude,Cluster
0,M1B,Scarborough,"Rouge, Malvern",43.819623,-79.184498,2
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784369,-79.187075,0
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.75237,-79.192489,0
3,M1G,Scarborough,Woburn,43.782601,-79.204958,2
4,M1H,Scarborough,Cedarbrae,43.785792,-79.22781,0
5,M1J,Scarborough,Scarborough Village,43.748624,-79.226122,0
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park",43.73599,-79.276515,0
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",43.727342,-79.281597,0
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",43.721939,-79.236232,0
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.696999,-79.263981,0


### Visualization of the Clusters

In [59]:
#Show the map of Toronto, with the Boroughs and Neighbourhoods clusters marked down
map_toronto_clusters = folium.Map(location=[toronto_latitude, toronto_longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(num_clusters)
ys = [i + x + (i*x)**2 for i in range(num_clusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to map
for lat, lng, borough, neighborhood, cluster in zip(PC_Canada_clustered_df['Latitude'], PC_Canada_clustered_df['Longitude'], PC_Canada_clustered_df['Borough'], PC_Canada_clustered_df['Neighbourhood'], PC_Canada_clustered_df['Cluster']):
    label = '{} - {}'.format(borough , neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto_clusters)  
    
map_toronto_clusters

### Analyzing each cluster in detail

In [22]:
#Taken from the course lab
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [38]:
#Taken from the course lab
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Postcode','Borough','Neighbourhood','Cluster']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
top_venues_sorted = pd.DataFrame(columns=columns)
top_venues_sorted['Postcode'] = PC_Canada_clustered_df['Postcode']
top_venues_sorted['Borough'] = PC_Canada_clustered_df['Borough']
top_venues_sorted['Neighbourhood'] = PC_Canada_clustered_df['Neighbourhood']
top_venues_sorted['Cluster'] = PC_Canada_clustered_df['Cluster']

for ind in np.arange(PC_Canada_clustered_df.shape[0]):
    top_venues_sorted.iloc[ind, 4:] = return_most_common_venues(X.iloc[ind, :], num_top_venues)

top_venues_sorted.head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood,Cluster,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M1B,Scarborough,"Rouge, Malvern",2,Coffee Shop,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",0,Thai Restaurant,Zoo Exhibit,Food Service,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",0,Liquor Store,Zoo Exhibit,Food Service,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
3,M1G,Scarborough,Woburn,2,Coffee Shop,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
4,M1H,Scarborough,Cedarbrae,0,Farmers Market,Zoo Exhibit,Fast Food Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Filipino Restaurant,Empanada Restaurant
5,M1J,Scarborough,Scarborough Village,0,Karaoke Bar,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park",0,Gym Pool,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",0,Diner,Zoo Exhibit,Fast Food Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Filipino Restaurant
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",0,Art Gallery,Zoo Exhibit,Filipino Restaurant,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Fish & Chips Shop
9,M1N,Scarborough,"Birch Cliff, Cliffside West",0,Fast Food Restaurant,Zoo Exhibit,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Filipino Restaurant,Empanada Restaurant


### Cluster \#0

In [53]:
top_venues_sorted[top_venues_sorted['Cluster']==0].head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood,Cluster,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",0,Thai Restaurant,Zoo Exhibit,Food Service,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",0,Liquor Store,Zoo Exhibit,Food Service,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
4,M1H,Scarborough,Cedarbrae,0,Farmers Market,Zoo Exhibit,Fast Food Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Filipino Restaurant,Empanada Restaurant
5,M1J,Scarborough,Scarborough Village,0,Karaoke Bar,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park",0,Gym Pool,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",0,Diner,Zoo Exhibit,Fast Food Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Filipino Restaurant
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",0,Art Gallery,Zoo Exhibit,Filipino Restaurant,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Fish & Chips Shop
9,M1N,Scarborough,"Birch Cliff, Cliffside West",0,Fast Food Restaurant,Zoo Exhibit,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Filipino Restaurant,Empanada Restaurant
11,M1R,Scarborough,"Maryvale, Wexford",0,American Restaurant,Zoo Exhibit,Filipino Restaurant,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Fish & Chips Shop
13,M1T,Scarborough,"Clarks Corners, Sullivan, Tam O'Shanter",0,Bakery,Zoo Exhibit,Ethiopian Restaurant,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Filipino Restaurant


### Cluster \#1

In [54]:
top_venues_sorted[top_venues_sorted['Cluster']==1].head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood,Cluster,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
40,M4J,East York,East Toronto,1,Italian Restaurant,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
58,M5H,Downtown Toronto,"Adelaide, King, Richmond",1,Italian Restaurant,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
95,M9C,Etobicoke,"Bloordale Gardens, Eringate, Markland Wood, Ol...",1,Italian Restaurant,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant


### Cluster \#2

In [55]:
top_venues_sorted[top_venues_sorted['Cluster']==2].head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood,Cluster,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M1B,Scarborough,"Rouge, Malvern",2,Coffee Shop,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
3,M1G,Scarborough,Woburn,2,Coffee Shop,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
18,M2J,North York,"Fairview, Henry Farm, Oriole",2,Coffee Shop,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
28,M3H,North York,"Bathurst Manor, Downsview North, Wilson Heights",2,Coffee Shop,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
33,M3N,North York,Downsview Northwest,2,Coffee Shop,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
34,M4A,North York,Victoria Village,2,Coffee Shop,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
37,M4E,East Toronto,The Beaches,2,Coffee Shop,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
38,M4G,East York,Leaside,2,Coffee Shop,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
53,M5A,Downtown Toronto,"Harbourfront, Regent Park",2,Coffee Shop,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
73,M6C,York,Humewood-Cedarvale,2,Coffee Shop,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant


### Cluster \#3

In [56]:
top_venues_sorted[top_venues_sorted['Cluster']==3].head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood,Cluster,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
56,M5E,Downtown Toronto,Berczy Park,3,Café,Zoo Exhibit,Fast Food Restaurant,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Filipino Restaurant,Ethiopian Restaurant
78,M6K,West Toronto,"Brockton, Exhibition Place, Parkdale Village",3,Café,Zoo Exhibit,Fast Food Restaurant,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Filipino Restaurant,Ethiopian Restaurant


### Cluster \#4

In [57]:
top_venues_sorted[top_venues_sorted['Cluster']==4].head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood,Cluster,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,M1P,Scarborough,"Dorset Park, Scarborough Town Centre, Wexford ...",4,Clothing Store,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
12,M1S,Scarborough,Agincourt,4,Clothing Store,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
16,M1X,Scarborough,Upper Rouge,4,Clothing Store,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
24,M2R,North York,Willowdale West,4,Clothing Store,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
65,M5R,Central Toronto,"The Annex, North Midtown, Yorkville",4,Clothing Store,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
83,M6R,West Toronto,"Parkdale, Roncesvalles",4,Clothing Store,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
90,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North",4,Clothing Store,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
91,M8Y,Etobicoke,"Humber Bay, King's Mill Park, Kingsway Park So...",4,Clothing Store,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
92,M8Z,Etobicoke,"Kingsway Park South West, Mimico NW, The Queen...",4,Clothing Store,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
102,M9W,Etobicoke,Northwest,4,Clothing Store,Zoo Exhibit,Empanada Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant


### Cluster \#5

In [58]:
top_venues_sorted[top_venues_sorted['Cluster']==5].head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood,Cluster,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
32,M3M,North York,Downsview Central,5,Cosmetics Shop,Zoo Exhibit,Fast Food Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Filipino Restaurant
45,M4P,Central Toronto,Davisville North,5,Cosmetics Shop,Zoo Exhibit,Fast Food Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Filipino Restaurant
89,M8W,Etobicoke,"Alderwood, Long Branch",5,Cosmetics Shop,Zoo Exhibit,Fast Food Restaurant,Event Service,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Filipino Restaurant


As it can be seen, most of the popular venues are shared across clusters. However, ***each cluster's 1st most common venue is very specific and unique*** (Cluster \#1 concentrates the Italian restaurants, Cluster \#2 shows the Neighbourhoods where the Coffee shops are located and so on). This behaviour is true for all the clusters formed except for Cluster \#0, which has more variety. This indicates that the clustering process could be further refined, ending up in ***clustering the neighborhoods according to the category of the most popular businesses in the area***.