# Segmenting and Clustering Neighborhoods in Toronto

### 1. Assignment description

In this assignment, you will be required to explore, segment, and cluster the neighborhoods in the city of Toronto based on the postalcode and borough information.. However, unlike New York, the neighborhood data is not readily available on the internet. What is interesting about the field of data science is that each project can be challenging in its unique way, so you need to learn to be agile and refine the skill to learn new libraries and tools quickly depending on the project.

For the Toronto neighborhood data, a Wikipedia page exists that has all the information we need to explore and cluster the neighborhoods in Toronto. You will be required to scrape the Wikipedia page and wrangle the data, clean it, and then read it into a pandas  dataframe so that it is in a structured format like the New York dataset.

Once the data is in a structured format, you can replicate the analysis that we did to the New York City dataset to explore and cluster the neighborhoods in the city of Toronto.

### 2. Scraping content from wikipedia page

In [114]:
!pip install folium

import re
import requests
import numpy as np
import pandas as pd
from bs4 import BeautifulSoup
from geopy.geocoders import Nominatim

import folium # map rendering library

# import k-means from clustering stage
from sklearn.cluster import KMeans

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

print('Folium installed')
print('Libraries imported.')

  from cryptography.utils import int_from_bytes
  from cryptography.utils import int_from_bytes
Folium installed
Libraries imported.


In [115]:
source = requests.get("https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M").text
canada_list = BeautifulSoup(source, 'lxml')

table = canada_list.find("table")
table_rows = table.tbody.find_all("tr")

def add_row(td, result):
    postal_code = ''
    borough = ''
    neighborhood = ''
    
    for tr in td:
        postal_code = tr.prettify().split("<br/>")[0].split("<p>")[1].replace("\n  ","")
        
        borough_text = tr.p.span.text
        if borough_text != 'Not assigned':
            borough_text_transformed = re.search(r"(.*)\(([^)]+)\)", borough_text)
            if borough_text_transformed:
                borough = borough_text_transformed.groups()[0]
        
        neighborhood_text = tr.p.span.text
        neighborhood_text_transformed = re.search(r"\(([^)]+)\)", neighborhood_text)
        if neighborhood_text_transformed:
            neighborhood = neighborhood_text_transformed.groups()[0].replace(" / ", ",")
        
        # Check if the neighborhood has the unassigned value, then the borough is assigned the neighborhood value
        if neighborhood == 'Not assigned':
            neighborhood = borough
        
        # Ignoring boroughs that has not assigned
        if borough_text != 'Not assigned':
            row = [postal_code, borough, neighborhood]
            result.append(row)
    
result = []
for tr in table_rows:
    td = tr.find_all("td")
    add_row(td, result)

# Define the dataframe columns
columns_names = ['PostalCode', 'Borough', 'Neighborhood']

# Instantiate and populate the dataframe
df = pd.DataFrame(result, columns=columns_names)

# Examine the dataframe
df.head()


Unnamed: 0,PostalCode,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park,Harbourfront"
3,M6A,North York,"Lawrence Manor,Lawrence Heights"
4,M7A,Queen's Park,Ontario Provincial Government


Group all neighborhoods with the same postal code

In [116]:
df = df.groupby(["PostalCode", "Borough"])["Neighborhood"].apply(", ".join).reset_index()
df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1B,Scarborough,"Malvern,Rouge"
1,M1C,Scarborough,"Rouge Hill,Port Union,Highland Creek"
2,M1E,Scarborough,"Guildwood,Morningside,West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


In [117]:
print("The number of rows in dataframe:", df.shape[0])
df.shape

The number of rows in dataframe: 103


(103, 3)

### 3. Get the latitude and the longitude coordinates of each neighborhood.

In [118]:
df_toronto = df.copy()

In [119]:
!wget -q -O 'toronto_latitude_longitude_geospatial_data.csv'  http://cocl.us/Geospatial_data
geodata = pd.read_csv('toronto_latitude_longitude_geospatial_data.csv')
geodata.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [120]:
geodata.rename(index=str, columns={"Postal Code":"PostalCode"},inplace=True)
geodata.head()

Unnamed: 0,PostalCode,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [121]:
df_toronto = df_toronto.merge(geodata, how='inner', on='PostalCode')
df_toronto.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Malvern,Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Rouge Hill,Port Union,Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood,Morningside,West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


In [122]:
# Check how many boroughs and neighborhoods there are
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(df_toronto['Borough'].unique()),
        df_toronto.shape[0]
    )
)

The dataframe has 17 boroughs and 103 neighborhoods.


### 4. Explore and cluster the neighborhoods in Toronto


#### 4.1. Get the latitude and longitude values of Toronto.


In [155]:
address = "Toronto, ON"

geolocator = Nominatim(user_agent="toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

print(f'The geograpical coordinate of Toronto city are {latitude}, {longitude}.')

The geograpical coordinate of Toronto city are 43.6534817, -79.3839347.


#### 4.2. Create a map of the whole Toronto City with neighborhoods superimposed on top.

Create map and add markers to the map.

In [124]:
# create map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

for latitude, longitude, borough, neighborhood in zip(
        df_toronto['Latitude'], 
        df_toronto['Longitude'], 
        df_toronto['Borough'], 
        df_toronto['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [latitude, longitude],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.5,
        parse_html=False).add_to(map_toronto)  

map_toronto

#### 4.3. Map of a part of Toronto City

I will filter by the boroughs that contains the word "Toronto".

In [125]:
# Filter boroughs that contains the word "Toronto"
df_toronto_city = df_toronto[df_toronto['Borough'].str.contains("Toronto")].reset_index(drop=True)
df_toronto_city.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M4E,East Toronto,The Beaches,43.676357,-79.293031
1,M4J,East YorkEast Toronto,The Danforth East,43.685347,-79.338106
2,M4K,East Toronto,"The Danforth West,Riverdale",43.679557,-79.352188
3,M4L,East Toronto,"India Bazaar,The Beaches West",43.668999,-79.315572
4,M4M,East Toronto,Studio District,43.659526,-79.340923


In [126]:
map_toronto_city = folium.Map(location=[latitude, longitude], zoom_start=12)
for latitude, longitude, borough, neighborhood in zip(
        df_toronto_city['Latitude'], 
        df_toronto_city['Longitude'], 
        df_toronto_city['Borough'], 
        df_toronto_city['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [latitude, longitude],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.5,
        parse_html=False).add_to(map_toronto_city)  

map_toronto_city

#### 4.4. Define Foursquare Credentials and Version

In [127]:
CLIENT_ID = 'JM34QREU1GK5VI2NJYSJ5RJBR2YO1KLCLYPVEACLJ12NW1ON'
CLIENT_SECRET = 'YLBLSHQAF14LY04KVFZ4NSKYGTQ301RATD0H44SGLOPUBAM3'
VERSION = '20180323'

#### 4.5. Explore the first neighborhood in our data frame "df_toronto"

In [128]:
neighborhood_name = df_toronto_city.loc[0, 'Neighborhood']
print(f"The first neighborhood's name is '{neighborhood_name}'.")

The first neighborhood's name is 'The Beaches'.


Get the neighborhood's latitude and longitude values.

In [129]:
neighborhood_latitude = df_toronto_city.loc[0, 'Latitude'] # Get neighborhood latitude value
neighborhood_longitude = df_toronto_city.loc[0, 'Longitude'] # Get neighborhood longitude value

print(f'Latitude and longitude values of {neighborhood_name} are {neighborhood_latitude}, {neighborhood_longitude}.')

Latitude and longitude values of The Beaches are 43.67635739999999, -79.2930312.


Now, let's get the top 100 venues that are in The Beaches within a radius of 500 meters.

In [130]:
limit = 100 # limit of number of venues returned by Foursquare API
radius = 500 # define radius
api_url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    limit) # FourSquare API Rest

print(api_url)

# get the result to a json file
results = requests.get(api_url).json()

https://api.foursquare.com/v2/venues/explore?&client_id=JM34QREU1GK5VI2NJYSJ5RJBR2YO1KLCLYPVEACLJ12NW1ON&client_secret=YLBLSHQAF14LY04KVFZ4NSKYGTQ301RATD0H44SGLOPUBAM3&v=20180323&ll=43.67635739999999,-79.2930312&radius=500&limit=100


Create a function that get the category name of the venue

In [131]:
def get_category_name_venue(row):
    categories = row['venue.categories']
    return None if len(categories) == 0 else categories[0]['name']

Clean the structure json into a pandas dataframe.

In [132]:
venues = results['response']['groups'][0]['items']
venues_normalized = pd.json_normalize(venues)

# Columns
venues_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
venues_normalized = venues_normalized.loc[:, venues_columns]

# Filter the category for each row
venues_normalized['venue.categories'] = venues_normalized.apply(get_category_name_venue, axis=1)

# Clean columns
venues_normalized.columns = [column.split(".")[-1] for column in venues_normalized.columns]

venues_normalized


Unnamed: 0,name,categories,lat,lng
0,Glen Manor Ravine,Trail,43.676821,-79.293942
1,The Big Carrot Natural Food Market,Health Food Store,43.678879,-79.297734
2,Grover Pub and Grub,Pub,43.679181,-79.297215
3,Upper Beaches,Neighborhood,43.680563,-79.292869


#### 4.6. Explore neighborhoods in a part of Toronto City

In [133]:
def getVenues(names, latitudes, longitudes, radius=500):
    venues_data=[]
    
    limit = 100 # Limit of number of venues returned by Foursquare API Rest
    radius = 500 # Define radius
    
    for name, latitude, longitude in zip(names, latitudes, longitudes):
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            latitude, 
            longitude, 
            radius, 
            limit)
            
        # Make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # Return only relevant information
        venues_data.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    df_venues = pd.DataFrame([venue for venues_data in venues_data for venue in venues_data])
    df_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return df_venues

Now the data is written for each neighborhood in the dataframe named df_toronto_city.

In [134]:
toronto_venues = getVenues(
      names=df_toronto_city['Neighborhood'],
      latitudes=df_toronto_city['Latitude'],
      longitudes=df_toronto_city['Longitude']
 )
toronto_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,The Beaches,43.662744,-79.321558,Glen Manor Ravine,43.676821,-79.293942,Trail
1,The Beaches,43.662744,-79.321558,The Big Carrot Natural Food Market,43.678879,-79.297734,Health Food Store
2,The Beaches,43.662744,-79.321558,Grover Pub and Grub,43.679181,-79.297215,Pub
3,The Beaches,43.662744,-79.321558,Upper Beaches,43.680563,-79.292869,Neighborhood
4,The Danforth East,43.662744,-79.321558,Danforth & Jones,43.684352,-79.334792,Intersection


Let's check how many venues were returned for each neighborhood.

In [135]:
toronto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Berczy Park,58,58,58,58,58,58
"Brockton,Parkdale Village,Exhibition Place",22,22,22,22,22,22
"CN Tower,King and Spadina,Railway Lands,Harbourfront West,Bathurst Quay,South Niagara,Island airport",17,17,17,17,17,17
Central Bay Street,68,68,68,68,68,68
Christie,15,15,15,15,15,15
Church and Wellesley,75,75,75,75,75,75
"Commerce Court,Victoria Hotel",100,100,100,100,100,100
Davisville,37,37,37,37,37,37
Davisville North,7,7,7,7,7,7
"Dufferin,Dovercourt Village",14,14,14,14,14,14


Let's find out how many unique categories can be curated from all the returned venues

In [136]:
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 231 uniques categories.


#### 4.7. Analyze Each Neighborhood

In [137]:
# Create dummy dataframe
toronto_dummy_df = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# Add neighborhood column back to dataframe
toronto_dummy_df['Neighborhood'] = toronto_venues['Neighborhood'] 

# Move neighborhood column to the first column
move_first_column = [toronto_dummy_df.columns[-1]] + list(toronto_dummy_df.columns[:-1])
toronto_dummy_df = toronto_dummy_df[move_first_column]

toronto_dummy_df.head()

Unnamed: 0,Yoga Studio,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,...,Theme Restaurant,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,1,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [138]:
toronto_means_df = toronto_dummy_df.groupby('Neighborhood').mean().reset_index()
toronto_means_df.head()

Unnamed: 0,Neighborhood,Yoga Studio,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Theme Restaurant,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop
0,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0
1,"Brockton,Parkdale Village,Exhibition Place",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"CN Tower,King and Spadina,Railway Lands,Harbou...",0.0,0.058824,0.058824,0.058824,0.117647,0.117647,0.117647,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Central Bay Street,0.014706,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.0
4,Christie,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


Check the 10 most common venues in each neighborhood.

In [153]:
def return_most_top_ten_common_venues(row, num_venues):
    categories = row.iloc[1:]
    categories_sorted = categories.sort_values(ascending=False)
    return categories_sorted.index.values[0:num_venues]

In [154]:
indicators = ['st', 'nd', 'rd']
num_top_ten_venues = 10

# Create columns according to number of top venues
columns = ['Neighborhood']
for index in np.arange(num_top_ten_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(index+1, indicators[index]))
    except:
        columns.append('{}th Most Common Venue'.format(index+1))

# Create a new dataframe
neighborhoods_venues_df = pd.DataFrame(columns=columns)
neighborhoods_venues_df['Neighborhood'] = toronto_means_df['Neighborhood']

for index in np.arange(toronto_means_df.shape[0]):
    neighborhoods_venues_df.iloc[index, 1:] = return_most_top_ten_common_venues(toronto_means_df.iloc[index, :], num_top_ten_venues)

neighborhoods_venues_df.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Berczy Park,Coffee Shop,Bakery,Cocktail Bar,Seafood Restaurant,Farmers Market,Restaurant,Pharmacy,Cheese Shop,Beer Bar,Japanese Restaurant
1,"Brockton,Parkdale Village,Exhibition Place",Café,Coffee Shop,Breakfast Spot,Gym,Stadium,Burrito Place,Restaurant,Climbing Gym,Performing Arts Venue,Bakery
2,"CN Tower,King and Spadina,Railway Lands,Harbou...",Airport Lounge,Airport Service,Airport Terminal,Coffee Shop,Harbor / Marina,Bar,Rental Car Location,Plane,Boat or Ferry,Boutique
3,Central Bay Street,Coffee Shop,Café,Sandwich Place,Italian Restaurant,Restaurant,Salad Place,Bubble Tea Shop,Japanese Restaurant,Burger Joint,Spa
4,Christie,Grocery Store,Café,Park,Nightclub,Restaurant,Candy Store,Italian Restaurant,Baby Store,Coffee Shop,Donut Shop


#### 4.8. Cluster neighborhoods

Run k-means to cluster the neighborhood into 5 clusters.

In [144]:
# Set number of clusters
kmeans_clusters = 5

toronto_means_clustering_df = toronto_means_df.drop('Neighborhood', 1)

# Run k-means clustering
kmeans = KMeans(n_clusters=kmeans_clusters, random_state=0).fit(toronto_means_clustering_df)

# Check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([2, 2, 2, 2, 2, 2, 2, 2, 0, 2], dtype=int32)

Create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood



In [145]:
# Add clustering labels
neighborhoods_venues_df.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged_df = df_toronto_city

# Merge df_toronto_city with neighborhoods_venues_df to add latitude/longitude for each neighborhood
toronto_merged_df = toronto_merged_df.join(neighborhoods_venues_df.set_index('Neighborhood'), on='Neighborhood')

toronto_merged_df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M4E,East Toronto,The Beaches,43.676357,-79.293031,0,Health Food Store,Trail,Pub,Wine Shop,Department Store,Escape Room,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop
1,M4J,East YorkEast Toronto,The Danforth East,43.685347,-79.338106,2,Intersection,Park,Coffee Shop,Convenience Store,Wine Shop,Diner,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
2,M4K,East Toronto,"The Danforth West,Riverdale",43.679557,-79.352188,2,Greek Restaurant,Italian Restaurant,Coffee Shop,Ice Cream Shop,Café,Furniture / Home Store,Caribbean Restaurant,Indian Restaurant,Spa,Pub
3,M4L,East Toronto,"India Bazaar,The Beaches West",43.668999,-79.315572,2,Park,Pizza Place,Gym,Food & Drink Shop,Liquor Store,Sandwich Place,Board Shop,Burrito Place,Italian Restaurant,Restaurant
4,M4M,East Toronto,Studio District,43.659526,-79.340923,2,Coffee Shop,Café,Bakery,Gastropub,American Restaurant,Brewery,Yoga Studio,Fish Market,Italian Restaurant,Convenience Store


Visualize the results clusters

In [147]:
# Create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# Set color scheme for the clusters
x = np.arange(kmeans_clusters)
ys = [i + x + (i*x)**2 for i in range(kmeans_clusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# Add markers to the map
markers_colors = []
for latitude, longitude, neighborhood, cluster_index in zip(
        toronto_merged_df['Latitude'], 
        toronto_merged_df['Longitude'], 
        toronto_merged_df['Neighborhood'], 
        toronto_merged_df['Cluster Labels']):
    label = folium.Popup(str(neighborhood) + ' Cluster ' + str(cluster_index), parse_html=True)
    folium.CircleMarker(
        [latitude, longitude],
        radius=5,
        popup=label,
        color=rainbow[cluster_index-1],
        fill=True,
        fill_color=rainbow[cluster_index-1],
        fill_opacity=0.5).add_to(map_clusters)
       
map_clusters

#### 4.9. Examine Clusters

Now, each cluster is examined where it can be distinguished based on each area of Toronto

##### Cluster 1

In [148]:
toronto_cont_merged.loc[toronto_cont_merged['Cluster Labels'] == 0, toronto_cont_merged.columns[[1] + list(range(5, toronto_cont_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,East Toronto,0,Health Food Store,Trail,Pub,Wine Shop,Department Store,Escape Room,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop
6,Central Toronto,0,Park,Hotel,Breakfast Spot,Food & Drink Shop,Sandwich Place,Department Store,Gym / Fitness Center,Convenience Store,Distribution Center,Escape Room
11,Downtown Toronto,0,Park,Playground,Trail,Wine Shop,Department Store,Escape Room,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop
24,Central Toronto,0,Park,Jewelry Store,Trail,Sushi Restaurant,Dessert Shop,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant,Dumpling Restaurant


##### Cluster 2

In [149]:
toronto_cont_merged.loc[toronto_cont_merged['Cluster Labels'] == 1, toronto_cont_merged.columns[[1] + list(range(5, toronto_cont_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
9,Central Toronto,1,Restaurant,Wine Shop,Department Store,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant


##### Cluster 3

In [150]:
toronto_cont_merged.loc[toronto_cont_merged['Cluster Labels'] == 2, toronto_cont_merged.columns[[1] + list(range(5, toronto_cont_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,East YorkEast Toronto,2,Intersection,Park,Coffee Shop,Convenience Store,Wine Shop,Diner,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
2,East Toronto,2,Greek Restaurant,Italian Restaurant,Coffee Shop,Ice Cream Shop,Café,Furniture / Home Store,Caribbean Restaurant,Indian Restaurant,Spa,Pub
3,East Toronto,2,Park,Pizza Place,Gym,Food & Drink Shop,Liquor Store,Sandwich Place,Board Shop,Burrito Place,Italian Restaurant,Restaurant
4,East Toronto,2,Coffee Shop,Café,Bakery,Gastropub,American Restaurant,Brewery,Yoga Studio,Fish Market,Italian Restaurant,Convenience Store
7,Central Toronto,2,Coffee Shop,Clothing Store,Yoga Studio,Chinese Restaurant,Rental Car Location,Café,Fast Food Restaurant,Italian Restaurant,Diner,Sporting Goods Shop
8,Central Toronto,2,Pizza Place,Sandwich Place,Dessert Shop,Gym,Italian Restaurant,Café,Sushi Restaurant,Coffee Shop,Toy / Game Store,Seafood Restaurant
10,Central Toronto,2,Coffee Shop,Pub,American Restaurant,Sushi Restaurant,Bank,Fried Chicken Joint,Restaurant,Bagel Shop,Supermarket,Pizza Place
12,Downtown Toronto,2,Café,Coffee Shop,Restaurant,Bakery,Chinese Restaurant,Pet Store,Italian Restaurant,Pub,Pizza Place,Park
13,Downtown Toronto,2,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Restaurant,Gay Bar,Pub,Men's Store,Mediterranean Restaurant,Hotel,Yoga Studio
14,Downtown Toronto,2,Coffee Shop,Pub,Bakery,Park,Restaurant,Breakfast Spot,Café,Theater,Wine Shop,Farmers Market


##### Cluster 4

In [151]:
toronto_cont_merged.loc[toronto_cont_merged['Cluster Labels'] == 3, toronto_cont_merged.columns[[1] + list(range(5, toronto_cont_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,Central Toronto,3,Park,Swim School,Bus Line,Wine Shop,Diner,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant,Dumpling Restaurant


##### Cluster 5

In [152]:
toronto_cont_merged.loc[toronto_cont_merged['Cluster Labels'] == 4, toronto_cont_merged.columns[[1] + list(range(5, toronto_cont_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
23,Central Toronto,4,Home Service,Garden,Wine Shop,Dessert Shop,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop
