<h1>Segmenting and Clustering Neighborhoods in Toronto</h1>

<h2>Applied Data Science Capstone Course - Week 3 Peer Graded Assignment</h2>

In [306]:
import pandas as pd
import numpy as np

<h3>1. Scrap the List of postal codes of Toronto from wikipedia</h3>

<h4>Read Toronto Postal Code Data from Wikipedia</h4>

In [307]:
#Read the first table from wikipedia html page
toronto_df=pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M')[0]
toronto_df.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"


<h4>Clean And Transform the Data into desired format</h4>

In [308]:
#Rename the column Postal Code to PostalCode
toronto_df.rename(columns={'Postal Code':'PostalCode'},inplace=True)

#Remove Rows with Not assigned Borough
toronto_df=toronto_df[toronto_df.Borough !='Not assigned']

In [309]:
#Check if More than one neighborhood can exist in one postal code area
toronto_df.groupby(['PostalCode','Borough']).count().sort_values(['Neighbourhood'], ascending=False)

Unnamed: 0_level_0,Unnamed: 1_level_0,Neighbourhood
PostalCode,Borough,Unnamed: 2_level_1
M1B,Scarborough,1
M5R,Central Toronto,1
M6G,Downtown Toronto,1
M6E,York,1
M6C,York,1
...,...,...
M3L,North York,1
M3K,North York,1
M3J,North York,1
M3H,North York,1


There are no cases with more than one neighborhood in one postal code area

In [310]:
#Check if there are any Neighbourhoods with Not assigned value
toronto_df[toronto_df.Neighbourhood=='Not assigned'].count()

PostalCode       0
Borough          0
Neighbourhood    0
dtype: int64

There are no Neighbourhoods with Not assigned value

<h4>Print the number of rows of the final data set</h4>

In [311]:
toronto_df.shape

(103, 3)

<h3>2. Find the latitude and the longitude coordinates of each neighborhood. </h3>

<h4>Get Geospatial Data File</h4>

In [312]:
!pip install wget
import wget
coord_file=wget.download('http://cocl.us/Geospatial_data')



<h4>Load Geospatial Data File into dataframe</h4>

In [313]:
coord_df=pd.read_csv(coord_file)
coord_df.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


<h4>Create final dataframe containing both Neighbouhood info and Coordinations info</h4>

In [314]:
#Get Latitude and Longituge by merging Toronto Postal Codes dataframe with the Geospation_Data dataframe
toronto_coord_df=toronto_df.merge(coord_df,left_on='PostalCode', right_on='Postal Code')[['PostalCode', 'Borough', 'Neighbourhood','Latitude', 'Longitude']]
toronto_coord_df.head()

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494


<h3>3. Explore and cluster the neighborhoods in Toronto</h3>

In [315]:
!pip install folium
import json
import folium
import requests
import matplotlib.cm as cm
import matplotlib.colors as colors



#### Keep only Neighbourhoods from Borough containing the Toronto keyword

In [316]:
toronto_bor_df=toronto_coord_df[toronto_coord_df.Borough.str.contains("Toronto")].reset_index(drop=True)
toronto_bor_df

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude
0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
1,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
2,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
3,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
4,M4E,East Toronto,The Beaches,43.676357,-79.293031
5,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306
6,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383
7,M6G,Downtown Toronto,Christie,43.669542,-79.422564
8,M5H,Downtown Toronto,"Richmond, Adelaide, King",43.650571,-79.384568
9,M6H,West Toronto,"Dufferin, Dovercourt Village",43.669005,-79.442259


#### Take a look at the map

In [317]:
# create map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[43.651070 ,-79.347015], zoom_start=11)

# add markers to map
for lat, lng, borough, neighborhood in zip(toronto_bor_df['Latitude'], toronto_bor_df['Longitude'], toronto_bor_df['Borough'], toronto_bor_df['Neighbourhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

<h4>Define Foursqare Credentials</h4>

In [318]:
CLIENT_ID = 'ZWLQVDCUED4OJT423RUMHPFZGZWFOUDMA4M1DDFECN1OAA1C' 
CLIENT_SECRET = 'S4BS2TJ3YSXEAK22PI0GGEGA5OP3CX5VHUPRY0BOTBZ21HK5' 
VERSION = '20180605' 
LIMIT = 100 

#### Create Function to get the venues for all Neighbourhoods of the Toronto Dataframe from Foursquare

In [319]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Execute the getNearbyVenues function to get the venues for each neighborhood and create a new dataframe called toronto_venues_df


In [320]:

toronto_venues_df = getNearbyVenues(names=toronto_bor_df['Neighbourhood'],
                                   latitudes=toronto_bor_df['Latitude'],
                                   longitudes=toronto_bor_df['Longitude']
                                  )

Regent Park, Harbourfront
Queen's Park, Ontario Provincial Government
Garden District, Ryerson
St. James Town
The Beaches
Berczy Park
Central Bay Street
Christie
Richmond, Adelaide, King
Dufferin, Dovercourt Village
Harbourfront East, Union Station, Toronto Islands
Little Portugal, Trinity
The Danforth West, Riverdale
Toronto Dominion Centre, Design Exchange
Brockton, Parkdale Village, Exhibition Place
India Bazaar, The Beaches West
Commerce Court, Victoria Hotel
Studio District
Lawrence Park
Roselawn
Davisville North
Forest Hill North & West, Forest Hill Road Park
High Park, The Junction South
North Toronto West, Lawrence Park
The Annex, North Midtown, Yorkville
Parkdale, Roncesvalles
Davisville
University of Toronto, Harbord
Runnymede, Swansea
Moore Park, Summerhill East
Kensington Market, Chinatown, Grange Park
Summerhill West, Rathnelly, South Hill, Forest Hill SE, Deer Park
CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport
R

In [321]:
print('There are {} uniques categories.'.format(len(toronto_venues_df['Venue Category'].unique())))
toronto_venues_df.head(50)

There are 233 uniques categories.


Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Regent Park, Harbourfront",43.65426,-79.360636,Roselle Desserts,43.653447,-79.362017,Bakery
1,"Regent Park, Harbourfront",43.65426,-79.360636,Tandem Coffee,43.653559,-79.361809,Coffee Shop
2,"Regent Park, Harbourfront",43.65426,-79.360636,Morning Glory Cafe,43.653947,-79.361149,Breakfast Spot
3,"Regent Park, Harbourfront",43.65426,-79.360636,Cooper Koo Family YMCA,43.653249,-79.358008,Distribution Center
4,"Regent Park, Harbourfront",43.65426,-79.360636,Body Blitz Spa East,43.654735,-79.359874,Spa
5,"Regent Park, Harbourfront",43.65426,-79.360636,Impact Kitchen,43.656369,-79.35698,Restaurant
6,"Regent Park, Harbourfront",43.65426,-79.360636,Corktown Common,43.655618,-79.356211,Park
7,"Regent Park, Harbourfront",43.65426,-79.360636,The Extension Room,43.653313,-79.359725,Gym / Fitness Center
8,"Regent Park, Harbourfront",43.65426,-79.360636,The Distillery Historic District,43.650244,-79.359323,Historic Site
9,"Regent Park, Harbourfront",43.65426,-79.360636,SOMA chocolatemaker,43.650622,-79.358127,Chocolate Shop


We now have multiple rows for each Neighbourhood, one row per Venue (but multiple venues categories)

#### Convert  Venue Categories to Columns in order convert them from categorical to numeric variables and prepare them for K-Means Clustering

In [322]:
#Create dataframe toronto_clust_process_df 
toronto_clust_process_df=pd.get_dummies(toronto_venues_df[['Venue Category']],prefix='', prefix_sep='')
toronto_clust_process_df['Neighbourhood'] =toronto_venues_df['Neighbourhood']

#Bring last column 'Neighbourhood' as the first column of the dataframe
toronto_clust_process_df=toronto_clust_process_df.iloc[:,233:234].join(toronto_clust_process_df.iloc[:,0:233])
toronto_clust_process_df.head()

Unnamed: 0,Neighbourhood,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,...,Theme Restaurant,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Yoga Studio
0,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Now we need to gather the rows and create again onw row per Neighbourhood

#### Create one row per Neighbourhood dataframe. Aggregate each venue category with the avg number of the specific category on total number of venue categories found for each Neighbouhood

In [323]:
toronto_clust_ready_df = toronto_clust_process_df.groupby('Neighbourhood').mean().reset_index()
toronto_clust_ready_df.head()

Unnamed: 0,Neighbourhood,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,...,Theme Restaurant,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Yoga Studio
0,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0
1,"Brockton, Parkdale Village, Exhibition Place",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Business reply mail Processing Centre, South C...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"CN Tower, King and Spadina, Railway Lands, Har...",0.058824,0.058824,0.058824,0.117647,0.117647,0.117647,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Central Bay Street,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.015625,0.0,0.0,0.015625,0.015625


#### Execute K-Means clustering algorithm with a K=4 in order to split the Neighbourhoods into 4 clusters

In [324]:
from sklearn.cluster import KMeans
# set number of clusters
kclusters = 4

toronto_clustering_df = toronto_clust_ready_df.drop('Neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=1).fit(toronto_clustering_df)

#### Function to sort the venues in descending order

In [325]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

#### Create a new dataframe to display the top 10 venues for each neighborhood as columns

In [326]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighbourhoods_venues_sorted_df = pd.DataFrame(columns=columns)
neighbourhoods_venues_sorted_df['Neighbourhood'] = toronto_clust_ready_df['Neighbourhood']

for ind in np.arange(toronto_clust_ready_df.shape[0]):
    neighbourhoods_venues_sorted_df.iloc[ind, 1:] = return_most_common_venues(toronto_clust_ready_df.iloc[ind, :], num_top_venues)

neighbourhoods_venues_sorted_df.head()

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Berczy Park,Coffee Shop,Cocktail Bar,Seafood Restaurant,Restaurant,Cheese Shop,Farmers Market,Bakery,Beer Bar,Hotel,Pharmacy
1,"Brockton, Parkdale Village, Exhibition Place",Café,Breakfast Spot,Coffee Shop,Climbing Gym,Burrito Place,Restaurant,Italian Restaurant,Intersection,Stadium,Bar
2,"Business reply mail Processing Centre, South C...",Brewery,Garden Center,Butcher,Restaurant,Fast Food Restaurant,Farmers Market,Auto Workshop,Burrito Place,Smoke Shop,Garden
3,"CN Tower, King and Spadina, Railway Lands, Har...",Airport Lounge,Airport Service,Airport Terminal,Airport,Airport Food Court,Airport Gate,Bar,Harbor / Marina,Boutique,Rental Car Location
4,Central Bay Street,Coffee Shop,Sandwich Place,Café,Italian Restaurant,Salad Place,Burger Joint,Bubble Tea Shop,Portuguese Restaurant,Poke Place,Pizza Place


This form will be helpful to evaluate the clusters created

#### Gather everything together into final_df dataframe (Neighbourhoods, Coordinates, Cluster, Top 10 Venue Categories)

In [327]:
# add clustering labels to top 10 venue categories dataframe
neighbourhoods_venues_sorted_df.insert(0, 'Cluster Labels', kmeans.labels_)

#Create final dataframe based on the initial dataframe with the Neighbourhoods and locations
final_df = toronto_bor_df

#gather everything in final_df (join location data with top 10 venue categories data)
final_df = final_df.join(neighbourhoods_venues_sorted_df.set_index('Neighbourhood'), on='Neighbourhood')

final_df.head() 

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1,Coffee Shop,Bakery,Pub,Park,Breakfast Spot,Café,Theater,Yoga Studio,Shoe Store,Brewery
1,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,1,Coffee Shop,Sushi Restaurant,Yoga Studio,Creperie,Smoothie Shop,Sandwich Place,Burrito Place,Café,Portuguese Restaurant,College Auditorium
2,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,1,Coffee Shop,Clothing Store,Cosmetics Shop,Hotel,Bubble Tea Shop,Café,Middle Eastern Restaurant,Japanese Restaurant,Fast Food Restaurant,Ramen Restaurant
3,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,1,Coffee Shop,Café,Cocktail Bar,American Restaurant,Gastropub,Hotel,Gym,Cosmetics Shop,Department Store,Moroccan Restaurant
4,M4E,East Toronto,The Beaches,43.676357,-79.293031,1,Pub,Health Food Store,Trail,Neighborhood,Yoga Studio,Dog Run,Diner,Discount Store,Distribution Center,Donut Shop


#### Display clusters on the map

In [328]:
# create map
map_clusters =folium.Map(location=[43.651070 ,-79.347015], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(final_df['Latitude'], final_df['Longitude'], final_df['Neighbourhood'], final_df['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

#### Evaluate the clusters. Find common characteristics based on Top Venue Categories of each Cluster

##### Cluster 0 - "Relax and Play"

In [329]:
final_df.loc[final_df['Cluster Labels'] == 0]

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
18,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,0,Park,Bus Line,Swim School,Farmers Market,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop
21,M5P,Central Toronto,"Forest Hill North & West, Forest Hill Road Park",43.696948,-79.411307,0,Park,Trail,Jewelry Store,Sushi Restaurant,Department Store,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop
33,M4W,Downtown Toronto,Rosedale,43.679563,-79.377529,0,Park,Playground,Trail,Deli / Bodega,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant


All Neighbourhoods in this Cluster have as 1st Most Common Venue the Park. In the second and third place you can find Trails, Swim Schools and Playeground. These are neighbourhoods mainly for recreation, sports and relax.

#### Cluster 1 - "The busy part of the town"

In [330]:
final_df.loc[final_df['Cluster Labels'] == 1]

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1,Coffee Shop,Bakery,Pub,Park,Breakfast Spot,Café,Theater,Yoga Studio,Shoe Store,Brewery
1,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,1,Coffee Shop,Sushi Restaurant,Yoga Studio,Creperie,Smoothie Shop,Sandwich Place,Burrito Place,Café,Portuguese Restaurant,College Auditorium
2,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,1,Coffee Shop,Clothing Store,Cosmetics Shop,Hotel,Bubble Tea Shop,Café,Middle Eastern Restaurant,Japanese Restaurant,Fast Food Restaurant,Ramen Restaurant
3,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,1,Coffee Shop,Café,Cocktail Bar,American Restaurant,Gastropub,Hotel,Gym,Cosmetics Shop,Department Store,Moroccan Restaurant
4,M4E,East Toronto,The Beaches,43.676357,-79.293031,1,Pub,Health Food Store,Trail,Neighborhood,Yoga Studio,Dog Run,Diner,Discount Store,Distribution Center,Donut Shop
5,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,1,Coffee Shop,Cocktail Bar,Seafood Restaurant,Restaurant,Cheese Shop,Farmers Market,Bakery,Beer Bar,Hotel,Pharmacy
6,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383,1,Coffee Shop,Sandwich Place,Café,Italian Restaurant,Salad Place,Burger Joint,Bubble Tea Shop,Portuguese Restaurant,Poke Place,Pizza Place
7,M6G,Downtown Toronto,Christie,43.669542,-79.422564,1,Grocery Store,Café,Park,Restaurant,Candy Store,Baby Store,Italian Restaurant,Coffee Shop,Nightclub,Athletics & Sports
8,M5H,Downtown Toronto,"Richmond, Adelaide, King",43.650571,-79.384568,1,Coffee Shop,Café,Restaurant,Gym,Bakery,Deli / Bodega,Clothing Store,Thai Restaurant,Sushi Restaurant,Hotel
9,M6H,West Toronto,"Dufferin, Dovercourt Village",43.669005,-79.442259,1,Bakery,Pharmacy,Brewery,Pizza Place,Bank,Supermarket,Pool,Café,Playground,Middle Eastern Restaurant


This Cluster is full of Neighbourhoods with Cafe and all kind of Restaurants. These are the most busy and lively areas if you want to hang out , eat drink a genearl have a good time.

##### Cluster 2 - "The devil's advocate"

In [331]:
final_df.loc[final_df['Cluster Labels'] == 2]

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
29,M4T,Central Toronto,"Moore Park, Summerhill East",43.689574,-79.38316,2,Lawyer,Trail,Yoga Studio,Dessert Shop,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop


This is clearly a business area as the most common venue are Lawyer offices.

##### Cluster 3 - "Where your home is"

In [332]:
final_df.loc[final_df['Cluster Labels'] == 3]

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
19,M5N,Central Toronto,Roselawn,43.711695,-79.416936,3,Garden,Home Service,Dessert Shop,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop


This Cluster has a Neighboorhood that is hasnt so much cafes/restaurants in the first places. Garden and Home Services make us think this is a residential, quite neighbourhood.