# Segmenting and Clustering Neighborhoods in Toronto

## Part 1: creating a dataframe containing the Postalcodes, boroughs and neighborhoods of the city of Toronto

Installing / Importing the necessary modules

In [72]:
import numpy as np
import pandas as pd

In [86]:
!conda install -c conda-forge beautifulSoup4 --yes 

In [None]:
!conda install -c conda-forge lxml --yes

In [None]:
from bs4 import BeautifulSoup
import requests

Scraping the wikipedia page containing the necessary data:

In [5]:
source= requests.get('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M').text
soup= BeautifulSoup(source, 'lxml')

Scraping the postal codes:

In [6]:
post_codes=[]
for line in soup.tbody.find_all('tr'):
    code= line.text.strip().split('\n')
    post_codes.append(code[0])

In [7]:
post_codes= post_codes[1:]

In [8]:
df_toronto= pd.DataFrame()

In [9]:
df_toronto['PostalCode']= post_codes

In [10]:
df_toronto.head()

Unnamed: 0,PostalCode
0,M1A
1,M2A
2,M3A
3,M4A
4,M5A


Scraping the boroughs:

In [11]:
borough=[]
for line in soup.tbody.find_all('tr'):
    code= line.text.strip().split('\n')
    borough.append(code[1])

In [13]:
borough= borough[1:]
df_toronto['Borough']= borough

Scraping the neighborhoods:

In [14]:
neighborhood=[]
for line in soup.tbody.find_all('tr'):
    code= line.text.strip().split('\n')
    neighborhood.append(code[2])

In [15]:
neighborhood= neighborhood[1:]
df_toronto['Neighborhood']= neighborhood
df_toronto.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


In [16]:
df_toronto.shape

(288, 3)

In [74]:
# Transforming the "Not assigned" values to NaN
df= df_toronto[df_toronto!='Not assigned']

In [18]:
df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1A,,
1,M2A,,
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


In [19]:
# Dropping NaN boroughs:
df.dropna(subset=['Borough'], axis=0, inplace= True)

In [20]:
df.head(10)

Unnamed: 0,PostalCode,Borough,Neighborhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront
5,M5A,Downtown Toronto,Regent Park
6,M6A,North York,Lawrence Heights
7,M6A,North York,Lawrence Manor
8,M7A,Queen's Park,
10,M9A,Etobicoke,Islington Avenue
11,M1B,Scarborough,Rouge
12,M1B,Scarborough,Malvern


In [21]:
# Replacing not assigned neighborhoods with the name of the corresponding borough
df['Neighborhood'].fillna(value= df['Borough'], inplace= True)

In [22]:
df.head(10)

Unnamed: 0,PostalCode,Borough,Neighborhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront
5,M5A,Downtown Toronto,Regent Park
6,M6A,North York,Lawrence Heights
7,M6A,North York,Lawrence Manor
8,M7A,Queen's Park,Queen's Park
10,M9A,Etobicoke,Islington Avenue
11,M1B,Scarborough,Rouge
12,M1B,Scarborough,Malvern


In [23]:
# grouping the neighborhoods by postal code:
df2= df.groupby(['PostalCode', 'Borough']).agg(lambda x: ', '.join(set(x.astype(str))))

In [24]:
df2.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Neighborhood
PostalCode,Borough,Unnamed: 2_level_1
M1B,Scarborough,"Malvern, Rouge"
M1C,Scarborough,"Port Union, Rouge Hill, Highland Creek"
M1E,Scarborough,"West Hill, Morningside, Guildwood"
M1G,Scarborough,Woburn
M1H,Scarborough,Cedarbrae


In [25]:
df2.reset_index(inplace= True)

In [26]:
df2.head(113)

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1B,Scarborough,"Malvern, Rouge"
1,M1C,Scarborough,"Port Union, Rouge Hill, Highland Creek"
2,M1E,Scarborough,"West Hill, Morningside, Guildwood"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae
5,M1J,Scarborough,Scarborough Village
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park"
7,M1L,Scarborough,"Golden Mile, Oakridge, Clairlea"
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West"
9,M1N,Scarborough,"Birch Cliff, Cliffside West"


In [27]:
df2.shape

(103, 3)

## Part 2: Add neighborhood coordinates to the created dataframe

Importing the latitude and longitude from csv file

In [28]:
coord= pd.read_csv('https://cocl.us/Geospatial_data')
coord.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [29]:
df2[['Latitude', 'Longitude']]= coord[['Latitude', 'Longitude']]

In [30]:
df2.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Port Union, Rouge Hill, Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"West Hill, Morningside, Guildwood",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


### Get the latitude and longitude values of Toronto

In [31]:
from geopy.geocoders import Nominatim

address= "Toronto, ON"

geolocator= Nominatim (user_agent= "toronto_explorer")
location= geolocator.geocode(address)
latitude= location.latitude
longitude= location.longitude

print("The coordinate of Toronto are: {}, {}".format(latitude, longitude))

The coordinate of Toronto are: 43.653963, -79.387207


### Restrain the work to only bouroughs that contain the word 'Toronto'

In [32]:
df2['Borough'].unique()

array(['Scarborough', 'North York', 'East York', 'East Toronto',
       'Central Toronto', 'Downtown Toronto', 'York', 'West Toronto',
       "Queen's Park", 'Mississauga', 'Etobicoke'], dtype=object)

In [33]:
df3= df2[df2['Borough'].isin(['East Toronto', 'Central Toronto', 'Downtown Toronto', 'West Toronto'])]

In [34]:
df3.reset_index(inplace= True, drop = True)

In [35]:
df3.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M4E,East Toronto,The Beaches,43.676357,-79.293031
1,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188
2,M4L,East Toronto,"The Beaches West, India Bazaar",43.668999,-79.315572
3,M4M,East Toronto,Studio District,43.659526,-79.340923
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879


## Part3: Use the Foursquare API to explore and cluster the neighborhoods

Foursquare credentials (hidden cell)

In [36]:
# The code was removed by Watson Studio for sharing.

In [37]:
VERSION= '20180605'
LIMIT= 100

Create a function to get the venues of the neighborhoods in Toronto

In [38]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [39]:
toronto_venues= getNearbyVenues(names= df3['Neighborhood'], latitudes= df3['Latitude'], longitudes= df3['Longitude'], radius= 500)

The Beaches
The Danforth West, Riverdale
The Beaches West, India Bazaar
Studio District
Lawrence Park
Davisville North
North Toronto West
Davisville
Moore Park, Summerhill East
Summerhill West, Forest Hill SE, Rathnelly, South Hill, Deer Park
Rosedale
St. James Town, Cabbagetown
Church and Wellesley
Harbourfront, Regent Park
Ryerson, Garden District
St. James Town
Berczy Park
Central Bay Street
Adelaide, King, Richmond
Union Station, Harbourfront East, Toronto Islands
Toronto Dominion Centre, Design Exchange
Commerce Court, Victoria Hotel
Roselawn
Forest Hill North, Forest Hill West
The Annex, Yorkville, North Midtown
Harbord, University of Toronto
Kensington Market, Grange Park, Chinatown
Island airport, Bathurst Quay, CN Tower, Railway Lands, Harbourfront West, South Niagara, King and Spadina
Stn A PO Boxes 25 The Esplanade
Underground city, First Canadian Place
Christie
Dufferin, Dovercourt Village
Little Portugal, Trinity
Exhibition Place, Brockton, Parkdale Village
The Junction So

In [40]:
toronto_venues.shape   

(1706, 7)

In [41]:
toronto_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,The Beaches,43.676357,-79.293031,Glen Manor Ravine,43.676821,-79.293942,Trail
1,The Beaches,43.676357,-79.293031,The Big Carrot Natural Food Market,43.678879,-79.297734,Health Food Store
2,The Beaches,43.676357,-79.293031,Grover Pub and Grub,43.679181,-79.297215,Pub
3,The Beaches,43.676357,-79.293031,Upper Beaches,43.680563,-79.292869,Neighborhood
4,"The Danforth West, Riverdale",43.679557,-79.352188,Pantheon,43.677621,-79.351434,Greek Restaurant


In [42]:
# Get the number of venues per neighborhood
toronto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Adelaide, King, Richmond",100,100,100,100,100,100
Berczy Park,56,56,56,56,56,56
Business Reply Mail Processing Centre 969 Eastern,17,17,17,17,17,17
Central Bay Street,86,86,86,86,86,86
Christie,16,16,16,16,16,16
Church and Wellesley,86,86,86,86,86,86
"Commerce Court, Victoria Hotel",100,100,100,100,100,100
Davisville,34,34,34,34,34,34
Davisville North,9,9,9,9,9,9
"Dufferin, Dovercourt Village",17,17,17,17,17,17


Number of unique venues category:

In [43]:
toronto_venues['Venue Category'].unique().shape[0]

236

One hot encoding:

In [44]:
toronto_onehot= pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")
toronto_onehot['Neighbourhood']= toronto_venues['Neighborhood']
new_columns= [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot= toronto_onehot[new_columns]
toronto_onehot.head(100)

Unnamed: 0,Neighbourhood,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Thrift / Vintage Store,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wings Joint,Yoga Studio
0,The Beaches,0,0,0,0,0,0,0,0,0,...,0,0,1,0,0,0,0,0,0,0
1,The Beaches,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,The Beaches,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,The Beaches,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,"The Danforth West, Riverdale",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
5,"The Danforth West, Riverdale",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
6,"The Danforth West, Riverdale",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
7,"The Danforth West, Riverdale",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
8,"The Danforth West, Riverdale",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
9,"The Danforth West, Riverdale",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Get the mean of the frequency of category occurence per neighborhood:

In [45]:
toronto_grouped= toronto_onehot.groupby('Neighbourhood').mean().reset_index()
toronto_grouped.head()

Unnamed: 0,Neighbourhood,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Thrift / Vintage Store,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wings Joint,Yoga Studio
0,"Adelaide, King, Richmond",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,...,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0
1,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0
2,Business Reply Mail Processing Centre 969 Eastern,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Central Bay Street,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,...,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.011628,0.0,0.011628
4,Christie,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [46]:
toronto_grouped.shape

(38, 237)

Function to sort the venues in descending order:

In [47]:
def return_most_common_venues(row, num_top_venues):
    row_categories= row.iloc[1:]
    row_categories_sorted= row_categories.sort_values(ascending= False)
    return row_categories_sorted.index.values[: num_top_venues]

Dataframe that displays the top 10 venues for each neighborhood:

In [48]:
num_top_venues= 10

indicator= ["st", "nd", 'rd']

columns= ['Neighborhood']

for i in np.arange(num_top_venues):
    try:
        columns.append("{}{} Most common venue".format(i+1, indicator[i]))
    except:
        columns.append("{}th Most common venue".format(i+1))
        
neighborhoods_venues_sorted= pd.DataFrame(columns= columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighbourhood']

for i in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[i, 1:]= return_most_common_venues(toronto_grouped.iloc[i, :], num_top_venues)
    
print(neighborhoods_venues_sorted.shape)
neighborhoods_venues_sorted.head(10)


(38, 11)


Unnamed: 0,Neighborhood,1st Most common venue,2nd Most common venue,3rd Most common venue,4th Most common venue,5th Most common venue,6th Most common venue,7th Most common venue,8th Most common venue,9th Most common venue,10th Most common venue
0,"Adelaide, King, Richmond",Coffee Shop,Café,Steakhouse,Bar,American Restaurant,Restaurant,Cosmetics Shop,Asian Restaurant,Burger Joint,Hotel
1,Berczy Park,Coffee Shop,Cocktail Bar,Farmers Market,Seafood Restaurant,Café,Steakhouse,Bakery,Cheese Shop,Beer Bar,Fish Market
2,Business Reply Mail Processing Centre 969 Eastern,Light Rail Station,Spa,Auto Workshop,Park,Pizza Place,Restaurant,Butcher,Burrito Place,Brewery,Skate Park
3,Central Bay Street,Coffee Shop,Italian Restaurant,Café,Middle Eastern Restaurant,Ice Cream Shop,Sandwich Place,Burger Joint,Indian Restaurant,Spa,Bar
4,Christie,Café,Grocery Store,Park,Convenience Store,Restaurant,Diner,Baby Store,Athletics & Sports,Nightclub,Italian Restaurant
5,Church and Wellesley,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Gay Bar,Restaurant,Mediterranean Restaurant,Men's Store,Gastropub,Hotel,Café
6,"Commerce Court, Victoria Hotel",Coffee Shop,Café,Hotel,Restaurant,Gym,Gastropub,American Restaurant,Steakhouse,Deli / Bodega,Italian Restaurant
7,Davisville,Coffee Shop,Pizza Place,Dessert Shop,Sandwich Place,Gym,Café,Sushi Restaurant,Italian Restaurant,Fried Chicken Joint,Pharmacy
8,Davisville North,Park,Food & Drink Shop,Dog Run,Breakfast Spot,Gym,Grocery Store,Hotel,Sandwich Place,Clothing Store,Doner Restaurant
9,"Dufferin, Dovercourt Village",Bakery,Pharmacy,Supermarket,Gym / Fitness Center,Pizza Place,Portuguese Restaurant,Music Venue,Café,Middle Eastern Restaurant,Brewery


## Cluster neighborhoods using Kmeans algorithm

In [49]:
num_k= 5

toronto_grouped_clustering= toronto_grouped.drop('Neighbourhood', axis=1)

In [50]:
from sklearn.cluster import KMeans

k_means= KMeans(n_clusters= num_k, random_state=0).fit(toronto_grouped_clustering)

k_means.labels_

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 3, 0, 2, 0, 0, 2,
       1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int32)

In [53]:
neighborhoods_venues_sorted.insert(0, 'Klabels', k_means.labels_)

toronto_merged= df3

toronto_merged= toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on= 'Neighborhood')

toronto_merged.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Klabels,1st Most common venue,2nd Most common venue,3rd Most common venue,4th Most common venue,5th Most common venue,6th Most common venue,7th Most common venue,8th Most common venue,9th Most common venue,10th Most common venue
0,M4E,East Toronto,The Beaches,43.676357,-79.293031,0,Pub,Trail,Neighborhood,Health Food Store,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Diner,Electronics Store
1,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,0,Greek Restaurant,Coffee Shop,Italian Restaurant,Ice Cream Shop,Furniture / Home Store,Pub,Bookstore,Brewery,Bubble Tea Shop,Burger Joint
2,M4L,East Toronto,"The Beaches West, India Bazaar",43.668999,-79.315572,0,Board Shop,Ice Cream Shop,Park,Liquor Store,Fast Food Restaurant,Burger Joint,Burrito Place,Fish & Chips Shop,Sandwich Place,Steakhouse
3,M4M,East Toronto,Studio District,43.659526,-79.340923,0,Café,Coffee Shop,Italian Restaurant,Bakery,American Restaurant,Yoga Studio,Park,Brewery,Seafood Restaurant,Sandwich Place
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,3,Park,Lawyer,Swim School,Bus Line,Yoga Studio,Dog Run,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space


In [55]:
!conda install -c conda-forge folium=0.5.0 --yes

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    openssl-1.1.1c             |       h516909a_0         2.1 MB  conda-forge
    branca-0.3.1               |             py_0          25 KB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    certifi-2019.6.16          |           py36_1         149 KB  conda-forge
    altair-3.2.0               |           py36_0         770 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    ca-certificates-2019.6.16  |       hecc5488_0         145 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         3.3 MB

The following NEW packages will be 

## Visualize the clusters on Toronto map

In [None]:
import folium
import matplotlib.cm as cm
import matplotlib.colors as colors

In [93]:
map_toronto= folium.Map(location= [latitude, longitude], zoom_start= 12)

x = np.arange(num_k)
ys = [i + x + (i*x)**2 for i in range(num_k)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'], toronto_merged['Klabels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_toronto)
       
map_toronto

## Clusters analysis

### Cluster 1:

In [85]:
toronto_merged.loc[toronto_merged['Klabels']==0, toronto_merged.columns[[2]+ list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Klabels,1st Most common venue,2nd Most common venue,3rd Most common venue,4th Most common venue,5th Most common venue,6th Most common venue,7th Most common venue,8th Most common venue,9th Most common venue,10th Most common venue
0,The Beaches,0,Pub,Trail,Neighborhood,Health Food Store,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Diner,Electronics Store
1,"The Danforth West, Riverdale",0,Greek Restaurant,Coffee Shop,Italian Restaurant,Ice Cream Shop,Furniture / Home Store,Pub,Bookstore,Brewery,Bubble Tea Shop,Burger Joint
2,"The Beaches West, India Bazaar",0,Board Shop,Ice Cream Shop,Park,Liquor Store,Fast Food Restaurant,Burger Joint,Burrito Place,Fish & Chips Shop,Sandwich Place,Steakhouse
3,Studio District,0,Café,Coffee Shop,Italian Restaurant,Bakery,American Restaurant,Yoga Studio,Park,Brewery,Seafood Restaurant,Sandwich Place
5,Davisville North,0,Park,Food & Drink Shop,Dog Run,Breakfast Spot,Gym,Grocery Store,Hotel,Sandwich Place,Clothing Store,Doner Restaurant
6,North Toronto West,0,Coffee Shop,Sporting Goods Shop,Clothing Store,Mexican Restaurant,Miscellaneous Shop,Diner,Dessert Shop,Park,Gym / Fitness Center,Chinese Restaurant
7,Davisville,0,Coffee Shop,Pizza Place,Dessert Shop,Sandwich Place,Gym,Café,Sushi Restaurant,Italian Restaurant,Fried Chicken Joint,Pharmacy
9,"Summerhill West, Forest Hill SE, Rathnelly, So...",0,Coffee Shop,Pub,Liquor Store,Light Rail Station,Sports Bar,Supermarket,Sushi Restaurant,Bagel Shop,Restaurant,Fried Chicken Joint
11,"St. James Town, Cabbagetown",0,Coffee Shop,Park,Café,Bakery,Restaurant,Pub,Pizza Place,Pharmacy,Italian Restaurant,Jewelry Store
12,Church and Wellesley,0,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Gay Bar,Restaurant,Mediterranean Restaurant,Men's Store,Gastropub,Hotel,Café


Cluster 1 contains by far the biggest number of neighborhoods and it is seems this clusters groups neighborhoods where people can go to eat, drink or for having a coffee

Cluster name: "Restaurants, bars & coffee shops"

### Cluster 2:

In [78]:
toronto_merged.loc[toronto_merged['Klabels']==1, toronto_merged.columns[[2]+ list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Klabels,1st Most common venue,2nd Most common venue,3rd Most common venue,4th Most common venue,5th Most common venue,6th Most common venue,7th Most common venue,8th Most common venue,9th Most common venue,10th Most common venue
22,Roselawn,1,Garden,Yoga Studio,Fish & Chips Shop,Festival,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store


Cluster 2 contains only one neighborhood "Roslawn" and it is seems it is known for its gardens

Cluster name: "Gardens"

### Cluster 3:

In [79]:
toronto_merged.loc[toronto_merged['Klabels']==2, toronto_merged.columns[[2]+ list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Klabels,1st Most common venue,2nd Most common venue,3rd Most common venue,4th Most common venue,5th Most common venue,6th Most common venue,7th Most common venue,8th Most common venue,9th Most common venue,10th Most common venue
8,"Moore Park, Summerhill East",2,Playground,Park,Summer Camp,Tennis Court,Diner,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store
10,Rosedale,2,Park,Playground,Trail,Building,Diner,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store


Cluster 3 contains only three neighborhoods and it is seems they are known for the their parks and playgrounds

Cluster name: "Parks & Playgrounds"

### Cluster 4:

In [80]:
toronto_merged.loc[toronto_merged['Klabels']==3, toronto_merged.columns[[2]+ list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Klabels,1st Most common venue,2nd Most common venue,3rd Most common venue,4th Most common venue,5th Most common venue,6th Most common venue,7th Most common venue,8th Most common venue,9th Most common venue,10th Most common venue
4,Lawrence Park,3,Park,Lawyer,Swim School,Bus Line,Yoga Studio,Dog Run,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space


Cluster 4 contains only one neighborhood "Lawrence Park" and it is seems it is known for its park

Cluster name: "Parks"

### Cluster 5:

In [81]:
toronto_merged.loc[toronto_merged['Klabels']==4, toronto_merged.columns[[2]+ list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Klabels,1st Most common venue,2nd Most common venue,3rd Most common venue,4th Most common venue,5th Most common venue,6th Most common venue,7th Most common venue,8th Most common venue,9th Most common venue,10th Most common venue
23,"Forest Hill North, Forest Hill West",4,Jewelry Store,Sushi Restaurant,Park,Trail,Yoga Studio,Eastern European Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Ethiopian Restaurant


Cluster 5 contains only two  neighborhoods "Forest Hill North & West" and it is seems they are known for jewelry stores

Cluster name: "Jewelry Stores"