# Segmenting and Clustering Neighborhoods in Toronto

## 1. Gather Neighborhoods in Toronto

This following Wikipedia page contains postal codes, Borough and Neighborhood information in Toronto, scraping the website use BeautifulSoup to extract the data.

In [202]:
from bs4 import BeautifulSoup
import requests
import pandas as pd

Create soup object of our Wiki page.

In [203]:
html = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
source = requests.get(html).text
soup = BeautifulSoup(source, 'lxml')

From the returned html structure, our interested information is in tag 'tr' and 'td', let's extract the contents and make them into a list.

In [204]:
templist = []
for line in soup.find_all('tr'):
    if len(line.find_all('th')) != 0:
        templist.append([item.text.strip('\n') for item in line.find_all('th')])
    else:
        templist.append([item.text.strip('\n') for item in line.find_all('td')])

Convert the templist into a pandas dataframe, and look print out the size

In [205]:
df = pd.DataFrame([item[0:3] for item in templist[1:-5]])
df.columns = templist[0]
print(df.shape)
df.head()

(289, 3)


Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


#### Process the dataframe, first remove all records that Borough is 'Not assigned'.

In [206]:
df = df[df.Borough != 'Not assigned'] 
print(df.shape)
df.head()

(212, 3)


Unnamed: 0,Postcode,Borough,Neighbourhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront
5,M5A,Downtown Toronto,Regent Park
6,M6A,North York,Lawrence Heights


#### Second, for records that has Borough information, but without Neighbourhood, assign Borough to Neighbourhood.

In [207]:
temp = df[df['Neighbourhood'] =='Not assigned'].Borough
df.loc[df['Neighbourhood'] == 'Not assigned', 'Neighbourhood']=temp
df = df.reset_index(drop=True)
print(df.shape)
df.head()

(212, 3)


Unnamed: 0,Postcode,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M5A,Downtown Toronto,Regent Park
4,M6A,North York,Lawrence Heights


#### Third, group records that have the same postcode and Borough into one row with the neighbourhoods, comma delimited.

In [208]:
temp = df.groupby(['Postcode','Borough'])['Neighbourhood'].apply(lambda x: '%s' % ','.join(x))
df1 = pd.DataFrame(temp.index.get_level_values(0))
df2 = pd.DataFrame(temp.index.get_level_values(1))
df3 = pd.DataFrame(temp.values)
df4 = pd.concat([df1, df2, df3], axis=1)
df4.columns = ['Postcode', 'Borough', 'Neighbourhood']
pd.set_option('display.max_colwidth', -1)
# df4[df4.Postcode == 'M9V']


To exam if rows are correctly combined, check one postcode that has different neighbourhoods, M9V.

In [209]:
print(df4[df4.Postcode == 'M9V']['Neighbourhood'])
df[df.Postcode == 'M9V']

101    Albion Gardens,Beaumond Heights,Humbergate,Jamestown,Mount Olive,Silverstone,South Steeles,Thistletown
Name: Neighbourhood, dtype: object


Unnamed: 0,Postcode,Borough,Neighbourhood
174,M9V,Etobicoke,Albion Gardens
175,M9V,Etobicoke,Beaumond Heights
176,M9V,Etobicoke,Humbergate
177,M9V,Etobicoke,Jamestown
178,M9V,Etobicoke,Mount Olive
179,M9V,Etobicoke,Silverstone
180,M9V,Etobicoke,South Steeles
181,M9V,Etobicoke,Thistletown


Let's see the final data frame size

In [210]:
print(df4.shape)
df4.head()

(103, 3)


Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1B,Scarborough,"Rouge,Malvern"
1,M1C,Scarborough,"Highland Creek,Rouge Hill,Port Union"
2,M1E,Scarborough,"Guildwood,Morningside,West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


## 2. Assign coordinates to Neighborhoods

Now we have the postal code of each neighborhood along with the borough/neighborhood name, in order to utilize the Foursquare location data, we need to get the Lat, Lon of each neighborhood.

In [211]:
# !conda install -c conda-forge geocoder --yes

In [212]:
import geocoder

def getCoord(postal_code):
    lat_lng_coords = None
    print(postal_code)
    while(lat_lng_coords is None):
        print('looping...')
        g = geocoder.google('{}, Toronto, Ontario'.format(postal_code))
        lat_lng_coords = g.latlng
        print(g, g.latlng)
        
    return lat_lng_coord[0], lat_lng_coord[1]
print("Coordinates collector is ready to use!")

Coordinates collector is ready to use!


Because get lat lon from geocoder.google is taking too long for even a single postal code, we are going to use the csv file that already contains coordinates we are looking for.

In [213]:
CoordList = 'Geospatial_Coordinates.csv'
coord_df = pd.read_csv(CoordList)
coord_df.columns = ['Postcode', 'Latitude', 'Longitude']
coord_df.head()

Unnamed: 0,Postcode,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


Now combine this coord_df with previous dataframe we scraped from wikipedia.

In [214]:
toronto_df = pd.merge(df4, coord_df, on='Postcode')
toronto_df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge,Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek,Rouge Hill,Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood,Morningside,West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


**Create a map of Toronto with neighborhoods superimposed on top.**

In [215]:
import json # library to handle JSON files

# !conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
import numpy as np
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

# !conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

In [216]:
geolocator = Nominatim(user_agent="my-application")
location = geolocator.geocode('Toronto, Ontario')
latitude = location.latitude
longitude = location.longitude
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(toronto_df['Latitude'], toronto_df['Longitude'], toronto_df['Borough'], toronto_df['Neighbourhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

## 3. Explore Neighborhoods in Toronto

FourSquare API is used here to explore venues and their basic info around each Neighborhood.

First, define FourSquare credentials

In [217]:
CLIENT_ID = 'VTNJI1NBXEBGDETQPO5ITG5ZNRCR0LTLH4BON0L0FMCCQ0YD' # your Foursquare ID
CLIENT_SECRET = 'JCU14KCM31VSSPKWD4H3EFZYB32WD2YBCGCTSM045TANBYII' # your Foursquare Secret
VERSION = '201801122' # Foursquare API version
LIMIT = 500

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: VTNJI1NBXEBGDETQPO5ITG5ZNRCR0LTLH4BON0L0FMCCQ0YD
CLIENT_SECRET:JCU14KCM31VSSPKWD4H3EFZYB32WD2YBCGCTSM045TANBYII


#### Let's create a function to repeat the process of exploring venues to all the neighborhoods in Toronto, radius is set to 1500 meters to ensure each neighborhood could get at least one venue nearby

In [218]:
def getNearbyVenues(names, latitudes, longitudes, radius=1500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [220]:
toronto_venues = getNearbyVenues(names=toronto_df['Neighbourhood'],
                                   latitudes=toronto_df['Latitude'],
                                   longitudes=toronto_df['Longitude']
                                  )

Take a look at returned venues dataframe of Toronto.

In [221]:
print(toronto_venues.shape)
toronto_venues.head()

(6747, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Rouge,Malvern",43.806686,-79.194353,Images Salon & Spa,43.802283,-79.198565,Spa
1,"Rouge,Malvern",43.806686,-79.194353,Canadiana exhibit,43.817962,-79.193374,Zoo Exhibit
2,"Rouge,Malvern",43.806686,-79.194353,Caribbean Wave,43.798558,-79.195777,Caribbean Restaurant
3,"Rouge,Malvern",43.806686,-79.194353,Harvey's,43.800106,-79.198258,Fast Food Restaurant
4,"Rouge,Malvern",43.806686,-79.194353,Wendy's,43.802008,-79.19808,Fast Food Restaurant


6747 venues in total were found, let's see how many venues were returned for each neighborhood, and how many unique categories among them.

In [222]:
toronto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Adelaide,King,Richmond",100,100,100,100,100,100
Agincourt,59,59,59,59,59,59
"Agincourt North,L'Amoreaux East,Milliken,Steeles East",73,73,73,73,73,73
"Albion Gardens,Beaumond Heights,Humbergate,Jamestown,Mount Olive,Silverstone,South Steeles,Thistletown",29,29,29,29,29,29
"Alderwood,Long Branch",47,47,47,47,47,47
"Bathurst Manor,Downsview North,Wilson Heights",42,42,42,42,42,42
Bayview Village,15,15,15,15,15,15
"Bedford Park,Lawrence Manor East",75,75,75,75,75,75
Berczy Park,100,100,100,100,100,100
"Birch Cliff,Cliffside West",14,14,14,14,14,14


In [223]:
print('There are {} unique categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 335 unique categories.


Check if we successfully explored all neighborhoods, there should be 103 neighborhoods.

In [224]:
print('There are {} unique neighborhood.'.format(len(toronto_venues['Neighborhood'].unique())))

There are 103 unique neighborhood.


## 4. Analyze Each Neighborhood

#### Encoding categorized value 'Venue Catgory'

In [225]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")
# add neighborhood column back to dataframe
toronto_onehot['Neighbourhood'] = toronto_venues['Neighborhood']

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Neighbourhood,Accessories Store,Adult Boutique,Afghan Restaurant,African Restaurant,Airport,Airport Lounge,American Restaurant,Amphitheater,Animal Shelter,...,Warehouse Store,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Xinjiang Restaurant,Yoga Studio,Zoo,Zoo Exhibit
0,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,1
2,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,"Rouge,Malvern",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [226]:
toronto_grouped = toronto_onehot.groupby('Neighbourhood').mean().reset_index()
print(toronto_grouped.shape)
toronto_grouped.head()

(103, 336)


Unnamed: 0,Neighbourhood,Accessories Store,Adult Boutique,Afghan Restaurant,African Restaurant,Airport,Airport Lounge,American Restaurant,Amphitheater,Animal Shelter,...,Warehouse Store,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Xinjiang Restaurant,Yoga Studio,Zoo,Zoo Exhibit
0,"Adelaide,King,Richmond",0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Agincourt,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0
2,"Agincourt North,L'Amoreaux East,Milliken,Steeles East",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"Albion Gardens,Beaumond Heights,Humbergate,Jamestown,Mount Olive,Silverstone,South Steeles,Thistletown",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Alderwood,Long Branch",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0


#### Let's get each neighborhood along with the top 10 most common venues

In [227]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    return row_categories_sorted.index.values[0:num_top_venues]


In [228]:
num_top_venues = 10
indicators = ['st', 'nd', 'rd']
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common'.format(ind+1))

neighborhoods_venues_sorted = pd.DataFrame(columns = columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighbourhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)
    
    neighborhoods_venues_sorted

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common,5th Most Common,6th Most Common,7th Most Common,8th Most Common,9th Most Common,10th Most Common
0,"Adelaide,King,Richmond",Coffee Shop,Hotel,Café,Gastropub,Theater,Pizza Place,Thai Restaurant,Japanese Restaurant,Concert Hall,Italian Restaurant
1,Agincourt,Chinese Restaurant,Coffee Shop,Park,Gym / Fitness Center,Breakfast Spot,Caribbean Restaurant,Cantonese Restaurant,Japanese Restaurant,Hong Kong Restaurant,Bakery
2,"Agincourt North,L'Amoreaux East,Milliken,Steeles East",Chinese Restaurant,BBQ Joint,Bakery,Coffee Shop,Korean Restaurant,Pizza Place,Noodle House,Bubble Tea Shop,Pharmacy,Tea Room
3,"Albion Gardens,Beaumond Heights,Humbergate,Jamestown,Mount Olive,Silverstone,South Steeles,Thistletown",Coffee Shop,Fast Food Restaurant,Grocery Store,Pizza Place,Convenience Store,Hardware Store,Steakhouse,Bus Line,Fried Chicken Joint,Café
4,"Alderwood,Long Branch",Pizza Place,Bar,Pharmacy,Grocery Store,Restaurant,Park,Coffee Shop,Light Rail Station,Toy / Game Store,Café
5,"Bathurst Manor,Downsview North,Wilson Heights",Park,Coffee Shop,Pizza Place,Convenience Store,Bus Line,Restaurant,Diner,Chinese Restaurant,Gift Shop,Bank
6,Bayview Village,Japanese Restaurant,Bank,Trail,Convenience Store,Fast Food Restaurant,Café,Skate Park,Shopping Mall,Park,Chinese Restaurant
7,"Bedford Park,Lawrence Manor East",Bakery,Coffee Shop,Sushi Restaurant,Bagel Shop,Italian Restaurant,Café,Pizza Place,Tea Room,Grocery Store,Burger Joint
8,Berczy Park,Café,Coffee Shop,Hotel,Restaurant,Park,Japanese Restaurant,Gastropub,Italian Restaurant,Bakery,Diner
9,"Birch Cliff,Cliffside West",Park,Restaurant,Golf Course,Gym,General Entertainment,Skating Rink,Diner,Café,Thai Restaurant,Bank


## 4. Cluster Neighborhoods

Run k-means to cluster the neighborhood into 5 clusters

In [229]:
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighbourhood', 1)

kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

len(kmeans.labels_)

103

#### Let's visualize the resulting clusters.

In [230]:
toronto_merged = toronto_df
toronto_merged['Neighborhood'] = toronto_merged['Neighbourhood']
toronto_merged['Cluster Labels'] = kmeans.labels_
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

toronto_grouped_clustering.shape


(103, 335)

In [231]:
map_clusters = folium.Map(location=[latitude, longitude], zoom_start = 10)

x = np.arange(kclusters)
ys = [i+x+(i+x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon], 
        radius = 5,
        popup = label, 
        color = rainbow[cluster-1],
        fill = True,
        fill_color = rainbow[cluster-1],
        fill_opacity = 0.7).add_to(map_clusters)

map_clusters

## 6. Examine Clusters

Now, let's examine each cluster and determine the discriminating venue categories that distinguish each cluster.

In [232]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common,5th Most Common,6th Most Common,7th Most Common,8th Most Common,9th Most Common,10th Most Common
0,Scarborough,"Rouge,Malvern",0,Zoo Exhibit,Fast Food Restaurant,Pizza Place,Video Game Store,Movie Theater,Cosmetics Shop,Fruit & Vegetable Store,Liquor Store,Caribbean Restaurant,Bakery
8,Scarborough,"Cliffcrest,Cliffside,Scarborough Village West",0,Park,Harbor / Marina,Pharmacy,Fast Food Restaurant,Pizza Place,Sandwich Place,Ice Cream Shop,Beach,Grocery Store,Sporting Goods Shop
11,Scarborough,"Maryvale,Wexford",0,Middle Eastern Restaurant,Coffee Shop,Pizza Place,Grocery Store,Mediterranean Restaurant,Restaurant,Breakfast Spot,Sandwich Place,Indian Chinese Restaurant,Fish Market
12,Scarborough,Agincourt,0,Chinese Restaurant,Coffee Shop,Park,Gym / Fitness Center,Breakfast Spot,Caribbean Restaurant,Cantonese Restaurant,Japanese Restaurant,Hong Kong Restaurant,Bakery
15,Scarborough,"L'Amoreaux West,Steeles West",0,Chinese Restaurant,Coffee Shop,Fast Food Restaurant,Tennis Court,Pizza Place,Pool,Sandwich Place,Bank,Grocery Store,Shopping Mall
19,North York,Bayview Village,0,Japanese Restaurant,Bank,Trail,Convenience Store,Fast Food Restaurant,Café,Skate Park,Shopping Mall,Park,Chinese Restaurant
20,North York,"Silver Hills,York Mills",0,Coffee Shop,Furniture / Home Store,Bank,Burger Joint,Japanese Restaurant,Butcher,Park,Pharmacy,Supermarket,Liquor Store
21,North York,"Newtonbrook,Willowdale",0,Korean Restaurant,Coffee Shop,Bubble Tea Shop,Café,Japanese Restaurant,Shopping Mall,Juice Bar,Pharmacy,Bank,Fast Food Restaurant
22,North York,Willowdale South,0,Korean Restaurant,Coffee Shop,Bubble Tea Shop,Japanese Restaurant,Café,Ramen Restaurant,Pizza Place,Grocery Store,Sushi Restaurant,Dessert Shop
27,North York,"Flemingdon Park,Don Mills South",0,Coffee Shop,Japanese Restaurant,Gym,Middle Eastern Restaurant,Sandwich Place,Restaurant,Fast Food Restaurant,Asian Restaurant,Grocery Store,Italian Restaurant


#### Number of international restaurants is outstanding in category 1 neighborhood, therefore this category 1 could named as International Neighborhood. People who love the taste of exotic or crave of a bite from their home countries are most attracted here.

In [233]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common,5th Most Common,6th Most Common,7th Most Common,8th Most Common,9th Most Common,10th Most Common
4,Scarborough,Cedarbrae,1,Coffee Shop,Clothing Store,Sandwich Place,Indian Restaurant,Hotel,Restaurant,Fast Food Restaurant,Jewelry Store,Bakery,Gym
7,Scarborough,"Clairlea,Golden Mile,Oakridge",1,Coffee Shop,Pizza Place,Grocery Store,Burger Joint,Sandwich Place,Park,Bus Stop,Diner,Electronics Store,Beer Store
16,Scarborough,Upper Rouge,1,National Park,Farm,Zoo Exhibit,Fast Food Restaurant,Electronics Store,Ethiopian Restaurant,Event Space,Eye Doctor,Fabric Shop,Falafel Restaurant
17,North York,Hillcrest Village,1,Chinese Restaurant,Coffee Shop,Bank,Sandwich Place,Park,Pharmacy,Sushi Restaurant,Bakery,Pizza Place,Grocery Store
18,North York,"Fairview,Henry Farm,Oriole",1,Coffee Shop,Clothing Store,Park,Middle Eastern Restaurant,Japanese Restaurant,Sandwich Place,Fast Food Restaurant,Bakery,Pharmacy,Sushi Restaurant
23,North York,York Mills West,1,Coffee Shop,Bank,Japanese Restaurant,Sandwich Place,Fried Chicken Joint,Pharmacy,Pizza Place,Korean Restaurant,Grocery Store,Gym
29,North York,"Northwood Park,York University",1,Coffee Shop,Restaurant,Fast Food Restaurant,Pizza Place,Bar,Japanese Restaurant,Furniture / Home Store,Shopping Mall,Greek Restaurant,Middle Eastern Restaurant
31,North York,Downsview West,1,Park,Grocery Store,Moving Target,Tea Room,Coffee Shop,Plaza,Vietnamese Restaurant,Bank,Pizza Place,Fabric Shop
33,North York,Downsview Northwest,1,Tennis Stadium,Pharmacy,Fast Food Restaurant,Theater,Sandwich Place,Hotel,Pizza Place,Coffee Shop,Fried Chicken Joint,Kitchen Supply Store
41,East Toronto,"The Danforth West,Riverdale",1,Greek Restaurant,Café,Park,Coffee Shop,Pizza Place,Bakery,Pub,Vietnamese Restaurant,Grocery Store,Diner


#### Category 2 has plenty of coffee shop, spa and Bar, we named it Leisure Neighborhood, these are nice area to enjoy a nice layback morning or hangout with a couple friends.

In [234]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common,5th Most Common,6th Most Common,7th Most Common,8th Most Common,9th Most Common,10th Most Common
93,Etobicoke,Islington Avenue,2,Grocery Store,Pharmacy,Convenience Store,Japanese Restaurant,Bus Line,Café,Shopping Mall,Liquor Store,Golf Course,Bank


#### Category 3's top three venues are grocery store, parmacy and convenience store, named as 'Shopping Neighborhood', very convenient neighborhoods that everything you need is just a couple minutes away.

In [235]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]


Unnamed: 0,Borough,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common,5th Most Common,6th Most Common,7th Most Common,8th Most Common,9th Most Common,10th Most Common
1,Scarborough,"Highland Creek,Rouge Hill,Port Union",3,Gym,Breakfast Spot,Neighborhood,Burger Joint,Italian Restaurant,Grocery Store,Pizza Place,Farm,Electronics Store,Ethiopian Restaurant
2,Scarborough,"Guildwood,Morningside,West Hill",3,Pizza Place,Coffee Shop,Fast Food Restaurant,Sports Bar,Breakfast Spot,Greek Restaurant,Bank,Liquor Store,Beer Store,Supermarket
3,Scarborough,Woburn,3,Coffee Shop,Pharmacy,Indian Restaurant,Sandwich Place,Pizza Place,Fast Food Restaurant,Chinese Restaurant,Juice Bar,Thrift / Vintage Store,Supermarket
24,North York,Willowdale West,3,Coffee Shop,Pizza Place,Bagel Shop,Pharmacy,Park,Sandwich Place,Intersection,Business Service,Bakery,Bank
34,North York,Victoria Village,3,Coffee Shop,Middle Eastern Restaurant,Gym,Fast Food Restaurant,Grocery Store,Gym / Fitness Center,Chinese Restaurant,Camera Store,Beer Store,Sandwich Place
36,East York,Woodbine Heights,3,Pizza Place,Coffee Shop,Park,Thai Restaurant,Gastropub,Sandwich Place,Pharmacy,Breakfast Spot,Bar,Grocery Store
37,East Toronto,The Beaches,3,Coffee Shop,Pub,Beach,Breakfast Spot,Bakery,Ice Cream Shop,Japanese Restaurant,Bar,Sandwich Place,Grocery Store
39,East York,Thorncliffe Park,3,Coffee Shop,Sandwich Place,Grocery Store,Restaurant,Burger Joint,Indian Restaurant,Sporting Goods Shop,Electronics Store,Greek Restaurant,Bank
46,Central Toronto,North Toronto West,3,Coffee Shop,Italian Restaurant,Fast Food Restaurant,Café,Sushi Restaurant,Bakery,Japanese Restaurant,Diner,Thai Restaurant,Food & Drink Shop
47,Central Toronto,Davisville,3,Coffee Shop,Italian Restaurant,Indian Restaurant,Café,Bakery,Gym,Sushi Restaurant,Restaurant,Japanese Restaurant,Pizza Place


#### Category 4 is a "Balanced Neighborhood", in these neighborhood you likely to find varies types of venues that fit different services, such as Park, Gym, Bank, etc.

In [236]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common,5th Most Common,6th Most Common,7th Most Common,8th Most Common,9th Most Common,10th Most Common
5,Scarborough,Scarborough Village,4,Sandwich Place,Grocery Store,Pharmacy,Coffee Shop,Fast Food Restaurant,Ice Cream Shop,Wings Joint,Chinese Restaurant,Pizza Place,Liquor Store
6,Scarborough,"East Birchmount Park,Ionview,Kennedy Park",4,Coffee Shop,Fast Food Restaurant,Sandwich Place,Chinese Restaurant,Pizza Place,Grocery Store,Train Station,Pharmacy,Sporting Goods Shop,Beer Store
9,Scarborough,"Birch Cliff,Cliffside West",4,Park,Restaurant,Golf Course,Gym,General Entertainment,Skating Rink,Diner,Café,Thai Restaurant,Bank
10,Scarborough,"Dorset Park,Scarborough Town Centre,Wexford Heights",4,Coffee Shop,Fast Food Restaurant,Pizza Place,Grocery Store,Chinese Restaurant,Indian Restaurant,Wings Joint,Pet Store,Light Rail Station,Burger Joint
13,Scarborough,"Clarks Corners,Sullivan,Tam O'Shanter",4,Chinese Restaurant,Fast Food Restaurant,Park,Pharmacy,Falafel Restaurant,Pizza Place,Deli / Bodega,Sandwich Place,Cantonese Restaurant,Vietnamese Restaurant
14,Scarborough,"Agincourt North,L'Amoreaux East,Milliken,Steeles East",4,Chinese Restaurant,BBQ Joint,Bakery,Coffee Shop,Korean Restaurant,Pizza Place,Noodle House,Bubble Tea Shop,Pharmacy,Tea Room
25,North York,Parkwoods,4,Coffee Shop,Park,Bank,Supermarket,Bus Stop,Pharmacy,Paper / Office Supplies Store,Caribbean Restaurant,Beer Store,Shop & Service
26,North York,Don Mills North,4,Japanese Restaurant,Coffee Shop,Burger Joint,Park,Bank,Pizza Place,Restaurant,Middle Eastern Restaurant,Italian Restaurant,Bar
38,East York,Leaside,4,Coffee Shop,Indian Restaurant,Restaurant,Supermarket,Bakery,Burger Joint,Grocery Store,Pizza Place,Electronics Store,Sandwich Place
54,Downtown Toronto,"Ryerson,Garden District",4,Coffee Shop,Gastropub,American Restaurant,Diner,Cosmetics Shop,Tea Room,Restaurant,Ramen Restaurant,Italian Restaurant,Japanese Restaurant


#### Category 5 is 'Busy-commuter Neighborhood', as the top number venues are fast food restaurants, pizza place, Burger Joint, etc.