# Neighborhoods in Toronto

- Segmenting and clustering neighborhoods in the city of Toronto, Canada
- Coursera_Capstone week 3 project

## 1. Data Cleaning

### Import library

In [2]:
import pandas as pd
import numpy as np

### Dataframe for postal codes

In [3]:
pcode=pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M')

In [4]:
pcode_df=pcode[0]

In [5]:
pcode_df.rename(columns={'Neighbourhood':'Neighborhood'},inplace=True)

### Ignore "Not assigned" in borough

In [6]:
clean_pcode_df=pcode_df[pcode_df['Borough']!='Not assigned']

### Combine neighborhoods with the same postal code

In [7]:
comb_pcode_df=clean_pcode_df.groupby('Postal Code',as_index=False).agg({'Borough':'first','Neighborhood':lambda x: ','.join(x)})

### Copy 'borough' to 'Neighborhood'  if 'Neighborhood' is 'Not assigned'

In [8]:
try:
    comb_pcode_df[comb_pcode_df['Neighborhood']=='Not assigned','Neighborhood']=comb_pcode_df[comb_pcode_df['Neighborhood']=='Not assigned','Borough']
except TypeError:
    print("All neighborhood are assigned")

All neighborhood are assigned


In [9]:
print(comb_pcode_df.shape)

(103, 3)


## 2. Geo information

###  Get the latitude and the longitude coordinates of each neighborhood (with geocoder python library) - error

In [10]:
#!pip install geocoder

In [11]:
# import geocoder # import geocoder

# # initialize your variable to None
# lat_lng_coords = None
# postal_code = 'M5G'

# # loop until you get the coordinates
# # while(lat_lng_coords is None):
# g = geocoder.google('{}, Toronto, Ontario'.format(postal_code))go
# lat_lng_coords = g.latlng

# latitude = lat_lng_coords[0]
# longitude = lat_lng_coords[1]

###  Get the latitude and the longitude coordinates of each neighborhood (with a csv file)

In [12]:
geopartial_df=pd.read_csv('http://cocl.us/Geospatial_data')

###  Add the latitude and the longitude in the neighborhood table

In [13]:
loc_pcode_df=comb_pcode_df.merge(geopartial_df,how='left',on='Postal Code')

In [14]:
loc_pcode_df

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476
6,M1K,Scarborough,"Kennedy Park, Ionview, East Birchmount Park",43.727929,-79.262029
7,M1L,Scarborough,"Golden Mile, Clairlea, Oakridge",43.711112,-79.284577
8,M1M,Scarborough,"Cliffside, Cliffcrest, Scarborough Village West",43.716316,-79.239476
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848


## 3. Segmenting and Clustering Neighborhoods in Toronto

In [16]:
neighborhoods=loc_pcode_df[['Borough','Neighborhood','Latitude','Longitude']]

### Import library

In [83]:
#!conda install -c conda-forge geopy --yes # Error
#!pip install geopy
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values


#!conda install -c conda-forge folium=0.5.0 # Error
#!pip install folium # Successful Install

# import library
import folium # map rendering library

import requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

from sklearn.cluster import KMeans

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

###  The number of boroughs and neighborhoods

In [36]:
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(neighborhoods['Borough'].unique()),
        neighborhoods.shape[0]
    )
)

The dataframe has 10 boroughs and 103 neighborhoods.


###  Use geopy library to get the latitude and longitude values of Toronto

In [37]:
address = 'Toronto, CA'

geolocator = Nominatim(user_agent="to_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.6534817, -79.3839347.


###  Create a map of Toronto with neighborhoods superimposed on top.

In [38]:
# create map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

### Define Foursquare Credentials and Version (hidden cell)

In [40]:
# The code was removed by Watson Studio for sharing.

Your credentails:
CLIENT_ID: FO0D0V4TDBIGHHPPXTOIXBZ4JX52WEOCJV42OTRYDB2YIMXX
CLIENT_SECRET: MTLJSVCGJRKODMBYDZYHNKJ2RYP0I3TRY4SO4QKAJ2SEBKGF


### Let's explore the first borough in our dataframe.

Get the borough's name.

In [42]:
neighborhoods.loc[0, 'Borough']

'Scarborough'

Get the neighborhood's latitude and longitude values.

In [45]:
neighborhood_latitude = neighborhoods.loc[0, 'Latitude'] # borough latitude value
neighborhood_longitude = neighborhoods.loc[0, 'Longitude'] # borough longitude value
neighborhood_name = neighborhoods.loc[0, 'Borough'] # borough name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Scarborough are 43.806686299999996, -79.19435340000001.


### Now, let's get the top 100 venues that are in Scarborough within a radius of 500 meters.

First, let's create the GET request URL. Name your URL url.

In [46]:
LIMIT=100
radius=500
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)

Send the GET request and examine the results

In [48]:
results = requests.get(url).json()

let's borrow the **get_category_type** function from the Foursquare lab.

In [49]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

Clean the json and structure it into a pandas dataframe.

In [51]:
venues = results['response']['groups'][0]['items']

nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Wendy’s,Fast Food Restaurant,43.807448,-79.199056


how many venues were returned by Foursquare?

In [52]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

1 venues were returned by Foursquare.


## Explore Neighborhoods in Toronto

Let's create a function to repeat the same process to all the neighborhoods in Manhattan

In [61]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Borough', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Now write the code to run the above function on each neighborhood and create a new dataframe called *toronto_venues*.

In [62]:
toronto_venues = getNearbyVenues(names=neighborhoods['Borough'],
                                   latitudes=neighborhoods['Latitude'],
                                   longitudes=neighborhoods['Longitude']
                                  )

Scarborough
Scarborough
Scarborough
Scarborough
Scarborough
Scarborough
Scarborough
Scarborough
Scarborough
Scarborough
Scarborough
Scarborough
Scarborough
Scarborough
Scarborough
Scarborough
Scarborough
North York
North York
North York
North York
North York
North York
North York
North York
North York
North York
North York
North York
North York
North York
North York
North York
North York
North York
East York
East York
East Toronto
East York
East York
East York
East Toronto
East Toronto
East Toronto
Central Toronto
Central Toronto
Central Toronto
Central Toronto
Central Toronto
Central Toronto
Downtown Toronto
Downtown Toronto
Downtown Toronto
Downtown Toronto
Downtown Toronto
Downtown Toronto
Downtown Toronto
Downtown Toronto
Downtown Toronto
Downtown Toronto
Downtown Toronto
Downtown Toronto
North York
Central Toronto
Central Toronto
Central Toronto
Downtown Toronto
Downtown Toronto
Downtown Toronto
Downtown Toronto
Downtown Toronto
North York
North York
York
York
Downtown Toronto
Wes

Let's check the size of the resulting dataframe

In [63]:
print(toronto_venues.shape)
toronto_venues.head()

(2156, 7)


Unnamed: 0,Borough,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Scarborough,43.806686,-79.194353,Wendy’s,43.807448,-79.199056,Fast Food Restaurant
1,Scarborough,43.784535,-79.160497,Royal Canadian Legion,43.782533,-79.163085,Bar
2,Scarborough,43.763573,-79.188711,RBC Royal Bank,43.76679,-79.191151,Bank
3,Scarborough,43.763573,-79.188711,G & G Electronics,43.765309,-79.191537,Electronics Store
4,Scarborough,43.763573,-79.188711,Sail Sushi,43.765951,-79.191275,Restaurant


Let's check how many venues were returned for each borough

In [64]:
toronto_venues.groupby('Borough').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Borough,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Central Toronto,118,118,118,118,118,118
Downtown Toronto,1244,1244,1244,1244,1244,1244
East Toronto,125,125,125,125,125,125
East York,74,74,74,74,74,74
Etobicoke,74,74,74,74,74,74
Mississauga,12,12,12,12,12,12
North York,242,242,242,242,242,242
Scarborough,90,90,90,90,90,90
West Toronto,160,160,160,160,160,160
York,17,17,17,17,17,17


 Let's find out how many unique categories can be curated from all the returned venues

In [65]:
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 272 uniques categories.


## Analyze Each Borough

In [66]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Borough'] =toronto_venues['Borough'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Borough,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,Scarborough,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Scarborough,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Scarborough,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Scarborough,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Scarborough,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


And let's examine the new dataframe size.

In [67]:
toronto_onehot.shape

(2156, 273)

#### Next, let's group rows by borough and by taking the mean of the frequency of occurrence of each category

In [68]:
toronto_grouped = toronto_onehot.groupby('Borough').mean().reset_index()
toronto_grouped

Unnamed: 0,Borough,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,Central Toronto,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.008475,0.0,...,0.008475,0.0,0.0,0.008475,0.0,0.0,0.0,0.0,0.0,0.008475
1,Downtown Toronto,0.0,0.000804,0.000804,0.000804,0.001608,0.001608,0.001608,0.013666,0.001608,...,0.012058,0.000804,0.0,0.003215,0.0,0.006431,0.000804,0.0,0.000804,0.005627
2,East Toronto,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.024,0.0,...,0.0,0.0,0.0,0.0,0.0,0.008,0.0,0.0,0.0,0.024
3,East York,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.013514,0.0,0.013514,0.0,0.0,0.0,0.0,0.013514
4,Etobicoke,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,...,0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.013514,0.0,0.0
5,Mississauga,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,North York,0.004132,0.0,0.004132,0.0,0.0,0.0,0.0,0.008264,0.0,...,0.0,0.004132,0.0,0.008264,0.0,0.0,0.0,0.0,0.004132,0.0
7,Scarborough,0.011111,0.0,0.0,0.0,0.0,0.0,0.0,0.011111,0.0,...,0.0,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.011111,0.0
8,West Toronto,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.00625,...,0.0125,0.0,0.0,0.0125,0.0,0.00625,0.0,0.0,0.0,0.025
9,York,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0


Let's confirm the new size

In [69]:
toronto_grouped.shape

(10, 273)

#### Let's print each neighborhood along with the top 5 most common venues

In [71]:
num_top_venues = 5

for hood in toronto_grouped['Borough']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Borough'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Central Toronto----
            venue  freq
0     Coffee Shop  0.07
1  Sandwich Place  0.06
2            Park  0.06
3            Café  0.05
4     Pizza Place  0.05


----Downtown Toronto----
                 venue  freq
0          Coffee Shop  0.11
1                 Café  0.06
2                Hotel  0.03
3           Restaurant  0.03
4  Japanese Restaurant  0.03


----East Toronto----
                venue  freq
0         Coffee Shop  0.06
1    Greek Restaurant  0.06
2                Park  0.04
3                Café  0.04
4  Italian Restaurant  0.04


----East York----
                 venue  freq
0          Coffee Shop  0.05
1                 Bank  0.05
2         Burger Joint  0.04
3  Sporting Goods Shop  0.04
4         Intersection  0.04


----Etobicoke----
            venue  freq
0     Pizza Place  0.11
1  Sandwich Place  0.07
2        Pharmacy  0.05
3     Coffee Shop  0.05
4            Café  0.04


----Mississauga----
                 venue  freq
0          Coffee Shop  0.17
1 

### Let's put that into a pandas dataframe

First, let's write a function to sort the venues in descending order.

In [73]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each neighborhood.

In [74]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Borough']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Borough'] = toronto_grouped['Borough']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Central Toronto,Coffee Shop,Sandwich Place,Park,Pizza Place,Café,Restaurant,Sushi Restaurant,Pub,Clothing Store,Dessert Shop
1,Downtown Toronto,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Park,Bakery,Gym,Clothing Store
2,East Toronto,Coffee Shop,Greek Restaurant,Brewery,Café,Park,Italian Restaurant,Ice Cream Shop,Restaurant,Pub,Pizza Place
3,East York,Coffee Shop,Bank,Burger Joint,Pizza Place,Sporting Goods Shop,Park,Intersection,Sandwich Place,Restaurant,Indian Restaurant
4,Etobicoke,Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Grocery Store,Fast Food Restaurant,Café,Discount Store,Gym,Pool


## Cluster Neighborhoods

Run k-means to cluster the neighborhood into 5 clusters.

In [79]:
# set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Borough', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 0, 3, 4, 1, 0, 3, 0, 2], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [81]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = neighborhoods

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Borough'), on='Borough')

toronto_merged.head() # check the last columns!

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Scarborough,"Malvern, Rouge",43.806686,-79.194353,3,Chinese Restaurant,Fast Food Restaurant,Bakery,Bank,Breakfast Spot,Coffee Shop,Pizza Place,Indian Restaurant,Thai Restaurant,Soccer Field
1,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497,3,Chinese Restaurant,Fast Food Restaurant,Bakery,Bank,Breakfast Spot,Coffee Shop,Pizza Place,Indian Restaurant,Thai Restaurant,Soccer Field
2,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711,3,Chinese Restaurant,Fast Food Restaurant,Bakery,Bank,Breakfast Spot,Coffee Shop,Pizza Place,Indian Restaurant,Thai Restaurant,Soccer Field
3,Scarborough,Woburn,43.770992,-79.216917,3,Chinese Restaurant,Fast Food Restaurant,Bakery,Bank,Breakfast Spot,Coffee Shop,Pizza Place,Indian Restaurant,Thai Restaurant,Soccer Field
4,Scarborough,Cedarbrae,43.773136,-79.239476,3,Chinese Restaurant,Fast Food Restaurant,Bakery,Bank,Breakfast Spot,Coffee Shop,Pizza Place,Indian Restaurant,Thai Restaurant,Soccer Field


Finally, let's visualize the resulting clusters

In [84]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'],toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## Examine Clusters

Now, you can examine each cluster and determine the discriminating venue categories that distinguish each cluster.

#### cluster 1

In [85]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
17,Hillcrest Village,Coffee Shop,Clothing Store,Restaurant,Japanese Restaurant,Pizza Place,Sandwich Place,Fast Food Restaurant,Bank,Park,Café
18,"Fairview, Henry Farm, Oriole",Coffee Shop,Clothing Store,Restaurant,Japanese Restaurant,Pizza Place,Sandwich Place,Fast Food Restaurant,Bank,Park,Café
19,Bayview Village,Coffee Shop,Clothing Store,Restaurant,Japanese Restaurant,Pizza Place,Sandwich Place,Fast Food Restaurant,Bank,Park,Café
20,"York Mills, Silver Hills",Coffee Shop,Clothing Store,Restaurant,Japanese Restaurant,Pizza Place,Sandwich Place,Fast Food Restaurant,Bank,Park,Café
21,"Willowdale, Newtonbrook",Coffee Shop,Clothing Store,Restaurant,Japanese Restaurant,Pizza Place,Sandwich Place,Fast Food Restaurant,Bank,Park,Café
22,"Willowdale, Willowdale East",Coffee Shop,Clothing Store,Restaurant,Japanese Restaurant,Pizza Place,Sandwich Place,Fast Food Restaurant,Bank,Park,Café
23,York Mills West,Coffee Shop,Clothing Store,Restaurant,Japanese Restaurant,Pizza Place,Sandwich Place,Fast Food Restaurant,Bank,Park,Café
24,"Willowdale, Willowdale West",Coffee Shop,Clothing Store,Restaurant,Japanese Restaurant,Pizza Place,Sandwich Place,Fast Food Restaurant,Bank,Park,Café
25,Parkwoods,Coffee Shop,Clothing Store,Restaurant,Japanese Restaurant,Pizza Place,Sandwich Place,Fast Food Restaurant,Bank,Park,Café
26,Don Mills,Coffee Shop,Clothing Store,Restaurant,Japanese Restaurant,Pizza Place,Sandwich Place,Fast Food Restaurant,Bank,Park,Café


#### cluster 2

In [86]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
86,Canada Post Gateway Processing Centre,Hotel,Coffee Shop,Burrito Place,American Restaurant,Intersection,Sandwich Place,Gas Station,Mediterranean Restaurant,Fried Chicken Joint,Gym


#### cluster 3

In [87]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
73,Humewood-Cedarvale,Park,Convenience Store,Field,Discount Store,Sandwich Place,Tennis Court,Pool,Hockey Arena,Trail,Grocery Store
74,Caledonia-Fairbanks,Park,Convenience Store,Field,Discount Store,Sandwich Place,Tennis Court,Pool,Hockey Arena,Trail,Grocery Store
80,"Del Ray, Mount Dennis, Keelsdale and Silverthorn",Park,Convenience Store,Field,Discount Store,Sandwich Place,Tennis Court,Pool,Hockey Arena,Trail,Grocery Store
81,"Runnymede, The Junction North",Park,Convenience Store,Field,Discount Store,Sandwich Place,Tennis Court,Pool,Hockey Arena,Trail,Grocery Store
98,Weston,Park,Convenience Store,Field,Discount Store,Sandwich Place,Tennis Court,Pool,Hockey Arena,Trail,Grocery Store


#### cluster 4

In [88]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Malvern, Rouge",Chinese Restaurant,Fast Food Restaurant,Bakery,Bank,Breakfast Spot,Coffee Shop,Pizza Place,Indian Restaurant,Thai Restaurant,Soccer Field
1,"Rouge Hill, Port Union, Highland Creek",Chinese Restaurant,Fast Food Restaurant,Bakery,Bank,Breakfast Spot,Coffee Shop,Pizza Place,Indian Restaurant,Thai Restaurant,Soccer Field
2,"Guildwood, Morningside, West Hill",Chinese Restaurant,Fast Food Restaurant,Bakery,Bank,Breakfast Spot,Coffee Shop,Pizza Place,Indian Restaurant,Thai Restaurant,Soccer Field
3,Woburn,Chinese Restaurant,Fast Food Restaurant,Bakery,Bank,Breakfast Spot,Coffee Shop,Pizza Place,Indian Restaurant,Thai Restaurant,Soccer Field
4,Cedarbrae,Chinese Restaurant,Fast Food Restaurant,Bakery,Bank,Breakfast Spot,Coffee Shop,Pizza Place,Indian Restaurant,Thai Restaurant,Soccer Field
5,Scarborough Village,Chinese Restaurant,Fast Food Restaurant,Bakery,Bank,Breakfast Spot,Coffee Shop,Pizza Place,Indian Restaurant,Thai Restaurant,Soccer Field
6,"Kennedy Park, Ionview, East Birchmount Park",Chinese Restaurant,Fast Food Restaurant,Bakery,Bank,Breakfast Spot,Coffee Shop,Pizza Place,Indian Restaurant,Thai Restaurant,Soccer Field
7,"Golden Mile, Clairlea, Oakridge",Chinese Restaurant,Fast Food Restaurant,Bakery,Bank,Breakfast Spot,Coffee Shop,Pizza Place,Indian Restaurant,Thai Restaurant,Soccer Field
8,"Cliffside, Cliffcrest, Scarborough Village West",Chinese Restaurant,Fast Food Restaurant,Bakery,Bank,Breakfast Spot,Coffee Shop,Pizza Place,Indian Restaurant,Thai Restaurant,Soccer Field
9,"Birch Cliff, Cliffside West",Chinese Restaurant,Fast Food Restaurant,Bakery,Bank,Breakfast Spot,Coffee Shop,Pizza Place,Indian Restaurant,Thai Restaurant,Soccer Field


#### cluster 5

In [89]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
88,"New Toronto, Mimico South, Humber Bay Shores",Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Grocery Store,Fast Food Restaurant,Café,Discount Store,Gym,Pool
89,"Alderwood, Long Branch",Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Grocery Store,Fast Food Restaurant,Café,Discount Store,Gym,Pool
90,"The Kingsway, Montgomery Road, Old Mill North",Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Grocery Store,Fast Food Restaurant,Café,Discount Store,Gym,Pool
91,"Old Mill South, King's Mill Park, Sunnylea, Hu...",Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Grocery Store,Fast Food Restaurant,Café,Discount Store,Gym,Pool
92,"Mimico NW, The Queensway West, South of Bloor,...",Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Grocery Store,Fast Food Restaurant,Café,Discount Store,Gym,Pool
93,"Islington Avenue, Humber Valley Village",Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Grocery Store,Fast Food Restaurant,Café,Discount Store,Gym,Pool
94,"West Deane Park, Princess Gardens, Martin Grov...",Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Grocery Store,Fast Food Restaurant,Café,Discount Store,Gym,Pool
95,"Eringate, Bloordale Gardens, Old Burnhamthorpe...",Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Grocery Store,Fast Food Restaurant,Café,Discount Store,Gym,Pool
99,Westmount,Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Grocery Store,Fast Food Restaurant,Café,Discount Store,Gym,Pool
100,"Kingsview Village, St. Phillips, Martin Grove ...",Pizza Place,Sandwich Place,Coffee Shop,Pharmacy,Grocery Store,Fast Food Restaurant,Café,Discount Store,Gym,Pool
