# Coursera "Applied Datascience capstone" - Week3

## Segmenting and Clustering Neighborhoods in Toronto

This workbook is intended to answer the coursera "applied datascience" capstone, on week 3 : 

## 1. import Toronto Postal codes in a dataframe

In [1]:
import pandas as pd
import numpy as np

In [2]:
url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'

Using pandas read_html() fonction to read the tables of the webpage.
Select the first table, and assign to a dataframe called "zip"

In [3]:
df =pd.read_html(url)

In [4]:
tor = df[0]
tor

Unnamed: 0,Postal Code,Borough,Neighborhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
...,...,...,...
175,M5Z,Not assigned,Not assigned
176,M6Z,Not assigned,Not assigned
177,M7Z,Not assigned,Not assigned
178,M8Z,Etobicoke,"Mimico NW, The Queensway West, South of Bloor,..."


Removing the lines with a borough "Not assigned" from Zip

In [5]:
index = tor[tor['Borough'] =='Not assigned'].index


In [6]:
tor.drop(index, inplace= True)

In [7]:
tor

Unnamed: 0,Postal Code,Borough,Neighborhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
5,M6A,North York,"Lawrence Manor, Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
...,...,...,...
160,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North"
165,M4Y,Downtown Toronto,Church and Wellesley
168,M7Y,East Toronto,"Business reply mail Processing Centre, South C..."
169,M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu..."


In [8]:
tor['Postal Code'].unique

<bound method Series.unique of 2      M3A
3      M4A
4      M5A
5      M6A
6      M7A
      ... 
160    M8X
165    M4Y
168    M7Y
169    M8Y
178    M8Z
Name: Postal Code, Length: 103, dtype: object>

In [9]:
tor = tor.reset_index()
tor.drop(columns='index', inplace=True)

In [10]:
tor

Unnamed: 0,Postal Code,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
...,...,...,...
98,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North"
99,M4Y,Downtown Toronto,Church and Wellesley
100,M7Y,East Toronto,"Business reply mail Processing Centre, South C..."
101,M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu..."


## 2. Load Geo Data

load geographical data for toronto zip codes from csv file

In [11]:
geo = pd.read_csv ('Geospatial_Coordinates.csv')

In [12]:
geo

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476
...,...,...,...
98,M9N,43.706876,-79.518188
99,M9P,43.696319,-79.532242
100,M9R,43.688905,-79.554724
101,M9V,43.739416,-79.588437


Join the 2 dataframes to obtain the full dataframe.

In [13]:
tor = tor.set_index('Postal Code').join(geo.set_index('Postal Code'),  how='left')

In [14]:
tor.reset_index( inplace = True)


In [15]:
tor

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.654260,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
...,...,...,...,...,...
98,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North",43.653654,-79.506944
99,M4Y,Downtown Toronto,Church and Wellesley,43.665860,-79.383160
100,M7Y,East Toronto,"Business reply mail Processing Centre, South C...",43.662744,-79.321558
101,M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu...",43.636258,-79.498509


## 3. Explore and cluster the neighborhoods in Toronto

Visualize all neighborhoods

In [16]:
from pandas import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors


import requests
from geopy.geocoders import Nominatim 

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium 

In [17]:
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(tor['Borough'].unique()),
        tor.shape[0]
    )
)

The dataframe has 10 boroughs and 103 neighborhoods.


In [18]:
address = 'Toronto, Canada'

geolocator = Nominatim(user_agent="explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.6534817, -79.3839347.


In [19]:
# create map of New York using latitude and longitude values
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(tor['Latitude'], tor['Longitude'], tor['Borough'], tor['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

#### Define Foursquare Credentials and Version

In [20]:
CLIENT_ID = 'SBDQCUVHPUSYGURUVCO0BKJZJIUP42FS2BRPWZKXIWPQ0HZA' # your Foursquare ID
CLIENT_SECRET = '23IEJUSVPUF54EJ01AQQNCGNTQ5RW4KFUT1MZJW2HCLGF4BX' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: SBDQCUVHPUSYGURUVCO0BKJZJIUP42FS2BRPWZKXIWPQ0HZA
CLIENT_SECRET:23IEJUSVPUF54EJ01AQQNCGNTQ5RW4KFUT1MZJW2HCLGF4BX


Let's create a function to repeat  to all the neighborhoods the following process :   
- create the foursquare API request URL
- make the GET request
- clean the json 
- structure it in a pandas dataframe

In [21]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [22]:
tor['Neighborhood'][0]


'Parkwoods'

1er essai en recopiant le process new york
=> mis en commentaire pour ne pas etre executé à chauqe fois.

In [23]:
# torontto_venues = getNearbyVenues(names=tor['Neighborhood'],
                                   latitudes=tor['Latitude'],
                                   longitudes=tor['Longitude']
                                  )

Parkwoods
Victoria Village
Regent Park, Harbourfront
Lawrence Manor, Lawrence Heights
Queen's Park, Ontario Provincial Government
Islington Avenue, Humber Valley Village
Malvern, Rouge
Don Mills
Parkview Hill, Woodbine Gardens
Garden District, Ryerson
Glencairn
West Deane Park, Princess Gardens, Martin Grove, Islington, Cloverdale
Rouge Hill, Port Union, Highland Creek
Don Mills
Woodbine Heights
St. James Town
Humewood-Cedarvale
Eringate, Bloordale Gardens, Old Burnhamthorpe, Markland Wood
Guildwood, Morningside, West Hill
The Beaches
Berczy Park
Caledonia-Fairbanks
Woburn
Leaside
Central Bay Street
Christie
Cedarbrae
Hillcrest Village
Bathurst Manor, Wilson Heights, Downsview North
Thorncliffe Park
Richmond, Adelaide, King
Dufferin, Dovercourt Village
Scarborough Village
Fairview, Henry Farm, Oriole
Northwood Park, York University
East Toronto, Broadview North (Old East York)
Harbourfront East, Union Station, Toronto Islands
Little Portugal, Trinity
Kennedy Park, Ionview, East Birchmo

KeyError: 'groups'

In [None]:
torontto_venues


#### Add postal code to the dataframe

#### 2eme fonction avec conservation des codes postaux.

In [24]:
def getNearbyVenues2(names, posts, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, post, lat, lng in zip(names, posts, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name,
            post,
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood',
                        'Postal Code',
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [25]:
venues = getNearbyVenues2(names=tor['Neighborhood'][0:],
                         posts=tor['Postal Code'][0:],
                        latitudes=tor['Latitude'][0:],
                        longitudes=tor['Longitude'][0:]
                                  )

Parkwoods
Victoria Village
Regent Park, Harbourfront
Lawrence Manor, Lawrence Heights
Queen's Park, Ontario Provincial Government
Islington Avenue, Humber Valley Village
Malvern, Rouge
Don Mills
Parkview Hill, Woodbine Gardens
Garden District, Ryerson
Glencairn
West Deane Park, Princess Gardens, Martin Grove, Islington, Cloverdale
Rouge Hill, Port Union, Highland Creek
Don Mills
Woodbine Heights
St. James Town
Humewood-Cedarvale
Eringate, Bloordale Gardens, Old Burnhamthorpe, Markland Wood
Guildwood, Morningside, West Hill
The Beaches
Berczy Park
Caledonia-Fairbanks
Woburn
Leaside
Central Bay Street
Christie
Cedarbrae
Hillcrest Village
Bathurst Manor, Wilson Heights, Downsview North
Thorncliffe Park
Richmond, Adelaide, King
Dufferin, Dovercourt Village
Scarborough Village
Fairview, Henry Farm, Oriole
Northwood Park, York University
East Toronto, Broadview North (Old East York)
Harbourfront East, Union Station, Toronto Islands
Little Portugal, Trinity
Kennedy Park, Ionview, East Birchmo

In [27]:
venues


Unnamed: 0,Neighborhood,Postal Code,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,M3A,43.753259,-79.329656,Brookbanks Park,43.751976,-79.332140,Park
1,Parkwoods,M3A,43.753259,-79.329656,Variety Store,43.751974,-79.333114,Food & Drink Shop
2,Victoria Village,M4A,43.725882,-79.315572,Victoria Village Arena,43.723481,-79.315635,Hockey Arena
3,Victoria Village,M4A,43.725882,-79.315572,Portugril,43.725819,-79.312785,Portuguese Restaurant
4,Victoria Village,M4A,43.725882,-79.315572,Tim Hortons,43.725517,-79.313103,Coffee Shop
...,...,...,...,...,...,...,...,...
2123,"Mimico NW, The Queensway West, South of Bloor,...",M8Z,43.628841,-79.520999,RONA,43.629393,-79.518320,Hardware Store
2124,"Mimico NW, The Queensway West, South of Bloor,...",M8Z,43.628841,-79.520999,Jim & Maria's No Frills,43.631152,-79.518617,Grocery Store
2125,"Mimico NW, The Queensway West, South of Bloor,...",M8Z,43.628841,-79.520999,Royal Canadian Legion #210,43.628855,-79.518903,Social Club
2126,"Mimico NW, The Queensway West, South of Bloor,...",M8Z,43.628841,-79.520999,Kingsway Boxing Club,43.627254,-79.526684,Gym


In [28]:
print('There are {} uniques categories.'.format(len(venues['Venue Category'].unique())))

There are 275 uniques categories.


#### Convert the category information into columns and analyse it

In [29]:
tor_neib = pd.get_dummies(venues[['Venue Category']], prefix="", prefix_sep="")

In [30]:
tor_neib['Postal Code']=venues['Postal Code']


In [31]:
tor_neib['Neighborhood']=venues['Neighborhood']

In [32]:
tor_neib

Unnamed: 0,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Postal Code
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,M3A
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,M3A
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,M4A
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,M4A
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,M4A
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2123,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,M8Z
2124,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,M8Z
2125,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,M8Z
2126,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,M8Z


In [34]:
tor_gr = tor_neib.groupby(['Postal Code','Neighborhood']).mean().reset_index()
tor_gr

Unnamed: 0,Postal Code,Neighborhood,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,...,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,M1B,"Malvern, Rouge",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,M1C,"Rouge Hill, Port Union, Highland Creek",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,M1E,"Guildwood, Morningside, West Hill",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,M1G,Woburn,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,M1H,Cedarbrae,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,M9N,Weston,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
96,M9P,Westmount,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
97,M9R,"Kingsview Village, St. Phillips, Martin Grove ...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
98,M9V,"South Steeles, Silverstone, Humbergate, Jamest...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [40]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[2:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [41]:
return_most_common_venues(tor_gr.iloc[99, :], 10)

array(['Rental Car Location', 'Drugstore', 'Yoga Studio', 'Dog Run',
       'Dessert Shop', 'Dim Sum Restaurant', 'Diner', 'Discount Store',
       'Distribution Center', 'Doner Restaurant'], dtype=object)

In [42]:
tor_gr.shape[0]

100

In [43]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood', 'Postal Code']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = tor_gr['Neighborhood']
neighborhoods_venues_sorted['Postal Code'] = tor_gr['Postal Code']

for lig in np.arange(tor_gr.shape[0]):
    neighborhoods_venues_sorted.iloc[lig, 2:] = return_most_common_venues(tor_gr.iloc[lig, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,Postal Code,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Malvern, Rouge",,Fast Food Restaurant,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,College Rec Center
1,"Rouge Hill, Port Union, Highland Creek",,Bar,Yoga Studio,Donut Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,Dessert Shop
2,"Guildwood, Morningside, West Hill",,Mexican Restaurant,Breakfast Spot,Electronics Store,Intersection,Medical Center,Bank,Rental Car Location,Dog Run,Discount Store,Distribution Center
3,Woburn,,Coffee Shop,Korean Restaurant,Yoga Studio,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop
4,Cedarbrae,,Bank,Lounge,Hakka Restaurant,Fried Chicken Joint,Athletics & Sports,Thai Restaurant,Caribbean Restaurant,Gas Station,Bakery,Discount Store


In [45]:
neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,Postal Code,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Malvern, Rouge",M1B,Fast Food Restaurant,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,College Rec Center
1,"Rouge Hill, Port Union, Highland Creek",M1C,Bar,Yoga Studio,Donut Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,Dessert Shop
2,"Guildwood, Morningside, West Hill",M1E,Mexican Restaurant,Breakfast Spot,Electronics Store,Intersection,Medical Center,Bank,Rental Car Location,Dog Run,Discount Store,Distribution Center
3,Woburn,M1G,Coffee Shop,Korean Restaurant,Yoga Studio,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop
4,Cedarbrae,M1H,Bank,Lounge,Hakka Restaurant,Fried Chicken Joint,Athletics & Sports,Thai Restaurant,Caribbean Restaurant,Gas Station,Bakery,Discount Store
...,...,...,...,...,...,...,...,...,...,...,...,...
95,Weston,M9N,Park,Yoga Studio,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore
96,Westmount,M9P,Pizza Place,Coffee Shop,Sandwich Place,Discount Store,Chinese Restaurant,Intersection,Drugstore,Donut Shop,Doner Restaurant,Deli / Bodega
97,"Kingsview Village, St. Phillips, Martin Grove ...",M9R,Pizza Place,Sandwich Place,Mobile Phone Shop,Bus Line,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Yoga Studio
98,"South Steeles, Silverstone, Humbergate, Jamest...",M9V,Grocery Store,Pharmacy,Fast Food Restaurant,Sandwich Place,Beer Store,Fried Chicken Joint,Pizza Place,Drugstore,Dumpling Restaurant,Donut Shop


#### save dataframe to csv for further use

In [46]:
 # neighborhoods_venues_sorted.to_csv('TorontoNeighborhood')

#### Clustering neighborhoods

Run k-means to cluster the neighborhood into 5 clusters.

In [48]:
# preparing dataframe

tor_cluster = tor_gr.drop(['Postal Code','Neighborhood'], 1)
tor_cluster


Unnamed: 0,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
96,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
97,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
98,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [49]:
# set number of clusters
kclusters = 5

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(tor_cluster)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([1, 0, 1, 1, 1, 2, 1, 1, 1, 1])

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [50]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)



ValueError: columns overlap but no suffix specified: Index(['Neighborhood'], dtype='object')

In [51]:
neighborhoods_venues_sorted

Unnamed: 0,Cluster Labels,Neighborhood,Postal Code,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,1,"Malvern, Rouge",M1B,Fast Food Restaurant,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,College Rec Center
1,0,"Rouge Hill, Port Union, Highland Creek",M1C,Bar,Yoga Studio,Donut Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,Dessert Shop
2,1,"Guildwood, Morningside, West Hill",M1E,Mexican Restaurant,Breakfast Spot,Electronics Store,Intersection,Medical Center,Bank,Rental Car Location,Dog Run,Discount Store,Distribution Center
3,1,Woburn,M1G,Coffee Shop,Korean Restaurant,Yoga Studio,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop
4,1,Cedarbrae,M1H,Bank,Lounge,Hakka Restaurant,Fried Chicken Joint,Athletics & Sports,Thai Restaurant,Caribbean Restaurant,Gas Station,Bakery,Discount Store
...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,2,Weston,M9N,Park,Yoga Studio,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore
96,1,Westmount,M9P,Pizza Place,Coffee Shop,Sandwich Place,Discount Store,Chinese Restaurant,Intersection,Drugstore,Donut Shop,Doner Restaurant,Deli / Bodega
97,1,"Kingsview Village, St. Phillips, Martin Grove ...",M9R,Pizza Place,Sandwich Place,Mobile Phone Shop,Bus Line,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Yoga Studio
98,1,"South Steeles, Silverstone, Humbergate, Jamest...",M9V,Grocery Store,Pharmacy,Fast Food Restaurant,Sandwich Place,Beer Store,Fried Chicken Joint,Pizza Place,Drugstore,Dumpling Restaurant,Donut Shop


In [54]:
tor2 = tor.set_index('Postal Code')

In [56]:
tor2

Unnamed: 0_level_0,Borough,Neighborhood,Latitude,Longitude
Postal Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
M3A,North York,Parkwoods,43.753259,-79.329656
M4A,North York,Victoria Village,43.725882,-79.315572
M5A,Downtown Toronto,"Regent Park, Harbourfront",43.654260,-79.360636
M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
...,...,...,...,...
M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North",43.653654,-79.506944
M4Y,Downtown Toronto,Church and Wellesley,43.665860,-79.383160
M7Y,East Toronto,"Business reply mail Processing Centre, South C...",43.662744,-79.321558
M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu...",43.636258,-79.498509


In [60]:
n2 = neighborhoods_venues_sorted.set_index('Postal Code')

In [63]:

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
tor_merged = tor2.join(n2, how='left', rsuffix = '_clustered')

tor_merged.head() # check the last columns!

Unnamed: 0_level_0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,Neighborhood_clustered,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
Postal Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
M3A,North York,Parkwoods,43.753259,-79.329656,2.0,Parkwoods,Food & Drink Shop,Park,Yoga Studio,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop
M4A,North York,Victoria Village,43.725882,-79.315572,1.0,Victoria Village,French Restaurant,Coffee Shop,Portuguese Restaurant,Hockey Arena,Intersection,Yoga Studio,Distribution Center,Dessert Shop,Dim Sum Restaurant,Diner
M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1.0,"Regent Park, Harbourfront",Coffee Shop,Bakery,Park,Pub,Theater,Café,Breakfast Spot,Distribution Center,Restaurant,Spa
M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763,1.0,"Lawrence Manor, Lawrence Heights",Furniture / Home Store,Clothing Store,Accessories Store,Vietnamese Restaurant,Miscellaneous Shop,Event Space,Boutique,Coffee Shop,Construction & Landscaping,Dim Sum Restaurant
M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,1.0,"Queen's Park, Ontario Provincial Government",Coffee Shop,Sushi Restaurant,Yoga Studio,Bank,Beer Bar,Smoothie Shop,Burrito Place,Sandwich Place,Café,Park


In [71]:
index = tor_merged.index[tor_merged['Neighborhood']!=tor_merged['Neighborhood_clustered']]

In [74]:
tor_merged = tor_merged.drop(index, axis=0)

In [76]:
tor_merged =  tor_merged.drop(columns = 'Neighborhood_clustered')

In [77]:
tor_merged 

Unnamed: 0_level_0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
Postal Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
M3A,North York,Parkwoods,43.753259,-79.329656,2.0,Food & Drink Shop,Park,Yoga Studio,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop
M4A,North York,Victoria Village,43.725882,-79.315572,1.0,French Restaurant,Coffee Shop,Portuguese Restaurant,Hockey Arena,Intersection,Yoga Studio,Distribution Center,Dessert Shop,Dim Sum Restaurant,Diner
M5A,Downtown Toronto,"Regent Park, Harbourfront",43.654260,-79.360636,1.0,Coffee Shop,Bakery,Park,Pub,Theater,Café,Breakfast Spot,Distribution Center,Restaurant,Spa
M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763,1.0,Furniture / Home Store,Clothing Store,Accessories Store,Vietnamese Restaurant,Miscellaneous Shop,Event Space,Boutique,Coffee Shop,Construction & Landscaping,Dim Sum Restaurant
M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,1.0,Coffee Shop,Sushi Restaurant,Yoga Studio,Bank,Beer Bar,Smoothie Shop,Burrito Place,Sandwich Place,Café,Park
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North",43.653654,-79.506944,2.0,River,Park,Smoke Shop,Yoga Studio,Dog Run,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center
M4Y,Downtown Toronto,Church and Wellesley,43.665860,-79.383160,1.0,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Gay Bar,Restaurant,Yoga Studio,Bubble Tea Shop,Burger Joint,Pub,Café
M7Y,East Toronto,"Business reply mail Processing Centre, South C...",43.662744,-79.321558,1.0,Light Rail Station,Yoga Studio,Park,Smoke Shop,Brewery,Skate Park,Burrito Place,Restaurant,Recording Studio,Comic Shop
M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu...",43.636258,-79.498509,4.0,Baseball Field,Yoga Studio,Donut Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,Dessert Shop


#### Map  the  clustered neighborhoods

In [83]:
#change type for integer
tor_merged['Cluster Labels'] = tor_merged['Cluster Labels'].astype('int')

In [84]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(tor_merged['Latitude'], tor_merged['Longitude'], tor_merged['Neighborhood'], tor_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Examine CLusters

Neignborhoods by cluster

In [94]:
tor_merged['Cluster Labels'].value_counts()

1    83
2    11
4     3
3     2
0     1
Name: Cluster Labels, dtype: int64

On 100 neighborhoods, 83 are in cluster 1 => the clustering is not enough separated.  
Clusters 0,3,and 4 have less than 4 Neighborhoods.
=> there is either too many categories, or not enough neighborhoods.

#### cluster 0

In [89]:
tor_merged.loc[tor_merged['Cluster Labels'] == 0]

Unnamed: 0_level_0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
Postal Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497,0,Bar,Yoga Studio,Donut Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,Dessert Shop


#### cluster 1

In [90]:
tor_merged.loc[tor_merged['Cluster Labels'] == 1]

Unnamed: 0_level_0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
Postal Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
M4A,North York,Victoria Village,43.725882,-79.315572,1,French Restaurant,Coffee Shop,Portuguese Restaurant,Hockey Arena,Intersection,Yoga Studio,Distribution Center,Dessert Shop,Dim Sum Restaurant,Diner
M5A,Downtown Toronto,"Regent Park, Harbourfront",43.654260,-79.360636,1,Coffee Shop,Bakery,Park,Pub,Theater,Café,Breakfast Spot,Distribution Center,Restaurant,Spa
M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763,1,Furniture / Home Store,Clothing Store,Accessories Store,Vietnamese Restaurant,Miscellaneous Shop,Event Space,Boutique,Coffee Shop,Construction & Landscaping,Dim Sum Restaurant
M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,1,Coffee Shop,Sushi Restaurant,Yoga Studio,Bank,Beer Bar,Smoothie Shop,Burrito Place,Sandwich Place,Café,Park
M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353,1,Fast Food Restaurant,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,College Rec Center
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
M4X,Downtown Toronto,"St. James Town, Cabbagetown",43.667967,-79.367675,1,Coffee Shop,Pizza Place,Market,Restaurant,Pub,Bakery,Italian Restaurant,Café,Gift Shop,Deli / Bodega
M5X,Downtown Toronto,"First Canadian Place, Underground city",43.648429,-79.382280,1,Coffee Shop,Café,Hotel,Restaurant,Gym,Seafood Restaurant,American Restaurant,Steakhouse,Salad Place,Japanese Restaurant
M4Y,Downtown Toronto,Church and Wellesley,43.665860,-79.383160,1,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Gay Bar,Restaurant,Yoga Studio,Bubble Tea Shop,Burger Joint,Pub,Café
M7Y,East Toronto,"Business reply mail Processing Centre, South C...",43.662744,-79.321558,1,Light Rail Station,Yoga Studio,Park,Smoke Shop,Brewery,Skate Park,Burrito Place,Restaurant,Recording Studio,Comic Shop


#### cluster 2

In [91]:
tor_merged.loc[tor_merged['Cluster Labels'] == 2]

Unnamed: 0_level_0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
Postal Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
M3A,North York,Parkwoods,43.753259,-79.329656,2,Food & Drink Shop,Park,Yoga Studio,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop
M6E,York,Caledonia-Fairbanks,43.689026,-79.453512,2,Park,Women's Store,Pool,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run
M1J,Scarborough,Scarborough Village,43.744734,-79.239476,2,Playground,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Donut Shop,College Gym
M4J,East York,"East Toronto, Broadview North (Old East York)",43.685347,-79.338106,2,Park,Convenience Store,Coffee Shop,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant
M2L,North York,"York Mills, Silver Hills",43.75749,-79.374714,2,Cafeteria,Park,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Donut Shop
M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,2,Park,Bus Line,Swim School,Yoga Studio,Dog Run,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Doner Restaurant
M9N,York,Weston,43.706876,-79.518188,2,Park,Yoga Studio,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore
M2P,North York,York Mills West,43.752758,-79.400049,2,Park,Bar,Convenience Store,Yoga Studio,Donut Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant
M1V,Scarborough,"Milliken, Agincourt North, Steeles East, L'Amo...",43.815252,-79.284577,2,Playground,Park,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Donut Shop
M4W,Downtown Toronto,Rosedale,43.679563,-79.377529,2,Park,Trail,Playground,Yoga Studio,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center


#### cluster 3

In [92]:
tor_merged.loc[tor_merged['Cluster Labels'] == 3]

Unnamed: 0_level_0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
Postal Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
M9B,Etobicoke,"West Deane Park, Princess Gardens, Martin Grov...",43.650943,-79.554724,3,Home Service,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Yoga Studio,Department Store
M5N,Central Toronto,Roselawn,43.711695,-79.416936,3,Home Service,Music Venue,Garden,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Drugstore,Donut Shop,Doner Restaurant,Deli / Bodega


#### cluster 4

In [93]:
tor_merged.loc[tor_merged['Cluster Labels'] == 4]

Unnamed: 0_level_0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
Postal Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
M3M,North York,Downsview,43.728496,-79.495697,4,Food Truck,Baseball Field,Yoga Studio,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Dessert Shop
M9M,North York,"Humberlea, Emery",43.724766,-79.532242,4,Baseball Field,Yoga Studio,Donut Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,Dessert Shop
M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu...",43.636258,-79.498509,4,Baseball Field,Yoga Studio,Donut Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,Dessert Shop


As a conclusion, 

## Extra information

In [55]:
tor['Neighborhood'].value_counts()

Downsview                                                                                                        4
Don Mills                                                                                                        2
York Mills West                                                                                                  1
Kensington Market, Chinatown, Grange Park                                                                        1
Lawrence Park                                                                                                    1
                                                                                                                ..
Cedarbrae                                                                                                        1
Bathurst Manor, Wilson Heights, Downsview North                                                                  1
East Toronto, Broadview North (Old East York)                                   