# IBM Data Science Capstone Project

This notebook is used for the Captioned project.

Let's get started.

In [1]:
print("Hello Capstone Project Course!")

Hello Capstone Project Course!


## Toronto Neighborhood Project

---
---
## Part 1 - Create the data frame

### Get Scrape Info from wikipedia

In [147]:
#use beautifulsoup to scrape
import requests
from bs4 import BeautifulSoup
import re
import numpy as np

url = "https://en.wikipedia.org/w/index.php?title=List_of_postal_codes_of_Canada:_M"
response = requests.get(url)
html_doc = response.text
soup = BeautifulSoup(response.text, "lxml")
#print(soup.prettify())

soup_tables = soup.find_all('table')

#first table is the table with postal code, borough, neighborhood details
Neighborhood_table = soup_tables[0]

tds = Neighborhood_table.find_all("td")

postal_codes = []
borughs = []
neighborhoods = []

for td in tds:
    #find the postal code, inside b tag
    postal_codes.append(td.find("b").text)
    
    #find District, the next element of span tag
    try:
        search_borugh = td.find("span").next_element.text
    except:
        search_borugh = td.find("span").next_element
    borughs.append(search_borugh)
    
    #find neighborhood, inside parenthesis of span tag, so search by using regular expression
    search_neighborhood = td.find("span").text
    search_neighborhood_match = re.findall("(?<=\()(.+)(?=\))", search_neighborhood)
    try:
        search_neighborhood = re.sub("\s\/\s" ,", ", search_neighborhood_match[0])
        search_neighborhood = re.sub("\)(.+)\(" ,", ", search_neighborhood)
    except:
        search_neighborhood = np.nan 
    neighborhoods.append(search_neighborhood)

#take a look at the scraped lists
for i, (x,y,z) in enumerate(zip(postal_codes,borughs,neighborhoods)):
    if i < 10:
        print("{} |\t {} |\t {} |\t {}".format(i+1,x,y,z))

1 |	 M1A |	 Not assigned |	 nan
2 |	 M2A |	 Not assigned |	 nan
3 |	 M3A |	 North York |	 Parkwoods
4 |	 M4A |	 North York |	 Victoria Village
5 |	 M5A |	 Downtown Toronto |	 Regent Park, Harbourfront
6 |	 M6A |	 North York |	 Lawrence Manor, Lawrence Heights
7 |	 M7A |	 Queen's Park |	 Ontario Provincial Government
8 |	 M8A |	 Not assigned |	 nan
9 |	 M9A |	 Etobicoke |	 Islington Avenue
10 |	 M1B |	 Scarborough |	 Malvern, Rouge


### Create the base pandas table

In [148]:
import pandas as pd

toronto_neighborhood_df = pd.DataFrame(columns=["PostalCode", "Borugh", "Neighborhood"])
toronto_neighborhood_df.PostalCode = postal_codes
toronto_neighborhood_df.Borugh = borughs
toronto_neighborhood_df.Neighborhood = neighborhoods

toronto_neighborhood_df = toronto_neighborhood_df[toronto_neighborhood_df.Borugh != "Not assigned"].reset_index(drop=True)

#toronto_neighborhood_df.to_csv("toronto_neighborhood.csv")

#check the data table
print("Shape of the Toronto Neighborhood DataFrame: {}".format(toronto_neighborhood_df.shape))
toronto_neighborhood_df.head(10)

Shape of the Toronto Neighborhood DataFrame: (103, 3)


Unnamed: 0,PostalCode,Borugh,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Queen's Park,Ontario Provincial Government
5,M9A,Etobicoke,Islington Avenue
6,M1B,Scarborough,"Malvern, Rouge"
7,M3B,North York,Don Mills
8,M4B,East York,"Parkview Hill, Woodbine Gardens"
9,M5B,Downtown Toronto,"Garden District, Ryerson"


---
---
## Part 2 - Create the data frame

In [149]:
#starting from the 2nd part, work on the project with the saved csv, no need to scrape wiki page again
import pandas as pd
import numpy as np

toronto_neighborhood_df = pd.read_csv('toronto_neighborhood.csv')
toronto_neighborhood_df.drop("Unnamed: 0", axis=1, inplace=True)

#just to make sure everything is good
print(toronto_neighborhood_df.shape)
toronto_neighborhood_df.head(5)

(103, 3)


Unnamed: 0,PostalCode,Borugh,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Queen's Park,Ontario Provincial Government


### Get the latitudes and longitudes using pgeocode

In [219]:
import pgeocode
ca_nomi = pgeocode.Nominatim('ca')

toronto_neighborhood_df = pd.read_csv('toronto_neighborhood.csv')
toronto_neighborhood_df.drop("Unnamed: 0", axis=1, inplace=True)

toronto_neighborhood_df.head(5)
#get the PostalCode from the dataframe
All_PostalCodes = toronto_neighborhood_df.loc[:,"PostalCode"]

#create loop to get all latitudes and longitudes
latitudes = []
longitudes = []
for PCode in All_PostalCodes:
    location = ca_nomi.query_postal_code(PCode)
    latitudes.append(location.latitude)
    longitudes.append(location.longitude)
    
#add the Latitudes and Longitudes as new columns
toronto_neighborhood_df_A = pd.DataFrame(toronto_neighborhood_df)
toronto_neighborhood_df_A["Latitude"] = latitudes
toronto_neighborhood_df_A["Longitude"] = longitudes

#save to another csv
#toronto_neighborhood_df_A.to_csv("toronto_neighborhood_wth_lat_lnt_pgeocode.csv")

print(toronto_neighborhood_df_A.shape)
toronto_neighborhood_df_A.head(5)

(103, 5)


Unnamed: 0,PostalCode,Borugh,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.7545,-79.33
1,M4A,North York,Victoria Village,43.7276,-79.3148
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.6555,-79.3626
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.7223,-79.4504
4,M7A,Queen's Park,Ontario Provincial Government,43.6641,-79.3889


### Try to do the same by importing the latitude longitude csv

In [154]:
toronto_neighborhood_df = pd.read_csv('toronto_neighborhood.csv')
toronto_neighborhood_df.drop("Unnamed: 0", axis=1, inplace=True)

#import csv
lat_lnt_data = pd.read_csv("Geospatial_Coordinates.csv")

#merge the data
toronto_neighborhood_df_B = pd.merge(left=toronto_neighborhood_df, right=lat_lnt_data, how='left', left_on='PostalCode', right_on='Postal Code')

#remove unnecesary column
toronto_neighborhood_df_B.drop("Postal Code", axis=1, inplace=True)

#save to another csv
#toronto_neighborhood_df_B.to_csv("toronto_neighborhood_wth_lat_lnt_csvdata.csv")

print(toronto_neighborhood_df_B.shape)
toronto_neighborhood_df_B.head(5)

(103, 5)


Unnamed: 0,PostalCode,Borugh,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Queen's Park,Ontario Provincial Government,43.662301,-79.389494


---
---
## Part 3 - Cluster Analysis

In [155]:
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

In [156]:
CLIENT_ID = '' # your Foursquare ID
CLIENT_SECRET = '' # your Foursquare Secret
ACCESS_TOKEN = '' # your FourSquare Access Token
VERSION = '20180604'
LIMIT = 100

### Try to analyse neighborhoods in East York, North York and York

#### Get the data related to the chosen borough

In [160]:
#Get all the data related to the chosen borough first
onmytest = toronto_neighborhood_df_B["Borugh"].str.contains('New York|East York|York',case=False, regex=True)
toronto_neighborhood_df_analysis = toronto_neighborhood_df_B[onmytest]

print(toronto_neighborhood_df_analysis.shape)
toronto_neighborhood_df_analysis.head(10)

(34, 5)


Unnamed: 0,PostalCode,Borugh,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
7,M3B,North York,Don Mills,43.745906,-79.352188
8,M4B,East York,"Parkview Hill, Woodbine Gardens",43.706397,-79.309937
10,M6B,North York,Glencairn,43.709577,-79.445073
13,M3C,North York,"Don Mills, Flemingdon Park",43.7259,-79.340923
14,M4C,East York,Woodbine Heights,43.695344,-79.318389
16,M6C,York,Humewood-Cedarvale,43.693781,-79.428191
21,M6E,York,Caledonia-Fairbanks,43.689026,-79.453512


#### Function for explore to all the neighborhoods

In [158]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Run the function and create data table

In [173]:
Neighborhood_names = toronto_neighborhood_df_analysis['Neighborhood'].tolist()
Neighborhood_lat = toronto_neighborhood_df_analysis['Latitude'].tolist()
Neighborhood_lnt = toronto_neighborhood_df_analysis['Longitude'].tolist()
Neighborhood_venues = getNearbyVenues(Neighborhood_names, Neighborhood_lat, Neighborhood_lnt, radius=1000)

Parkwoods
Victoria Village
Lawrence Manor, Lawrence Heights
Don Mills
Parkview Hill, Woodbine Gardens
Glencairn
Don Mills, Flemingdon Park
Woodbine Heights
Humewood-Cedarvale
Caledonia-Fairbanks
Leaside
Hillcrest Village
Bathurst Manor, Wilson Heights, Downsview North
Thorncliffe Park
Fairview, Henry Farm, Oriole
Northwood Park, York University
The Danforth  East
Bayview Village
Downsview, CFB Toronto
York Mills, Silver Hills
Downsview
North Park, Maple Leaf Park, Upwood Park
Humber Summit
Willowdale, Newtonbrook
Downsview
Bedford Park, Lawrence Manor East
Del Ray, Mount Dennis, Keelsdale and Silverthorn
Humberlea, Emery
Willowdale
Downsview
Runnymede, The Junction North
Weston
York Mills West
Willowdale


#### Take a look at the data

In [174]:
print(Neighborhood_venues.shape)
Neighborhood_venues.head()

(997, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,43.753259,-79.329656,Allwyn's Bakery,43.75984,-79.324719,Caribbean Restaurant
1,Parkwoods,43.753259,-79.329656,Tim Hortons,43.760668,-79.326368,Café
2,Parkwoods,43.753259,-79.329656,Brookbanks Park,43.751976,-79.33214,Park
3,Parkwoods,43.753259,-79.329656,A&W,43.760643,-79.326865,Fast Food Restaurant
4,Parkwoods,43.753259,-79.329656,Bruno's valu-mart,43.746143,-79.32463,Grocery Store


In [175]:
Neighborhood_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Bathurst Manor, Wilson Heights, Downsview North",30,30,30,30,30,30
Bayview Village,16,16,16,16,16,16
"Bedford Park, Lawrence Manor East",39,39,39,39,39,39
Caledonia-Fairbanks,25,25,25,25,25,25
"Del Ray, Mount Dennis, Keelsdale and Silverthorn",17,17,17,17,17,17
Don Mills,28,28,28,28,28,28
"Don Mills, Flemingdon Park",44,44,44,44,44,44
Downsview,43,43,43,43,43,43
"Downsview, CFB Toronto",22,22,22,22,22,22
"Fairview, Henry Farm, Oriole",44,44,44,44,44,44


#### And number of unique categories

In [176]:
print('There are {} uniques categories.'.format(len(Neighborhood_venues['Venue Category'].unique())))

There are 179 uniques categories.


#### Do the one hot encoding

In [183]:
# one hot encoding
Neighborhood_onehot = pd.get_dummies(Neighborhood_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Neighborhood_onehot['Neighborhood'] = Neighborhood_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [Neighborhood_onehot.columns[-1]] + list(Neighborhood_onehot.columns[:-1])
Neighborhood_onehot = Neighborhood_onehot[fixed_columns]

print(Neighborhood_onehot.shape)
Neighborhood_onehot.head()

#Neighborhood_onehot[Neighborhood_onehot["Neighborhood"] == "York Mills, Silver Hills"]

(997, 180)


Unnamed: 0,Neighborhood,ATM,Accessories Store,Afghan Restaurant,Airport,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Dealership,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Beer Bar,Beer Store,Bike Shop,Bookstore,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bridal Shop,Bridge,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Cafeteria,Café,Caribbean Restaurant,Carpet Store,Cheese Shop,Chinese Restaurant,Chocolate Shop,Clothing Store,Coffee Shop,Comfort Food Restaurant,Community Center,Construction & Landscaping,Convenience Store,Cosmetics Shop,Creperie,Curling Ice,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Bar,Dog Run,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fireworks Store,Fish & Chips Shop,Flea Market,Flower Shop,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gas Station,Gastropub,Gift Shop,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,History Museum,Hockey Arena,Home Service,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Ice Cream Shop,Indian Restaurant,Intersection,Italian Restaurant,Japanese Restaurant,Juice Bar,Karaoke Bar,Kitchen Supply Store,Korean Restaurant,Latin American Restaurant,Laundry Service,Liquor Store,Locksmith,Lounge,Market,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Movie Theater,Nail Salon,New American Restaurant,Nightclub,Optical Shop,Other Repair Shop,Paper / Office Supplies Store,Park,Pastry Shop,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Playground,Plaza,Pool,Portuguese Restaurant,Pub,Ramen Restaurant,Recreation Center,Rental Car Location,Residential Building (Apartment / Condo),Restaurant,Rock Climbing Spot,Salad Place,Salon / Barbershop,Sandwich Place,Seafood Restaurant,Shop & Service,Shopping Mall,Skating Rink,Ski Area,Ski Chalet,Smoothie Shop,Snack Place,Soccer Field,Soccer Stadium,Souvlaki Shop,Spa,Sporting Goods Shop,Sports Bar,Sports Club,Steakhouse,Storage Facility,Supermarket,Supplement Shop,Sushi Restaurant,Tennis Court,Thai Restaurant,Theater,Toy / Game Store,Trail,Train Station,Turkish Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,Parkwoods,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Parkwoods,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Parkwoods,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Parkwoods,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Parkwoods,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


#### Group rows by neighborhood and by taking the mean of the frequency of occurrence of each category


In [178]:
Neighborhood_grouped = Neighborhood_onehot.groupby('Neighborhood').mean().reset_index()
Neighborhood_grouped.head()

Unnamed: 0,Neighborhood,ATM,Accessories Store,Afghan Restaurant,Airport,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Dealership,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Beer Bar,Beer Store,Bike Shop,Bookstore,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bridal Shop,Bridge,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Cafeteria,Café,Caribbean Restaurant,Carpet Store,Cheese Shop,Chinese Restaurant,Chocolate Shop,Clothing Store,Coffee Shop,Comfort Food Restaurant,Community Center,Construction & Landscaping,Convenience Store,Cosmetics Shop,Creperie,Curling Ice,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Bar,Dog Run,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Field,Fireworks Store,Fish & Chips Shop,Flea Market,Flower Shop,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gas Station,Gastropub,Gift Shop,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,History Museum,Hockey Arena,Home Service,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Ice Cream Shop,Indian Restaurant,Intersection,Italian Restaurant,Japanese Restaurant,Juice Bar,Karaoke Bar,Kitchen Supply Store,Korean Restaurant,Latin American Restaurant,Laundry Service,Liquor Store,Locksmith,Lounge,Market,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Movie Theater,Nail Salon,New American Restaurant,Nightclub,Optical Shop,Other Repair Shop,Paper / Office Supplies Store,Park,Pastry Shop,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Playground,Plaza,Pool,Portuguese Restaurant,Pub,Ramen Restaurant,Recreation Center,Rental Car Location,Residential Building (Apartment / Condo),Restaurant,Rock Climbing Spot,Salad Place,Salon / Barbershop,Sandwich Place,Seafood Restaurant,Shop & Service,Shopping Mall,Skating Rink,Ski Area,Ski Chalet,Smoothie Shop,Snack Place,Soccer Field,Soccer Stadium,Souvlaki Shop,Spa,Sporting Goods Shop,Sports Bar,Sports Club,Steakhouse,Storage Facility,Supermarket,Supplement Shop,Sushi Restaurant,Tennis Court,Thai Restaurant,Theater,Toy / Game Store,Trail,Train Station,Turkish Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,"Bathurst Manor, Wilson Heights, Downsview North",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Bayview Village,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Bedford Park, Lawrence Manor East",0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.025641,0.025641,0.051282,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.025641,0.076923,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.025641,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.025641,0.025641,0.025641,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.051282,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.025641,0.0,0.025641,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0
3,Caledonia-Fairbanks,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.04,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.08,0.08,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0
4,"Del Ray, Mount Dennis, Keelsdale and Silverthorn",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.117647,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.176471,0.058824,0.0,0.0,0.0,0.0,0.0,0.117647,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.117647,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0


#### Find the top 10 venues in each neighborhood

In [179]:
num_top_venues = 10

for hood in Neighborhood_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = Neighborhood_grouped[Neighborhood_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]   
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Bathurst Manor, Wilson Heights, Downsview North----
                       venue  freq
0  Park                       0.07
1  Coffee Shop                0.07
2  Bank                       0.07
3  Convenience Store          0.03
4  Ski Area                   0.03
5  Men's Store                0.03
6  Bridal Shop                0.03
7  Pizza Place                0.03
8  Middle Eastern Restaurant  0.03
9  Mobile Phone Shop          0.03


----Bayview Village----
                 venue  freq
0  Japanese Restaurant  0.12
1  Grocery Store        0.12
2  Gas Station          0.12
3  Bank                 0.12
4  Park                 0.06
5  Restaurant           0.06
6  Trail                0.06
7  Chinese Restaurant   0.06
8  Shopping Mall        0.06
9  Intersection         0.06


----Bedford Park, Lawrence Manor East----
                     venue  freq
0  Italian Restaurant       0.08
1  Coffee Shop              0.08
2  Sandwich Place           0.05
3  Bank                     0.05
4  Co

#### Add the info to dataframe

Create a data frame with top 10 venues for each neighborhood.

There are neighbor hood with fewer types of venues (freq = 0.0), put that as NaN

In [180]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [208]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    a = []
    for x in range(0,10):
        if row_categories_sorted[x] > 0:
            a.append(row_categories_sorted.index.values[x])
        else:
            a.append(np.nan)
    return a

In [211]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = Neighborhood_grouped['Neighborhood']

for ind in np.arange(Neighborhood_grouped.shape[0]):
     neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Neighborhood_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Bathurst Manor, Wilson Heights, Downsview North",Park,Coffee Shop,Bank,Convenience Store,Ski Area,Men's Store,Bridal Shop,Pizza Place,Middle Eastern Restaurant,Mobile Phone Shop
1,Bayview Village,Japanese Restaurant,Grocery Store,Gas Station,Bank,Park,Restaurant,Trail,Chinese Restaurant,Shopping Mall,Intersection
2,"Bedford Park, Lawrence Manor East",Italian Restaurant,Coffee Shop,Sandwich Place,Bank,Comfort Food Restaurant,Breakfast Spot,Bridal Shop,Sports Club,Butcher,Café
3,Caledonia-Fairbanks,Park,Pizza Place,Pharmacy,Sporting Goods Shop,Bus Stop,Café,Coffee Shop,Portuguese Restaurant,Discount Store,ATM
4,"Del Ray, Mount Dennis, Keelsdale and Silverthorn",Furniture / Home Store,Discount Store,Grocery Store,Intersection,Sandwich Place,Gas Station,Bar,Fast Food Restaurant,Dessert Shop,Playground


### Cluster Analysis

In [212]:
# set number of clusters
kclusters = 5

Neighborhood_grouped_clustering = Neighborhood_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(Neighborhood_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 1, 1, 2, 1, 1, 0, 1, 1])

Dataframe with the cluster labels

In [227]:
neighborhoods_venues_sorted.drop('Cluster Labels',1,inplace=True)
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
neighborhoods_venues_sorted.head()

Neighborhood_merged = toronto_neighborhood_df_analysis

# merge manhattan_grouped with manhattan_data to add latitude/longitude for each neighborhood
Neighborhood_merged = Neighborhood_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

print(Neighborhood_merged.shape)
Neighborhood_merged.head()

(34, 16)


Unnamed: 0,PostalCode,Borugh,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M3A,North York,Parkwoods,43.753259,-79.329656,0,Park,Bus Stop,Pharmacy,Shopping Mall,Shop & Service,Café,Caribbean Restaurant,Chinese Restaurant,Cosmetics Shop,Pizza Place
1,M4A,North York,Victoria Village,43.725882,-79.315572,1,Coffee Shop,Hockey Arena,Golf Course,Playground,Pizza Place,Men's Store,Sporting Goods Shop,Portuguese Restaurant,Intersection,Gym / Fitness Center
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763,1,Restaurant,Fast Food Restaurant,Coffee Shop,Fried Chicken Joint,Dessert Shop,Furniture / Home Store,Vietnamese Restaurant,Clothing Store,Sushi Restaurant,Accessories Store
7,M3B,North York,Don Mills,43.745906,-79.352188,1,Japanese Restaurant,Coffee Shop,Pizza Place,Burger Joint,Diner,Bank,Café,Mobile Phone Shop,Caribbean Restaurant,Paper / Office Supplies Store
8,M4B,East York,"Parkview Hill, Woodbine Gardens",43.706397,-79.309937,1,Coffee Shop,Pizza Place,Brewery,Gym / Fitness Center,Intersection,Flea Market,Fast Food Restaurant,Café,Gastropub,Bank


Visulization

In [223]:
geolocator = Nominatim(user_agent="toronto_agent")
toronto_location = geolocator.geocode("Toronto")

latitude = toronto_location.latitude
longitude = toronto_location.longitude

# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Neighborhood_merged['Latitude'], Neighborhood_merged['Longitude'], Neighborhood_merged['Neighborhood'], Neighborhood_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Take some observations to the labels

In [228]:
Neighborhood_merged.shape

(34, 16)

#### Cluster 1

In [230]:
Neighborhood_merged.loc[Neighborhood_merged['Cluster Labels'] == 0, Neighborhood_merged.columns[[1] + list(range(5, Neighborhood_merged.shape[1]))]]

Unnamed: 0,Borugh,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,North York,0,Park,Bus Stop,Pharmacy,Shopping Mall,Shop & Service,Café,Caribbean Restaurant,Chinese Restaurant,Cosmetics Shop,Pizza Place
10,North York,0,Grocery Store,Fast Food Restaurant,Pizza Place,Gas Station,Coffee Shop,Ice Cream Shop,Bakery,Metro Station,Spa,Electronics Store
27,North York,0,Coffee Shop,Park,Pharmacy,Recreation Center,Shopping Mall,Sandwich Place,Chinese Restaurant,Tennis Court,Restaurant,Fast Food Restaurant
28,North York,0,Park,Coffee Shop,Bank,Convenience Store,Ski Area,Men's Store,Bridal Shop,Pizza Place,Middle Eastern Restaurant,Mobile Phone Shop
39,North York,0,Japanese Restaurant,Grocery Store,Gas Station,Bank,Park,Restaurant,Trail,Chinese Restaurant,Shopping Mall,Intersection
46,North York,0,Grocery Store,Hotel,Vietnamese Restaurant,Pizza Place,Coffee Shop,Shopping Mall,Fast Food Restaurant,Gas Station,Pharmacy,Falafel Restaurant
49,North York,0,Coffee Shop,Bakery,Mobile Phone Shop,Mediterranean Restaurant,Chinese Restaurant,Dim Sum Restaurant,Park,Athletics & Sports,Intersection,Convenience Store
50,North York,0,Shopping Mall,Pizza Place,Pharmacy,Italian Restaurant,Electronics Store,Park,Bakery,Bank,Optical Shop,
53,North York,0,Grocery Store,Hotel,Vietnamese Restaurant,Pizza Place,Coffee Shop,Shopping Mall,Fast Food Restaurant,Gas Station,Pharmacy,Falafel Restaurant
60,North York,0,Grocery Store,Hotel,Vietnamese Restaurant,Pizza Place,Coffee Shop,Shopping Mall,Fast Food Restaurant,Gas Station,Pharmacy,Falafel Restaurant


#### Cluster 2

In [231]:
Neighborhood_merged.loc[Neighborhood_merged['Cluster Labels'] == 1, Neighborhood_merged.columns[[1] + list(range(5, Neighborhood_merged.shape[1]))]]

Unnamed: 0,Borugh,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,North York,1,Coffee Shop,Hockey Arena,Golf Course,Playground,Pizza Place,Men's Store,Sporting Goods Shop,Portuguese Restaurant,Intersection,Gym / Fitness Center
3,North York,1,Restaurant,Fast Food Restaurant,Coffee Shop,Fried Chicken Joint,Dessert Shop,Furniture / Home Store,Vietnamese Restaurant,Clothing Store,Sushi Restaurant,Accessories Store
7,North York,1,Japanese Restaurant,Coffee Shop,Pizza Place,Burger Joint,Diner,Bank,Café,Mobile Phone Shop,Caribbean Restaurant,Paper / Office Supplies Store
8,East York,1,Coffee Shop,Pizza Place,Brewery,Gym / Fitness Center,Intersection,Flea Market,Fast Food Restaurant,Café,Gastropub,Bank
13,North York,1,Restaurant,Gym,Supermarket,Café,Coffee Shop,Hockey Arena,Discount Store,Bus Line,Chinese Restaurant,Clothing Store
14,East York,1,Park,Coffee Shop,Café,Sandwich Place,Pizza Place,Athletics & Sports,Thai Restaurant,Restaurant,Skating Rink,Beer Store
16,York,1,Pizza Place,Convenience Store,Coffee Shop,Middle Eastern Restaurant,Restaurant,Dance Studio,Optical Shop,Field,Mexican Restaurant,Burger Joint
21,York,1,Park,Pizza Place,Pharmacy,Sporting Goods Shop,Bus Stop,Café,Coffee Shop,Portuguese Restaurant,Discount Store,ATM
23,East York,1,Sporting Goods Shop,Coffee Shop,Grocery Store,Burger Joint,Electronics Store,Furniture / Home Store,Restaurant,Brewery,Bank,Sports Bar
29,East York,1,Coffee Shop,Indian Restaurant,Grocery Store,Shopping Mall,Supermarket,Afghan Restaurant,Burger Joint,Pizza Place,Turkish Restaurant,Brewery


#### Cluster 3

In [232]:
Neighborhood_merged.loc[Neighborhood_merged['Cluster Labels'] == 2, Neighborhood_merged.columns[[1] + list(range(5, Neighborhood_merged.shape[1]))]]

Unnamed: 0,Borugh,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
56,York,2,Furniture / Home Store,Discount Store,Grocery Store,Intersection,Sandwich Place,Gas Station,Bar,Fast Food Restaurant,Dessert Shop,Playground


#### Cluster 4

In [233]:
Neighborhood_merged.loc[Neighborhood_merged['Cluster Labels'] == 3, Neighborhood_merged.columns[[1] + list(range(5, Neighborhood_merged.shape[1]))]]

Unnamed: 0,Borugh,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
45,North York,3,Park,Pool,,,,,,,,


#### Cluster 5

In [234]:
Neighborhood_merged.loc[Neighborhood_merged['Cluster Labels'] == 4, Neighborhood_merged.columns[[1] + list(range(5, Neighborhood_merged.shape[1]))]]

Unnamed: 0,Borugh,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
57,North York,4,Bakery,Convenience Store,Storage Facility,Park,Golf Course,Discount Store,Gas Station,,,


#### Final Words

By using Cluster Analysis, it can be seen that there are 3 neighborhoods which can be considered unique, compared to other neighborhoods under analysis.