# Neighborhoods Comparison
### - The Battle of Neighborhoods

### Table of Contents


## 1. [Introduction](#1.)
### 1.1 [Background](#1.1)
### 1.2 [About Datasets](#1.2)
#### 1.2.1 [Toronto Neighborhoods](#1.2.1)
#### 1.2.2 [Metro Vancouver Neighborhoods](#1.2.2)

## 2. [Data Gathering](#2.)
### 2.1 [List of Neighborhoods](#2.1)
#### 2.1.1 [Neighborhoods in Toronto](#2.1.1)
#### 2.1.2 [Neighborhoods in Metro Vancouver](#2.1.2)

### 2.2 [Coordinates of the Neighborhoods](#2.2)
#### 2.2.1 [Coordinates of Toronto Neighborhoods](#2.2.1)
#### 2.2.2 [Coordinates of Metro Vancouver Neighborhoods](#2.2.2)

### 2.3 [Venue Information](#2.3)
#### 2.3.1 [Venues in Toronto](#2.3.1)
#### 2.3.2 [Venues in Metro Vancouver](#2.3.2)

## 3. [Exploration of Venue Information](#3.)
### 3.1 [Neighborhoods and Venues in Both Cities](#3.1)
### 3.2 [Venues in Toronto](#3.2)
### 3.3 [Venues in Metro Vancouver](#3.3)
### 3.4 [The Most Common Venues](#3.4)
#### 3.4.1 [The Most Common Venues for each Neighborhood in Toronto](#3.4.1)
#### 3.4.2 [The Most Common Venues for each Neighborhood in Metro Vancouver](#3.4.2)
#### 3.4.3 [The Most Common Categories of Venues ](#3.4.3)

## 4. [Neighborhoods Clustering](#4.)

## 5. [Cluster Examinations](#5.)
### 5.1 [Clusters of Both Cities](#5.1)
### 5.2 [Clusters of Toronto](#5.2)
### 5.3 [Clusters of Metro Vancouver](#5.3)


## 1. Introduction
### 1.1 Background

Toronto and Vancouver are well-known cities in Canada. Recently, Canada announced the immigration policy, and according to the new immigration policy, Canada will accept more immigrants. Many people who are thinking of immigration to Canada ask "Which city is better, Toronto or Vancouver?" Of course, the answer will depend on the conditions, and they have their criteria to make a decision. Although the answer or decision is subjective, if they have information about the cities it will help them make a decision.

### 1.2 About Datasets
#### 1.2.1 Toronto Neighborhoods
Toronto, the provincial capital of Ontario, is the largest city in Canada with a population about of 2.7 million people and the most populous city in Canada. Toronto is also the most preferred city for the new immigrants in Canada. Toronto has about 200 neighborhoods.

#### 1.2.2 Metro Vancouver Neighborhoods
Metro Vanoucouver is consisted of 23 cities and villages. Metro vancouver is the second preferred city for the new immigrants in Canada and the population is about 2.4 million. In this comparison of neighborhood between Toronto and Vanouver, neighborhoods in 7 cities of the Metro Vancouver are included. The list of 7 cities are:
- Vancouver
- West Vancouver
- North Vancouver
- Burnaby
- Surrey
- Coquitlam
- Richmond

<a id="2."></a>
## 2. Data Gathering

#### Load library and packages

In [1]:
import pandas as pd
import numpy as np
import requests
from bs4 import BeautifulSoup
import json

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

import matplotlib.cm as cm
import matplotlib.colors as colors

from sklearn.cluster import KMeans

import folium # map rendering library

<a id="2.1"></a>
### 2.1 List of Neighborhoods

<a id="2.1.1"></a>
#### 2.1.1 Neighborhoods in Toronto
- List of neighborhoods in Toronto can be found in the Wikipedia (url: https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M')
- Extract postal code and neighborhoods using web scrawling.

In [2]:
# Read postal code and neighborhood list from Wikipedia
url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
wikipedia_page = requests.get(url)

# extract postal code and neighborhood from html
neighborhood = pd.read_html(wikipedia_page.content, header=0)[0]
neighborhood.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"


In [3]:
# Remove rows if the value of Borough is "Not assigned."
neighborhood = neighborhood[neighborhood.Borough != 'Not assigned']

In [4]:
# Rename column names
neighborhood.rename(columns={'Postal Code': 'PostalCode', 'Neighbourhood':'Neighborhood'}, inplace = True)
neighborhood.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
5,M6A,North York,"Lawrence Manor, Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


Neighborhood has multiple valuse, Split neighborhood that single row has single neighborhood.
Create new dataframe 'toronto' to save the neighborhoods list.

In [5]:
toronto = pd.DataFrame(neighborhood.Neighborhood.str.split(',').tolist(), index = neighborhood.PostalCode).stack()
toronto = toronto.reset_index()[[0, 'PostalCode']]
toronto.columns = ['Neighborhood', 'PostalCode']

# remove leading space (' ')
toronto['Neighborhood'] = toronto['Neighborhood'].str.strip(' ')
toronto.head()

Unnamed: 0,Neighborhood,PostalCode
0,Parkwoods,M3A
1,Victoria Village,M4A
2,Regent Park,M5A
3,Harbourfront,M5A
4,Lawrence Manor,M6A


In [6]:
# check duplicated values in Neighborhood
toronto[toronto['Neighborhood'].duplicated()]

Unnamed: 0,Neighborhood,PostalCode
26,Don Mills,M3C
85,Downsview,M3L
99,Downsview,M3M
110,Willowdale,M2N
112,Downsview,M3N
130,Willowdale,M2R
133,Lawrence Park,M4R
148,Runnymede,M6S
194,St. James Town,M4X


In [7]:
# Drop duplicated rows
toronto.drop_duplicates('Neighborhood', inplace=True, ignore_index=True)

In [8]:
# check duplicated values are removed
toronto[toronto['Neighborhood'].duplicated()]

Unnamed: 0,Neighborhood,PostalCode


In [9]:
# remove PostalCode column
toronto.drop('PostalCode', axis = 1, inplace=True)

In [10]:
# add 'City' column and assign to 'Toronto'
toronto['City'] = 'Toronto'

In [11]:
toronto.head()

Unnamed: 0,Neighborhood,City
0,Parkwoods,Toronto
1,Victoria Village,Toronto
2,Regent Park,Toronto
3,Harbourfront,Toronto
4,Lawrence Manor,Toronto


<a id="2.1.2"></a>
#### 2.1.2 Neighborhoods in Metro Vancouver

#### 1) Neighborhoods of City of Vancouver
Neighoborhoods of the city of Vancouver are found on Wikipedia (https://en.wikipedia.org/wiki/List_of_neighbourhoods_in_Vancouver")

In [12]:
response = requests.get(
    url="https://en.wikipedia.org/wiki/List_of_neighbourhoods_in_Vancouver")

In [13]:
# get list of neighborhoods from html
# use BeautifulSoup4 to parse the html
wiki_data = BeautifulSoup(response.content, 'html.parser')
wiki_data

<!DOCTYPE html>

<html class="client-nojs" dir="ltr" lang="en">
<head>
<meta charset="utf-8"/>
<title>List of neighbourhoods in Vancouver - Wikipedia</title>
<script>document.documentElement.className="client-js";RLCONF={"wgBreakFrames":!1,"wgSeparatorTransformTable":["",""],"wgDigitTransformTable":["",""],"wgDefaultDateFormat":"dmy","wgMonthNames":["","January","February","March","April","May","June","July","August","September","October","November","December"],"wgRequestId":"YCzOHqN9o562xdBQq4RC8AAAAM4","wgCSPNonce":!1,"wgCanonicalNamespace":"","wgCanonicalSpecialPageName":!1,"wgNamespaceNumber":0,"wgPageName":"List_of_neighbourhoods_in_Vancouver","wgTitle":"List of neighbourhoods in Vancouver","wgCurRevisionId":1002412562,"wgRevisionId":1002412562,"wgArticleId":17757023,"wgIsArticle":!0,"wgIsRedirect":!1,"wgAction":"view","wgUserName":null,"wgUserGroups":["*"],"wgCategories":["Articles needing additional references from January 2009","All articles needing additional references","Arti

In [14]:
li = wiki_data.find_all('li')
neighborhood_list = []

for i in range(5, 27):
    neighborhood_list.append(li[i].text.split('-', 1)[0])

neighborhoods = pd.DataFrame (neighborhood_list, columns=['Neighborhood'])
neighborhoods

Unnamed: 0,Neighborhood
0,Arbutus Ridge
1,Downtown
2,Dunbar
3,Fairview
4,Grandview
5,Hastings
6,Kensington
7,Kerrisdale
8,Killarney
9,Kitsilano


In [15]:
# the last row had different format of value. Unlike the other rows, it is a form of statement.
# update last row to its neighborhood name
neighborhoods.iloc[21].Neighborhood = "West Point Grey"
neighborhoods.tail()

Unnamed: 0,Neighborhood
17,Strathcona
18,Sunset
19,Victoria
20,West End
21,West Point Grey


#### 2) Neighborhood in other 6 cities of Metro Vancouver
Read neighborhoods in other 6 cities Metro Vancouver from 'neighborhood_metro_vancouver.csv' file.

In [16]:
neighborhoods_6_cities = pd.read_csv('neighborhood_metro_vancouver.csv')
neighborhoods_6_cities.head()

Unnamed: 0,City,Neighborhood
0,Burnaby,Burnaby Heights
1,Burnaby,Willingdon Heights
2,Burnaby,West Central Valley
3,Burnaby,Dawson-Delta
4,Burnaby,Brentwood


#### 3) Concatenate  list of neighborhoods in Vancouver and other 6 cities
Create new dataframe 'vancouver' and save the list of neighborhood to vancouver dataframe.
Add 'City' column to distinguish same neighborhood name from different cities.

In [17]:
# add 'City' column to neighborhood dataframe that stores neighborhoods of city of vancouver
# assign 'Vancouver' to the 'City' column
neighborhoods['City'] = 'Vancouver'

In [18]:
# merge two datafimes into single dataframe using append method
vancouver = neighborhoods.append(neighborhoods_6_cities, ignore_index=True)

# reset index of the new dataframe
vancouver.reset_index(inplace=True)

# drop index column that was added when two dataframes were appended
vancouver.drop('index', axis = 1, inplace=True)
vancouver.head()

Unnamed: 0,Neighborhood,City
0,Arbutus Ridge,Vancouver
1,Downtown,Vancouver
2,Dunbar,Vancouver
3,Fairview,Vancouver
4,Grandview,Vancouver


In [19]:
vancouver.tail()

Unnamed: 0,Neighborhood,City
132,Central Lonsdale,North Vancouver
133,Grand Boulevard,North Vancouver
134,Cedar Village,North Vancouver
135,Moodyville,North Vancouver
136,Upper Capilano,North Vancouver


<a id="2.2"></a>
### 2.2 Coordinates of the Neighborhoods

<a id="2.2.1"></a>
#### 2.2.1 Coordinates of Toronto Neighborhoods
#### 1) Get coordinates of neighborhoods in Toronto

In [20]:
# Add 'Latitude' and ' Longitude' columns to the toronto dataframe and assign np.NAN for the initial value
toronto['Latitude'] = np.NAN
toronto['Longitude'] = np.NAN
toronto.head()

Unnamed: 0,Neighborhood,City,Latitude,Longitude
0,Parkwoods,Toronto,,
1,Victoria Village,Toronto,,
2,Regent Park,Toronto,,
3,Harbourfront,Toronto,,
4,Lawrence Manor,Toronto,,


Get coordinates (latitude and longitude) using geopy.

In [21]:
geolocator = Nominatim(user_agent = 'toronto_explorer')

for i in range(toronto.shape[0]):
    try:
        address = toronto.iloc[i]['Neighborhood'] + ' Toronto, ON'
   
        location = geolocator.geocode(address)
        toronto['Latitude'].iloc[i] = location.latitude
        toronto['Longitude'].iloc[i] = location.longitude

    except:
        print('No cooridates:', toronto.iloc[i]['Neighborhood'])
        toronto['Latitude'].iloc[i] = np.NAN
        toronto['Longitude'].iloc[i] = np.NAN

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  iloc._setitem_with_indexer(indexer, value)


No cooridates: Ontario Provincial Government
No cooridates: Caledonia-Fairbanks
No cooridates: Del Ray
No cooridates: Keelsdale and Silverthorn
No cooridates: Canada Post Gateway Processing Centre
No cooridates: Railway Lands
No cooridates: Island airport
No cooridates: Humber Bay Shores
No cooridates: Beaumond Heights
No cooridates: Stn A PO Boxes
No cooridates: Business reply mail Processing Centre
No cooridates: South Central Letter Processing Plant Toronto
No cooridates: South of Bloor


Save toronto dataframe to 'toronto_neighborhoods_master.csv'
- The file stores all neighborhoods even if their coordinates are not confirmed.

In [22]:
toronto.to_csv('toronto_neighborhoods_master.csv')

#### 2) Create Map of Toronto
Creat map of Toronto and mark neighborhoods using Folium

In [23]:
# Remove Neighborhood if coordinates are not confirmed
toronto= toronto[~toronto.Latitude.isnull()]
toronto.reset_index(drop=True, inplace=True)

toronto.head()

Unnamed: 0,Neighborhood,City,Latitude,Longitude
0,Parkwoods,Toronto,43.7588,-79.320197
1,Victoria Village,Toronto,43.732658,-79.311189
2,Regent Park,Toronto,43.660706,-79.360457
3,Harbourfront,Toronto,43.64008,-79.38015
4,Lawrence Manor,Toronto,43.722079,-79.437507


In [24]:
# Get coordinate of Toronto
address = 'Toronto, ON'
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, label in zip(toronto['Latitude'], toronto['Longitude'], toronto['Neighborhood']):
#     print(lat, lng, label)
    label = folium.Popup(label, parse_html=True)
#     print(lat, lng, label)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

<a id="2.2.2"></a>
#### 2.2.2 Coordinates of Metro Vancouver Neighborhoods
#### 1) Get coordinates of neighborhoods in Metro Vancouver

In [25]:
# Add 'Latitude' and ' Longitude' columns to the vancouver dataframe and assign np.NAN for the initial value
vancouver['Latitude'] = np.NAN
vancouver['Longitude'] = np.NAN
vancouver.head()

Unnamed: 0,Neighborhood,City,Latitude,Longitude
0,Arbutus Ridge,Vancouver,,
1,Downtown,Vancouver,,
2,Dunbar,Vancouver,,
3,Fairview,Vancouver,,
4,Grandview,Vancouver,,


Get coordinates (latitude and longitude) using geopy.

In [26]:
geolocator = Nominatim(user_agent = 'vancouver_explorer')

for i in range(vancouver.shape[0]):
    try:
        address = vancouver.iloc[i]['Neighborhood'] + ' ' + vancouver.iloc[i]['City'] + ' , BC'
   
        location = geolocator.geocode(address)
        vancouver['Latitude'].iloc[i] = location.latitude
        vancouver['Longitude'].iloc[i] = location.longitude

    except:
        print('No cooridates:', vancouver.iloc[i]['Neighborhood'])
        vancouver['Latitude'].iloc[i] = np.NAN
        vancouver['Longitude'].iloc[i] = np.NAN

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  iloc._setitem_with_indexer(indexer, value)


No cooridates: Dawson-Delta
No cooridates: Parkcrest-Aubrey
No cooridates: Ardingley-Sprott
No cooridates: Clinton-Glenwood
No cooridates: Cascade-Schou
No cooridates: Douglas-Gilpin
No cooridates: Kingsway-Beresford
No cooridates: Morley-Buckingham
No cooridates: Lakeview-Mayfield
No cooridates: South Westminster
No cooridates: NEW HORIZONS
No cooridates: PARK RIDGE ESTATES
No cooridates: UPPER EAGLE RIDGE
No cooridates: Boundary
No cooridates: Cleveland


Save vancouver dataframe to 'vancouver_neighborhoods_master.csv'
- The file stores all neighborhoods even if their coordinates are not confirmed.

In [27]:
vancouver.to_csv('vancouver_neighborhoods_master.csv')

#### 2) Create Map of Metro Vancouver
Creat map of Metro Vancouver and mark neighborhoods using Folium

In [28]:
# Remove Neighborhood if coordinates are not confirmed
vancouver = vancouver[~vancouver['Latitude'].isnull()]
vancouver.reset_index(drop=True, inplace=True)
vancouver.head()

Unnamed: 0,Neighborhood,City,Latitude,Longitude
0,Arbutus Ridge,Vancouver,49.246305,-123.159636
1,Downtown,Vancouver,49.285998,-123.127358
2,Dunbar,Vancouver,49.247581,-123.18495
3,Fairview,Vancouver,49.261956,-123.130408
4,Grandview,Vancouver,49.262593,-123.068806


In [29]:
# Get coordinate of Vancouver
address = 'Vancouver, BC'
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
map_vancouver = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, label in zip(vancouver['Latitude'], vancouver['Longitude'], vancouver['Neighborhood']):
#     print(lat, lng, label)
    label = folium.Popup(label, parse_html=True)
#     print(lat, lng, label)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_vancouver)  
    
map_vancouver

##### Save dataframe to CSV file
- These files contains neighborhoods with coordinates.

In [30]:
# save toronto dataframe
toronto.to_csv('toronto_neighborhood_coordinates.csv')
# save vancouver dataframe
vancouver.to_csv('vancouver_neighborhood_coordinates.csv')

In [31]:
# Merge toronto and vancouver dataframe
vancouver['City'] = 'Metro Vancouver'
toronto_vancouver = toronto.append(vancouver)
toronto_vancouver.shape

(317, 4)

<a id="2.3"></a>
### 2.3 Get Venue Information
Get venue information from Foursquare for each neighborhood in Toronto and Metro Vancouver

In [32]:
# Sett client id, client secret and version
CLIENT_ID = ''
CLIENT_SECRET = ''
VERSION = '20210205' # Foursquare API version
LIMIT = 100 # up to 100 venues per given location i.e. neighborhood in Toronto and Metro Vancouver

Define fundtion that get nearby venues from the given location. Nearby means the radius of 500m by default.

In [33]:
def getNearbyVenues(names, city_name, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url_base = 'https://api.foursquare.com/v2/venues/'
        url = url_base +'explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name,
            city_name,
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 'City', 'Neighborhood Latitude', 'Neighborhood Longitude',
                             'Venue', 'Venue Latitude', 'Venue Longitude', 'Venue Category']
    
    return(nearby_venues)

In [34]:
# Set nearby radius to 1,000m
radius = 1000

<a id="2.3.1"></a>
#### 2.3.1 Venues in Toronto

In [35]:
toronto_venues = getNearbyVenues(names=toronto['Neighborhood'], city_name = 'Toronto',
                                 latitudes=toronto['Latitude'], longitudes=toronto['Longitude'],
                                 radius = radius)

Parkwoods
Victoria Village
Regent Park
Harbourfront
Lawrence Manor
Lawrence Heights
Queen's Park
Islington Avenue
Humber Valley Village
Malvern
Rouge
Don Mills
Parkview Hill
Woodbine Gardens
Garden District
Ryerson
Glencairn
West Deane Park
Princess Gardens
Martin Grove
Islington
Cloverdale
Rouge Hill
Port Union
Highland Creek
Woodbine Heights
St. James Town
Humewood-Cedarvale
Eringate
Bloordale Gardens
Old Burnhamthorpe
Markland Wood
Guildwood
Morningside
West Hill
The Beaches
Berczy Park
Woburn
Leaside
Central Bay Street
Christie
Cedarbrae
Hillcrest Village
Bathurst Manor
Wilson Heights
Downsview North
Thorncliffe Park
Richmond
Adelaide
King
Dufferin
Dovercourt Village
Scarborough Village
Fairview
Henry Farm
Oriole
Northwood Park
York University
East Toronto
Broadview North (Old East York)
Harbourfront East
Union Station
Toronto Islands
Little Portugal
Trinity
Kennedy Park
Ionview
East Birchmount Park
Bayview Village
Downsview
The Danforth West
Riverdale
Toronto Dominion Centre
Desig

In [36]:
toronto_venues.shape

(10286, 8)

In [37]:
toronto_venues.head(10)

Unnamed: 0,Neighborhood,City,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,Toronto,43.7588,-79.320197,Allwyn's Bakery,43.75984,-79.324719,Caribbean Restaurant
1,Parkwoods,Toronto,43.7588,-79.320197,Tim Hortons,43.760668,-79.326368,Café
2,Parkwoods,Toronto,43.7588,-79.320197,LCBO,43.757774,-79.314257,Liquor Store
3,Parkwoods,Toronto,43.7588,-79.320197,A&W,43.760643,-79.326865,Fast Food Restaurant
4,Parkwoods,Toronto,43.7588,-79.320197,Dollarama,43.758135,-79.310672,Discount Store
5,Parkwoods,Toronto,43.7588,-79.320197,Shoppers Drug Mart,43.760857,-79.324961,Pharmacy
6,Parkwoods,Toronto,43.7588,-79.320197,Tim Hortons,43.758295,-79.31231,Coffee Shop
7,Parkwoods,Toronto,43.7588,-79.320197,The Beer Store,43.758275,-79.313705,Beer Store
8,Parkwoods,Toronto,43.7588,-79.320197,Food Basics,43.760549,-79.326045,Supermarket
9,Parkwoods,Toronto,43.7588,-79.320197,Staples,43.758379,-79.310695,Paper / Office Supplies Store


- About 10,000 venues are found
- Note that some venues are located in the overlapped area and those venues are appeared in multiple neighborhoods.

In [38]:
# save vanue information
toronto_venues.to_csv('toronto_venues.csv', index=False)

<a id="2.3.2"></a>
#### 2.3.2 Vanues in Metro Vancouver

In [39]:
vancouver_venues = getNearbyVenues(names=vancouver['Neighborhood'], city_name = 'Metro Vanouver',
                                 latitudes=vancouver['Latitude'], longitudes=vancouver['Longitude'],
                                 radius=radius)

Arbutus Ridge 
Downtown 
Dunbar
Fairview 
Grandview
Hastings
Kensington
Kerrisdale 
Killarney 
Kitsilano 
Marpole 
Mount Pleasant 
Oakridge 
Renfrew
Riley Park 
Shaughnessy 
South Cambie 
Strathcona 
Sunset 
Victoria
West End 
West Point Grey
Burnaby Heights
Willingdon Heights
West Central Valley
Brentwood
Capitol Hill
Burnaby Lake
Government Road
Sperling-Broadway
Lochdale
Westridge
Burnaby Mountain
Lake City
Lyndhurst
Cameron
Cariboo-Armstrong
Second Street
Edmonds
Stride Avenue
Stride Hill
Big Bend
Sussex-Nelson
Suncrest
Maywood
Garden Village
Marlborough
Windsor
Richmond Park
Oakalla
Bridgeview
Cloverdale
Crescent Beach
Douglas
Fleetwood
Fraser Heights
Grandview Heights
Guildford
Newton
Ocean Park
Port Kells
Port Mann
South Surrey
Sunnyside
Whalley
Altamont
Ambleside
British Properties
Cedardale
Dundarave
Hollyburn
Horseshoe Bay
Sentinel Hill
West Bay
NORTH COQUITLAM
DOWNTOWN
BURKE MOUNTAIN
COQUITLAM WEST
BURQUITLAM
CANYON SPRINGS
CENTRAL COQUITLAM
COQUITLAM EAST
CHINESIDE
CAPE HOR

In [40]:
vancouver_venues.shape

(4171, 8)

In [41]:
vancouver_venues.head(10)

Unnamed: 0,Neighborhood,City,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Arbutus Ridge,Metro Vanouver,49.246305,-123.159636,The Patty Shop,49.25068,-123.167916,Caribbean Restaurant
1,Arbutus Ridge,Metro Vanouver,49.246305,-123.159636,The Arbutus Club,49.248507,-123.152152,Event Space
2,Arbutus Ridge,Metro Vanouver,49.246305,-123.159636,Quilchena Park,49.245194,-123.151211,Park
3,Arbutus Ridge,Metro Vanouver,49.246305,-123.159636,Butter Baked Goods,49.242209,-123.170381,Bakery
4,Arbutus Ridge,Metro Vanouver,49.246305,-123.159636,La Buca,49.250549,-123.167933,Italian Restaurant
5,Arbutus Ridge,Metro Vanouver,49.246305,-123.159636,Starbucks,49.244768,-123.153891,Coffee Shop
6,Arbutus Ridge,Metro Vanouver,49.246305,-123.159636,Subway,49.244558,-123.153975,Sandwich Place
7,Arbutus Ridge,Metro Vanouver,49.246305,-123.159636,Dollarama,49.248885,-123.154049,Discount Store
8,Arbutus Ridge,Metro Vanouver,49.246305,-123.159636,Wakwak Burger,49.25243,-123.159954,Food Truck
9,Arbutus Ridge,Metro Vanouver,49.246305,-123.159636,7-Eleven,49.23842,-123.155939,Convenience Store


- About 4,000 venues
- Note that some venues are located in the overlapped area and those venues are appeared in multiple neighborhoods.

In [42]:
# save vanue information
vancouver_venues.to_csv('vancouver_venues.csv', index=False)

Combine two dataframes of toronto_venues and vancouver_venues into single dataframe

In [43]:
venues = toronto_venues.append(vancouver_venues)
venues.shape

(14457, 8)

In [44]:
venues.head()

Unnamed: 0,Neighborhood,City,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,Toronto,43.7588,-79.320197,Allwyn's Bakery,43.75984,-79.324719,Caribbean Restaurant
1,Parkwoods,Toronto,43.7588,-79.320197,Tim Hortons,43.760668,-79.326368,Café
2,Parkwoods,Toronto,43.7588,-79.320197,LCBO,43.757774,-79.314257,Liquor Store
3,Parkwoods,Toronto,43.7588,-79.320197,A&W,43.760643,-79.326865,Fast Food Restaurant
4,Parkwoods,Toronto,43.7588,-79.320197,Dollarama,43.758135,-79.310672,Discount Store


In [45]:
venues.tail()

Unnamed: 0,Neighborhood,City,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
4166,Upper Capilano,Metro Vanouver,49.320999,-123.097217,BC SPCA Thrift Store,49.322468,-123.107998,Thrift / Vintage Store
4167,Upper Capilano,Metro Vanouver,49.320999,-123.097217,Barre Fitness,49.324373,-123.106948,Gym
4168,Upper Capilano,Metro Vanouver,49.320999,-123.097217,Sam's Farm Market,49.324287,-123.108153,Farmers Market
4169,Upper Capilano,Metro Vanouver,49.320999,-123.097217,Squamish Nation Smoke Shop,49.314495,-123.088299,Smoke Shop
4170,Upper Capilano,Metro Vanouver,49.320999,-123.097217,Fitness Town (North Vancouver),49.323558,-123.110407,Sporting Goods Shop


<a id="3."></a>
## 3. Exploration of Venue Information

<a id="3.1"></a>
### 3.1 Neighborhood and Venues in Both Cities
Group rows by neighborhood and by taking the sum of the frequency of occurrence of each category

In [46]:
# Create new dataframe with neighborhood and venue types
# one hot encoding
neighborhood_venues = pd.get_dummies(venues[['Venue Category']], prefix="", prefix_sep="")

neighborhood_venues.drop('Neighborhood', 1, inplace=True)

# add neighborhood and city columns back to dataframe
neighborhood_venues['Neighborhood'] = venues['Neighborhood']
neighborhood_venues['City'] = venues['City']

# move neighborhood column to the first column
fixed_columns = [neighborhood_venues.columns[-2]] + [neighborhood_venues.columns[-1]] + list(neighborhood_venues.columns[:-2])
        
neighborhood_venues = neighborhood_venues[fixed_columns]

neighborhood_venues.head()

Unnamed: 0,Neighborhood,City,Accessories Store,Afghan Restaurant,African Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,...,Water Park,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Xinjiang Restaurant,Yoga Studio,Zoo Exhibit
0,Parkwoods,Toronto,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Parkwoods,Toronto,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Parkwoods,Toronto,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Parkwoods,Toronto,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Parkwoods,Toronto,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Sum of the frequency of occurrence of each category by Neighborhood

In [47]:
venue_group = neighborhood_venues.groupby('Neighborhood').sum().reset_index()
venue_group.head()

Unnamed: 0,Neighborhood,Accessories Store,Afghan Restaurant,African Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,...,Water Park,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Xinjiang Restaurant,Yoga Studio,Zoo Exhibit
0,Adelaide,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Agincourt,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Agincourt North,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,1,0,0,0,0
3,Albion Gardens,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Alderwood,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


<a id="3.2"></a>
### 3.2 Venues in Toronto
Group rows by neighborhood and by taking the sum of the frequency of occurrence of each category

In [48]:
# Create new dataframe with neighborhood and venue types
# one hot encoding
toronto_neighborhood_venues = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

toronto_neighborhood_venues.drop('Neighborhood', 1, inplace=True)
# add neighborhood column back to dataframe
toronto_neighborhood_venues['Neighborhood'] = toronto_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_neighborhood_venues.columns[-1]] + list(toronto_neighborhood_venues.columns[:-1])
toronto_neighborhood_venues = toronto_neighborhood_venues[fixed_columns]

toronto_neighborhood_venues.head()

Unnamed: 0,Neighborhood,Accessories Store,Afghan Restaurant,African Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,...,Vietnamese Restaurant,Warehouse Store,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Xinjiang Restaurant,Yoga Studio,Zoo Exhibit
0,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Sum of the frequency of occurrence of each category by Neighborhood

In [49]:
toronto_venue_group = toronto_neighborhood_venues.groupby('Neighborhood').sum().reset_index()
toronto_venue_group.head()

Unnamed: 0,Neighborhood,Accessories Store,Afghan Restaurant,African Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,...,Vietnamese Restaurant,Warehouse Store,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Xinjiang Restaurant,Yoga Studio,Zoo Exhibit
0,Adelaide,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Agincourt,0,0,0,0,0,0,0,0,0,...,1,0,0,0,0,0,0,0,0,0
2,Agincourt North,0,0,0,0,0,0,0,0,0,...,1,0,0,0,0,1,0,0,0,0
3,Albion Gardens,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Alderwood,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


- The number of venue categories are 259.

Sum of the frequency of occurrence of each category

In [50]:
toronto_venue_category = pd.DataFrame(toronto_venue_group.sum(axis = 0), columns=['Occurrence'])
toronto_venue_category.reset_index(inplace=True)
toronto_venue_category.rename(columns={"index":"category"}, inplace=True)
toronto_venue_category.drop(toronto_venue_category[toronto_venue_category.category=='Neighborhood'].index, inplace=True)
toronto_venue_category.sort_values(by=['Occurrence'], ascending=False, inplace=True, ignore_index=True)
toronto_venue_category.reset_index(drop=True, inplace=True)
toronto_venue_category.head()

Unnamed: 0,category,Occurrence
0,Coffee Shop,781
1,Café,420
2,Park,317
3,Restaurant,306
4,Pizza Place,277


In [51]:
toronto_venue_category.to_csv('toronto_venue_category.csv')

<a id="3.3"></a>
### 3.3 Venues in Metro Vancouver

In [52]:
# Create new dataframe with neighborhood and venue types
# one hot encoding
vancouver_neighborhood_venues = pd.get_dummies(vancouver_venues[['Venue Category']], prefix="", prefix_sep="")

vancouver_neighborhood_venues.drop('Neighborhood', 1, inplace=True)
# add neighborhood column back to dataframe
vancouver_neighborhood_venues['Neighborhood'] = vancouver_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [vancouver_neighborhood_venues.columns[-1]] + list(vancouver_neighborhood_venues.columns[:-1])
vancouver_neighborhood_venues = vancouver_neighborhood_venues[fixed_columns]

vancouver_neighborhood_venues.head()

Unnamed: 0,Neighborhood,Accessories Store,Airport Service,Airport Terminal,American Restaurant,Amphitheater,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Water Park,Waterfront,Wine Shop,Women's Store,Yoga Studio
0,Arbutus Ridge,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Arbutus Ridge,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Arbutus Ridge,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Arbutus Ridge,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Arbutus Ridge,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Sum of the frequency of occurrence of each category

In [53]:
vancouver_venue_group = vancouver_neighborhood_venues.groupby('Neighborhood').sum().reset_index()
vancouver_venue_group.head()

Unnamed: 0,Neighborhood,Accessories Store,Airport Service,Airport Terminal,American Restaurant,Amphitheater,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Water Park,Waterfront,Wine Shop,Women's Store,Yoga Studio
0,Altamont,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Ambleside,0,0,0,1,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
2,Arbutus Ridge,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,BURKE MOUNTAIN,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,BURQUITLAM,0,0,0,0,0,0,0,0,2,...,0,0,0,0,0,0,0,0,0,0


- The number of venue categories are 289.

Sum of the frequency of occurrence of each category

In [54]:
vancouver_venue_category = pd.DataFrame(vancouver_venue_group.sum(axis = 0), columns=['Occurrence'])
vancouver_venue_category.reset_index(inplace=True)
vancouver_venue_category.rename(columns={"index":"category"}, inplace=True)
vancouver_venue_category.drop(vancouver_venue_category[vancouver_venue_category.category=='Neighborhood'].index, inplace=True)
vancouver_venue_category.sort_values(by=['Occurrence'], ascending=False, inplace=True, ignore_index=True)
vancouver_venue_category.reset_index(drop=True, inplace=True)
vancouver_venue_category.head()

Unnamed: 0,category,Occurrence
0,Coffee Shop,282
1,Park,170
2,Sushi Restaurant,139
3,Grocery Store,115
4,Café,110


In [55]:
vancouver_venue_category.to_csv('vancouver_venue_category.csv')

### Findings
- The number of venues are different:
    - Toronto: about 10,000 venues
    - Metro Vancouver: about 4,000 venues from 7 cities
- The number of categories of venues are quite different:
    - Toronto: 359 categories
    - Vancouver: 279 categories
    - Toronto has more diversity of categories of venues than Mero Vancouver.

<a id="3.4"></a>
### 3.4 The Most Common Venues

In [56]:
# Creat function that sort the venues by frequency of occurance of categories in descending order

def most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Create the new dataframe and display the top 10 venues in descending order for each neighborhood.

In [57]:
# create column names according to number of top venues
# 'Neighborhood', '1st', '2nd', '3rd', '4th', ... ,'10th'
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{}'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th'.format(ind+1))
columns

['Neighborhood',
 '1st',
 '2nd',
 '3rd',
 '4th',
 '5th',
 '6th',
 '7th',
 '8th',
 '9th',
 '10th']

In [58]:
# create a new dataframe for both cities
venues_sorted = pd.DataFrame(columns=columns)
venues_sorted['Neighborhood'] = venue_group['Neighborhood']

for ind in np.arange(venues_sorted.shape[0]):
    venues_sorted.iloc[ind, 1:] = most_common_venues(venue_group.iloc[ind, :], num_top_venues)

venues_sorted.head()

Unnamed: 0,Neighborhood,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
0,Adelaide,Coffee Shop,Café,Hotel,Seafood Restaurant,Gastropub,Restaurant,Japanese Restaurant,Concert Hall,Plaza,Art Gallery
1,Agincourt,Chinese Restaurant,Restaurant,Cantonese Restaurant,Asian Restaurant,Bank,Liquor Store,Park,Coffee Shop,BBQ Joint,Bakery
2,Agincourt North,Indian Restaurant,Chinese Restaurant,Coffee Shop,Bank,Japanese Restaurant,Park,Discount Store,Fast Food Restaurant,Restaurant,Juice Bar
3,Albion Gardens,Pizza Place,Grocery Store,Fried Chicken Joint,Park,Auto Garage,Sandwich Place,Beer Store,Discount Store,Fast Food Restaurant,Pharmacy
4,Alderwood,Convenience Store,Discount Store,Pizza Place,Pub,Coffee Shop,Skating Rink,Shopping Mall,Grocery Store,Gym,Donut Shop


<a id="3.4.1"></a>
#### 3.4.1 The Most Common Venues for each Neighborhood in Toronto

In [59]:
# create a new dataframe for Toronto
toronto_venues_sorted = pd.DataFrame(columns=columns)
toronto_venues_sorted['Neighborhood'] = toronto_venue_group['Neighborhood']

for ind in np.arange(toronto_venues_sorted.shape[0]):
    toronto_venues_sorted.iloc[ind, 1:] = most_common_venues(toronto_venue_group.iloc[ind, :], num_top_venues)

toronto_venues_sorted.head()

Unnamed: 0,Neighborhood,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
0,Adelaide,Coffee Shop,Café,Restaurant,Gastropub,Hotel,Seafood Restaurant,Japanese Restaurant,Concert Hall,Art Gallery,Thai Restaurant
1,Agincourt,Chinese Restaurant,Asian Restaurant,Restaurant,Cantonese Restaurant,Liquor Store,BBQ Joint,Karaoke Bar,Bakery,Bank,Korean Restaurant
2,Agincourt North,Bank,Coffee Shop,Chinese Restaurant,Indian Restaurant,Sporting Goods Shop,Movie Theater,Juice Bar,Beer Store,Frozen Yogurt Shop,Fried Chicken Joint
3,Albion Gardens,Pizza Place,Grocery Store,Sandwich Place,Fried Chicken Joint,Auto Garage,Caribbean Restaurant,Discount Store,Beer Store,Park,Fast Food Restaurant
4,Alderwood,Discount Store,Pizza Place,Convenience Store,Grocery Store,Intersection,Skating Rink,Park,Moroccan Restaurant,Shopping Mall,Donut Shop


<a id="3.4.2"></a>
#### 3.4.2 The Most Common Venues for each Neighborhood in Metro Vancouver

In [60]:
# create a new dataframe for Vancouver
vancouver_venues_sorted = pd.DataFrame(columns=columns)
vancouver_venues_sorted['Neighborhood'] = vancouver_venue_group['Neighborhood']

for ind in np.arange(vancouver_venues_sorted.shape[0]):
    vancouver_venues_sorted.iloc[ind, 1:] = most_common_venues(vancouver_venue_group.iloc[ind, :], num_top_venues)

vancouver_venues_sorted.head()

Unnamed: 0,Neighborhood,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
0,Altamont,Beach,Park,Optical Shop,Yoga Studio,Fish & Chips Shop,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Filipino Restaurant
1,Ambleside,Coffee Shop,Park,Italian Restaurant,Bank,Café,Sushi Restaurant,Gym,Building,Pharmacy,Gas Station
2,Arbutus Ridge,Bubble Tea Shop,Bakery,Burger Joint,Caribbean Restaurant,Seafood Restaurant,Sushi Restaurant,Sandwich Place,Event Space,Liquor Store,Basketball Court
3,BURKE MOUNTAIN,Park,Trail,Sandwich Place,Convenience Store,Mountain,Sushi Restaurant,Yoga Studio,Falafel Restaurant,Farm,Farmers Market
4,BURQUITLAM,Grocery Store,Park,Liquor Store,Fast Food Restaurant,Asian Restaurant,Korean Restaurant,Sandwich Place,Coffee Shop,Gas Station,Gym / Fitness Center


<a id="3.4.3"></a>
#### 3.4.3 The Most Common Categories of Venues
Display top 30 venue categories.

In [61]:
toronto_venue_category.rename(columns={"category":"Toronto_Category", "Occurrence":"Toronto_Occurrence"}, inplace=True)
vancouver_venue_category.rename(columns={"category":"Vancouver_Category", "Occurrence":"Vancouver_Occurrence"}, inplace=True)

In [62]:
# Merge tow sorted dataframe into single dataframe
# Index is used for rank
venue_category_rank = toronto_venue_category.merge(vancouver_venue_category, left_index=True, right_index=True)

In [63]:
venue_category_rank.head(30)

Unnamed: 0,Toronto_Category,Toronto_Occurrence,Vancouver_Category,Vancouver_Occurrence
0,Coffee Shop,781,Coffee Shop,282
1,Café,420,Park,170
2,Park,317,Sushi Restaurant,139
3,Restaurant,306,Grocery Store,115
4,Pizza Place,277,Café,110
5,Italian Restaurant,254,Chinese Restaurant,107
6,Bakery,229,Sandwich Place,103
7,Sandwich Place,221,Japanese Restaurant,103
8,Grocery Store,207,Fast Food Restaurant,92
9,Sushi Restaurant,183,Bank,86


In [64]:
venue_category_rank.to_csv('venue_category_rank.csv')

<a id="4."></a>
## 4. Neighborhoods Clustering

### Run k-means to cluster the neighborhood into 5 clusters.

In [65]:
# set number of clusters
kclusters = 5

venue_clustering = venue_group.drop('Neighborhood', 1)
toronto_clustering = toronto_venue_group.drop('Neighborhood', 1)
vancouver_clustering = vancouver_venue_group.drop('Neighborhood', 1)

# run k-means clustering
venue_kmeans_cluster = KMeans(n_clusters=kclusters, random_state=0).fit(venue_clustering)
toronto_kmeans_cluster = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_clustering)
vancouver_kmeans_cluster = KMeans(n_clusters=kclusters, random_state=0).fit(vancouver_clustering)

In [66]:
# check cluster labels generated for each row in the dataframe
venue_kmeans_cluster.labels_[0:10]

array([1, 0, 3, 3, 3, 3, 0, 3, 3, 3], dtype=int32)

In [67]:
toronto_kmeans_cluster.labels_[0:10]

array([4, 2, 2, 2, 2, 0, 3, 2, 1, 4], dtype=int32)

In [68]:
vancouver_kmeans_cluster.labels_[0:10]

array([0, 3, 0, 0, 0, 3, 0, 0, 3, 2], dtype=int32)

Create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [69]:
# add clustering labels
venues_sorted.insert(0, 'Cluster Labels', venue_kmeans_cluster.labels_)
toronto_venues_sorted.insert(0, 'Cluster Labels', toronto_kmeans_cluster.labels_)
vancouver_venues_sorted.insert(0, 'Cluster Labels', vancouver_kmeans_cluster.labels_)

neighborhood_venues = toronto_vancouver
toronto_neighborhood_venues = toronto
vancouver_neighborhood_venues = vancouver

neighborhood_venues = neighborhood_venues.join(venues_sorted.set_index('Neighborhood'), on='Neighborhood')
toronto_neighborhood_venues = toronto_neighborhood_venues.join(toronto_venues_sorted.set_index('Neighborhood'), on='Neighborhood')
vancouver_neighborhood_venues = vancouver_neighborhood_venues.join(vancouver_venues_sorted.set_index('Neighborhood'), on='Neighborhood')


In [70]:
neighborhood_venues.head()

Unnamed: 0,Neighborhood,City,Latitude,Longitude,Cluster Labels,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
0,Parkwoods,Toronto,43.7588,-79.320197,3,Coffee Shop,Supermarket,Gas Station,Fast Food Restaurant,Pharmacy,Pizza Place,Café,Paper / Office Supplies Store,Sushi Restaurant,Bowling Alley
1,Victoria Village,Toronto,43.732658,-79.311189,3,Middle Eastern Restaurant,Portuguese Restaurant,Thai Restaurant,Thrift / Vintage Store,Chinese Restaurant,Park,Asian Restaurant,Mediterranean Restaurant,Pet Store,Pizza Place
2,Regent Park,Toronto,43.660706,-79.360457,2,Coffee Shop,Restaurant,Café,Bakery,Park,Thai Restaurant,Fast Food Restaurant,Breakfast Spot,Diner,Pub
3,Harbourfront,Toronto,43.64008,-79.38015,1,Coffee Shop,Hotel,Café,Park,Brewery,Gym,Japanese Restaurant,Scenic Lookout,Art Gallery,Baseball Stadium
4,Lawrence Manor,Toronto,43.722079,-79.437507,0,Coffee Shop,Bagel Shop,Pizza Place,Middle Eastern Restaurant,Fast Food Restaurant,Department Store,Metro Station,Pet Store,Sandwich Place,Liquor Store


In [71]:
neighborhood_venues.tail()

Unnamed: 0,Neighborhood,City,Latitude,Longitude,Cluster Labels,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
117,Central Lonsdale,Metro Vancouver,49.321923,-123.07187,2,Coffee Shop,Sushi Restaurant,Café,Park,Grocery Store,Middle Eastern Restaurant,Italian Restaurant,Sandwich Place,Gym / Fitness Center,Fast Food Restaurant
118,Grand Boulevard,Metro Vancouver,49.321521,-123.059474,0,Coffee Shop,Park,Middle Eastern Restaurant,Sushi Restaurant,Café,Sandwich Place,Japanese Restaurant,Grocery Store,Italian Restaurant,Falafel Restaurant
119,Cedar Village,Metro Vancouver,49.326556,-123.046084,3,Park,Grocery Store,Gym / Fitness Center,Pub,Baseball Field,Soccer Field,Pharmacy,Coffee Shop,Farm,Elementary School
120,Moodyville,Metro Vancouver,49.310666,-123.056126,3,Breakfast Spot,Butcher,Moving Target,Ice Cream Shop,Japanese Restaurant,Park,Sandwich Place,Salon / Barbershop,Café,Escape Room
121,Upper Capilano,Metro Vancouver,49.320999,-123.097217,0,Grocery Store,Coffee Shop,Gym,Fast Food Restaurant,Brewery,Pizza Place,Sushi Restaurant,Liquor Store,Bank,Restaurant


In [72]:
toronto_neighborhood_venues.head()

Unnamed: 0,Neighborhood,City,Latitude,Longitude,Cluster Labels,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
0,Parkwoods,Toronto,43.7588,-79.320197,2,Coffee Shop,Pharmacy,Pizza Place,Gas Station,Fast Food Restaurant,Supermarket,Café,Bank,Laundry Service,Sandwich Place
1,Victoria Village,Toronto,43.732658,-79.311189,2,Middle Eastern Restaurant,Thai Restaurant,Indian Restaurant,Mediterranean Restaurant,Pizza Place,French Restaurant,Thrift / Vintage Store,Asian Restaurant,Chinese Restaurant,Intersection
2,Regent Park,Toronto,43.660706,-79.360457,3,Coffee Shop,Restaurant,Café,Bakery,Park,Thai Restaurant,Fast Food Restaurant,Pub,Diner,Breakfast Spot
3,Harbourfront,Toronto,43.64008,-79.38015,4,Coffee Shop,Hotel,Café,Park,Japanese Restaurant,Gym,Scenic Lookout,Brewery,Sushi Restaurant,Seafood Restaurant
4,Lawrence Manor,Toronto,43.722079,-79.437507,2,Coffee Shop,Bagel Shop,Pizza Place,Pet Store,Middle Eastern Restaurant,Fast Food Restaurant,Department Store,Metro Station,Playground,Diner


In [73]:
vancouver_neighborhood_venues.head()

Unnamed: 0,Neighborhood,City,Latitude,Longitude,Cluster Labels,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
0,Arbutus Ridge,Metro Vancouver,49.246305,-123.159636,0,Bubble Tea Shop,Bakery,Burger Joint,Caribbean Restaurant,Seafood Restaurant,Sushi Restaurant,Sandwich Place,Event Space,Liquor Store,Basketball Court
1,Downtown,Metro Vancouver,49.285998,-123.127358,4,Hotel,Dessert Shop,Food Truck,Clothing Store,Japanese Restaurant,Ramen Restaurant,Park,American Restaurant,Cosmetics Shop,Café
2,Dunbar,Metro Vancouver,49.247581,-123.18495,3,Sushi Restaurant,Coffee Shop,Liquor Store,Bank,Indie Movie Theater,Bakery,Pharmacy,Cosmetics Shop,Park,Pub
3,Fairview,Metro Vancouver,49.261956,-123.130408,1,Coffee Shop,Furniture / Home Store,Japanese Restaurant,Park,Restaurant,Breakfast Spot,Café,American Restaurant,Bakery,Camera Store
4,Grandview,Metro Vancouver,49.262593,-123.068806,1,Coffee Shop,Pizza Place,Sushi Restaurant,Sandwich Place,Chinese Restaurant,Park,Breakfast Spot,Vietnamese Restaurant,Ice Cream Shop,Bakery


<a id="5."></a>
## 5. Cluster Examinations
Examine each cluster and determine the discriminating venue categories that distinguish each cluster.

<a id="5.1"></a>
#### 5.1 Clusters of both cities

In [74]:
cols = neighborhood_venues.shape[1]
cols

15

In [75]:
neighborhood_venues.tail()

Unnamed: 0,Neighborhood,City,Latitude,Longitude,Cluster Labels,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
117,Central Lonsdale,Metro Vancouver,49.321923,-123.07187,2,Coffee Shop,Sushi Restaurant,Café,Park,Grocery Store,Middle Eastern Restaurant,Italian Restaurant,Sandwich Place,Gym / Fitness Center,Fast Food Restaurant
118,Grand Boulevard,Metro Vancouver,49.321521,-123.059474,0,Coffee Shop,Park,Middle Eastern Restaurant,Sushi Restaurant,Café,Sandwich Place,Japanese Restaurant,Grocery Store,Italian Restaurant,Falafel Restaurant
119,Cedar Village,Metro Vancouver,49.326556,-123.046084,3,Park,Grocery Store,Gym / Fitness Center,Pub,Baseball Field,Soccer Field,Pharmacy,Coffee Shop,Farm,Elementary School
120,Moodyville,Metro Vancouver,49.310666,-123.056126,3,Breakfast Spot,Butcher,Moving Target,Ice Cream Shop,Japanese Restaurant,Park,Sandwich Place,Salon / Barbershop,Café,Escape Room
121,Upper Capilano,Metro Vancouver,49.320999,-123.097217,0,Grocery Store,Coffee Shop,Gym,Fast Food Restaurant,Brewery,Pizza Place,Sushi Restaurant,Liquor Store,Bank,Restaurant


In [76]:
# create map for Toronto
address = 'Toronto, ON'
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

clusters_map = folium.Map(location=[latitude, longitude], zoom_start=11)

color_list = ['gray', 'orange', 'lime', 'magenta', 'cyan']

# add markers to the map
markers_colors = []

cluster_toronto = neighborhood_venues[neighborhood_venues['City']=='Toronto']
for lat, lon, poi, cluster in zip(cluster_toronto['Latitude'],
                                  cluster_toronto['Longitude'],
                                  cluster_toronto['Neighborhood'], 
                                  cluster_toronto['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color = color_list[cluster],
        fill=True,
        fill_color = 'black',
        fill_opacity=0.7).add_to(clusters_map)

clusters_map

In [77]:
# create map for Metro Vancouver
address = 'Vancouver, BC'
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

clusters_map = folium.Map(location=[latitude, longitude], zoom_start=11)

color_list = ['gray', 'orange', 'lime', 'magenta', 'cyan']

# add markers to the map
markers_colors = []

cluster_vancouver = neighborhood_venues[neighborhood_venues['City']!='Toronto']
for lat, lon, poi, cluster in zip(cluster_vancouver['Latitude'],
                                  cluster_vancouver['Longitude'],
                                  cluster_vancouver['Neighborhood'], 
                                  cluster_vancouver['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color = color_list[cluster],
        fill=True,
        fill_color = 'black',
        fill_opacity=0.7).add_to(clusters_map)

clusters_map

#### 1) Cluster 1 (Gray color)

In [78]:
neighborhood_venues.loc[neighborhood_venues['Cluster Labels'] == 0,
                                neighborhood_venues.columns[[0] + [1] + list(range(5, cols))]]

Unnamed: 0,Neighborhood,City,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
4,Lawrence Manor,Toronto,Coffee Shop,Bagel Shop,Pizza Place,Middle Eastern Restaurant,Fast Food Restaurant,Department Store,Metro Station,Pet Store,Sandwich Place,Liquor Store
5,Lawrence Heights,Toronto,Clothing Store,Restaurant,Coffee Shop,Fast Food Restaurant,Furniture / Home Store,Dessert Shop,Food Court,Toy / Game Store,Grocery Store,Greek Restaurant
7,Islington Avenue,Toronto,Coffee Shop,Bakery,Sandwich Place,Grocery Store,Pizza Place,Restaurant,Fast Food Restaurant,Pub,Gym / Fitness Center,Thrift / Vintage Store
11,Don Mills,Toronto,Clothing Store,Coffee Shop,Fast Food Restaurant,Japanese Restaurant,Intersection,Baseball Field,Bakery,Sporting Goods Shop,Juice Bar,Restaurant
20,Islington,Toronto,Coffee Shop,Pub,Pizza Place,Fast Food Restaurant,Grocery Store,Tapas Restaurant,Sandwich Place,Sushi Restaurant,Italian Restaurant,Thai Restaurant
...,...,...,...,...,...,...,...,...,...,...,...,...
112,Westview,Metro Vancouver,Coffee Shop,Grocery Store,Fast Food Restaurant,Gym / Fitness Center,Sandwich Place,Chinese Restaurant,Sushi Restaurant,Park,Korean Restaurant,Persian Restaurant
114,Mahon,Metro Vancouver,Coffee Shop,Fast Food Restaurant,Sushi Restaurant,Grocery Store,Middle Eastern Restaurant,Mediterranean Restaurant,Pizza Place,Greek Restaurant,Bagel Shop,Chinese Restaurant
115,Marine-Hamilton,Metro Vancouver,Gym,Grocery Store,Restaurant,Pizza Place,Coffee Shop,Bank,Liquor Store,Sushi Restaurant,Thrift / Vintage Store,Brewery
118,Grand Boulevard,Metro Vancouver,Coffee Shop,Park,Middle Eastern Restaurant,Sushi Restaurant,Café,Sandwich Place,Japanese Restaurant,Grocery Store,Italian Restaurant,Falafel Restaurant


#### 2) Cluster 2 (Orange color)

In [79]:
neighborhood_venues.loc[neighborhood_venues['Cluster Labels'] == 1,
                                neighborhood_venues.columns[[0] + [1] + list(range(5, cols))]]

Unnamed: 0,Neighborhood,City,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
3,Harbourfront,Toronto,Coffee Shop,Hotel,Café,Park,Brewery,Gym,Japanese Restaurant,Scenic Lookout,Art Gallery,Baseball Stadium
14,Garden District,Toronto,Coffee Shop,Café,Gastropub,Japanese Restaurant,Italian Restaurant,Farmers Market,Vegetarian / Vegan Restaurant,Restaurant,Ramen Restaurant,Clothing Store
15,Ryerson,Toronto,Coffee Shop,Gastropub,Japanese Restaurant,Italian Restaurant,Theater,Park,Diner,Gay Bar,Ramen Restaurant,Clothing Store
18,Princess Gardens,Toronto,Coffee Shop,Hotel,Café,Park,Italian Restaurant,Gym,Scenic Lookout,Restaurant,Beer Bar,Yoga Studio
19,Martin Grove,Toronto,Coffee Shop,Café,Gastropub,Theater,Restaurant,Seafood Restaurant,Pizza Place,Sushi Restaurant,Furniture / Home Store,Plaza
36,Berczy Park,Toronto,Coffee Shop,Café,Seafood Restaurant,Hotel,Gastropub,Japanese Restaurant,Restaurant,Bakery,Art Gallery,Sporting Goods Shop
48,Adelaide,Toronto,Coffee Shop,Café,Hotel,Seafood Restaurant,Gastropub,Restaurant,Japanese Restaurant,Concert Hall,Plaza,Art Gallery
49,King,Toronto,Coffee Shop,Café,Restaurant,Seafood Restaurant,Gastropub,Hotel,Bakery,Japanese Restaurant,Concert Hall,Gym
60,Harbourfront East,Toronto,Coffee Shop,Hotel,Café,Park,Brewery,Gym,Japanese Restaurant,Scenic Lookout,Art Gallery,Baseball Stadium
61,Union Station,Toronto,Hotel,Café,Coffee Shop,Japanese Restaurant,Park,Concert Hall,Seafood Restaurant,Restaurant,Vegetarian / Vegan Restaurant,Farmers Market


#### 3) Cluster 3 (Lime color)

In [80]:
neighborhood_venues.loc[neighborhood_venues['Cluster Labels'] == 2,
                                neighborhood_venues.columns[[0] + [1] + list(range(5, cols))]]

Unnamed: 0,Neighborhood,City,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
2,Regent Park,Toronto,Coffee Shop,Restaurant,Café,Bakery,Park,Thai Restaurant,Fast Food Restaurant,Breakfast Spot,Diner,Pub
6,Queen's Park,Toronto,Coffee Shop,Café,Park,Sushi Restaurant,Japanese Restaurant,Ramen Restaurant,Mexican Restaurant,Bubble Tea Shop,Theater,Italian Restaurant
26,St. James Town,Toronto,Coffee Shop,Restaurant,Japanese Restaurant,Gay Bar,Café,Diner,Juice Bar,Park,Gastropub,Thai Restaurant
27,Humewood-Cedarvale,Toronto,Pizza Place,Coffee Shop,Grocery Store,Restaurant,Sushi Restaurant,Indian Restaurant,Italian Restaurant,Café,Bakery,Bank
35,The Beaches,Toronto,Beach,Pub,Coffee Shop,Bakery,Breakfast Spot,Japanese Restaurant,Pizza Place,Health Food Store,Bar,Grocery Store
39,Central Bay Street,Toronto,Coffee Shop,Park,Italian Restaurant,Gourmet Shop,Hotel,Men's Store,French Restaurant,Bookstore,Ice Cream Shop,Grocery Store
42,Hillcrest Village,Toronto,Restaurant,Pizza Place,Italian Restaurant,Bakery,Ice Cream Shop,Coffee Shop,Indian Restaurant,Mexican Restaurant,Latin American Restaurant,BBQ Joint
59,Broadview North (Old East York),Toronto,Greek Restaurant,Coffee Shop,Café,Bakery,Pub,Ice Cream Shop,Pizza Place,Italian Restaurant,Restaurant,Intersection
83,The Beaches West,Toronto,Beach,Pub,Coffee Shop,Bakery,Breakfast Spot,Japanese Restaurant,Pizza Place,Health Food Store,Bar,Grocery Store
85,Victoria Hotel,Toronto,Coffee Shop,Restaurant,Japanese Restaurant,Gay Bar,Café,Gastropub,Thai Restaurant,Diner,Italian Restaurant,Men's Store


#### 4) Cluster 4 (Magenta color)

In [81]:
neighborhood_venues.loc[neighborhood_venues['Cluster Labels'] == 3,
                                neighborhood_venues.columns[[0] + [1] + list(range(5, cols))]]

Unnamed: 0,Neighborhood,City,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
0,Parkwoods,Toronto,Coffee Shop,Supermarket,Gas Station,Fast Food Restaurant,Pharmacy,Pizza Place,Café,Paper / Office Supplies Store,Sushi Restaurant,Bowling Alley
1,Victoria Village,Toronto,Middle Eastern Restaurant,Portuguese Restaurant,Thai Restaurant,Thrift / Vintage Store,Chinese Restaurant,Park,Asian Restaurant,Mediterranean Restaurant,Pet Store,Pizza Place
8,Humber Valley Village,Toronto,Park,Pharmacy,Grocery Store,Bank,Shopping Mall,Supermarket,Liquor Store,Skating Rink,Camera Store,Café
9,Malvern,Toronto,Fast Food Restaurant,Park,Grocery Store,Pharmacy,Pizza Place,Gym / Fitness Center,Sandwich Place,Salon / Barbershop,Bubble Tea Shop,Supermarket
10,Rouge,Toronto,Park,Trail,Bus Station,Gas Station,Campground,Fast Food Restaurant,Shopping Mall,Intersection,Flower Shop,Falafel Restaurant
...,...,...,...,...,...,...,...,...,...,...,...,...
109,Capilano,Metro Vancouver,Coffee Shop,Bank,Bakery,Gift Shop,Sandwich Place,Park,Candy Store,Liquor Store,Bridge,Scenic Lookout
110,Carisbrooke,Metro Vancouver,Coffee Shop,Convenience Store,Bar,Grocery Store,Park,Chinese Restaurant,Donut Shop,Trail,Sandwich Place,Café
113,Tempe,Metro Vancouver,Park,Coffee Shop,Fast Food Restaurant,Grocery Store,Bar,Theater,Shipping Store,Gym / Fitness Center,Donut Shop,Sandwich Place
119,Cedar Village,Metro Vancouver,Park,Grocery Store,Gym / Fitness Center,Pub,Baseball Field,Soccer Field,Pharmacy,Coffee Shop,Farm,Elementary School


#### 5) Cluster 5 (Cyan color)

In [82]:
neighborhood_venues.loc[neighborhood_venues['Cluster Labels'] == 4,
                                neighborhood_venues.columns[[0] + [1] + list(range(5, cols))]]

Unnamed: 0,Neighborhood,City,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
40,Christie,Toronto,Korean Restaurant,Café,Grocery Store,Coffee Shop,Ice Cream Shop,Pizza Place,Cocktail Bar,Italian Restaurant,Dessert Shop,Eastern European Restaurant
43,Bathurst Manor,Toronto,Korean Restaurant,Coffee Shop,Grocery Store,Café,Bakery,Pizza Place,Bookstore,Dessert Shop,Beer Bar,Japanese Restaurant
50,Dufferin,Toronto,Café,Bar,Coffee Shop,Bakery,Caribbean Restaurant,Park,Vietnamese Restaurant,Cocktail Bar,Restaurant,Italian Restaurant
51,Dovercourt Village,Toronto,Bar,Bakery,Café,Coffee Shop,Cocktail Bar,Grocery Store,Park,Pizza Place,Mexican Restaurant,Italian Restaurant
63,Little Portugal,Toronto,Bar,Café,Restaurant,Coffee Shop,Vegetarian / Vegan Restaurant,French Restaurant,Park,Cocktail Bar,Italian Restaurant,Japanese Restaurant
71,Riverdale,Toronto,Coffee Shop,Vietnamese Restaurant,Café,Park,Bakery,Bar,Restaurant,Fast Food Restaurant,Brewery,Pizza Place
74,Brockton,Toronto,Coffee Shop,Bar,Café,Bakery,Restaurant,Sandwich Place,Cocktail Bar,Gift Shop,Park,Thai Restaurant
75,Parkdale Village,Toronto,Coffee Shop,Restaurant,Bakery,Café,Bar,Tibetan Restaurant,Italian Restaurant,Diner,Park,Pharmacy
122,North Toronto West,Toronto,Café,Restaurant,Bar,Vegetarian / Vegan Restaurant,Italian Restaurant,Bakery,Furniture / Home Store,Yoga Studio,Cocktail Bar,Seafood Restaurant
123,The Annex,Toronto,Café,Italian Restaurant,Restaurant,Korean Restaurant,Bakery,Grocery Store,Vegetarian / Vegan Restaurant,Coffee Shop,Eastern European Restaurant,Gym


In [83]:
neighborhood_venues.groupby(['City', 'Cluster Labels']).count()['Neighborhood']

City             Cluster Labels
Metro Vancouver  0                 42
                 1                  3
                 2                  9
                 3                 68
Toronto          0                 31
                 1                 20
                 2                 43
                 3                 85
                 4                 16
Name: Neighborhood, dtype: int64

<a id="5.2"></a>
### 5.2 Clusters of Toronto

In [84]:
# create map
address = 'Toronto, ON'
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

toronto_clusters_map = folium.Map(location=[latitude, longitude], zoom_start=11)

color_list = ['gray', 'orange', 'lime', 'magenta', 'cyan']

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_neighborhood_venues['Latitude'],
                                  toronto_neighborhood_venues['Longitude'],
                                  toronto_neighborhood_venues['Neighborhood'], 
                                  toronto_neighborhood_venues['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color = color_list[cluster],
        fill=True,
        fill_color = 'black',
        fill_opacity=0.7).add_to(toronto_clusters_map)

toronto_clusters_map

#### 1) Cluster 1(Gray color)

In [85]:
cols = toronto_neighborhood_venues.shape[1]
cols

15

In [86]:
toronto_neighborhood_venues.loc[toronto_neighborhood_venues['Cluster Labels'] == 0,
                                toronto_neighborhood_venues.columns[[0] +
                                                                    list(range(5, cols))]]

Unnamed: 0,Neighborhood,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
40,Christie,Korean Restaurant,Café,Grocery Store,Coffee Shop,Italian Restaurant,Ice Cream Shop,Cocktail Bar,Pizza Place,Restaurant,Comedy Club
43,Bathurst Manor,Korean Restaurant,Grocery Store,Coffee Shop,Café,Bakery,Bookstore,Dessert Shop,Pizza Place,Japanese Restaurant,Mexican Restaurant
50,Dufferin,Café,Coffee Shop,Bar,Bakery,Park,Restaurant,Caribbean Restaurant,Vietnamese Restaurant,Cocktail Bar,African Restaurant
51,Dovercourt Village,Bar,Bakery,Café,Coffee Shop,Cocktail Bar,Park,Grocery Store,Pizza Place,Pub,Italian Restaurant
63,Little Portugal,Bar,Café,Restaurant,Coffee Shop,Vegetarian / Vegan Restaurant,Italian Restaurant,Cocktail Bar,Japanese Restaurant,French Restaurant,Gift Shop
71,Riverdale,Café,Coffee Shop,Vietnamese Restaurant,Bar,Bakery,Park,Fast Food Restaurant,Restaurant,Brewery,Pool
74,Brockton,Coffee Shop,Bar,Café,Bakery,Restaurant,Sandwich Place,Park,Cocktail Bar,Gift Shop,Bookstore
75,Parkdale Village,Coffee Shop,Restaurant,Café,Bar,Bakery,Italian Restaurant,Tibetan Restaurant,Pizza Place,Indian Restaurant,Diner
122,North Toronto West,Restaurant,Café,Bar,Vegetarian / Vegan Restaurant,Bakery,Italian Restaurant,Cocktail Bar,Yoga Studio,Furniture / Home Store,Theater
123,The Annex,Café,Italian Restaurant,Korean Restaurant,Restaurant,Bakery,Coffee Shop,Vegetarian / Vegan Restaurant,Grocery Store,Beer Bar,Bar


#### 2) Cluster 2 (Orange color)

In [87]:
toronto_neighborhood_venues.loc[toronto_neighborhood_venues['Cluster Labels'] == 1,
                                toronto_neighborhood_venues.columns[[0] +
                                                                    list(range(5, cols))]]

Unnamed: 0,Neighborhood,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
5,Lawrence Heights,Clothing Store,Restaurant,Coffee Shop,Dessert Shop,Furniture / Home Store,Fast Food Restaurant,Food Court,Playground,Cosmetics Shop,Department Store
7,Islington Avenue,Coffee Shop,Bakery,Fast Food Restaurant,Pub,Restaurant,Sandwich Place,Pizza Place,Grocery Store,Gym / Fitness Center,Cheese Shop
11,Don Mills,Clothing Store,Coffee Shop,Fast Food Restaurant,Japanese Restaurant,Baseball Field,Intersection,Restaurant,Juice Bar,Sporting Goods Shop,Bank
20,Islington,Coffee Shop,Pub,Fast Food Restaurant,Sushi Restaurant,Sandwich Place,Pizza Place,Tapas Restaurant,Grocery Store,Italian Restaurant,Gastropub
27,Humewood-Cedarvale,Pizza Place,Coffee Shop,Restaurant,Indian Restaurant,Sushi Restaurant,Grocery Store,Italian Restaurant,American Restaurant,BBQ Joint,Café
35,The Beaches,Pub,Beach,Coffee Shop,Pizza Place,Bakery,Japanese Restaurant,Breakfast Spot,Park,Nail Salon,French Restaurant
42,Hillcrest Village,Restaurant,Indian Restaurant,Bakery,Mexican Restaurant,Ice Cream Shop,Italian Restaurant,Coffee Shop,Pizza Place,Liquor Store,Bank
53,Fairview,Coffee Shop,Clothing Store,Japanese Restaurant,Restaurant,Bank,Juice Bar,Pharmacy,Liquor Store,Electronics Store,Chinese Restaurant
59,Broadview North (Old East York),Greek Restaurant,Coffee Shop,Café,Bakery,Pub,Pizza Place,Ice Cream Shop,Restaurant,Yoga Studio,Fast Food Restaurant
77,Golden Mile,Fast Food Restaurant,Clothing Store,Sporting Goods Shop,Sandwich Place,Coffee Shop,Japanese Restaurant,Furniture / Home Store,Chinese Restaurant,Health Food Store,Supermarket


#### 3) Cluster 3 (Lime color)

In [88]:
toronto_neighborhood_venues.loc[toronto_neighborhood_venues['Cluster Labels'] == 2,
                                toronto_neighborhood_venues.columns[[0] +
                                                                    list(range(5, cols))]]

Unnamed: 0,Neighborhood,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
0,Parkwoods,Coffee Shop,Pharmacy,Pizza Place,Gas Station,Fast Food Restaurant,Supermarket,Café,Bank,Laundry Service,Sandwich Place
1,Victoria Village,Middle Eastern Restaurant,Thai Restaurant,Indian Restaurant,Mediterranean Restaurant,Pizza Place,French Restaurant,Thrift / Vintage Store,Asian Restaurant,Chinese Restaurant,Intersection
4,Lawrence Manor,Coffee Shop,Bagel Shop,Pizza Place,Pet Store,Middle Eastern Restaurant,Fast Food Restaurant,Department Store,Metro Station,Playground,Diner
8,Humber Valley Village,Park,Pharmacy,Shopping Mall,Grocery Store,Bank,Convenience Store,Bakery,Café,Spa,Ice Cream Shop
9,Malvern,Fast Food Restaurant,Park,Pizza Place,Pharmacy,Grocery Store,Gym / Fitness Center,Sandwich Place,Salon / Barbershop,Convenience Store,Skating Rink
...,...,...,...,...,...,...,...,...,...,...,...
183,Old Mill South,Park,Sushi Restaurant,River,Bar,Italian Restaurant,Coffee Shop,Pub,Yoga Studio,Gym,Café
185,Sunnylea,Breakfast Spot,Bank,Sushi Restaurant,Park,Bar,Bakery,Liquor Store,Greek Restaurant,Restaurant,Dessert Shop
186,Humber Bay,Park,Shopping Mall,River,Zoo Exhibit,Escape Room,Dry Cleaner,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store
187,Mimico NE,Park,Breakfast Spot,Convenience Store,Bank,Indian Restaurant,Grocery Store,Storage Facility,Skating Rink,Beer Store,Italian Restaurant


#### 4) Cluster 4 (Magenta color)

In [89]:
toronto_neighborhood_venues.loc[toronto_neighborhood_venues['Cluster Labels'] == 3,
                                toronto_neighborhood_venues.columns[[0] +
                                                                    list(range(5, cols))]]

Unnamed: 0,Neighborhood,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
2,Regent Park,Coffee Shop,Restaurant,Café,Bakery,Park,Thai Restaurant,Fast Food Restaurant,Pub,Diner,Breakfast Spot
6,Queen's Park,Coffee Shop,Café,Ramen Restaurant,Mexican Restaurant,Japanese Restaurant,Park,Sushi Restaurant,Dessert Shop,Theater,Gastropub
15,Ryerson,Coffee Shop,Gastropub,Japanese Restaurant,Italian Restaurant,Theater,Café,Steakhouse,Bookstore,Bubble Tea Shop,Burrito Place
26,St. James Town,Coffee Shop,Restaurant,Japanese Restaurant,Café,Gay Bar,Diner,Thai Restaurant,Gastropub,Juice Bar,Park
39,Central Bay Street,Coffee Shop,Park,Italian Restaurant,French Restaurant,Bookstore,Hotel,Gourmet Shop,Grocery Store,Juice Bar,Men's Store
85,Victoria Hotel,Coffee Shop,Japanese Restaurant,Restaurant,Café,Gay Bar,Thai Restaurant,Diner,Gastropub,Hotel,Men's Store
93,Willowdale,Coffee Shop,Grocery Store,Japanese Restaurant,Pizza Place,Sushi Restaurant,Ramen Restaurant,Fried Chicken Joint,Middle Eastern Restaurant,Sandwich Place,Fast Food Restaurant
103,Willowdale East,Coffee Shop,Grocery Store,Japanese Restaurant,Pizza Place,Sushi Restaurant,Ramen Restaurant,Fried Chicken Joint,Middle Eastern Restaurant,Sandwich Place,Fast Food Restaurant
105,Roselawn,Coffee Shop,Italian Restaurant,Fast Food Restaurant,Restaurant,Café,Pub,Sushi Restaurant,Bakery,Deli / Bodega,Japanese Restaurant
106,Runnymede,Coffee Shop,Café,Pizza Place,Sushi Restaurant,Bakery,Pub,Italian Restaurant,Falafel Restaurant,Bank,Diner


#### 5) Cluster 5 (Cyan color)

In [90]:
toronto_neighborhood_venues.loc[toronto_neighborhood_venues['Cluster Labels'] == 4,
                                toronto_neighborhood_venues.columns[[0] +
                                                                    list(range(5, cols))]]

Unnamed: 0,Neighborhood,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
3,Harbourfront,Coffee Shop,Hotel,Café,Park,Japanese Restaurant,Gym,Scenic Lookout,Brewery,Sushi Restaurant,Seafood Restaurant
14,Garden District,Coffee Shop,Café,Gastropub,Japanese Restaurant,Italian Restaurant,Creperie,Clothing Store,Farmers Market,Seafood Restaurant,Tea Room
18,Princess Gardens,Coffee Shop,Hotel,Park,Italian Restaurant,Café,Gym,Restaurant,Scenic Lookout,Yoga Studio,Beer Bar
19,Martin Grove,Coffee Shop,Café,Gastropub,Restaurant,Theater,Pizza Place,Furniture / Home Store,Steakhouse,Sandwich Place,Clothing Store
36,Berczy Park,Coffee Shop,Café,Seafood Restaurant,Restaurant,Japanese Restaurant,Gastropub,Hotel,Bakery,Creperie,Furniture / Home Store
48,Adelaide,Coffee Shop,Café,Restaurant,Gastropub,Hotel,Seafood Restaurant,Japanese Restaurant,Concert Hall,Art Gallery,Thai Restaurant
49,King,Coffee Shop,Café,Restaurant,Hotel,Gastropub,Seafood Restaurant,Concert Hall,Japanese Restaurant,Bakery,Theater
60,Harbourfront East,Coffee Shop,Hotel,Café,Park,Japanese Restaurant,Gym,Scenic Lookout,Brewery,Sushi Restaurant,Seafood Restaurant
61,Union Station,Hotel,Café,Coffee Shop,Concert Hall,Seafood Restaurant,Restaurant,Japanese Restaurant,Park,Steakhouse,Baseball Stadium
64,Trinity,Coffee Shop,Hotel,Italian Restaurant,Boutique,Café,Restaurant,Sushi Restaurant,Clothing Store,Park,French Restaurant


In [91]:
toronto_neighborhood_venues.groupby('Cluster Labels').count()['Neighborhood']

Cluster Labels
0     17
1     30
2    102
3     26
4     20
Name: Neighborhood, dtype: int64

<a id="5.3"></a>
#### 5.3 Clusters of Metro Vancouver

In [92]:
# create map
address = 'Vancouver, BC'
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
vancouver_clusters_map = folium.Map(location=[latitude, longitude], zoom_start=11)

color_list = ['gray', 'orange', 'lime', 'magenta', 'cyan']

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(vancouver_neighborhood_venues['Latitude'],
                                  vancouver_neighborhood_venues['Longitude'],
                                  vancouver_neighborhood_venues['Neighborhood'], 
                                  vancouver_neighborhood_venues['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=color_list[cluster],
        fill=True,
        fill_color='black',
        fill_opacity=0.7).add_to(vancouver_clusters_map)
       
vancouver_clusters_map

#### 1) Cluster 1 (Gray color)

In [93]:
cols = vancouver_neighborhood_venues.shape[1]

In [94]:
vancouver_neighborhood_venues.loc[vancouver_neighborhood_venues['Cluster Labels'] == 0,
                                vancouver_neighborhood_venues.columns[[0] +
                                                                    list(range(5, cols))]]

Unnamed: 0,Neighborhood,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
0,Arbutus Ridge,Bubble Tea Shop,Bakery,Burger Joint,Caribbean Restaurant,Seafood Restaurant,Sushi Restaurant,Sandwich Place,Event Space,Liquor Store,Basketball Court
6,Kensington,Trail,Japanese Restaurant,Grocery Store,Bank,Café,Shopping Plaza,Ski Area,Liquor Store,Pharmacy,Cosmetics Shop
7,Kerrisdale,Golf Course,Bus Stop,Grocery Store,Supermarket,Pool,Spanish Restaurant,Gift Shop,Park,Café,Farm
8,Killarney,Pizza Place,Juice Bar,Sandwich Place,Chinese Restaurant,Bar,Grocery Store,Coffee Shop,Recreation Center,Gas Station,Track
15,Shaughnessy,Park,Coffee Shop,Garden,Bank,Greek Restaurant,Bubble Tea Shop,Sandwich Place,Malay Restaurant,Grocery Store,Cantonese Restaurant
...,...,...,...,...,...,...,...,...,...,...,...
108,Braemar,Convenience Store,Bus Station,Bus Stop,Boat Rental,Yoga Studio,Food,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Flea Market
109,Capilano,Coffee Shop,Bank,Sandwich Place,Bakery,Park,Gift Shop,Liquor Store,Sushi Restaurant,Bistro,Hot Dog Joint
110,Carisbrooke,Coffee Shop,Chinese Restaurant,Park,Grocery Store,Sandwich Place,Bar,Shopping Mall,Donut Shop,Café,Trail
119,Cedar Village,Park,Grocery Store,Soccer Field,Pharmacy,Gym / Fitness Center,Coffee Shop,Baseball Field,Pub,Flower Shop,Flea Market


#### 2) Cluster 2 (Orange color)

In [95]:
vancouver_neighborhood_venues.loc[vancouver_neighborhood_venues['Cluster Labels'] == 1,
                                vancouver_neighborhood_venues.columns[[1] +
                                                                    list(range(5, cols))]]

Unnamed: 0,City,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
3,Metro Vancouver,Coffee Shop,Furniture / Home Store,Japanese Restaurant,Park,Restaurant,Breakfast Spot,Café,American Restaurant,Bakery,Camera Store
4,Metro Vancouver,Coffee Shop,Pizza Place,Sushi Restaurant,Sandwich Place,Chinese Restaurant,Park,Breakfast Spot,Vietnamese Restaurant,Ice Cream Shop,Bakery
9,Metro Vancouver,Coffee Shop,Yoga Studio,Bakery,Japanese Restaurant,Pizza Place,Board Shop,French Restaurant,Café,Vegetarian / Vegan Restaurant,Restaurant
11,Metro Vancouver,Coffee Shop,Brewery,Bakery,Sushi Restaurant,Vietnamese Restaurant,Yoga Studio,Park,Taco Place,Café,Chinese Restaurant
14,Metro Vancouver,Coffee Shop,Japanese Restaurant,Café,Park,Grocery Store,Garden,Vietnamese Restaurant,Restaurant,Chinese Restaurant,Pub
17,Metro Vancouver,Sandwich Place,Café,Brewery,Park,Coffee Shop,Chinese Restaurant,Restaurant,Noodle House,Asian Restaurant,Bar
111,Metro Vancouver,Coffee Shop,Sushi Restaurant,Café,Grocery Store,Middle Eastern Restaurant,Park,Indian Restaurant,Gym / Fitness Center,Italian Restaurant,Fast Food Restaurant
116,Metro Vancouver,Coffee Shop,Gastropub,Restaurant,Sushi Restaurant,Japanese Restaurant,Brewery,Ice Cream Shop,Korean Restaurant,Café,Breakfast Spot
117,Metro Vancouver,Coffee Shop,Sushi Restaurant,Café,Middle Eastern Restaurant,Park,Grocery Store,Gym / Fitness Center,Fast Food Restaurant,Mediterranean Restaurant,Convenience Store


#### 3) Cluster 3 (Lime color)

In [96]:
vancouver_neighborhood_venues.loc[vancouver_neighborhood_venues['Cluster Labels'] == 2,
                                vancouver_neighborhood_venues.columns[[1] +
                                                                    list(range(5, cols))]]

Unnamed: 0,City,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
24,Metro Vancouver,Korean Restaurant,Fast Food Restaurant,Japanese Restaurant,Grocery Store,Gym,Bank,Burger Joint,Coffee Shop,Breakfast Spot,Park
46,Metro Vancouver,Bakery,Chinese Restaurant,Dessert Shop,Asian Restaurant,Coffee Shop,Korean Restaurant,Sporting Goods Shop,Breakfast Spot,Sushi Restaurant,Café
47,Metro Vancouver,Chinese Restaurant,Korean Restaurant,Vietnamese Restaurant,Japanese Restaurant,Taiwanese Restaurant,Deli / Bodega,Fast Food Restaurant,Cantonese Restaurant,Sandwich Place,Fried Chicken Joint
49,Metro Vancouver,Chinese Restaurant,Pharmacy,Sushi Restaurant,Café,Korean Restaurant,Bank,Coffee Shop,Asian Restaurant,Thai Restaurant,Gym
96,Metro Vancouver,Chinese Restaurant,Hotel,Coffee Shop,Bus Stop,Bubble Tea Shop,Seafood Restaurant,Vietnamese Restaurant,Japanese Restaurant,Italian Restaurant,Bakery
97,Metro Vancouver,Chinese Restaurant,Bank,Coffee Shop,Clothing Store,Vietnamese Restaurant,Sushi Restaurant,Fast Food Restaurant,Sandwich Place,Bakery,Bubble Tea Shop
100,Metro Vancouver,Chinese Restaurant,Food Court,Japanese Restaurant,Bubble Tea Shop,Coffee Shop,Korean Restaurant,Supermarket,Shopping Mall,Grocery Store,Dessert Shop


#### 4) Cluster 4 (Magenta color)

In [97]:
vancouver_neighborhood_venues.loc[vancouver_neighborhood_venues['Cluster Labels'] == 3,
                                vancouver_neighborhood_venues.columns[[1] +
                                                                    list(range(5, cols))]]

Unnamed: 0,City,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
2,Metro Vancouver,Sushi Restaurant,Coffee Shop,Liquor Store,Bank,Indie Movie Theater,Bakery,Pharmacy,Cosmetics Shop,Park,Pub
5,Metro Vancouver,Park,Vietnamese Restaurant,Coffee Shop,Theme Park Ride / Attraction,Fast Food Restaurant,Pharmacy,Convenience Store,Event Space,Theme Park,Gas Station
10,Metro Vancouver,Pizza Place,Sushi Restaurant,Chinese Restaurant,Café,Bus Stop,Vietnamese Restaurant,Bank,Liquor Store,Japanese Restaurant,Bubble Tea Shop
12,Metro Vancouver,Park,Women's Store,Coffee Shop,Sushi Restaurant,Gift Shop,Toy / Game Store,Fast Food Restaurant,Pharmacy,Sporting Goods Shop,Tea Room
13,Metro Vancouver,Bus Stop,Pet Store,Pizza Place,Convenience Store,Furniture / Home Store,Breakfast Spot,Coffee Shop,Sandwich Place,Grocery Store,Chinese Restaurant
16,Metro Vancouver,Coffee Shop,Park,Garden,Sushi Restaurant,Sandwich Place,Chinese Restaurant,Grocery Store,Bank,Seafood Restaurant,Malay Restaurant
18,Metro Vancouver,Indian Restaurant,Chinese Restaurant,Bus Stop,Park,Bank,Bakery,Sushi Restaurant,Restaurant,Market,Pharmacy
22,Metro Vancouver,Café,Park,Sushi Restaurant,Vietnamese Restaurant,Trail,Pub,Grocery Store,Thai Restaurant,Bank,French Restaurant
23,Metro Vancouver,Sushi Restaurant,Café,Vietnamese Restaurant,Pizza Place,Coffee Shop,Japanese Restaurant,Liquor Store,Fast Food Restaurant,Greek Restaurant,Grocery Store
25,Metro Vancouver,Coffee Shop,Sushi Restaurant,Restaurant,Pharmacy,Italian Restaurant,Gas Station,Liquor Store,Light Rail Station,Supermarket,Fried Chicken Joint


#### 5) Cluster 5 (Cyan color)

In [98]:
vancouver_neighborhood_venues.loc[vancouver_neighborhood_venues['Cluster Labels'] == 4,
                                vancouver_neighborhood_venues.columns[[1] +
                                                                    list(range(5, cols))]]

Unnamed: 0,City,1st,2nd,3rd,4th,5th,6th,7th,8th,9th,10th
1,Metro Vancouver,Hotel,Dessert Shop,Food Truck,Clothing Store,Japanese Restaurant,Ramen Restaurant,Park,American Restaurant,Cosmetics Shop,Café
19,Metro Vancouver,Hotel,Italian Restaurant,Bakery,Japanese Restaurant,Coffee Shop,Concert Hall,Seafood Restaurant,Sushi Restaurant,Restaurant,Yoga Studio
20,Metro Vancouver,Hotel,Japanese Restaurant,Bakery,Sandwich Place,Ramen Restaurant,Sushi Restaurant,Dessert Shop,Café,Park,Gay Bar


In [99]:
vancouver_neighborhood_venues.groupby('Cluster Labels').count()['Neighborhood']

Cluster Labels
0    67
1     9
2     7
3    36
4     3
Name: Neighborhood, dtype: int64