<h1 align=center><font size = 5>Opening pizza restaurant in one of the districts of Toronto</font></h1>

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">
<font size = 3>

1. <a href="#item1">Import needed libraries</a>

2. <a href="#item2">Download and Explore Dataset from Wiki</a>   
    
3. <a href="#item3">Get the geographical coordinates of a given postal codes</a>
    
4. <a href="#item4">Explore and cluster the neighborhoods in Toronto</a>    
    
</font>
</div>

## 1. Import needed libraries

In [7]:
import numpy as np # library to handle data in a vectorized manner

!pip3 install lxml
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

# import libraries for Wiki scrumble
import requests
import urllib.request
import time
!pip install beautifulsoup4
from bs4 import BeautifulSoup
from urllib.request import urlopen
import re

print('Libraries imported.')

Collecting lxml
  Downloading lxml-4.6.2-cp39-cp39-manylinux1_x86_64.whl (5.4 MB)
[K     |████████████████████████████████| 5.4 MB 5.7 MB/s eta 0:00:01
[?25hInstalling collected packages: lxml
Successfully installed lxml-4.6.2
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs:
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2020.12.5          |   py36h5fab9bb_1         143 KB  conda-forge
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    geopy-2.1.0                |     pyhd3deb0d_0          64 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         240 KB

The following NEW packages will be INSTALLED:

  geographi

## 2. Download and Explore Dataset from Wiki

In [8]:
url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'

html = urlopen(url) 

soup = BeautifulSoup(html, 'html.parser')

#tables = soup.find_all('table',{"class":"wikitable sortable"})
table = soup.find('table',{"class":"wikitable"})
#print(indiatable)

In [9]:
#Create array to hold the data we extract
postalCodes = []
boroughs = []
neighbourhoods = []  
    
rows = table.find_all('tr')

data = {}

for row in rows:
    cells = row.find_all('td')


    if len(cells) > 1:
        postalCode = cells[0]            
        postalCodes.append(postalCode.text.strip())

        borough = cells[1]            
        boroughs.append(borough.text.strip())                    

        neighbourhood = cells[2]            
        neighbourhoods.append(neighbourhood.text.strip())                   
        
data = {'PostalCode': postalCodes, 'Borough': boroughs, 'Neighborhood': neighbourhoods}   

In [10]:
#Transform the data into a pandas dataframe
df1 = pd.DataFrame.from_dict(data)
df1.head()
#df1["PostalCode"].count()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"


In [11]:
not_assigned = "Not assigned"

In [12]:
#Only process the cells that have an assigned borough. Ignore cells with a borough that is Not assigned.         
indexRows = df1[ df1['Borough'] == not_assigned ].index

#print(indexRows)

df1.drop(indexRows , inplace=True)
df1.head()
#df1["PostalCode"].count()

Unnamed: 0,PostalCode,Borough,Neighborhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
5,M6A,North York,"Lawrence Manor, Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


In [13]:
#More than one neighborhood can exist in one postal code area. For example, in the table on the Wikipedia page, you will notice that M5A is listed twice and has two neighborhoods: Harbourfront and Regent Park. These two rows will be combined into one row with the neighborhoods separated with a comma as shown in row 11  in the above table.
df2 = df1.groupby(['PostalCode', 'Borough'], as_index = False).agg({'Neighborhood': ','.join})
df2.head()
#df2["PostalCode"].count()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1B,Scarborough,"Malvern, Rouge"
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


In [14]:
#If a cell has a borough but a Not assigned  neighborhood, then the neighborhood will be the same as the borough.
df2['Neighborhood'] = np.where((df2.Neighborhood == not_assigned),df2.Borough,df2.Neighborhood)
df2.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1B,Scarborough,"Malvern, Rouge"
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


In [15]:
df2.shape

(103, 3)

## 3. Get the geographical coordinates of a given postal codes

In [16]:
!pip install geocoder
import geocoder # import geocoder
print("Geocoder installed!")

Collecting geocoder
[?25l  Downloading https://files.pythonhosted.org/packages/4f/6b/13166c909ad2f2d76b929a4227c952630ebaf0d729f6317eb09cbceccbab/geocoder-1.38.1-py2.py3-none-any.whl (98kB)
[K     |████████████████████████████████| 102kB 4.9MB/s ta 0:00:01
Collecting ratelim (from geocoder)
  Downloading https://files.pythonhosted.org/packages/f2/98/7e6d147fd16a10a5f821db6e25f192265d6ecca3d82957a4fdd592cad49c/ratelim-0.1.6-py2.py3-none-any.whl
Installing collected packages: ratelim, geocoder
Successfully installed geocoder-1.38.1 ratelim-0.1.6
Geocoder installed!


# initialize your variable to None
lat_lng_coords = None

#postal_code = "M4V"

# loop until you get the coordinates
while(lat_lng_coords is None):
  #g = geocoder.google('{}, Toronto, Ontario'.format(postal_code))
  g = geocoder.google('Mountain View, CA')
  lat_lng_coords = g.latlng

latitude = lat_lng_coords[0]
longitude = lat_lng_coords[1]

### Geocoder package was very unreliable, had to importe CSV instead

In [17]:
df_geo_temp = pd.read_csv("https://cocl.us/Geospatial_data/Geospatial_Coordinates.csv")
#print(df_geo_temp.rename(columns={'Postal Code': 'PostalCode'}))
df_geo = df_geo_temp.rename(columns={'Postal Code': 'PostalCode'})
df_geo.head()

Unnamed: 0,PostalCode,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [18]:
#merge 2 dataframes in order to get geo coordinates
df_complete = pd.merge(df2, df_geo, on='PostalCode')
df_complete.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


## 4. Explore and cluster the neighborhoods in Toronto

In [19]:
# instantiate the dataframe
neighborhoods = df_complete
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(neighborhoods['Borough'].unique()),
        neighborhoods.shape[0]
    )
)

The dataframe has 10 boroughs and 103 neighborhoods.


In [20]:
# Use geopy library to get the latitude and longitude values of Toronto.
address = 'Toronto'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.6534817, -79.3839347.


In [22]:
# create map of Toronto using latitude and longitude values with neighborhoods superimposed on top.
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

### Define Foursquare Credentials and Version

In [23]:
CLIENT_ID = 'WZ4F1FOGR1NUQGS5RKCQ4R2M50S4RMLJ05FC1GX3K3CBAEGW' # your Foursquare ID
CLIENT_SECRET = 'OPLF2Q41GVGZRFXXC3HV4CABMQMDU52TDXL0PW4OIYHSVTR5' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: WZ4F1FOGR1NUQGS5RKCQ4R2M50S4RMLJ05FC1GX3K3CBAEGW
CLIENT_SECRET:OPLF2Q41GVGZRFXXC3HV4CABMQMDU52TDXL0PW4OIYHSVTR5


### North York

In [24]:
# Let's explore the first neighborhood in our dataframe.
map_north_york_data = neighborhoods[neighborhoods['Borough'] == 'North York'].reset_index(drop=True)
map_north_york_data.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M2H,North York,Hillcrest Village,43.803762,-79.363452
1,M2J,North York,"Fairview, Henry Farm, Oriole",43.778517,-79.346556
2,M2K,North York,Bayview Village,43.786947,-79.385975
3,M2L,North York,"York Mills, Silver Hills",43.75749,-79.374714
4,M2M,North York,"Willowdale, Newtonbrook",43.789053,-79.408493


In [25]:
# Let's get the geographical coordinates of North York.
address = 'North York, Toronto'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of North York, Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of North York, Toronto are 43.7543263, -79.44911696639593.


In [26]:
# create map of North York using latitude and longitude values
map_north_york = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(map_north_york_data['Latitude'], map_north_york_data['Longitude'], map_north_york_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_north_york)  
    
map_north_york

### Downtown Toronto

In [27]:
# Let's explore the first neighborhood in our dataframe.
map_downtown_toronto_data = neighborhoods[neighborhoods['Borough'] == 'Downtown Toronto'].reset_index(drop=True)
map_downtown_toronto_data.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M4W,Downtown Toronto,Rosedale,43.679563,-79.377529
1,M4X,Downtown Toronto,"St. James Town, Cabbagetown",43.667967,-79.367675
2,M4Y,Downtown Toronto,Church and Wellesley,43.66586,-79.38316
3,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
4,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937


In [28]:
# Let's get the geographical coordinates of Downtown Toronto.
address = 'Downtown Toronto, Toronto'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of North York, Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of North York, Toronto are 43.6541737, -79.38081164513409.


In [30]:
# create map of Downtown Toronto using latitude and longitude values
map_downtown_toronto = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(map_downtown_toronto_data['Latitude'], map_downtown_toronto_data['Longitude'], map_downtown_toronto_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_downtown_toronto)  
    
map_downtown_toronto

### East York

In [31]:
# Let's explore the first neighborhood in our dataframe.
map_east_york_data = neighborhoods[neighborhoods['Borough'] == 'East York'].reset_index(drop=True)
map_east_york_data.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M4B,East York,"Parkview Hill, Woodbine Gardens",43.706397,-79.309937
1,M4C,East York,Woodbine Heights,43.695344,-79.318389
2,M4G,East York,Leaside,43.70906,-79.363452
3,M4H,East York,Thorncliffe Park,43.705369,-79.349372
4,M4J,East York,"East Toronto, Broadview North (Old East York)",43.685347,-79.338106


In [33]:
# Let's get the geographical coordinates of East York.
address = 'East York, Toronto'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of North York, Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of North York, Toronto are 43.699971000000005, -79.33251996261595.


In [35]:
# create map of North York using latitude and longitude values
map_east_york = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(map_east_york_data['Latitude'], map_east_york_data['Longitude'], map_east_york_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_east_york)  
    
map_east_york

### York

In [36]:
# Let's explore the first neighborhood in our dataframe.
map_york_data = neighborhoods[neighborhoods['Borough'] == 'York'].reset_index(drop=True)
map_york_data.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M6C,York,Humewood-Cedarvale,43.693781,-79.428191
1,M6E,York,Caledonia-Fairbanks,43.689026,-79.453512
2,M6M,York,"Del Ray, Mount Dennis, Keelsdale and Silverthorn",43.691116,-79.476013
3,M6N,York,"Runnymede, The Junction North",43.673185,-79.487262
4,M9N,York,Weston,43.706876,-79.518188


In [37]:
# Let's get the geographical coordinates of York.
address = 'York, Toronto'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of North York, Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of North York, Toronto are 43.6896191, -79.479188.


In [38]:
# create map of York using latitude and longitude values
map_york = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(map_york_data['Latitude'], map_york_data['Longitude'], map_york_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_york)  
    
map_york

### Let's explore neighborhoods in our dataframes.

#### Let's create a function to repeat the same process to all the neighborhoods

In [192]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Let's create a function to run the loop function on each neighborhood and create a new dataframe

In [193]:
def getVenues(district_data, district_name):
    district_venues  = getNearbyVenues(names=district_data['Neighborhood'],
                                       latitudes=district_data['Latitude'],
                                       longitudes=district_data['Longitude']
                                      )
    print(district_venues.shape)
        
    district_venues.groupby('Neighborhood').count()
    
    district_venues.head()
    
    print('District name: ', district_name, ' There are {} uniques categories.'.format(len(district_venues['Venue Category'].unique())))
    
    return(district_venues)

#### Let's create a function to help to analyze each Neighborhood

In [194]:
def getDistrictOnehot(district_venues):
    # one hot encoding
    district_onehot = pd.get_dummies(district_venues[['Venue Category']], prefix="", prefix_sep="")

    # add neighborhood column back to dataframe
    district_onehot['Neighborhood'] = district_venues['Neighborhood'] 

    # move neighborhood column to the first column
    fixed_columns = [district_onehot.columns[-1]] + list(district_onehot.columns[:-1])
    district_onehot = district_onehot[fixed_columns]

    district_onehot.head()    
    return(district_onehot)

#### Let's create a function to group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [195]:
def getDistrictGrouped(district_onehot):
    district_grouped = district_onehot.groupby('Neighborhood').mean().reset_index()
    district_grouped.shape
    district_grouped
    return(district_grouped)

#### let's write a function to sort the venues in descending order.

In [196]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

#### let's create a function to create a new dataframe and display the top 10 venues for each neighborhood.

In [197]:
def displayTopVenues(district_grouped):
    num_top_venues = 10

    indicators = ['st', 'nd', 'rd']

    # create columns according to number of top venues
    columns = ['Neighborhood']
    for ind in np.arange(num_top_venues):
        try:
            columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
        except:
            columns.append('{}th Most Common Venue'.format(ind+1))

    # create a new dataframe
    neighborhoods_venues_sorted = None
    neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
    neighborhoods_venues_sorted['Neighborhood'] = district_grouped['Neighborhood']

    for ind in np.arange(district_grouped.shape[0]):
        neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(district_grouped.iloc[ind, :], num_top_venues)

    return(neighborhoods_venues_sorted)

### Cluster Neighborhoods functions

#### Let's create a function to run k-means to cluster the neighborhood into 5 clusters.

In [209]:
def clusterNeighborhood(district_grouped, venues_sorted, district_data):
    # set number of clusters
    kclusters = 5

    district_grouped_clustering = district_grouped.drop('Neighborhood', 1)

    # run k-means clustering
    kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(district_grouped_clustering)

    # check cluster labels generated for each row in the dataframe
    kmeans.labels_[0:10] 
    
    #print(venues_sorted)
    
    # add clustering labels
    venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

    district_merged = district_data

    # merge district_grouped with district_data to add latitude/longitude for each neighborhood
    district_merged = district_merged.join(venues_sorted.set_index('Neighborhood'), on='Neighborhood')

    district_merged.head() # check the last columns!
    return(district_merged)

#### Create Map Function

In [211]:
def createMap(district_merged):
    # create map
    map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

    # set color scheme for the clusters
    kclusters = 5
    x = np.arange(kclusters)
    ys = [i + x + (i*x)**2 for i in range(kclusters)]
    colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
    rainbow = [colors.rgb2hex(i) for i in colors_array]

    # add markers to the map
    markers_colors = []
    for lat, lon, poi, cluster in zip(district_merged['Latitude'], district_merged['Longitude'], district_merged['Neighborhood'], district_merged['Cluster Labels']):
        label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
        folium.CircleMarker(
            [lat, lon],
            radius=5,
            popup=label,
            color=rainbow[cluster-1],
            fill=True,
            fill_color=rainbow[cluster-1],
            fill_opacity=0.7).add_to(map_clusters)
       
    return(map_clusters)

### North York analysis

In [199]:
north_york_venues = getVenues(map_north_york_data, 'North York')
north_york_venues.head()

Hillcrest Village
Fairview, Henry Farm, Oriole
Bayview Village
York Mills, Silver Hills
Willowdale, Newtonbrook
Willowdale, Willowdale East
York Mills West
Willowdale, Willowdale West
Parkwoods
Don Mills
Don Mills
Bathurst Manor, Wilson Heights, Downsview North
Northwood Park, York University
Downsview
Downsview
Downsview
Downsview
Victoria Village
Bedford Park, Lawrence Manor East
Lawrence Manor, Lawrence Heights
Glencairn
North Park, Maple Leaf Park, Upwood Park
Humber Summit
Humberlea, Emery
(240, 7)
District name:  North York  There are 97 uniques categories.


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Hillcrest Village,43.803762,-79.363452,New York Fries,43.803664,-79.363905,Fast Food Restaurant
1,Hillcrest Village,43.803762,-79.363452,Eagle's Nest Golf Club,43.805455,-79.364186,Golf Course
2,Hillcrest Village,43.803762,-79.363452,AY Jackson Pool,43.804515,-79.366138,Pool
3,Hillcrest Village,43.803762,-79.363452,Villa Madina,43.801685,-79.363938,Mediterranean Restaurant
4,Hillcrest Village,43.803762,-79.363452,Duncan Creek Park,43.805539,-79.360695,Dog Run


In [200]:
north_york_onehot = getDistrictOnehot(north_york_venues)

In [201]:
north_york_grouped = getDistrictGrouped(north_york_onehot)
north_york_grouped.shape

(18, 98)

In [202]:
north_york_sorted = displayTopVenues(north_york_grouped)
north_york_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Bathurst Manor, Wilson Heights, Downsview North",Coffee Shop,Bank,Middle Eastern Restaurant,Sandwich Place,Pharmacy,Pizza Place,Bridal Shop,Deli / Bodega,Ice Cream Shop,Intersection
1,Bayview Village,Chinese Restaurant,Café,Bank,Japanese Restaurant,Diner,Comfort Food Restaurant,Construction & Landscaping,Convenience Store,Cosmetics Shop,Deli / Bodega
2,"Bedford Park, Lawrence Manor East",Juice Bar,Coffee Shop,Italian Restaurant,Sandwich Place,Greek Restaurant,Restaurant,Café,Butcher,Comfort Food Restaurant,Indian Restaurant
3,Don Mills,Gym,Coffee Shop,Restaurant,Japanese Restaurant,Beer Store,Clothing Store,Chinese Restaurant,Caribbean Restaurant,Italian Restaurant,Dim Sum Restaurant
4,Downsview,Park,Grocery Store,Food Truck,Bank,Construction & Landscaping,Liquor Store,Discount Store,Baseball Field,Shopping Mall,Home Service


In [203]:
north_york_sorted.shape
north_york_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Bathurst Manor, Wilson Heights, Downsview North",Coffee Shop,Bank,Middle Eastern Restaurant,Sandwich Place,Pharmacy,Pizza Place,Bridal Shop,Deli / Bodega,Ice Cream Shop,Intersection
1,Bayview Village,Chinese Restaurant,Café,Bank,Japanese Restaurant,Diner,Comfort Food Restaurant,Construction & Landscaping,Convenience Store,Cosmetics Shop,Deli / Bodega
2,"Bedford Park, Lawrence Manor East",Juice Bar,Coffee Shop,Italian Restaurant,Sandwich Place,Greek Restaurant,Restaurant,Café,Butcher,Comfort Food Restaurant,Indian Restaurant
3,Don Mills,Gym,Coffee Shop,Restaurant,Japanese Restaurant,Beer Store,Clothing Store,Chinese Restaurant,Caribbean Restaurant,Italian Restaurant,Dim Sum Restaurant
4,Downsview,Park,Grocery Store,Food Truck,Bank,Construction & Landscaping,Liquor Store,Discount Store,Baseball Field,Shopping Mall,Home Service


In [204]:
north_york_grouped.shape
north_york_grouped.head()

Unnamed: 0,Neighborhood,Accessories Store,Airport,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Bakery,Bank,Bar,Baseball Field,Basketball Court,Beer Store,Bike Shop,Boutique,Bridal Shop,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Station,Butcher,Café,Caribbean Restaurant,Chinese Restaurant,Chocolate Shop,Clothing Store,Coffee Shop,Comfort Food Restaurant,Construction & Landscaping,Convenience Store,Cosmetics Shop,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Electronics Store,Event Space,Fast Food Restaurant,Food & Drink Shop,Food Court,Food Truck,Fried Chicken Joint,Furniture / Home Store,Gas Station,Golf Course,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Hobby Shop,Hockey Arena,Home Service,Hotel,Ice Cream Shop,Indian Restaurant,Intersection,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Lingerie Store,Liquor Store,Lounge,Massage Studio,Mediterranean Restaurant,Metro Station,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Movie Theater,Park,Pet Store,Pharmacy,Pizza Place,Plaza,Pool,Portuguese Restaurant,Pub,Ramen Restaurant,Restaurant,Salon / Barbershop,Sandwich Place,Shoe Store,Shopping Mall,Sporting Goods Shop,Steakhouse,Supermarket,Sushi Restaurant,Thai Restaurant,Theater,Toy / Game Store,Video Game Store,Vietnamese Restaurant,Women's Store
0,"Bathurst Manor, Wilson Heights, Downsview North",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.045455,0.0,0.045455,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.045455,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.045455,0.0,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.045455,0.0,0.045455,0.0,0.0,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.0
1,Bayview Village,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Bedford Park, Lawrence Manor East",0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.043478,0.0,0.0,0.0,0.0,0.086957,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.043478,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.043478,0.0,0.086957,0.0,0.0,0.086957,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.043478,0.0,0.0,0.0,0.043478,0.0,0.043478,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.043478,0.043478,0.0,0.0,0.0,0.0,0.0
3,Don Mills,0.0,0.0,0.0,0.041667,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.041667,0.041667,0.0,0.041667,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.041667,0.0,0.0,0.041667,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Downsview,0.0,0.066667,0.0,0.0,0.0,0.0,0.066667,0.0,0.066667,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.066667,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [205]:
north_york_merged = clusterNeighborhood(north_york_grouped, north_york_sorted, map_north_york_data)

In [207]:
north_york_merged.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M2H,North York,Hillcrest Village,43.803762,-79.363452,3.0,Golf Course,Mediterranean Restaurant,Fast Food Restaurant,Pool,Dog Run,Dessert Shop,Coffee Shop,Comfort Food Restaurant,Construction & Landscaping,Convenience Store
1,M2J,North York,"Fairview, Henry Farm, Oriole",43.778517,-79.346556,3.0,Clothing Store,Coffee Shop,Fast Food Restaurant,Restaurant,Juice Bar,Jewelry Store,Bakery,Bank,Japanese Restaurant,Food Court
2,M2K,North York,Bayview Village,43.786947,-79.385975,3.0,Chinese Restaurant,Café,Bank,Japanese Restaurant,Diner,Comfort Food Restaurant,Construction & Landscaping,Convenience Store,Cosmetics Shop,Deli / Bodega
3,M2L,North York,"York Mills, Silver Hills",43.75749,-79.374714,,,,,,,,,,,
4,M2M,North York,"Willowdale, Newtonbrook",43.789053,-79.408493,,,,,,,,,,,


In [208]:
north_york_merged['Cluster Labels'] = north_york_merged['Cluster Labels'].fillna(0).astype(int)
north_york_merged.head(145) # check the last columns!

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M2H,North York,Hillcrest Village,43.803762,-79.363452,3,Golf Course,Mediterranean Restaurant,Fast Food Restaurant,Pool,Dog Run,Dessert Shop,Coffee Shop,Comfort Food Restaurant,Construction & Landscaping,Convenience Store
1,M2J,North York,"Fairview, Henry Farm, Oriole",43.778517,-79.346556,3,Clothing Store,Coffee Shop,Fast Food Restaurant,Restaurant,Juice Bar,Jewelry Store,Bakery,Bank,Japanese Restaurant,Food Court
2,M2K,North York,Bayview Village,43.786947,-79.385975,3,Chinese Restaurant,Café,Bank,Japanese Restaurant,Diner,Comfort Food Restaurant,Construction & Landscaping,Convenience Store,Cosmetics Shop,Deli / Bodega
3,M2L,North York,"York Mills, Silver Hills",43.75749,-79.374714,0,,,,,,,,,,
4,M2M,North York,"Willowdale, Newtonbrook",43.789053,-79.408493,0,,,,,,,,,,
5,M2N,North York,"Willowdale, Willowdale East",43.77012,-79.408493,3,Ramen Restaurant,Sandwich Place,Café,Sushi Restaurant,Restaurant,Coffee Shop,Pizza Place,Shopping Mall,Plaza,Pet Store
6,M2P,North York,York Mills West,43.752758,-79.400049,0,Park,Convenience Store,Bar,Women's Store,Diner,Coffee Shop,Comfort Food Restaurant,Construction & Landscaping,Cosmetics Shop,Deli / Bodega
7,M2R,North York,"Willowdale, Willowdale West",43.782736,-79.442259,1,Coffee Shop,Grocery Store,Butcher,Pharmacy,Pizza Place,Women's Store,Dessert Shop,Comfort Food Restaurant,Construction & Landscaping,Convenience Store
8,M3A,North York,Parkwoods,43.753259,-79.329656,3,Park,Food & Drink Shop,Hotel,Women's Store,Dessert Shop,Clothing Store,Coffee Shop,Comfort Food Restaurant,Construction & Landscaping,Convenience Store
9,M3B,North York,Don Mills,43.745906,-79.352188,3,Gym,Coffee Shop,Restaurant,Japanese Restaurant,Beer Store,Clothing Store,Chinese Restaurant,Caribbean Restaurant,Italian Restaurant,Dim Sum Restaurant


In [212]:
createMap(north_york_merged)

### Examine Clusters North York

In [236]:
# Cluster 1
north_york_merged.loc[north_york_merged['Cluster Labels'] == 0, north_york_merged.columns[[2] + list(range(5, north_york_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,"York Mills, Silver Hills",0,,,,,,,,,,
4,"Willowdale, Newtonbrook",0,,,,,,,,,,
6,York Mills West,0,Park,Convenience Store,Bar,Women's Store,Diner,Coffee Shop,Comfort Food Restaurant,Construction & Landscaping,Cosmetics Shop,Deli / Bodega


In [237]:
# Cluster 2
north_york_merged.loc[north_york_merged['Cluster Labels'] == 1, north_york_merged.columns[[2] + list(range(5, north_york_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,"Willowdale, Willowdale West",1,Coffee Shop,Grocery Store,Butcher,Pharmacy,Pizza Place,Women's Store,Dessert Shop,Comfort Food Restaurant,Construction & Landscaping,Convenience Store
17,Victoria Village,1,Coffee Shop,Pizza Place,Hockey Arena,Portuguese Restaurant,Women's Store,Dessert Shop,Clothing Store,Comfort Food Restaurant,Construction & Landscaping,Convenience Store


In [238]:
# Cluster 3
north_york_merged.loc[north_york_merged['Cluster Labels'] == 2, north_york_merged.columns[[2] + list(range(5, north_york_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
22,Humber Summit,2,Furniture / Home Store,Intersection,Women's Store,Diner,Coffee Shop,Comfort Food Restaurant,Construction & Landscaping,Convenience Store,Cosmetics Shop,Deli / Bodega


In [239]:
# Cluster 4
north_york_merged.loc[north_york_merged['Cluster Labels'] == 3, north_york_merged.columns[[2] + list(range(5, north_york_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Hillcrest Village,3,Golf Course,Mediterranean Restaurant,Fast Food Restaurant,Pool,Dog Run,Dessert Shop,Coffee Shop,Comfort Food Restaurant,Construction & Landscaping,Convenience Store
1,"Fairview, Henry Farm, Oriole",3,Clothing Store,Coffee Shop,Fast Food Restaurant,Restaurant,Juice Bar,Jewelry Store,Bakery,Bank,Japanese Restaurant,Food Court
2,Bayview Village,3,Chinese Restaurant,Café,Bank,Japanese Restaurant,Diner,Comfort Food Restaurant,Construction & Landscaping,Convenience Store,Cosmetics Shop,Deli / Bodega
5,"Willowdale, Willowdale East",3,Ramen Restaurant,Sandwich Place,Café,Sushi Restaurant,Restaurant,Coffee Shop,Pizza Place,Shopping Mall,Plaza,Pet Store
8,Parkwoods,3,Park,Food & Drink Shop,Hotel,Women's Store,Dessert Shop,Clothing Store,Coffee Shop,Comfort Food Restaurant,Construction & Landscaping,Convenience Store
9,Don Mills,3,Gym,Coffee Shop,Restaurant,Japanese Restaurant,Beer Store,Clothing Store,Chinese Restaurant,Caribbean Restaurant,Italian Restaurant,Dim Sum Restaurant
10,Don Mills,3,Gym,Coffee Shop,Restaurant,Japanese Restaurant,Beer Store,Clothing Store,Chinese Restaurant,Caribbean Restaurant,Italian Restaurant,Dim Sum Restaurant
11,"Bathurst Manor, Wilson Heights, Downsview North",3,Coffee Shop,Bank,Middle Eastern Restaurant,Sandwich Place,Pharmacy,Pizza Place,Bridal Shop,Deli / Bodega,Ice Cream Shop,Intersection
12,"Northwood Park, York University",3,Furniture / Home Store,Bar,Metro Station,Caribbean Restaurant,Massage Studio,Coffee Shop,Vietnamese Restaurant,Food & Drink Shop,Fast Food Restaurant,Fried Chicken Joint
13,Downsview,3,Park,Grocery Store,Food Truck,Bank,Construction & Landscaping,Liquor Store,Discount Store,Baseball Field,Shopping Mall,Home Service


#### Cluster 5

### Downtown Toronto analysis

In [235]:
downtown_toronto_venues = getVenues(map_downtown_toronto_data, 'Downtown Toronto')
downtown_toronto_onehot = getDistrictOnehot(downtown_toronto_venues)
downtown_toronto_grouped = getDistrictGrouped(downtown_toronto_onehot)
downtown_toronto_sorted = displayTopVenues(downtown_toronto_grouped)
downtown_toronto_merged = clusterNeighborhood(downtown_toronto_grouped, downtown_toronto_sorted, map_downtown_toronto_data)
downtown_toronto_merged['Cluster Labels'] = downtown_toronto_merged['Cluster Labels'].fillna(0).astype(int)
createMap(downtown_toronto_merged)

Rosedale
St. James Town, Cabbagetown
Church and Wellesley
Regent Park, Harbourfront
Garden District, Ryerson
St. James Town
Berczy Park
Central Bay Street
Richmond, Adelaide, King
Harbourfront East, Union Station, Toronto Islands
Toronto Dominion Centre, Design Exchange
Commerce Court, Victoria Hotel
University of Toronto, Harbord
Kensington Market, Chinatown, Grange Park
CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport
Stn A PO Boxes
First Canadian Place, Underground city
Christie
Queen's Park, Ontario Provincial Government
(1225, 7)
District name:  Downtown Toronto  There are 206 uniques categories.


### Examine Clusters Downtown Toronto

In [240]:
# Cluster 1
downtown_toronto_merged.loc[downtown_toronto_merged['Cluster Labels'] == 0, downtown_toronto_merged.columns[[2] + list(range(5, downtown_toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,"St. James Town, Cabbagetown",0,Coffee Shop,Park,Italian Restaurant,Café,Pet Store,Pizza Place,Bakery,Pub,Restaurant,Japanese Restaurant
2,Church and Wellesley,0,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Fast Food Restaurant,Restaurant,Gay Bar,Yoga Studio,Men's Store,Café,Pizza Place
4,"Garden District, Ryerson",0,Clothing Store,Coffee Shop,Café,Cosmetics Shop,Middle Eastern Restaurant,Japanese Restaurant,Hotel,Bubble Tea Shop,Fast Food Restaurant,Lingerie Store
5,St. James Town,0,Coffee Shop,Café,Gastropub,Cocktail Bar,American Restaurant,Gym,Clothing Store,Beer Bar,Hotel,Seafood Restaurant
6,Berczy Park,0,Coffee Shop,Cocktail Bar,Cheese Shop,Beer Bar,Farmers Market,Seafood Restaurant,Restaurant,Bakery,Shopping Mall,Eastern European Restaurant
8,"Richmond, Adelaide, King",0,Coffee Shop,Café,Restaurant,Gym,Clothing Store,Thai Restaurant,Deli / Bodega,Sushi Restaurant,Pizza Place,Burrito Place
9,"Harbourfront East, Union Station, Toronto Islands",0,Coffee Shop,Aquarium,Hotel,Café,Restaurant,Brewery,Italian Restaurant,Scenic Lookout,Fried Chicken Joint,Park
10,"Toronto Dominion Centre, Design Exchange",0,Coffee Shop,Hotel,Café,Restaurant,Japanese Restaurant,Italian Restaurant,Salad Place,American Restaurant,Seafood Restaurant,Sporting Goods Shop
11,"Commerce Court, Victoria Hotel",0,Coffee Shop,Restaurant,Café,Hotel,Gym,American Restaurant,Seafood Restaurant,Japanese Restaurant,Italian Restaurant,Deli / Bodega
12,"University of Toronto, Harbord",0,Café,Bookstore,Bar,Japanese Restaurant,Bakery,Yoga Studio,Italian Restaurant,Beer Bar,College Gym,Sandwich Place


In [241]:
# Cluster 2
downtown_toronto_merged.loc[downtown_toronto_merged['Cluster Labels'] == 1, downtown_toronto_merged.columns[[2] + list(range(5, downtown_toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Rosedale,1,Park,Trail,Playground,Cupcake Shop,Donut Shop,Doner Restaurant,Dog Run,Distribution Center,Discount Store,Diner


In [242]:
# Cluster 3
downtown_toronto_merged.loc[downtown_toronto_merged['Cluster Labels'] == 2, downtown_toronto_merged.columns[[2] + list(range(5, downtown_toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
17,Christie,2,Grocery Store,Café,Park,Coffee Shop,Baby Store,Restaurant,Candy Store,Italian Restaurant,Athletics & Sports,Nightclub


In [244]:
# Cluster 4
downtown_toronto_merged.loc[downtown_toronto_merged['Cluster Labels'] == 3, downtown_toronto_merged.columns[[2] + list(range(5, downtown_toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
14,"CN Tower, King and Spadina, Railway Lands, Har...",3,Airport Service,Airport Terminal,Plane,Boutique,Rental Car Location,Sculpture Garden,Harbor / Marina,Airport Lounge,Airport Gate,Airport Food Court


### East York analysis

In [246]:
east_york_venues = getVenues(map_east_york_data, 'East York')
east_york_onehot = getDistrictOnehot(east_york_venues)
east_york_grouped = getDistrictGrouped(east_york_onehot)
east_york_sorted = displayTopVenues(east_york_grouped)
east_york_merged = clusterNeighborhood(east_york_grouped, east_york_sorted, map_east_york_data)
east_york_merged['Cluster Labels'] = east_york_merged['Cluster Labels'].fillna(0).astype(int)
createMap(east_york_merged)

Parkview Hill, Woodbine Gardens
Woodbine Heights
Leaside
Thorncliffe Park
East Toronto, Broadview North (Old East York)
(72, 7)
District name:  East York  There are 44 uniques categories.


### Examine Clusters East York

In [247]:
# Cluster 1
east_york_merged.loc[east_york_merged['Cluster Labels'] == 0, east_york_merged.columns[[2] + list(range(5, east_york_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Woodbine Heights,0,Athletics & Sports,Dance Studio,Beer Store,Skating Rink,Intersection,Park,Curling Ice,Flea Market,Fish & Chips Shop,Fast Food Restaurant


In [248]:
# Cluster 2
east_york_merged.loc[east_york_merged['Cluster Labels'] == 1, east_york_merged.columns[[2] + list(range(5, east_york_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Thorncliffe Park,1,Indian Restaurant,Yoga Studio,Park,Bank,Burger Joint,Coffee Shop,Discount Store,Fast Food Restaurant,Gas Station,Warehouse Store


In [249]:
# Cluster 3
east_york_merged.loc[east_york_merged['Cluster Labels'] == 2, east_york_merged.columns[[2] + list(range(5, east_york_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,"East Toronto, Broadview North (Old East York)",2,Park,Convenience Store,Yoga Studio,Dance Studio,Furniture / Home Store,Flea Market,Fish & Chips Shop,Fast Food Restaurant,Discount Store,Dessert Shop


In [250]:
# Cluster 4
east_york_merged.loc[east_york_merged['Cluster Labels'] == 3, east_york_merged.columns[[2] + list(range(5, east_york_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Leaside,3,Furniture / Home Store,Sporting Goods Shop,Coffee Shop,Burger Joint,Bank,Grocery Store,Pet Store,Dessert Shop,Department Store,Brewery


### York analysis

In [251]:
york_venues = getVenues(map_york_data, 'York')
york_onehot = getDistrictOnehot(york_venues)
york_grouped = getDistrictGrouped(york_onehot)
york_sorted = displayTopVenues(york_grouped)
york_merged = clusterNeighborhood(york_grouped, york_sorted, map_york_data)
york_merged['Cluster Labels'] = york_merged['Cluster Labels'].fillna(0).astype(int)
createMap(york_merged)

Humewood-Cedarvale
Caledonia-Fairbanks
Del Ray, Mount Dennis, Keelsdale and Silverthorn
Runnymede, The Junction North
Weston
(22, 7)
District name:  York  There are 16 uniques categories.


### Examine Clusters York

In [252]:
# Cluster 1
york_merged.loc[york_merged['Cluster Labels'] == 0, york_merged.columns[[2] + list(range(5, york_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Weston,0,Park,Convenience Store,Women's Store,Trail,Sandwich Place,Pool,Pizza Place,Hockey Arena,Fried Chicken Joint,Field


In [253]:
# Cluster 2
york_merged.loc[york_merged['Cluster Labels'] == 1, york_merged.columns[[2] + list(range(5, york_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,"Runnymede, The Junction North",1,Pizza Place,Convenience Store,Caribbean Restaurant,Brewery,Breakfast Spot,Women's Store,Trail,Sandwich Place,Pool,Park


In [254]:
# Cluster 3
york_merged.loc[york_merged['Cluster Labels'] == 2, york_merged.columns[[2] + list(range(5, york_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,"Del Ray, Mount Dennis, Keelsdale and Silverthorn",2,Fast Food Restaurant,Sandwich Place,Fried Chicken Joint,Discount Store,Women's Store,Trail,Pool,Pizza Place,Park,Hockey Arena


In [255]:
# Cluster 4
york_merged.loc[york_merged['Cluster Labels'] == 3, york_merged.columns[[2] + list(range(5, york_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Caledonia-Fairbanks,3,Park,Women's Store,Pool,Trail,Sandwich Place,Pizza Place,Hockey Arena,Fried Chicken Joint,Field,Fast Food Restaurant
