# Segmenting and Clustering Neighborhoods in Toronto

## Part 1

Let's start by importing everything we will need

In [44]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt 

We can use the read_html function from pandas to obtain the content of the wiki site.

In [3]:
wiki = pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M', header=0)
wiki

[    Postal Code           Borough  \
 0           M1A      Not assigned   
 1           M2A      Not assigned   
 2           M3A        North York   
 3           M4A        North York   
 4           M5A  Downtown Toronto   
 ..          ...               ...   
 175         M5Z      Not assigned   
 176         M6Z      Not assigned   
 177         M7Z      Not assigned   
 178         M8Z         Etobicoke   
 179         M9Z      Not assigned   
 
                                          Neighbourhood  
 0                                         Not assigned  
 1                                         Not assigned  
 2                                            Parkwoods  
 3                                     Victoria Village  
 4                            Regent Park, Harbourfront  
 ..                                                 ...  
 175                                       Not assigned  
 176                                       Not assigned  
 177                

This returns a list of multple tables scraped from the wiki. We are only interessted in the first table

In [4]:
df=wiki[0]
df

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
...,...,...,...
175,M5Z,Not assigned,Not assigned
176,M6Z,Not assigned,Not assigned
177,M7Z,Not assigned,Not assigned
178,M8Z,Etobicoke,"Mimico NW, The Queensway West, South of Bloor,..."


Let's drop all rows with "Not assigned" Borough. We do this by checking if the Borough Field contains "Not assigned", finding the inverse of the resulting boolean array with "~" and applying the resulting mask to our original dataframe.

In [5]:
df=df[~df['Borough'].str.contains('Not assigned')]
df.head(11)

Unnamed: 0,Postal Code,Borough,Neighbourhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
5,M6A,North York,"Lawrence Manor, Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
8,M9A,Etobicoke,"Islington Avenue, Humber Valley Village"
9,M1B,Scarborough,"Malvern, Rouge"
11,M3B,North York,Don Mills
12,M4B,East York,"Parkview Hill, Woodbine Gardens"
13,M5B,Downtown Toronto,"Garden District, Ryerson"


Let's check if we have any 'Not assigned' values in the Neighbourhood column.

In [6]:
df[df['Neighbourhood'].str.contains('Not assigned')]

Unnamed: 0,Postal Code,Borough,Neighbourhood


This looks good! Now we use the shape method.

In [7]:
df.shape

(103, 3)

## End of Part 1 Beginning of Part 2

In [8]:
!pip install geocoder
import geocoder



The geocoder does not work for me so I will use the csv.

In [9]:
!wget -O coordinates.csv http://cocl.us/Geospatial_data
df_coord = pd.read_csv('coordinates.csv')
df_coord

--2020-08-08 09:50:01--  http://cocl.us/Geospatial_data
Resolving cocl.us (cocl.us)... 169.55.161.7
Connecting to cocl.us (cocl.us)|169.55.161.7|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://cocl.us/Geospatial_data [following]
--2020-08-08 09:50:02--  https://cocl.us/Geospatial_data
Connecting to cocl.us (cocl.us)|169.55.161.7|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://ibm.box.com/shared/static/9afzr83pps4pwf2smjjcf1y5mvgb18rr.csv [following]
--2020-08-08 09:50:03--  https://ibm.box.com/shared/static/9afzr83pps4pwf2smjjcf1y5mvgb18rr.csv
Resolving ibm.box.com (ibm.box.com)... 185.235.236.197
Connecting to ibm.box.com (ibm.box.com)|185.235.236.197|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: /public/static/9afzr83pps4pwf2smjjcf1y5mvgb18rr.csv [following]
--2020-08-08 09:50:03--  https://ibm.box.com/public/static/9afzr83pps4pwf2smjjcf1y5

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476
...,...,...,...
98,M9N,43.706876,-79.518188
99,M9P,43.696319,-79.532242
100,M9R,43.688905,-79.554724
101,M9V,43.739416,-79.588437


Now we need to merge the two dataframes. We use the merge functions so we can join on the Postal code column of the data frames

In [10]:
df2=pd.merge(df,df_coord, on='Postal Code')
df2

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.654260,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
...,...,...,...,...,...
98,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North",43.653654,-79.506944
99,M4Y,Downtown Toronto,Church and Wellesley,43.665860,-79.383160
100,M7Y,East Toronto,"Business reply mail Processing Centre, South C...",43.662744,-79.321558
101,M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu...",43.636258,-79.498509


## End of Part 2, Beginning of Part 3

Let's have a look at the neighbourhoods on a map. First we install and import folium. We additionally need requests

In [11]:
!pip install folium
import folium
import requests



Second, we plot the neighbourhoods on the map and give the different boroughs different colors.

In [52]:
#longitude and latitude of toronto
lat=43.66135
long=-79.383087

map_toronto = folium.Map(location=[lat, long], zoom_start=10,control_scale = True)

#array with multiple colors
palette = ['blue', 'cadetblue', 'darkblue', 'darkgreen', 'darkpurple', 'darkred', 'gray', 'green', 'lightblue', 'lightgray', 'lightgreen', 'lightred', 'orange', 'pink', 'purple', 'red', 'white','beige','black']

for i,bor in enumerate(pd.unique(df['Borough'])):
    df3=df2[df2['Borough'].str.match(bor)]
    for lat, lng, neigh in zip(df3['Latitude'], df3['Longitude'], df3['Neighbourhood']):
        label = folium.Popup(neigh, parse_html=True)
        folium.CircleMarker(
            [lat, lng],
            radius=5,
            popup="<b>"+ bor + "</b>: " + neigh,
            color=palette[i],
            fill=True,
            fill_color=palette[i],
            fill_opacity=1,
            parse_html=False).add_to(map_toronto)

map_toronto

We would like to get the venues of every Neighbourhood. We do this in 4 steps, later we will put the last 3 steps in a loop.

**First step:**
Set the basic parameters for Foursquare:

I use the getpass library to keep my credentials secret.

In [108]:
import getpass
CLIENT_ID = getpass.getpass('ID:')
CLIENT_SECRET = getpass.getpass('SECRET:')
VERSION = '20180605' # Foursquare API version

ID:········
SECRET:········


**Second step:**
get all venues of a neighbourhood, in this example it is the neighbourhood with the index 3, which is Lawrence Manor, Lawrence Heights. Foursquare returns a maximum of 100 venues. Since the neighbourhoods in the city centre are closer together than the neighbourhoods in the suburbs we need an adaptive approach to get as many venues as possible. If our requeste returns a number of venues that is below 50 we will increase the radius and repeat the request. We start at a radius of 800. 

In [109]:
j=3
neighborhood_latitude = df2.loc[j, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = df2.loc[j, 'Longitude'] # neighborhood longitude value
name=df2.loc[j, 'Neighbourhood']
radius=800
LIMIT=100
nvenues=0
while nvenues<50:
    url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, neighborhood_latitude, neighborhood_longitude, VERSION, radius, LIMIT)
    results = requests.get(url).json()
    nvenues=len(results['response']['groups'][0]['items'])
    if (nvenues <50):
        radius=radius+300
        print(name+ ': '+str(nvenues) + ' venues found, radius increased to '+str(radius))
    else:
        print(name+ ': '+str(nvenues) + ' venues found')
        
items = results['response']['groups'][0]['items']
items

Lawrence Manor, Lawrence Heights: 35 venues found, radius increased to 1100
Lawrence Manor, Lawrence Heights: 55 venues found


[{'reasons': {'count': 0,
   'items': [{'summary': 'This spot is popular',
     'type': 'general',
     'reasonName': 'globalInteractionReason'}]},
  'venue': {'id': '4b16e8b6f964a52051bf23e3',
   'name': 'Roots',
   'location': {'address': '71-95 Orfus Road',
    'lat': 43.71821373389962,
    'lng': -79.46389305304558,
    'labeledLatLngs': [{'label': 'display',
      'lat': 43.71821373389962,
      'lng': -79.46389305304558}],
    'distance': 77,
    'postalCode': 'M6A 1L9',
    'cc': 'CA',
    'city': 'Toronto',
    'state': 'ON',
    'country': 'Canada',
    'formattedAddress': ['71-95 Orfus Road', 'Toronto ON M6A 1L9', 'Canada']},
   'categories': [{'id': '4bf58dd8d48988d104951735',
     'name': 'Boutique',
     'pluralName': 'Boutiques',
     'shortName': 'Boutique',
     'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/shops/apparel_boutique_',
      'suffix': '.png'},
     'primary': True}],
   'photos': {'count': 0, 'groups': []}},
  'referralId': 'e-0-4b16e8b6f964a5

**Third step:** we convert the json file to a pd dataframe. In preparation of the next step we also convert the venue.location.formattedAddress field from an array to a string.

In [16]:
df_venue= pd.json_normalize(items)
df_venue['venue.location.formattedAddress']=df_venue['venue.location.formattedAddress'].apply(', '.join)
df_venue

Unnamed: 0,referralId,reasons.count,reasons.items,venue.id,venue.name,venue.location.address,venue.location.lat,venue.location.lng,venue.location.labeledLatLngs,venue.location.distance,...,venue.location.city,venue.location.state,venue.location.country,venue.location.formattedAddress,venue.categories,venue.photos.count,venue.photos.groups,venue.venuePage.id,venue.location.crossStreet,venue.location.neighborhood
0,e-0-4b16e8b6f964a52051bf23e3-0,0,"[{'summary': 'This spot is popular', 'type': '...",4b16e8b6f964a52051bf23e3,Roots,71-95 Orfus Road,43.718214,-79.463893,"[{'label': 'display', 'lat': 43.71821373389962...",77,...,Toronto,ON,Canada,"71-95 Orfus Road, Toronto ON M6A 1L9, Canada","[{'id': '4bf58dd8d48988d104951735', 'name': 'B...",0,[],,,
1,e-0-4ccc5aebee23a14370591ea8-1,0,"[{'summary': 'This spot is popular', 'type': '...",4ccc5aebee23a14370591ea8,Lac Vien Vietnamese Restaurant,141 Cartwright Ave,43.721259,-79.468472,"[{'label': 'display', 'lat': 43.72125878799614...",426,...,Toronto,ON,Canada,"141 Cartwright Ave, Toronto ON, Canada","[{'id': '4bf58dd8d48988d14a941735', 'name': 'V...",0,[],137238575.0,,
2,e-0-4b12e300f964a520299023e3-2,0,"[{'summary': 'This spot is popular', 'type': '...",4b12e300f964a520299023e3,Kitchen Stuff Plus (Clearance Outlet),76 Orfus Rd,43.719096,-79.462675,"[{'label': 'display', 'lat': 43.71909636823933...",179,...,Toronto,ON,Canada,"76 Orfus Rd, Toronto ON, Canada","[{'id': '4bf58dd8d48988d1f8941735', 'name': 'F...",0,[],,,
3,e-0-54c3d9ed498e50c810920ee4-3,0,"[{'summary': 'This spot is popular', 'type': '...",54c3d9ed498e50c810920ee4,BATLgrounds,"153 Bridgeland Ave, Units 15-16",43.724054,-79.463398,"[{'label': 'display', 'lat': 43.72405414025907...",625,...,Toronto,ON,Canada,"153 Bridgeland Ave, Units 15-16 (Dufferin St),...","[{'id': '4f4528bc4b90abdf24c9de85', 'name': 'A...",0,[],,Dufferin St,Downsview
4,e-0-4b5f1f11f964a52047a729e3-4,0,"[{'summary': 'This spot is popular', 'type': '...",4b5f1f11f964a52047a729e3,Yuki Japanese Restaurant,3259 Dufferin St.,43.72061,-79.456119,"[{'label': 'display', 'lat': 43.72060971477955...",733,...,Toronto,ON,Canada,"3259 Dufferin St., Toronto ON M6A 2T2, Canada","[{'id': '4bf58dd8d48988d1d2941735', 'name': 'S...",0,[],,,
5,e-0-55fc4738498ea821b8d881ae-5,0,"[{'summary': 'This spot is popular', 'type': '...",55fc4738498ea821b8d881ae,The Burger's Priest,3280 Dufferin St,43.720789,-79.456766,"[{'label': 'display', 'lat': 43.7207893797841,...",691,...,Toronto,ON,Canada,"3280 Dufferin St, Toronto ON M6A 2T5, Canada","[{'id': '4bf58dd8d48988d16e941735', 'name': 'F...",0,[],,,
6,e-0-52ae475e498e1d931502955d-6,0,"[{'summary': 'This spot is popular', 'type': '...",52ae475e498e1d931502955d,Krystos,22-3200 Dufferin Street,43.718516,-79.455855,"[{'label': 'display', 'lat': 43.71851626753137...",716,...,Toronto,ON,Canada,"22-3200 Dufferin Street, Toronto ON, Canada","[{'id': '4bf58dd8d48988d10e941735', 'name': 'G...",0,[],,,
7,e-0-5ae3d754d3cce8002c86cbe1-7,0,"[{'summary': 'This spot is popular', 'type': '...",5ae3d754d3cce8002c86cbe1,RH Courtyard Café,3401 Dufferin Street,43.724874,-79.455536,"[{'label': 'display', 'lat': 43.7248736, 'lng'...",1025,...,Toronto,ON,Canada,"3401 Dufferin Street, Toronto ON M6A 2T9, Canada","[{'id': '4bf58dd8d48988d1c4941735', 'name': 'R...",0,[],,,
8,e-0-4b0ecb01f964a5201b5b23e3-8,0,"[{'summary': 'This spot is popular', 'type': '...",4b0ecb01f964a5201b5b23e3,Harvey's,3120 Dufferin St,43.715337,-79.455396,"[{'label': 'display', 'lat': 43.71533697302975...",832,...,North York,ON,Canada,"3120 Dufferin St, North York ON M6A 2S6, Canada","[{'id': '4bf58dd8d48988d1c4941735', 'name': 'R...",0,[],,,
9,e-0-4b9580fef964a52044a634e3-9,0,"[{'summary': 'This spot is popular', 'type': '...",4b9580fef964a52044a634e3,Mary Brown's Famous Chicken,3199 Dufferin St.,43.718309,-79.455706,"[{'label': 'display', 'lat': 43.71830860417442...",729,...,Toronto,ON,Canada,"3199 Dufferin St. (at Samor Rd.), Toronto ON, ...","[{'id': '4d4ae6fc7a7b7dea34424761', 'name': 'F...",0,[],,at Samor Rd.,


**Fourth step:** We drop every venue that does not have the right Postal Code for the recent request

In [17]:
df_venue=df_venue[df_venue['venue.location.formattedAddress'].str.contains(df2.loc[j,'Postal Code'])]
df_venue

Unnamed: 0,referralId,reasons.count,reasons.items,venue.id,venue.name,venue.location.address,venue.location.lat,venue.location.lng,venue.location.labeledLatLngs,venue.location.distance,...,venue.location.city,venue.location.state,venue.location.country,venue.location.formattedAddress,venue.categories,venue.photos.count,venue.photos.groups,venue.venuePage.id,venue.location.crossStreet,venue.location.neighborhood
0,e-0-4b16e8b6f964a52051bf23e3-0,0,"[{'summary': 'This spot is popular', 'type': '...",4b16e8b6f964a52051bf23e3,Roots,71-95 Orfus Road,43.718214,-79.463893,"[{'label': 'display', 'lat': 43.71821373389962...",77,...,Toronto,ON,Canada,"71-95 Orfus Road, Toronto ON M6A 1L9, Canada","[{'id': '4bf58dd8d48988d104951735', 'name': 'B...",0,[],,,
3,e-0-54c3d9ed498e50c810920ee4-3,0,"[{'summary': 'This spot is popular', 'type': '...",54c3d9ed498e50c810920ee4,BATLgrounds,"153 Bridgeland Ave, Units 15-16",43.724054,-79.463398,"[{'label': 'display', 'lat': 43.72405414025907...",625,...,Toronto,ON,Canada,"153 Bridgeland Ave, Units 15-16 (Dufferin St),...","[{'id': '4f4528bc4b90abdf24c9de85', 'name': 'A...",0,[],,Dufferin St,Downsview
4,e-0-4b5f1f11f964a52047a729e3-4,0,"[{'summary': 'This spot is popular', 'type': '...",4b5f1f11f964a52047a729e3,Yuki Japanese Restaurant,3259 Dufferin St.,43.72061,-79.456119,"[{'label': 'display', 'lat': 43.72060971477955...",733,...,Toronto,ON,Canada,"3259 Dufferin St., Toronto ON M6A 2T2, Canada","[{'id': '4bf58dd8d48988d1d2941735', 'name': 'S...",0,[],,,
5,e-0-55fc4738498ea821b8d881ae-5,0,"[{'summary': 'This spot is popular', 'type': '...",55fc4738498ea821b8d881ae,The Burger's Priest,3280 Dufferin St,43.720789,-79.456766,"[{'label': 'display', 'lat': 43.7207893797841,...",691,...,Toronto,ON,Canada,"3280 Dufferin St, Toronto ON M6A 2T5, Canada","[{'id': '4bf58dd8d48988d16e941735', 'name': 'F...",0,[],,,
7,e-0-5ae3d754d3cce8002c86cbe1-7,0,"[{'summary': 'This spot is popular', 'type': '...",5ae3d754d3cce8002c86cbe1,RH Courtyard Café,3401 Dufferin Street,43.724874,-79.455536,"[{'label': 'display', 'lat': 43.7248736, 'lng'...",1025,...,Toronto,ON,Canada,"3401 Dufferin Street, Toronto ON M6A 2T9, Canada","[{'id': '4bf58dd8d48988d1c4941735', 'name': 'R...",0,[],,,
8,e-0-4b0ecb01f964a5201b5b23e3-8,0,"[{'summary': 'This spot is popular', 'type': '...",4b0ecb01f964a5201b5b23e3,Harvey's,3120 Dufferin St,43.715337,-79.455396,"[{'label': 'display', 'lat': 43.71533697302975...",832,...,North York,ON,Canada,"3120 Dufferin St, North York ON M6A 2S6, Canada","[{'id': '4bf58dd8d48988d1c4941735', 'name': 'R...",0,[],,,
10,e-0-4db7044fa86e8d2707b7795e-10,0,"[{'summary': 'This spot is popular', 'type': '...",4db7044fa86e8d2707b7795e,Popeyes Louisiana Kitchen,3317 Dufferin St.,43.722477,-79.456472,"[{'label': 'display', 'lat': 43.72247710118178...",799,...,Toronto,ON,Canada,"3317 Dufferin St. (at Glen Belle Cr.), Toronto...","[{'id': '4d4ae6fc7a7b7dea34424761', 'name': 'F...",0,[],,at Glen Belle Cr.,
13,e-0-4b9e6ed5f964a52048e336e3-13,0,"[{'summary': 'This spot is popular', 'type': '...",4b9e6ed5f964a52048e336e3,Shoppers Drug Mart,3401 Dufferin St,43.724775,-79.45538,"[{'label': 'display', 'lat': 43.724775, 'lng':...",1027,...,Toronto,ON,Canada,3401 Dufferin St (in Yorkdale Shopping Centre)...,"[{'id': '4bf58dd8d48988d10f951735', 'name': 'P...",0,[],,in Yorkdale Shopping Centre,
14,e-0-4b3a4021f964a520f76225e3-14,0,"[{'summary': 'This spot is popular', 'type': '...",4b3a4021f964a520f76225e3,Restoration Hardware,3401 Dufferin Street,43.724942,-79.455627,"[{'label': 'display', 'lat': 43.7249417, 'lng'...",1025,...,Toronto,ON,Canada,3401 Dufferin Street (btw Roselawn & Montgomer...,"[{'id': '4bf58dd8d48988d1f8941735', 'name': 'F...",0,[],,btw Roselawn & Montgomery,
15,e-0-507d95038055b398c78eec04-15,0,"[{'summary': 'This spot is popular', 'type': '...",507d95038055b398c78eec04,JOEY,305B - 3401 Dufferin Street,43.724131,-79.454042,"[{'label': 'display', 'lat': 43.72413142548127...",1065,...,Toronto,ON,Canada,"305B - 3401 Dufferin Street, Toronto ON M6A 3...","[{'id': '4bf58dd8d48988d14e941735', 'name': 'A...",0,[],,,


Let's put step 2-4 in a loop and use this loop to get the venues for all the neighbourhoods.

In [31]:
df_venues_neigh=pd.DataFrame()#columns=['referralId','reasons.count','reasons.items','venue.id','venue.name','venue.location.address','venue.location.crossStreet','venue.location.lat','venue.location.lng','venue.location.labeledLatLngs','...','venue.location.cc','venue.location.city','venue.location.state','venue.location.country','venue.location.formattedAddress','venue.categories','venue.photos.count','venue.photos.groups','venue.location.neighborhood','venue.venuePage.id'])
for j in range(len(df2)):
    neighborhood_latitude = df2.loc[j, 'Latitude'] # neighborhood latitude value
    neighborhood_longitude = df2.loc[j, 'Longitude'] # neighborhood longitude value
    name=df2.loc[j, 'Neighbourhood']
    radius=800
    LIMIT=100
    nvenues=0
    while nvenues<50:
        url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, neighborhood_latitude, neighborhood_longitude, VERSION, radius, LIMIT)
        results = requests.get(url).json()
        nvenues=len(results['response']['groups'][0]['items'])
        if (nvenues <50):
            radius=radius+500
            print(name+ ': '+str(nvenues) + ' venues found, radius increased to '+str(radius))
        else:
            print(name+ ': '+str(nvenues) + ' venues found')
    items = results['response']['groups'][0]['items']
    df_venue= pd.json_normalize(items)
    df_venue['venue.location.formattedAddress']=df_venue['venue.location.formattedAddress'].apply(', '.join)
    df_venue=df_venue[df_venue['venue.location.formattedAddress'].str.contains(df2.loc[j,'Postal Code'])]
    df_venues_neigh=df_venues_neigh.append(df_venue)
df_venues_neigh

Parkwoods: 5 venues found, radius increased to 1300
Parkwoods: 35 venues found, radius increased to 1800
Parkwoods: 96 venues found
Victoria Village: 9 venues found, radius increased to 1300
Victoria Village: 37 venues found, radius increased to 1800
Victoria Village: 71 venues found
Regent Park, Harbourfront: 82 venues found
Lawrence Manor, Lawrence Heights: 35 venues found, radius increased to 1300
Lawrence Manor, Lawrence Heights: 94 venues found
Queen's Park, Ontario Provincial Government: 100 venues found
Islington Avenue, Humber Valley Village: 10 venues found, radius increased to 1300
Islington Avenue, Humber Valley Village: 22 venues found, radius increased to 1800
Islington Avenue, Humber Valley Village: 41 venues found, radius increased to 2300
Islington Avenue, Humber Valley Village: 55 venues found
Malvern, Rouge: 12 venues found, radius increased to 1300
Malvern, Rouge: 26 venues found, radius increased to 1800
Malvern, Rouge: 32 venues found, radius increased to 2300
Malv

Downsview: 4 venues found, radius increased to 1300
Downsview: 24 venues found, radius increased to 1800
Downsview: 50 venues found
Studio District: 100 venues found
Bedford Park, Lawrence Manor East: 42 venues found, radius increased to 1300
Bedford Park, Lawrence Manor East: 89 venues found
Del Ray, Mount Dennis, Keelsdale and Silverthorn: 11 venues found, radius increased to 1300
Del Ray, Mount Dennis, Keelsdale and Silverthorn: 42 venues found, radius increased to 1800
Del Ray, Mount Dennis, Keelsdale and Silverthorn: 45 venues found, radius increased to 2300
Del Ray, Mount Dennis, Keelsdale and Silverthorn: 100 venues found
Humberlea, Emery: 5 venues found, radius increased to 1300
Humberlea, Emery: 13 venues found, radius increased to 1800
Humberlea, Emery: 41 venues found, radius increased to 2300
Humberlea, Emery: 57 venues found
Birch Cliff, Cliffside West: 7 venues found, radius increased to 1300
Birch Cliff, Cliffside West: 13 venues found, radius increased to 1800
Birch Cli

Unnamed: 0,referralId,reasons.count,reasons.items,venue.id,venue.name,venue.location.address,venue.location.lat,venue.location.lng,venue.location.labeledLatLngs,venue.location.distance,...,venue.location.state,venue.location.country,venue.location.formattedAddress,venue.categories,venue.photos.count,venue.photos.groups,venue.location.crossStreet,venue.venuePage.id,venue.events.count,venue.events.summary
0,e-0-4b8991cbf964a520814232e3-0,0,"[{'summary': 'This spot is popular', 'type': '...",4b8991cbf964a520814232e3,Allwyn's Bakery,81 Underhill drive,43.759840,-79.324719,"[{'label': 'display', 'lat': 43.75984035203157...",833,...,ON,Canada,"81 Underhill drive, Toronto ON M3A 1Z5, Canada","[{'id': '4bf58dd8d48988d144941735', 'name': 'C...",0,[],,,,
1,e-0-4bd4846a6798ef3bd0c5618d-1,0,"[{'summary': 'This spot is popular', 'type': '...",4bd4846a6798ef3bd0c5618d,Donalda Golf & Country Club,12 Bushbury Dr,43.752816,-79.342741,"[{'label': 'display', 'lat': 43.75281596740471...",1053,...,ON,Canada,"12 Bushbury Dr, Don Mills ON M3A 2Z7, Canada","[{'id': '4bf58dd8d48988d1e6941735', 'name': 'G...",0,[],,,,
5,e-0-4b8ec91af964a520053733e3-5,0,"[{'summary': 'This spot is popular', 'type': '...",4b8ec91af964a520053733e3,Graydon Hall Manor,185 Graydon Hall Drive,43.763923,-79.342961,"[{'label': 'display', 'lat': 43.76392256055678...",1597,...,ON,Canada,"185 Graydon Hall Drive, Toronto ON M3A 3B4, Ca...","[{'id': '4bf58dd8d48988d171941735', 'name': 'E...",0,[],,52496423,,
6,e-0-57e286f2498e43d84d92d34a-6,0,"[{'summary': 'This spot is popular', 'type': '...",57e286f2498e43d84d92d34a,Tim Hortons,215 Brookbanks,43.760668,-79.326368,"[{'label': 'display', 'lat': 43.76066827030228...",866,...,ON,Canada,"215 Brookbanks (York Miils Rd), Toronto ON M3A...","[{'id': '4bf58dd8d48988d16d941735', 'name': 'C...",0,[],York Miils Rd,,,
15,e-0-58a8dcaa6119f47b9a94dc05-15,0,"[{'summary': 'This spot is popular', 'type': '...",58a8dcaa6119f47b9a94dc05,A&W,1277 York Mills Road,43.760643,-79.326865,"[{'label': 'display', 'lat': 43.76064307616131...",852,...,ON,Canada,"1277 York Mills Road, Toronto ON M3A 1Z5, Canada","[{'id': '4bf58dd8d48988d16e941735', 'name': 'F...",0,[],,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
68,e-0-4c4affcfc9e4ef3b5d3fed10-68,0,"[{'summary': 'This spot is popular', 'type': '...",4c4affcfc9e4ef3b5d3fed10,Goodwill,871 Islington Ave.,43.621602,-79.513198,"[{'label': 'display', 'lat': 43.62160166310763...",1022,...,ON,Canada,"871 Islington Ave. (at Queensway), Toronto ON ...","[{'id': '4bf58dd8d48988d101951735', 'name': 'T...",0,[],at Queensway,,,
72,e-0-4d0408907d9ba35d83856323-72,0,"[{'summary': 'This spot is popular', 'type': '...",4d0408907d9ba35d83856323,King's Garden Banquet Hall,15 Canmotor Avenue,43.622929,-79.510698,"[{'label': 'display', 'lat': 43.6229291, 'lng'...",1059,...,ON,Canada,"15 Canmotor Avenue (Queensquay), Etobicoke ON ...","[{'id': '4bf58dd8d48988d171941735', 'name': 'E...",0,[],Queensquay,514633977,,
75,e-0-4b0703f6f964a52080f522e3-75,0,"[{'summary': 'This spot is popular', 'type': '...",4b0703f6f964a52080f522e3,ShaSha Organic,20 Plastics Ave,43.623370,-79.509302,"[{'label': 'display', 'lat': 43.62337, 'lng': ...",1122,...,ON,Canada,"20 Plastics Ave (Queensway), Toronto ON M8Z 4B...","[{'id': '4bf58dd8d48988d1f5941735', 'name': 'G...",0,[],Queensway,,,
78,e-0-4be5bc1d2457a593d3eaab15-78,0,"[{'summary': 'This spot is popular', 'type': '...",4be5bc1d2457a593d3eaab15,Fogh Marine,901 Oxford St,43.619122,-79.514810,"[{'label': 'display', 'lat': 43.61912207338791...",1191,...,ON,Canada,"901 Oxford St (Islington Ave and Evans Ave), T...","[{'id': '4bf58dd8d48988d1f2941735', 'name': 'S...",0,[],Islington Ave and Evans Ave,141621939,,


We used a large amount of the 950 requests we can send, so let's copy the data we obtained in an other Dataframe we will work with and save the obtained data into a csv so we don't have to get it again.

In [35]:
df_venues2=df_venues_neigh.reset_index()
df_venues2.to_csv('Venues.csv')

We would like to display all of the venues on the map with the marker-cluster-plugin of folium.

In [38]:
import folium.plugins as fplugins
lat=43.66135
long=-79.383087

map_toronto2 = folium.Map(location=[lat, long], zoom_start=10,control_scale = True)

# instantiate a mark cluster object
venues_cluster = fplugins.MarkerCluster().add_to(map_toronto2)

# loop through the dataframe and add each data point to the mark cluster
for lat, lng, label, in zip(df_venues2['venue.location.lat'], df_venues2['venue.location.lng'],df_venues2['venue.name']):
    folium.Marker(
        location=[lat, lng],
        icon=None,
        popup=label,
    ).add_to(venues_cluster)

# display map
map_toronto2

As we can see, the largest ammount of venues seem to be found in downtown. Let's use a more scientific version to cluster the spatial data of the locations. We will use the density-based clustering algorithm DBSCAN from the sklearn library.

In [99]:
from sklearn.cluster import DBSCAN

ClusterData = df_venues2[['venue.location.lat','venue.location.lng']]
db = DBSCAN(eps=0.005, min_samples=20).fit(ClusterData)
labels = db.labels_
unique_labels = set(labels)
print(unique_labels)
df_venues2['cluster_label']=labels
df_venues2

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, -1}


Unnamed: 0,index,referralId,reasons.count,reasons.items,venue.id,venue.name,venue.location.address,venue.location.lat,venue.location.lng,venue.location.labeledLatLngs,...,venue.location.country,venue.location.formattedAddress,venue.categories,venue.photos.count,venue.photos.groups,venue.location.crossStreet,venue.venuePage.id,venue.events.count,venue.events.summary,cluster_label
0,0,e-0-4b8991cbf964a520814232e3-0,0,"[{'summary': 'This spot is popular', 'type': '...",4b8991cbf964a520814232e3,Allwyn's Bakery,81 Underhill drive,43.759840,-79.324719,"[{'label': 'display', 'lat': 43.75984035203157...",...,Canada,"81 Underhill drive, Toronto ON M3A 1Z5, Canada","[{'id': '4bf58dd8d48988d144941735', 'name': 'C...",0,[],,,,,-1
1,1,e-0-4bd4846a6798ef3bd0c5618d-1,0,"[{'summary': 'This spot is popular', 'type': '...",4bd4846a6798ef3bd0c5618d,Donalda Golf & Country Club,12 Bushbury Dr,43.752816,-79.342741,"[{'label': 'display', 'lat': 43.75281596740471...",...,Canada,"12 Bushbury Dr, Don Mills ON M3A 2Z7, Canada","[{'id': '4bf58dd8d48988d1e6941735', 'name': 'G...",0,[],,,,,-1
2,5,e-0-4b8ec91af964a520053733e3-5,0,"[{'summary': 'This spot is popular', 'type': '...",4b8ec91af964a520053733e3,Graydon Hall Manor,185 Graydon Hall Drive,43.763923,-79.342961,"[{'label': 'display', 'lat': 43.76392256055678...",...,Canada,"185 Graydon Hall Drive, Toronto ON M3A 3B4, Ca...","[{'id': '4bf58dd8d48988d171941735', 'name': 'E...",0,[],,52496423,,,-1
3,6,e-0-57e286f2498e43d84d92d34a-6,0,"[{'summary': 'This spot is popular', 'type': '...",57e286f2498e43d84d92d34a,Tim Hortons,215 Brookbanks,43.760668,-79.326368,"[{'label': 'display', 'lat': 43.76066827030228...",...,Canada,"215 Brookbanks (York Miils Rd), Toronto ON M3A...","[{'id': '4bf58dd8d48988d16d941735', 'name': 'C...",0,[],York Miils Rd,,,,-1
4,15,e-0-58a8dcaa6119f47b9a94dc05-15,0,"[{'summary': 'This spot is popular', 'type': '...",58a8dcaa6119f47b9a94dc05,A&W,1277 York Mills Road,43.760643,-79.326865,"[{'label': 'display', 'lat': 43.76064307616131...",...,Canada,"1277 York Mills Road, Toronto ON M3A 1Z5, Canada","[{'id': '4bf58dd8d48988d16e941735', 'name': 'F...",0,[],,,,,-1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3050,68,e-0-4c4affcfc9e4ef3b5d3fed10-68,0,"[{'summary': 'This spot is popular', 'type': '...",4c4affcfc9e4ef3b5d3fed10,Goodwill,871 Islington Ave.,43.621602,-79.513198,"[{'label': 'display', 'lat': 43.62160166310763...",...,Canada,"871 Islington Ave. (at Queensway), Toronto ON ...","[{'id': '4bf58dd8d48988d101951735', 'name': 'T...",0,[],at Queensway,,,,-1
3051,72,e-0-4d0408907d9ba35d83856323-72,0,"[{'summary': 'This spot is popular', 'type': '...",4d0408907d9ba35d83856323,King's Garden Banquet Hall,15 Canmotor Avenue,43.622929,-79.510698,"[{'label': 'display', 'lat': 43.6229291, 'lng'...",...,Canada,"15 Canmotor Avenue (Queensquay), Etobicoke ON ...","[{'id': '4bf58dd8d48988d171941735', 'name': 'E...",0,[],Queensquay,514633977,,,-1
3052,75,e-0-4b0703f6f964a52080f522e3-75,0,"[{'summary': 'This spot is popular', 'type': '...",4b0703f6f964a52080f522e3,ShaSha Organic,20 Plastics Ave,43.623370,-79.509302,"[{'label': 'display', 'lat': 43.62337, 'lng': ...",...,Canada,"20 Plastics Ave (Queensway), Toronto ON M8Z 4B...","[{'id': '4bf58dd8d48988d1f5941735', 'name': 'G...",0,[],Queensway,,,,-1
3053,78,e-0-4be5bc1d2457a593d3eaab15-78,0,"[{'summary': 'This spot is popular', 'type': '...",4be5bc1d2457a593d3eaab15,Fogh Marine,901 Oxford St,43.619122,-79.514810,"[{'label': 'display', 'lat': 43.61912207338791...",...,Canada,"901 Oxford St (Islington Ave and Evans Ave), T...","[{'id': '4bf58dd8d48988d1f2941735', 'name': 'S...",0,[],Islington Ave and Evans Ave,141621939,,,-1


Let's have a look at our spatial venue clusters.

In [104]:
palette = ['blue', 'cadetblue', 'darkblue', 'darkgreen', 'darkpurple', 'darkred', 'gray', 'green', 'lightblue', 'lightgray', 'lightgreen', 'lightred', 'orange', 'pink', 'purple', 'red', 'white','beige','black','blue', 'cadetblue', 'darkblue', 'darkgreen', 'darkpurple', 'darkred', 'gray', 'green', 'lightblue', 'lightgray', 'lightgreen', 'lightred', 'orange', 'pink', 'purple', 'red', 'white','beige','black','blue', 'cadetblue', 'darkblue', 'darkgreen', 'darkpurple', 'darkred', 'gray', 'green', 'lightblue', 'lightgray', 'lightgreen', 'lightred', 'orange', 'pink', 'purple', 'red', 'white','beige','black']

map_toronto3 = folium.Map(location=[lat, long], zoom_start=11,control_scale = True)
for i in range(0,len(unique_labels)-1):
    df_venues3=df_venues2[df_venues2['cluster_label']==i]
    for lat, lng, name, in zip(df_venues3['venue.location.lat'], df_venues3['venue.location.lng'],df_venues3['venue.name']):
        label = folium.Popup(name, parse_html=True)
        folium.CircleMarker(
            [lat, lng],
            radius=5,
            popup=name,
            color=palette[i],
            fill=True,
            fill_color=palette[i],
            fill_opacity=1,
            parse_html=False).add_to(map_toronto3)

map_toronto3


As we expected, the largest venue cluster is in Downtown Toronto. Other clusters include main streets, e.g. Avenue Road (white markers) and St.Clair Avenue West (grey markers); Malls, e.g. Bayview Village (orange markers) and Yorkdale Shopping Centre (cadetblue markers); and central places in the suburbs, e.g. the North York Centre (black markers).