# The Battle of the Neighborhoods
#### Author: Emile Strasheim
#### Date: 2021-08-10

## Introduction

This notebook contains the code used in the pursuit of the objectives that were defined as part of The Battle of the Neighborhoods project. A report and presentation was also created containing more information on each element and phase of the project and the methodology followed.

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1.  [**Packages**](#item1)
    
    1.1 [Import packages required](#item1_1)
    
    1.2 [Create functions required](#item1_2)

2.  [**Download and analyse neighborhoods data - Toronto, Canada**](#item2)
    
    2.1 [Get webpage information and create soup](#item2_1)
    
    2.2 [Extract table information and create dataframe](#item2_2)
    
    2.3 [Remove rows with unassigned neighborhoods](#item2_3)
    
3.  [**Download and analyse neighborhoods data - Amsterdam, Netherlands**](#item3)
    
    3.1 [Get webpage information and create soup](#item3_1)
    
    3.2 [Extract table information and create dataframe](#item3_2)
    
    3.3 [Geocode neighborhoods to get longitudinal and latitudinal coordinates](#item3_3)

4.  [**Use Foursquare API to retrieve venues data for neighborhoods**](#item4)
    
    4.1 [Setup Foursquare API credentials](#item4_1)
    
    4.2 [Retrieve Toronto venue data from Foursquare API](#item4_2)
    
    4.3 [Retrieve Amsterdam venue data from Foursquare API](#item4_3)
    
    4.4 [One hot encode Toronto and Amsterdam venues by category per neighborhood](#item4_4)
    
    4.5 [Summarise Toronto and Amsterdam venues by category per neighborhood](#item4_5)
    
    4.6 [View top 5 venues per neighborhood](#item4_6)
    
    4.7 [Create dataframe containing top 10 venues per neighborhood](#item4_7)
    
5.  [**Cluster and classify neighborhoods**](#item5)
    
    5.1 [Build k-Means clustering model using the Toronto neighborhoods](#item5_1)
    
    5.2 [Classify Amsterdam neighorhoods using Toronto neighborhood k-Means clustering model](#item5_2)
    
    5.3 [Examine the clusters](#item5_3)
    
    5.4 [Cluster profiles](#item5_4)

6.  [**Analyse neighborhood clusters on geographical maps**](#item6)
    
    6.1 [Set parameters](#item6_1)
    
    6.2 [Toronto neighborhoods](#item6_2)
    
    6.3 [Amsterdam neighborhoods](#item6_3)
    
    </font>
    </div>
    



## 1. Packages and functions <div id="item1"/>

#### 1.1 Import packages required <div id="item1_1"/>

In [1]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
import json # library to handle JSON files

from geopy.geocoders import Nominatim # library to convert an address into latitude and longitude values
import urllib

import requests # library to handle requests
from pandas.io.json import json_normalize # library to tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library

from bs4 import BeautifulSoup # library to scrape webpages

print('Libraries imported.')

Libraries imported.


#### 1.2 Create functions required <div id="item1_2"/>

Function to geocode addresses and retrieve longitudinal and lattitudinal coordinates

In [2]:
def Bing(df, country, geo_col, Atype=None):
    '''
    This function geocodes addresses using Bing Maps API. Important to have the following columns initialised in dataframe that is passed: ['response', 'Latitude', 'Longitude']

    Parameters
    ----------
    df : Dataframe
        The dataframe with the addresses to be geocoded.
    country : String
        The country of the addresses in the dataset.
    geo_col : String
        The name of the column with the adress to be geocoded.
    Atype : String, optional
        String that indicates address type. This value will then be included in the latitude and longitude columns.

    Returns
    -------
    df : Dataframe
        The dataframe with the indicated addresses geocoded.
    '''
    print(1)
    for i in range(df.shape[0]):
        #print(i)
        temp=str(df[geo_col].iloc[i]) 
        temp = urllib.parse.quote(temp)+'%20' + urllib.parse.quote(country)        
        APIkey = 'AjWZ6QI7SQMPESS8LiB1pVE7CqCbJ7a8ho9mGIneEjaE6wvXN3-bMha9k6d0jqm9'
        url = 'http://dev.virtualearth.net/REST/v1/Locations?query=' + temp + '&includeNeighborhood=1&maxResults=10&key=' + APIkey
        try:
            try:
                request = urllib.request.Request(url)
                response = urllib.request.urlopen(request)
                #print(response)
            except urllib.error.HTTPError as e:
                print('Received error ' + str(e.reason) +'. Will try again in 40 seconds')
                time.sleep(10)
                request = urllib.request.Request(url)
                response = urllib.request.urlopen(request)
        except urllib.error.HTTPError:
            pass
        response_string =response.read().decode('utf-8')
        #df['response'].iloc[i]=response_string
        if Atype:
            try:
                lat = 'Lat_' + Atype
                long = 'Long_' + Atype
                lat_long = response_string.split('Point')[1].split('coordinates":[')[1].split(']},"address"')[0]
                latitude = lat_long.split(',')[0]
                df[lat].iloc[i] = latitude
                #print(latitude)
                longitude = lat_long.split(',')[1]
                df[long].iloc[i] = longitude
            except Exception:
                pass
        else:
            try:
                lat_long = response_string.split('Point')[1].split('coordinates":[')[1].split(']},"address"')[0]
                latitude = lat_long.split(',')[0]
                df['Latitude'].iloc[i] = float(latitude)
                longitude = lat_long.split(',')[1]
                df['Longitude'].iloc[i] = float(longitude)
                df['Long_Lat'].iloc[i] = [float(latitude), float(longitude)]
            except Exception:
                pass
    print(i)
    return 'Done!'

Function to retrieve venue data per neighborhood

In [3]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    print('Data extraction started...')
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):

        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, CLIENT_SECRET, VERSION, lat, lng, radius, LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(name, lat, lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 'Neighb. Latitude', 'Neighb. Longitude', 'Venue', 'Venue Latitude', 'Venue Longitude', 'Venue Category']
    
    print('Data extraction done.')    
    return(nearby_venues)

Function to sort venues per neighborhood by frequency of occurrence

In [4]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[2:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

## 2. Download and prepare neighborhoods data - Toronto, Canada <div id='item2'/>

#### 2.1 Get webpage information and create soup <div id="item2_1"/>

In [5]:
url_t = 'https://en.wikipedia.org/w/index.php?title=List_of_postal_codes_of_Canada:_M&oldid=945633050.'
page_t = requests.get(url_t)
soup_t = BeautifulSoup(page_t.text,'html.parser')

#### 2.2 Extract table information and create dataframe <div id="item2_2"/>

In [6]:
table_contents_t=[]
table_t=soup_t.find('table')

# loop through table rows to obtain values
for row in table_t.findAll('tr'):
    cell = {}
    try:
        c = row.findAll('td')
        cell['PostalCode'] = c[0].get_text()
        cell['Neighborhood'] = c[2].get_text().replace('\n', '')
        table_contents_t.append(cell)
    except:
        pass
    
# create a dataframe containing extracted text values
df_t=pd.DataFrame(table_contents_t)
print('Toronto neighorhoods (head): ')
display(df_t.head(10))
print('\n')
print('Shape of dataframe: {}'.format(df_t.shape))

Toronto neighorhoods (head): 


Unnamed: 0,PostalCode,Neighborhood
0,M1A,Not assigned
1,M2A,Not assigned
2,M3A,Parkwoods
3,M4A,Victoria Village
4,M5A,Harbourfront
5,M6A,Lawrence Heights
6,M6A,Lawrence Manor
7,M7A,Queen's Park
8,M8A,Not assigned
9,M9A,Islington Avenue




Shape of dataframe: (287, 2)


#### 2.3 Remove rows with unassigned neighborhoods <div id="item2_3"/>

In [7]:
# remove all rows where borough equals 'Not assigned'
print('Shape before: {}'.format(df_t.shape))
df_t = df_t[df_t['Neighborhood'] != 'Not assigned']
print('Shape after: {}'.format(df_t.shape))

Shape before: (287, 2)
Shape after: (210, 2)


#### 2.4 Geocode neighborhoods to get longitudinal and latitudinal coordinates <div id="item2_3"/>

In [8]:
df_t['Address'] = df_t['Neighborhood'] + ', ' + df_t['PostalCode'] + ', ' + 'Toronto'
df_t['Latitude'] = ''
df_t['Longitude'] = ''
df_t['Long_Lat'] = ''
Bing(df_t, 'Canada', 'Address')

1
209


'Done!'

In [9]:
df_t.head(10)

Unnamed: 0,PostalCode,Neighborhood,Address,Latitude,Longitude,Long_Lat
2,M3A,Parkwoods,"Parkwoods, M3A, Toronto",43.755997,-79.329544,"[43.75599670410156, -79.32954406738281]"
3,M4A,Victoria Village,"Victoria Village, M4A, Toronto",43.728336,-79.314789,"[43.728336334228516, -79.31478881835938]"
4,M5A,Harbourfront,"Harbourfront, M5A, Toronto",43.655376,-79.365005,"[43.65537643432617, -79.36500549316406]"
5,M6A,Lawrence Heights,"Lawrence Heights, M6A, Toronto",43.72192,-79.450676,"[43.721920013427734, -79.45067596435547]"
6,M6A,Lawrence Manor,"Lawrence Manor, M6A, Toronto",43.725235,-79.439537,"[43.72523498535156, -79.43953704833984]"
7,M7A,Queen's Park,"Queen's Park, M7A, Toronto",43.661913,-79.389937,"[43.6619131706686, -79.3899373128105]"
9,M9A,Islington Avenue,"Islington Avenue, M9A, Toronto",43.667498,-79.533481,"[43.6674979500411, -79.5334806996528]"
10,M1B,Rouge,"Rouge, M1B, Toronto",43.822937,-79.177452,"[43.82293701171875, -79.17745208740234]"
11,M1B,Malvern,"Malvern, M1B, Toronto",43.8022,-79.223869,"[43.80220031738281, -79.22386932373047]"
13,M3B,Don Mills North,"Don Mills North, M3B, Toronto",43.740825,-79.344493,"[43.7408247192986, -79.3444934347813]"


In [10]:
df_t.describe()

Unnamed: 0,PostalCode,Neighborhood,Address,Latitude,Longitude,Long_Lat
count,210,210,210,210.0,210.0,210
unique,103,208,210,198.0,197.0,198
top,M9V,Runnymede,"Sullivan, M1T, Toronto",43.643871,-79.381714,"[43.64387130737305, -79.3958511352539]"
freq,8,2,1,3.0,3.0,3


## 3. Download and prepare neighborhoods data - Amsterdam, Netherlands <div id='item3'/>

#### 3.1 Get webpage information and create soup <div id="item3_1"/>

In [11]:
url_a = 'https://en.wikipedia.org/wiki/Template:Neighborhoods_of_Amsterdam'
page_a = requests.get(url_a)
soup_a = BeautifulSoup(page_a.text,'html.parser')

#### 3.2 Extract table information and create dataframe <div id="item3_1"/>

In [12]:
table_contents_a=[]
table_a=soup_a.find('table')

# loop through table rows to obtain values
for row in table_a.findAll('tr'):
    try:
        head = row.find('th', {'class': 'navbox-group'}).get_text()
        for n in row.findAll('li'):
            cell = {}
            try:
                cell['District'] = head
                cell['Neighborhood'] = n.get_text()
                table_contents_a.append(cell)
            except:
                pass
    except:
        pass

# create a dataframe containing extracted text values
df_a=pd.DataFrame(table_contents_a)
print('Amsterdam neighorhoods (head): ')
display(df_a.head())
print('\n')
print('Shape of dataframe: {}'.format(df_a.shape))

Amsterdam neighorhoods (head): 


Unnamed: 0,District,Neighborhood
0,Centrum,Binnenstad (Oude Zijde - Nieuwe Zijde)
1,Centrum,Grachtengordel (Negen Straatjes)
2,Centrum,Haarlemmerbuurt
3,Centrum,Jodenbuurt
4,Centrum,Jordaan




Shape of dataframe: (77, 2)


#### 3.3 Geocode neighborhoods to get longitudinal and latitudinal coordinates <div id="item3_3"/>

In [13]:
df_a['Address'] = df_a['Neighborhood'] + ', ' + 'Amsterdam'
df_a['Latitude'] = ''
df_a['Longitude'] = ''
df_a['Long_Lat'] = ''
Bing(df_a, 'Netherlands', 'Address')

1
76


'Done!'

In [14]:
df_a.head(10)

Unnamed: 0,District,Neighborhood,Address,Latitude,Longitude,Long_Lat
0,Centrum,Binnenstad (Oude Zijde - Nieuwe Zijde),"Binnenstad (Oude Zijde - Nieuwe Zijde), Amsterdam",52.45822,5.03278,"[52.45822, 5.03278]"
1,Centrum,Grachtengordel (Negen Straatjes),"Grachtengordel (Negen Straatjes), Amsterdam",52.40387,4.88928,"[52.40387, 4.88928]"
2,Centrum,Haarlemmerbuurt,"Haarlemmerbuurt, Amsterdam",52.384697,4.886757,"[52.38469696044922, 4.886756896972656]"
3,Centrum,Jodenbuurt,"Jodenbuurt, Amsterdam",52.369171,4.9025,"[52.369171142578125, 4.902500152587891]"
4,Centrum,Jordaan,"Jordaan, Amsterdam",52.373295,4.879922,"[52.373294830322266, 4.879921913146973]"
5,Centrum,Kadijken,"Kadijken, Amsterdam",52.368889,4.91556,"[52.36888885498047, 4.915559768676758]"
6,Centrum,Lastage,"Lastage, Amsterdam",52.373085,4.903207,"[52.373085021972656, 4.903206825256348]"
7,Centrum,Oostelijke Eilanden (Czaar Peterbuurt),"Oostelijke Eilanden (Czaar Peterbuurt), Amsterdam",52.37003,4.92934,"[52.37003, 4.92934]"
8,Centrum,Oosterdokseiland,"Oosterdokseiland, Amsterdam",52.37648,4.906045,"[52.37648010253906, 4.906044960021973]"
9,Centrum,Plantage,"Plantage, Amsterdam",52.364498,4.910798,"[52.364498138427734, 4.910798072814941]"


In [15]:
df_a.describe()

Unnamed: 0,District,Neighborhood,Address,Latitude,Longitude,Long_Lat
count,77,77,77,77.0,77.0,77
unique,8,77,77,72.0,72.0,72
top,West,Oostelijk Havengebied (Borneo-eiland - Cruquiu...,"Postjesbuurt, Amsterdam",52.36154,5.03846,"[52.36154, 5.03846]"
freq,14,1,1,5.0,5.0,5


## 4. Use Foursquare API to retrieve venues data for neighborhoods <div id='item4'/>

#### 4.1 Setup Foursquare API credentials <div id="item4_1"/>

In [16]:
CLIENT_ID = 'MEV5QE34WGODCICLI5W0NF0LVZ1GDSIAORRWHW1T45ZSONS5'
CLIENT_SECRET = '02KNB3V0R0LUACDW1M1KVVL0LDX4V1Q0JXHKEQFO1L0AB4J1'
ACCESS_TOKEN = 'SSCRRE1DNQHRV1JVSWB3KACETRVIUPIEJSUGCGV1EC40JESM'
VERSION = '20180605'
LIMIT = 100

#### 4.2 Retrieve Toronto venue data from Foursquare API <div id="item4_2"/>

In [17]:
toronto_venues = getNearbyVenues(names=df_t['Neighborhood'], latitudes=df_t['Latitude'], longitudes=df_t['Longitude'])

Data extraction started...
Data extraction done.


In [18]:
toronto_venues['City'] = 'Toronto'

In [19]:
toronto_venues.head(10)

Unnamed: 0,Neighborhood,Neighb. Latitude,Neighb. Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,City
0,Parkwoods,43.755997,-79.329544,Brookbanks Park,43.751976,-79.33214,Park,Toronto
1,Parkwoods,43.755997,-79.329544,TTC Stop #09083,43.759655,-79.332223,Bus Stop,Toronto
2,Parkwoods,43.755997,-79.329544,DVP at York Mills,43.758899,-79.334099,Intersection,Toronto
3,Parkwoods,43.755997,-79.329544,Chick-N-Joy,43.7599,-79.32652,Fried Chicken Joint,Toronto
4,Victoria Village,43.728336,-79.314789,Tim Hortons,43.725517,-79.313103,Coffee Shop,Toronto
5,Victoria Village,43.728336,-79.314789,Portugril,43.725819,-79.312785,Portuguese Restaurant,Toronto
6,Victoria Village,43.728336,-79.314789,The Frig,43.727051,-79.317418,French Restaurant,Toronto
7,Victoria Village,43.728336,-79.314789,Eglinton Ave E & Sloane Ave/Bermondsey Rd,43.726086,-79.31362,Intersection,Toronto
8,Victoria Village,43.728336,-79.314789,Pizza Nova,43.725824,-79.31286,Pizza Place,Toronto
9,Victoria Village,43.728336,-79.314789,Wigmore Park,43.731023,-79.310771,Park,Toronto


In [20]:
toronto_venues.describe(include=object)

Unnamed: 0,Neighborhood,Venue,Venue Category,City
count,4817,4817,4817,4817
unique,205,2361,332,1
top,St. James Town,Starbucks,Coffee Shop,Toronto
freq,135,113,378,4817


#### 4.3 Retrieve Amsterdam venue data from Foursquare API <div id="item4_3"/>

In [21]:
amsterdam_venues = getNearbyVenues(names=df_a['Neighborhood'], latitudes=df_a['Latitude'], longitudes=df_a['Longitude'])

Data extraction started...
Data extraction done.


In [22]:
amsterdam_venues['City'] = 'Amsterdam'

In [23]:
amsterdam_venues.head(10)

Unnamed: 0,Neighborhood,Neighb. Latitude,Neighb. Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,City
0,Binnenstad (Oude Zijde - Nieuwe Zijde),52.45822,5.03278,Posthoorn,52.460915,5.035037,French Restaurant,Amsterdam
1,Binnenstad (Oude Zijde - Nieuwe Zijde),52.45822,5.03278,De Waegh,52.45924,5.036258,French Restaurant,Amsterdam
2,Binnenstad (Oude Zijde - Nieuwe Zijde),52.45822,5.03278,Beuqz,52.459148,5.036137,Café,Amsterdam
3,Binnenstad (Oude Zijde - Nieuwe Zijde),52.45822,5.03278,De Koperen Vis,52.459852,5.036748,Diner,Amsterdam
4,Binnenstad (Oude Zijde - Nieuwe Zijde),52.45822,5.03278,Cafe 1614,52.459243,5.035988,Pub,Amsterdam
5,Binnenstad (Oude Zijde - Nieuwe Zijde),52.45822,5.03278,"Coffee & Cacao Lunchroom, Chocolaterie, Patiss...",52.45911,5.036719,Café,Amsterdam
6,Binnenstad (Oude Zijde - Nieuwe Zijde),52.45822,5.03278,De Zwaan,52.459129,5.036301,Café,Amsterdam
7,Binnenstad (Oude Zijde - Nieuwe Zijde),52.45822,5.03278,Bierderij Waterland Organic Brewery & Tasting ...,52.458379,5.039865,Brewery,Amsterdam
8,Binnenstad (Oude Zijde - Nieuwe Zijde),52.45822,5.03278,SPAR express Monnickendam Oost,52.458756,5.031091,Convenience Store,Amsterdam
9,Binnenstad (Oude Zijde - Nieuwe Zijde),52.45822,5.03278,Four Seasons,52.46161,5.03641,Chinese Restaurant,Amsterdam


In [24]:
amsterdam_venues.describe(include=object)

Unnamed: 0,Neighborhood,Venue,Venue Category,City
count,2164,2164,2164,2164
unique,71,1527,254,1
top,Oostelijk Havengebied (Borneo-eiland - Cruquiu...,Albert Heijn,Bar,Amsterdam
freq,100,24,108,2164


#### 4.4 One hot encode Toronto and Amsterdam venues by category per neighborhood <div id="item4_4"/>

In [25]:
all_venues = toronto_venues.append(amsterdam_venues)
all_venues.shape

(6981, 8)

In [26]:
# one hot encode all venues according to venue category
all_onehot = pd.get_dummies(all_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood and city column back to dataframe
all_onehot['Neighborhood'] = all_venues['Neighborhood']
all_onehot['City'] = all_venues['City']

# move neighborhood and city column to first two columns
cols = list(all_onehot)
cols.insert(0, cols.pop(cols.index('City')))
cols.insert(1, cols.pop(cols.index('Neighborhood')))
all_onehot = all_onehot.loc[:, cols]

all_onehot.head()

Unnamed: 0,City,Neighborhood,Accessories Store,Afghan Restaurant,African Restaurant,Airport,Airport Food Court,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Amphitheater,Antique Shop,Aquarium,Arcade,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,Austrian Restaurant,Auto Garage,Auto Workshop,Automotive Shop,BBQ Joint,Baby Store,Badminton Court,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Court,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Belgian Restaurant,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bistro,Board Shop,Boarding House,Boat Rental,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Brasserie,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Bridge,Bubble Tea Shop,Buffet,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Cafeteria,Café,Cajun / Creole Restaurant,Camera Store,Canal,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Casino,Cemetery,Cheese Shop,Chinese Restaurant,Chiropractor,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Auditorium,College Gym,College Rec Center,College Stadium,Colombian Restaurant,Comedy Club,Comfort Food Restaurant,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Costume Shop,Coworking Space,Creperie,Cruise Ship,Cuban Restaurant,Cupcake Shop,Curling Ice,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distillery,Distribution Center,Dive Bar,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dry Cleaner,Dumpling Restaurant,Dutch Restaurant,Eastern European Restaurant,Electronics Store,Escape Room,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Financial or Legal Service,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food & Drink Shop,Food Court,Food Service,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Friterie,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Golf Driving Range,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Hakka Restaurant,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hobby Shop,Hockey Arena,Home Service,Hookah Bar,Hospital,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Housing Development,IT Services,Ice Cream Shop,Indian Chinese Restaurant,Indian Restaurant,Indie Movie Theater,Indonesian Restaurant,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kids Store,Kitchen Supply Store,Korean Restaurant,Lake,Laser Tag,Latin American Restaurant,Laundromat,Lawyer,Leather Goods Store,Lebanese Restaurant,Library,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Luggage Store,Mac & Cheese Joint,Malay Restaurant,Marijuana Dispensary,Market,Martial Arts School,Massage Studio,Medical Center,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Moving Target,Museum,Music School,Music Store,Music Venue,Nail Salon,New American Restaurant,Newsagent,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Event,Other Great Outdoors,Outdoor Sculpture,Outdoor Supply Store,Pakistani Restaurant,Palace,Paper / Office Supplies Store,Park,Pastry Shop,Performing Arts Venue,Persian Restaurant,Peruvian Restaurant,Pet Café,Pet Store,Pharmacy,Photography Lab,Pie Shop,Pier,Pilates Studio,Pizza Place,Planetarium,Platform,Playground,Plaza,Poke Place,Pool,Pool Hall,Pop-Up Shop,Portuguese Restaurant,Poutine Place,Print Shop,Pub,Public Art,Racetrack,Ramen Restaurant,Record Shop,Recording Studio,Recreation Center,Rental Car Location,Rental Service,Residential Building (Apartment / Condo),Restaurant,River,Rock Climbing Spot,Rock Club,Roof Deck,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shanghai Restaurant,Shipping Store,Shoe Repair,Shoe Store,Shopping Mall,Shopping Plaza,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soccer Field,Soccer Stadium,Social Club,Soup Place,South American Restaurant,Southern / Soul Food Restaurant,Souvenir Shop,Souvlaki Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Squash Court,Sri Lankan Restaurant,Stables,Stadium,Steakhouse,Storage Facility,Street Art,Strip Club,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tailor Shop,Tanning Salon,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Theme Restaurant,Thrift / Vintage Store,Tibetan Restaurant,Toy / Game Store,Track,Trail,Train Station,Tram Station,Tunnel,Turkish Restaurant,Udon Restaurant,University,Vacation Rental,Vegetarian / Vegan Restaurant,Veterinarian,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Whisky Bar,Windmill,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
0,Toronto,Parkwoods,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Toronto,Parkwoods,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Toronto,Parkwoods,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Toronto,Parkwoods,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Toronto,Victoria Village,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


Show top 10 occurring venue categories across neighborhoods for Toronto

In [27]:
all_onehot.groupby('City').sum().transpose().sort_values('Toronto', ascending=False).head(10)

City,Amsterdam,Toronto
Coffee Shop,92.0,378.0
Café,91.0,194.0
Restaurant,99.0,154.0
Park,30.0,128.0
Pizza Place,30.0,121.0
Italian Restaurant,46.0,118.0
Sandwich Place,20.0,107.0
Hotel,105.0,97.0
Bakery,50.0,96.0
Grocery Store,16.0,92.0


Show top 10 occurring venue categories across neighborhoods for Toronto

In [28]:
all_onehot.groupby('City').sum().transpose().sort_values('Toronto', ascending=False).head(10)

City,Amsterdam,Toronto
Coffee Shop,92.0,378.0
Café,91.0,194.0
Restaurant,99.0,154.0
Park,30.0,128.0
Pizza Place,30.0,121.0
Italian Restaurant,46.0,118.0
Sandwich Place,20.0,107.0
Hotel,105.0,97.0
Bakery,50.0,96.0
Grocery Store,16.0,92.0


#### 4.5 Summarise Toronto and Amsterdam venues by category per neighborhood <div id="item4_5"/>

Summarise the venues by category per neighborhood by taking the mean of the frequency of occurrence

In [29]:
all_grouped = all_onehot.groupby(['City', 'Neighborhood']).mean().reset_index()
all_grouped.head()

Unnamed: 0,City,Neighborhood,Accessories Store,Afghan Restaurant,African Restaurant,Airport,Airport Food Court,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Amphitheater,Antique Shop,Aquarium,Arcade,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Australian Restaurant,Austrian Restaurant,Auto Garage,Auto Workshop,Automotive Shop,BBQ Joint,Baby Store,Badminton Court,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Court,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Belgian Restaurant,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bistro,Board Shop,Boarding House,Boat Rental,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Bowling Alley,Brasserie,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Bridge,Bubble Tea Shop,Buffet,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Cafeteria,Café,Cajun / Creole Restaurant,Camera Store,Canal,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Casino,Cemetery,Cheese Shop,Chinese Restaurant,Chiropractor,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Auditorium,College Gym,College Rec Center,College Stadium,Colombian Restaurant,Comedy Club,Comfort Food Restaurant,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Costume Shop,Coworking Space,Creperie,Cruise Ship,Cuban Restaurant,Cupcake Shop,Curling Ice,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distillery,Distribution Center,Dive Bar,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dry Cleaner,Dumpling Restaurant,Dutch Restaurant,Eastern European Restaurant,Electronics Store,Escape Room,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Financial or Legal Service,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food & Drink Shop,Food Court,Food Service,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Friterie,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Golf Driving Range,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Hakka Restaurant,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hobby Shop,Hockey Arena,Home Service,Hookah Bar,Hospital,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Housing Development,IT Services,Ice Cream Shop,Indian Chinese Restaurant,Indian Restaurant,Indie Movie Theater,Indonesian Restaurant,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kids Store,Kitchen Supply Store,Korean Restaurant,Lake,Laser Tag,Latin American Restaurant,Laundromat,Lawyer,Leather Goods Store,Lebanese Restaurant,Library,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Luggage Store,Mac & Cheese Joint,Malay Restaurant,Marijuana Dispensary,Market,Martial Arts School,Massage Studio,Medical Center,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Moving Target,Museum,Music School,Music Store,Music Venue,Nail Salon,New American Restaurant,Newsagent,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Event,Other Great Outdoors,Outdoor Sculpture,Outdoor Supply Store,Pakistani Restaurant,Palace,Paper / Office Supplies Store,Park,Pastry Shop,Performing Arts Venue,Persian Restaurant,Peruvian Restaurant,Pet Café,Pet Store,Pharmacy,Photography Lab,Pie Shop,Pier,Pilates Studio,Pizza Place,Planetarium,Platform,Playground,Plaza,Poke Place,Pool,Pool Hall,Pop-Up Shop,Portuguese Restaurant,Poutine Place,Print Shop,Pub,Public Art,Racetrack,Ramen Restaurant,Record Shop,Recording Studio,Recreation Center,Rental Car Location,Rental Service,Residential Building (Apartment / Condo),Restaurant,River,Rock Climbing Spot,Rock Club,Roof Deck,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shanghai Restaurant,Shipping Store,Shoe Repair,Shoe Store,Shopping Mall,Shopping Plaza,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soccer Field,Soccer Stadium,Social Club,Soup Place,South American Restaurant,Southern / Soul Food Restaurant,Souvenir Shop,Souvlaki Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Squash Court,Sri Lankan Restaurant,Stables,Stadium,Steakhouse,Storage Facility,Street Art,Strip Club,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tailor Shop,Tanning Salon,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Theme Restaurant,Thrift / Vintage Store,Tibetan Restaurant,Toy / Game Store,Track,Trail,Train Station,Tram Station,Tunnel,Turkish Restaurant,Udon Restaurant,University,Vacation Rental,Vegetarian / Vegan Restaurant,Veterinarian,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Whisky Bar,Windmill,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo,Zoo Exhibit
0,Amsterdam,Admiralenbuurt,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.06383,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.042553,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.021277,0.0,0.042553,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.042553,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.021277,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.06383,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.042553,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.042553,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.042553,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Amsterdam,Apollobuurt,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0
2,Amsterdam,Banne Buiksloot,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Amsterdam,Bijlmer,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.666667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Amsterdam,Binnenstad (Oude Zijde - Nieuwe Zijde),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.148148,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.074074,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [30]:
all_grouped.shape

(276, 383)

#### 4.6 View top 5 venues per neighborhood <div id="item4_6"/>

Toronto: Top 5 venues

In [31]:
num_top_venues = 5

toronto_grouped = all_grouped[all_grouped['City'] == 'Toronto']

print('Toronto: Top 5 venues \n')

for hood in toronto_grouped.head()['Neighborhood']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[2:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

Toronto: Top 5 venues 

----Adelaide----
         venue  freq
0  Coffee Shop  0.07
1          Bar  0.04
2   Restaurant  0.03
3   Taco Place  0.03
4         Café  0.03


----Agincourt----
                venue  freq
0  Chinese Restaurant  0.11
1     Badminton Court  0.06
2      Sandwich Place  0.06
3            Pharmacy  0.06
4    Sushi Restaurant  0.06


----Agincourt North----
                venue  freq
0  Chinese Restaurant  0.15
1   Convenience Store  0.08
2  Frozen Yogurt Shop  0.08
3                Park  0.08
4        Liquor Store  0.08


----Albion Gardens----
               venue  freq
0        Pizza Place  0.25
1      Grocery Store  0.25
2        Coffee Shop  0.25
3     Sandwich Place  0.25
4  Accessories Store  0.00


----Alderwood----
                venue  freq
0         Pizza Place  0.33
1   Convenience Store  0.17
2            Pharmacy  0.17
3         Coffee Shop  0.17
4  Athletics & Sports  0.17




Amsterdam: Top 5 venues

In [32]:
num_top_venues = 5

amsterdam_grouped = all_grouped[all_grouped['City'] == 'Amsterdam']

print('Amsterdam: Top 5 venues \n')

for hood in amsterdam_grouped.head()['Neighborhood']:
    print("----"+hood+"----")
    temp = amsterdam_grouped[amsterdam_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[2:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

Amsterdam: Top 5 venues 

----Admiralenbuurt----
         venue  freq
0          Bar  0.06
1   Restaurant  0.06
2  Snack Place  0.04
3         Café  0.04
4  Supermarket  0.04


----Apollobuurt----
               venue  freq
0              Hotel  0.08
1   Basketball Court  0.06
2  Health Food Store  0.06
3         Restaurant  0.03
4             Bistro  0.03


----Banne Buiksloot----
         venue  freq
0         Park  0.17
1     Bus Stop  0.17
2  Supermarket  0.17
3   Restaurant  0.08
4       Bakery  0.08


----Bijlmer----
         venue  freq
0     Bus Stop  0.67
1      Dog Run  0.33
2    Racetrack  0.00
3  Pastry Shop  0.00
4         Park  0.00


----Binnenstad (Oude Zijde - Nieuwe Zijde)----
               venue  freq
0           Bus Stop  0.15
1               Café  0.11
2  French Restaurant  0.07
3              Diner  0.04
4        Pizza Place  0.04




#### 4.7 Create dataframe containing top 10 venues per neighborhood <div id="item4_7"/>

In [33]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['City', 'Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
all_venues_sorted = pd.DataFrame(columns=columns)
all_venues_sorted['City'] = all_grouped['City']
all_venues_sorted['Neighborhood'] = all_grouped['Neighborhood']

for ind in np.arange(all_grouped.shape[0]):
    all_venues_sorted.iloc[ind, 2:] = return_most_common_venues(all_grouped.iloc[ind, :], num_top_venues)

In [34]:
all_venues_sorted[all_venues_sorted['City'] == 'Amsterdam'].head(10)

Unnamed: 0,City,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Amsterdam,Admiralenbuurt,Bar,Restaurant,Snack Place,Café,Supermarket,Coffee Shop,Tram Station,Deli / Bodega,Ice Cream Shop,Falafel Restaurant
1,Amsterdam,Apollobuurt,Hotel,Basketball Court,Health Food Store,Restaurant,Bistro,Supermarket,Bookstore,Steakhouse,Breakfast Spot,Bridal Shop
2,Amsterdam,Banne Buiksloot,Park,Bus Stop,Supermarket,Restaurant,Bakery,Café,Turkish Restaurant,Shopping Mall,Drugstore,Office
3,Amsterdam,Bijlmer,Bus Stop,Dog Run,Racetrack,Pastry Shop,Park,Paper / Office Supplies Store,Palace,Pakistani Restaurant,Outdoor Supply Store,Outdoor Sculpture
4,Amsterdam,Binnenstad (Oude Zijde - Nieuwe Zijde),Bus Stop,Café,French Restaurant,Diner,Pizza Place,Coffee Shop,Snack Place,Athletics & Sports,Asian Restaurant,Harbor / Marina
5,Amsterdam,Bos en Lommer (Kolenkitbuurt - Landlust),Park,Bakery,Restaurant,Bagel Shop,Pizza Place,Bar,Gym / Fitness Center,Paper / Office Supplies Store,Café,Seafood Restaurant
6,Amsterdam,Buiksloot,Park,Bus Stop,Supermarket,Restaurant,Bakery,Café,Turkish Restaurant,Shopping Mall,Drugstore,Office
7,Amsterdam,Buikslotermeer,Supermarket,Bakery,Sandwich Place,Clothing Store,Electronics Store,Convenience Store,Market,Restaurant,Drugstore,Bus Stop
8,Amsterdam,Buitenveldert,Hotel,Drugstore,Supermarket,Sandwich Place,Restaurant,Bakery,Coffee Shop,Bistro,Chocolate Shop,Market
9,Amsterdam,Bullewijk,Hotel,Coffee Shop,Café,Hostel,Cafeteria,Performing Arts Venue,Office,Restaurant,Gym,Brewery


In [35]:
all_venues_sorted[all_venues_sorted['City'] == 'Toronto'].head(10)

Unnamed: 0,City,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
71,Toronto,Adelaide,Coffee Shop,Bar,Restaurant,Taco Place,Café,Hotel,Movie Theater,Vegetarian / Vegan Restaurant,Gym,Pizza Place
72,Toronto,Agincourt,Chinese Restaurant,Badminton Court,Sandwich Place,Pharmacy,Sushi Restaurant,Shopping Mall,Coffee Shop,Supermarket,Pizza Place,Shanghai Restaurant
73,Toronto,Agincourt North,Chinese Restaurant,Convenience Store,Frozen Yogurt Shop,Park,Liquor Store,Dim Sum Restaurant,Pizza Place,Fast Food Restaurant,Clothing Store,Bank
74,Toronto,Albion Gardens,Pizza Place,Grocery Store,Coffee Shop,Sandwich Place,Accessories Store,Park,Paper / Office Supplies Store,Palace,Pakistani Restaurant,Outdoor Supply Store
75,Toronto,Alderwood,Pizza Place,Convenience Store,Pharmacy,Coffee Shop,Athletics & Sports,Accessories Store,Other Event,Park,Paper / Office Supplies Store,Palace
76,Toronto,Bathurst Manor,Playground,Park,Convenience Store,Baseball Field,Optical Shop,Paper / Office Supplies Store,Palace,Pakistani Restaurant,Outdoor Supply Store,Outdoor Sculpture
77,Toronto,Bathurst Quay,Coffee Shop,Café,Park,Grocery Store,Pizza Place,Caribbean Restaurant,Bank,Gym,Sculpture Garden,Harbor / Marina
78,Toronto,Bayview Village,Trail,Dog Run,Construction & Landscaping,Golf Driving Range,Optical Shop,Paper / Office Supplies Store,Palace,Pakistani Restaurant,Outdoor Supply Store,Outdoor Sculpture
79,Toronto,Beaumond Heights,Grocery Store,Pizza Place,Fast Food Restaurant,Caribbean Restaurant,Auto Garage,Pharmacy,Sandwich Place,Beer Store,Coffee Shop,Park
80,Toronto,Bedford Park,Coffee Shop,Italian Restaurant,Sandwich Place,Restaurant,Park,Grocery Store,Bakery,Toy / Game Store,Liquor Store,Sushi Restaurant


## 5. Cluster and classify neighborhoods <div id='item5'/>

#### 5.1 Build k-Means clustering model using the Toronto neighborhoods <div id="item5_1"/>

In [36]:
# set number of clusters
kclusters = 10

toronto_grouped_clustering = all_grouped[all_grouped['City'] == 'Toronto'].drop(['City', 'Neighborhood'], 1)

# run k-means clustering
toronto_clusters = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
toronto_clusters.labels_[0:10] 

array([4, 4, 4, 3, 3, 1, 4, 7, 4, 4])

In [37]:
# add clustering labels
toronto_venues_sorted = all_venues_sorted[all_venues_sorted['City'] == 'Toronto']
toronto_venues_sorted.insert(0, 'Cluster Labels', toronto_clusters.labels_)

toronto_merged = df_t

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(toronto_venues_sorted.set_index('Neighborhood'), on='Neighborhood').dropna(subset=['Cluster Labels'])
toronto_merged['Cluster Labels'] = toronto_merged['Cluster Labels'].astype('int', copy=True)

toronto_merged.head()

Unnamed: 0,PostalCode,Neighborhood,Address,Latitude,Longitude,Long_Lat,Cluster Labels,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,M3A,Parkwoods,"Parkwoods, M3A, Toronto",43.755997,-79.329544,"[43.75599670410156, -79.32954406738281]",1,Toronto,Park,Intersection,Bus Stop,Fried Chicken Joint,Accessories Store,Organic Grocery,Paper / Office Supplies Store,Palace,Pakistani Restaurant,Outdoor Supply Store
3,M4A,Victoria Village,"Victoria Village, M4A, Toronto",43.728336,-79.314789,"[43.728336334228516, -79.31478881835938]",3,Toronto,Park,Intersection,Portuguese Restaurant,French Restaurant,Coffee Shop,Pizza Place,Optical Shop,Paper / Office Supplies Store,Palace,Pakistani Restaurant
4,M5A,Harbourfront,"Harbourfront, M5A, Toronto",43.655376,-79.365005,"[43.65537643432617, -79.36500549316406]",4,Toronto,Coffee Shop,Italian Restaurant,Theater,Sandwich Place,Thrift / Vintage Store,Spa,Bakery,Gym / Fitness Center,Bar,Thai Restaurant
5,M6A,Lawrence Heights,"Lawrence Heights, M6A, Toronto",43.72192,-79.450676,"[43.721920013427734, -79.45067596435547]",4,Toronto,Clothing Store,Coffee Shop,Dessert Shop,Cosmetics Shop,Women's Store,Restaurant,Sushi Restaurant,Food Court,Toy / Game Store,Bakery
6,M6A,Lawrence Manor,"Lawrence Manor, M6A, Toronto",43.725235,-79.439537,"[43.72523498535156, -79.43953704833984]",6,Toronto,Park,Accessories Store,Optical Shop,Paper / Office Supplies Store,Palace,Pakistani Restaurant,Outdoor Supply Store,Outdoor Sculpture,Other Great Outdoors,Other Event


In [38]:
toronto_merged.loc[:, ['Cluster Labels', 'Neighborhood']].groupby('Cluster Labels').count()

Unnamed: 0_level_0,Neighborhood
Cluster Labels,Unnamed: 1_level_1
0,8
1,20
2,5
3,31
4,123
5,2
6,9
7,5
8,3
9,1


#### 5.2 Classify Amsterdam neighorhoods using Toronto neighborhood k-Means clustering model <div id="item5_2"/>

In [39]:
amsterdam_grouped_clustering = all_grouped[all_grouped['City'] == 'Amsterdam'].drop(['City', 'Neighborhood'], 1)

amsterdam_clusters = toronto_clusters.predict(amsterdam_grouped_clustering)

amsterdam_clusters[:10]

array([4, 4, 1, 4, 4, 4, 1, 4, 4, 4])

In [40]:
# add clustering labels
amsterdam_venues_sorted = all_venues_sorted[all_venues_sorted['City'] == 'Amsterdam']
amsterdam_venues_sorted.insert(0, 'Cluster Labels', amsterdam_clusters)

amsterdam_merged = df_a

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
amsterdam_merged = amsterdam_merged.join(amsterdam_venues_sorted.set_index('Neighborhood'), on='Neighborhood').dropna(subset=['Cluster Labels'])
amsterdam_merged['Cluster Labels'] = amsterdam_merged['Cluster Labels'].astype('int', copy=True)

amsterdam_merged.head()

Unnamed: 0,District,Neighborhood,Address,Latitude,Longitude,Long_Lat,Cluster Labels,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Centrum,Binnenstad (Oude Zijde - Nieuwe Zijde),"Binnenstad (Oude Zijde - Nieuwe Zijde), Amsterdam",52.45822,5.03278,"[52.45822, 5.03278]",4,Amsterdam,Bus Stop,Café,French Restaurant,Diner,Pizza Place,Coffee Shop,Snack Place,Athletics & Sports,Asian Restaurant,Harbor / Marina
1,Centrum,Grachtengordel (Negen Straatjes),"Grachtengordel (Negen Straatjes), Amsterdam",52.40387,4.88928,"[52.40387, 4.88928]",0,Amsterdam,Restaurant,Hotel,Boat or Ferry,Steakhouse,Bagel Shop,Arcade,Bakery,General Entertainment,Harbor / Marina,Sandwich Place
2,Centrum,Haarlemmerbuurt,"Haarlemmerbuurt, Amsterdam",52.384697,4.886757,"[52.38469696044922, 4.886756896972656]",4,Amsterdam,Bar,Deli / Bodega,Plaza,Coffee Shop,Café,Restaurant,Sandwich Place,French Restaurant,Italian Restaurant,Music Venue
3,Centrum,Jodenbuurt,"Jodenbuurt, Amsterdam",52.369171,4.9025,"[52.369171142578125, 4.902500152587891]",4,Amsterdam,Hotel,Café,Coffee Shop,Bar,Cocktail Bar,History Museum,French Restaurant,Greek Restaurant,Grocery Store,Beer Bar
4,Centrum,Jordaan,"Jordaan, Amsterdam",52.373295,4.879922,"[52.373294830322266, 4.879921913146973]",4,Amsterdam,Bar,Coffee Shop,Café,Italian Restaurant,Hotel,Pizza Place,Ice Cream Shop,Cocktail Bar,Art Gallery,Chocolate Shop


In [41]:
amsterdam_merged.loc[:, ['Cluster Labels', 'Neighborhood']].groupby('Cluster Labels').count()

Unnamed: 0_level_0,Neighborhood
Cluster Labels,Unnamed: 1_level_1
0,5
1,4
2,1
4,61


#### 5.3 Examine the clusters <div id="item5_3"/>


In [52]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(6, toronto_merged.shape[1]))]].head(10)

Unnamed: 0,Neighborhood,Cluster Labels,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Parkwoods,1,Toronto,Park,Intersection,Bus Stop,Fried Chicken Joint,Accessories Store,Organic Grocery,Paper / Office Supplies Store,Palace,Pakistani Restaurant,Outdoor Supply Store
25,West Deane Park,1,Toronto,Park,Convenience Store,Business Service,Skating Rink,Organic Grocery,Paper / Office Supplies Store,Palace,Pakistani Restaurant,Outdoor Supply Store,Outdoor Sculpture
32,Woodbine Heights,1,Toronto,Park,Skating Rink,Bus Stop,Athletics & Sports,Curling Ice,Beer Store,Accessories Store,Other Event,Paper / Office Supplies Store,Palace
41,Guildwood,1,Toronto,Park,Sports Bar,Hotel,Sandwich Place,Optical Shop,Paper / Office Supplies Store,Palace,Pakistani Restaurant,Outdoor Supply Store,Outdoor Sculpture
43,West Hill,1,Toronto,Park,Construction & Landscaping,Gym / Fitness Center,Dry Cleaner,Accessories Store,Organic Grocery,Paper / Office Supplies Store,Palace,Pakistani Restaurant,Outdoor Supply Store
57,Christie,1,Toronto,Park,Baby Store,American Restaurant,Café,Candy Store,Coffee Shop,Grocery Store,Japanese Restaurant,Italian Restaurant,Pakistani Restaurant
63,Bathurst Manor,1,Toronto,Playground,Park,Convenience Store,Baseball Field,Optical Shop,Paper / Office Supplies Store,Palace,Pakistani Restaurant,Outdoor Supply Store,Outdoor Sculpture
83,Toronto Islands,1,Toronto,Café,Music Venue,Park,Harbor / Marina,Other Event,Pastry Shop,Paper / Office Supplies Store,Palace,Pakistani Restaurant,Outdoor Supply Store
95,Downsview East,1,Toronto,Park,Latin American Restaurant,Photography Lab,Vietnamese Restaurant,Optical Shop,Paper / Office Supplies Store,Palace,Pakistani Restaurant,Outdoor Supply Store,Outdoor Sculpture
126,Willowdale,1,Toronto,Korean Restaurant,Playground,Park,Japanese Restaurant,Optical Shop,Paper / Office Supplies Store,Palace,Pakistani Restaurant,Outdoor Supply Store,Outdoor Sculpture


In [51]:
amsterdam_merged.loc[amsterdam_merged['Cluster Labels'] == 1, amsterdam_merged.columns[[1] + list(range(5, amsterdam_merged.shape[1]))]].head(10)

Unnamed: 0,Neighborhood,Long_Lat,Cluster Labels,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
20,Sloten,"[52.339561462402344, 4.816626071929932]",1,Amsterdam,Café,Park,Hotel,Bus Stop,Diner,Other Event,Pastry Shop,Paper / Office Supplies Store,Palace,Pakistani Restaurant
23,Banne Buiksloot,"[52.40760803222656, 4.916283130645752]",1,Amsterdam,Park,Bus Stop,Supermarket,Restaurant,Bakery,Café,Turkish Restaurant,Shopping Mall,Drugstore,Office
24,Buiksloot,"[52.406494140625, 4.9156270027160645]",1,Amsterdam,Park,Bus Stop,Supermarket,Restaurant,Bakery,Café,Turkish Restaurant,Shopping Mall,Drugstore,Office
72,Gaasperdam,"[52.31224060058594, 4.982944011688232]",1,Amsterdam,Food & Drink Shop,Park,Tunnel,Bus Station,Organic Grocery,Paper / Office Supplies Store,Palace,Pakistani Restaurant,Outdoor Supply Store,Outdoor Sculpture


In [44]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 9, toronto_merged.columns[list(range(8, toronto_merged.shape[1]))]].stack().value_counts().head(20)

Seafood Restaurant               1
Pakistani Restaurant             1
Other Great Outdoors             1
Accessories Store                1
Palace                           1
Outdoor Sculpture                1
Paper / Office Supplies Store    1
Outdoor Supply Store             1
Optical Shop                     1
Brewery                          1
dtype: int64

In [45]:
amsterdam_merged.loc[amsterdam_merged['Cluster Labels'] == 9, amsterdam_merged.columns[list(range(8, amsterdam_merged.shape[1]))]].stack().value_counts().head(20)

Series([], dtype: int64)

#### 5.4 Cluster profiles <div id="item5_4"/>

In [53]:
print('Toronto clusters:\n')

for c in toronto_merged['Cluster Labels'].sort_values().unique():
    print('Profile of cluster {}\n'.format(c))
    display(toronto_merged.loc[toronto_merged['Cluster Labels'] == c, toronto_merged.columns[list(range(8, toronto_merged.shape[1]))]].stack().value_counts().head(5))
    print('\n')

Toronto clusters:

Profile of cluster 0



Restaurant                       8
Palace                           6
Park                             6
Paper / Office Supplies Store    4
Pakistani Restaurant             4
dtype: int64



Profile of cluster 1



Park                             20
Palace                           19
Pakistani Restaurant             19
Paper / Office Supplies Store    18
Outdoor Supply Store             16
dtype: int64



Profile of cluster 2



Palace                           5
Paper / Office Supplies Store    5
Coffee Shop                      5
Park                             5
Pastry Shop                      4
dtype: int64



Profile of cluster 3



Pizza Place                      25
Paper / Office Supplies Store    21
Palace                           20
Coffee Shop                      15
Pakistani Restaurant             15
dtype: int64



Profile of cluster 4



Coffee Shop    90
Café           59
Restaurant     45
Park           40
Bakery         34
dtype: int64



Profile of cluster 5



Pakistani Restaurant    2
Park                    2
Pharmacy                2
Accessories Store       2
Palace                  2
dtype: int64



Profile of cluster 6



Other Great Outdoors    9
Outdoor Sculpture       9
Palace                  9
Pakistani Restaurant    9
Outdoor Supply Store    9
dtype: int64



Profile of cluster 7



Trail                   5
Pakistani Restaurant    5
Outdoor Sculpture       5
Palace                  5
Outdoor Supply Store    5
dtype: int64



Profile of cluster 8



Pakistani Restaurant    3
Park                    3
Other Great Outdoors    3
Accessories Store       3
Palace                  3
dtype: int64



Profile of cluster 9



Seafood Restaurant      1
Pakistani Restaurant    1
Other Great Outdoors    1
Accessories Store       1
Palace                  1
dtype: int64





In [54]:
print('Amsterdam clusters:\n')

for c in amsterdam_merged['Cluster Labels'].sort_values().unique():
    print('Profile of cluster {}\n'.format(c))
    display(amsterdam_merged.loc[amsterdam_merged['Cluster Labels'] == c, amsterdam_merged.columns[list(range(8, amsterdam_merged.shape[1]))]].stack().value_counts().head(5))
    print('\n')

Amsterdam clusters:

Profile of cluster 0



Restaurant           5
Hotel                3
Sandwich Place       2
Coffee Shop          2
Convenience Store    2
dtype: int64



Profile of cluster 1



Park        4
Café        3
Bus Stop    3
Palace      2
Bakery      2
dtype: int64



Profile of cluster 2



Pakistani Restaurant    1
Park                    1
Accessories Store       1
Organic Grocery         1
Palace                  1
dtype: int64



Profile of cluster 4



Restaurant     30
Coffee Shop    29
Café           28
Bar            26
Hotel          22
dtype: int64





## 6. Analyse neighborhood clusters on geographical maps <div id='item6'/>

#### 6.1 Set parameters <div id="item6_1"/>

In [77]:
geolocator = Nominatim(user_agent="_explorer")

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 2.5, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

#### 6.2 Toronto neighborhoods <div id="item6_2"/>

In [79]:
location_t = geolocator.geocode('Toronto, Canada')
latitude_t = location_t.latitude
longitude_t = location_t.longitude

# create map
map_clusters_t = folium.Map(location=[latitude_t, longitude_t], zoom_start=11)

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters_t)
       
map_clusters_t

#### 6.3 Amsterdam neighborhoods <div id="item6_3"/>

In [80]:
location_a = geolocator.geocode('Amsterdam, Netherlands')
latitude_a = location_a.latitude
longitude_a = location_a.longitude

# create map
map_clusters_a = folium.Map(location=[latitude_a, longitude_a], zoom_start=12)

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(amsterdam_merged['Latitude'], amsterdam_merged['Longitude'], amsterdam_merged['Neighborhood'], amsterdam_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters_a)
       
map_clusters_a