## Project plan

### Business problem: We are looking for a place in Toronto to open a Chinese restaurant, where people can enjoy authentic Chinese food for a medium price. It will be a place for people who want to actually enjoy food and the atmosphere. Opening a restaurant can be a risky business in these years, therefore we want to be wise putting your investment by first making a smart decision of the location. 

### 1. First Step: we want to narrow down our choices within the neighbourhoods where already exist some venues and customer base. These neighbourhoods indicate a higher foot traffic. Using Foursquare, we just need to get the venue numbers within a certain radium of each neighbourhood and elimiate those with low numbers.

In [98]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes
import folium # map rendering library

print('Libraries imported.')

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.

Libraries imported.


#### load and prepare data about Toronto neighbourhoods

In [127]:
# web scraping into dataframe

tables = pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M')
df_Toronto = tables[0]

# remove "Not assigned" Borough rows

df_Toronto = df_Toronto[df_Toronto['Borough']!='Not assigned']
df_Toronto.reset_index(drop = True, inplace = True)

# check dataframe

print(df_Toronto.shape)
df_Toronto.head()

(103, 3)


Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


#### add location data to the dataframe

In [128]:
# read pre-processed location data

loc_data = pd.read_csv('https://cocl.us/Geospatial_data')

# merge two dataframe by Postal Code

Neighbourhoods = pd.merge(df_Toronto, loc_data, on = 'Postal Code')

# check revised dataframe

print(Neighbourhoods.shape)
Neighbourhoods.head()

(103, 5)


Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494


In [129]:
# check and remove duplicates
duplicates = Neighbourhoods.index[Neighbourhoods.duplicated(['Neighbourhood'])].tolist()
Neighbourhoods = Neighbourhoods.drop(duplicates)
Neighbourhoods.shape

(99, 5)

#### Examine venues in the neighbourhoods

In [107]:
# define Foursquare credentials and version

CLIENT_ID = 'N5I2BE2MI0DEILEMZNP1XCUIHPKW424RBU0Q25C3WH4JMENA'
CLIENT_SECRET = 'LEMDV5HBTRURZM54TTWPWVBUUGSO43J33J5OSG4JT3PXKESG'
VERSION = '20180605'
LIMIT = 100

In [108]:
# From Foursquare, let's define a function to loop through the neighbourhoods and get nearby venues

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION,
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['id'], 
            v['venue']['name'],
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude',
                  'Venue ID', 
                  'Venue',
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [130]:
# run the defined function

df_venues = getNearbyVenues(names = Neighbourhoods['Neighbourhood'],
                                latitudes = Neighbourhoods['Latitude'],
                                longitudes = Neighbourhoods['Longitude'])

Parkwoods
Victoria Village
Regent Park, Harbourfront
Lawrence Manor, Lawrence Heights
Queen's Park, Ontario Provincial Government
Islington Avenue, Humber Valley Village
Malvern, Rouge
Don Mills
Parkview Hill, Woodbine Gardens
Garden District, Ryerson
Glencairn
West Deane Park, Princess Gardens, Martin Grove, Islington, Cloverdale
Rouge Hill, Port Union, Highland Creek
Woodbine Heights
St. James Town
Humewood-Cedarvale
Eringate, Bloordale Gardens, Old Burnhamthorpe, Markland Wood
Guildwood, Morningside, West Hill
The Beaches
Berczy Park
Caledonia-Fairbanks
Woburn
Leaside
Central Bay Street
Christie
Cedarbrae
Hillcrest Village
Bathurst Manor, Wilson Heights, Downsview North
Thorncliffe Park
Richmond, Adelaide, King
Dufferin, Dovercourt Village
Scarborough Village
Fairview, Henry Farm, Oriole
Northwood Park, York University
East Toronto, Broadview North (Old East York)
Harbourfront East, Union Station, Toronto Islands
Little Portugal, Trinity
Kennedy Park, Ionview, East Birchmount Park
B

In [131]:
# check result

print(df_venues.shape)
df_venues.head()

(2103, 8)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue ID,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,43.753259,-79.329656,4e8d9dcdd5fbbbb6b3003c7b,Brookbanks Park,43.751976,-79.33214,Park
1,Parkwoods,43.753259,-79.329656,4cb11e2075ebb60cd1c4caad,Variety Store,43.751974,-79.333114,Food & Drink Shop
2,Victoria Village,43.725882,-79.315572,4c633acb86b6be9a61268e34,Victoria Village Arena,43.723481,-79.315635,Hockey Arena
3,Victoria Village,43.725882,-79.315572,4f3ecce6e4b0587016b6f30d,Portugril,43.725819,-79.312785,Portuguese Restaurant
4,Victoria Village,43.725882,-79.315572,4bbe904a85fbb713420d7167,Tim Hortons,43.725517,-79.313103,Coffee Shop


In [135]:
# check numbers of venues for each neighbourhood

count_venues = df_venues[['Neighborhood','Venue']].groupby('Neighborhood').count()
print(count_venues.shape)
count_venues.sort_values(by = 'Venue', ascending = False)

(94, 1)


Unnamed: 0_level_0,Venue
Neighborhood,Unnamed: 1_level_1
"Harbourfront East, Union Station, Toronto Islands",100
"First Canadian Place, Underground city",100
"Toronto Dominion Centre, Design Exchange",100
"Commerce Court, Victoria Hotel",100
"Garden District, Ryerson",100
Stn A PO Boxes,98
"Richmond, Adelaide, King",97
St. James Town,81
Church and Wellesley,79
"Fairview, Henry Farm, Oriole",68


#### We want to eliminate the neighbourhoods with less than 20 venues from our options because areas where there is less business going on can imply less people/foot traffic/demand

In [137]:
# convert data type of the "Venue" column 

count_venues['Venue'] = pd.to_numeric(count_venues['Venue'])

# filter the qualified neighbourhoods which should have at least 20 venues

qualified = list(count_venues[count_venues['Venue'] >= 20].index)
len(qualified)  # 31 neighbourhoods are disqualified

31

In [138]:
# keep neighbourhoods with at least 20 venues in dataframe "Neighbourhoods"

Popular_Neighbourhoods = Neighbourhoods[Neighbourhoods['Neighbourhood'].isin(qualified)]
Popular_Neighbourhoods.shape # check again the number of qualified neighbourhoods

(31, 5)

In [139]:
# keep neighbourhoods with at least 20 venues in df_venues

Popular_venues = df_venues[df_venues['Neighborhood'].isin(qualified)]
Popular_venues.shape

(1679, 8)

### 2. Step 2: By clustering neighbourhoods by their venues' categories, we want to know what do people do in different neighbourhoods. For example, we do not want to open our restaurant next to an airport where people will only appreciate fast/convenient food. 

#### Analyze venues of each neighbourhood

In [140]:
# one hot encoding
df_onehot = pd.get_dummies(Popular_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighbourhood column back to dataframe
df_onehot['Neighbourhood'] = Popular_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [df_onehot.columns[-1]] + list(df_onehot.columns[:-1])
df_onehot = df_onehot[fixed_columns]

print(df_onehot.shape) # double check row number
df_onehot.head()

(1679, 225)


Unnamed: 0,Neighbourhood,Accessories Store,Adult Boutique,Afghan Restaurant,American Restaurant,Antique Shop,Aquarium,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Store,Belgian Restaurant,Bike Shop,Bistro,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Station,Butcher,Café,Cajun / Creole Restaurant,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Auditorium,College Cafeteria,College Gym,College Rec Center,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Coworking Space,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Eastern European Restaurant,Electronics Store,Escape Room,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Gas Station,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hobby Shop,Hookah Bar,Hospital,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indoor Play Area,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Korean Restaurant,Lake,Latin American Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Market,Martial Arts School,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Moving Target,Museum,Music Venue,Neighborhood,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Playground,Plaza,Poke Place,Portuguese Restaurant,Poutine Place,Pub,Ramen Restaurant,Record Shop,Restaurant,Roof Deck,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soup Place,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Stationery Store,Steakhouse,Strip Club,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tea Room,Thai Restaurant,Theater,Theme Restaurant,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Women's Store,Yoga Studio
7,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
8,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
9,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
10,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
11,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [141]:
# Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

df_grouped = df_onehot.groupby('Neighbourhood').mean().reset_index()
print(df_grouped.shape)
df_grouped

(31, 225)


Unnamed: 0,Neighbourhood,Accessories Store,Adult Boutique,Afghan Restaurant,American Restaurant,Antique Shop,Aquarium,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Store,Belgian Restaurant,Bike Shop,Bistro,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Station,Butcher,Café,Cajun / Creole Restaurant,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Auditorium,College Cafeteria,College Gym,College Rec Center,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Coworking Space,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Eastern European Restaurant,Electronics Store,Escape Room,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Gas Station,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hobby Shop,Hookah Bar,Hospital,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indoor Play Area,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Korean Restaurant,Lake,Latin American Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Market,Martial Arts School,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Moving Target,Museum,Music Venue,Neighborhood,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Playground,Plaza,Poke Place,Portuguese Restaurant,Poutine Place,Pub,Ramen Restaurant,Record Shop,Restaurant,Roof Deck,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soup Place,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Stationery Store,Steakhouse,Strip Club,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tea Room,Thai Restaurant,Theater,Theme Restaurant,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,"Bathurst Manor, Wilson Heights, Downsview North",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.095238,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.095238,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.047619,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Bedford Park, Lawrence Manor East",0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.043478,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.017544,0.035088,0.0,0.0,0.0,0.0,0.017544,0.017544,0.0,0.035088,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.017544,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.017544,0.052632,0.087719,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.017544,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.035088,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.017544,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.017544,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.017544,0.0,0.017544,0.017544,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"Brockton, Parkdale Village, Exhibition Place",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.12,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04
4,Central Bay Street,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.032258,0.0,0.0,0.0,0.048387,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.177419,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.016129,0.016129,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.016129,0.0,0.0,0.016129,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.016129,0.016129,0.0,0.0,0.0,0.0,0.048387,0.016129,0.0,0.0,0.016129,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.016129,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.016129,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.032258,0.0,0.048387,0.0,0.016129,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.016129,0.0,0.0,0.016129
5,Church and Wellesley,0.0,0.012658,0.012658,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.012658,0.0,0.0,0.012658,0.0,0.012658,0.012658,0.0,0.0,0.025316,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.075949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.012658,0.0,0.0,0.0,0.012658,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.012658,0.012658,0.0,0.0,0.0,0.025316,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.012658,0.037975,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.012658,0.0,0.0,0.0,0.012658,0.0,0.0,0.025316,0.0,0.0,0.012658,0.012658,0.0,0.0,0.0,0.0,0.012658,0.063291,0.0,0.0,0.012658,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.025316,0.025316,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.012658,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.025316,0.012658,0.0,0.037975,0.0,0.012658,0.0,0.012658,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.012658,0.0,0.0,0.063291,0.0,0.0,0.0,0.0,0.0,0.012658,0.012658,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025316
6,"Commerce Court, Victoria Hotel",0.0,0.0,0.0,0.04,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.12,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.03,0.03,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.07,0.0,0.0,0.02,0.01,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0
7,Davisville,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.083333,0.027778,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.027778,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.027778,0.0,0.0,0.055556,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.027778,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,"Fairview, Henry Farm, Oriole",0.029412,0.0,0.0,0.014706,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.029412,0.029412,0.014706,0.014706,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.0,0.0,0.0,0.014706,0.014706,0.014706,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.102941,0.0,0.073529,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014706,0.014706,0.0,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.014706,0.0,0.0,0.029412,0.0,0.029412,0.029412,0.0,0.0,0.0,0.0,0.014706,0.014706,0.0,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.044118,0.0,0.0,0.0,0.014706,0.014706,0.0,0.0,0.0,0.029412,0.014706,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.014706,0.0,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014706,0.0,0.0,0.014706,0.0,0.0,0.0,0.014706,0.0,0.0,0.0,0.0,0.014706,0.0
9,"First Canadian Place, Underground city",0.0,0.0,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.03,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.09,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.01,0.0,0.04,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.04,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.04,0.0,0.0,0.03,0.01,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0


In [142]:
# write a function to sort the venues in descending order

def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [274]:
# Create a dataframe and display the top 5 venues for each neighbourhood

num_top_venues = 5

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighbourhood'] = df_grouped['Neighbourhood']

for ind in np.arange(df_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(df_grouped.iloc[ind, :], num_top_venues)

print(neighborhoods_venues_sorted.shape)
neighborhoods_venues_sorted

(31, 6)


Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,"Bathurst Manor, Wilson Heights, Downsview North",Coffee Shop,Bank,Restaurant,Supermarket,Diner
1,"Bedford Park, Lawrence Manor East",Sandwich Place,Italian Restaurant,Coffee Shop,Greek Restaurant,Thai Restaurant
2,Berczy Park,Coffee Shop,Cocktail Bar,Bakery,Cheese Shop,Seafood Restaurant
3,"Brockton, Parkdale Village, Exhibition Place",Café,Performing Arts Venue,Coffee Shop,Bakery,Breakfast Spot
4,Central Bay Street,Coffee Shop,Sandwich Place,Café,Italian Restaurant,Bubble Tea Shop
5,Church and Wellesley,Coffee Shop,Sushi Restaurant,Japanese Restaurant,Gay Bar,Restaurant
6,"Commerce Court, Victoria Hotel",Coffee Shop,Restaurant,Café,Hotel,Gym
7,Davisville,Pizza Place,Dessert Shop,Sandwich Place,Café,Italian Restaurant
8,"Fairview, Henry Farm, Oriole",Clothing Store,Coffee Shop,Fast Food Restaurant,Restaurant,Accessories Store
9,"First Canadian Place, Underground city",Coffee Shop,Café,Hotel,Japanese Restaurant,Restaurant


#### we want to remove neighbourhoods which don't have any restaurant in their top 5 venue, as such neighbourhoods have not formed a foundation for restaurant business

In [312]:
# loop through each row and create a list of indices of the rows that meet our requirement
temp_list = []
for row in range(len(neighborhoods_venues_sorted)):
    for element in neighborhoods_venues_sorted.iloc[row,1:6].tolist():
        if 'Restaurant' in element:
            temp_list.append(neighborhoods_venues_sorted.iloc[row,0])
            break
temp_list

['Bathurst Manor, Wilson Heights, Downsview North',
 'Bedford Park, Lawrence Manor East',
 'Berczy Park',
 'Central Bay Street',
 'Church and Wellesley',
 'Commerce Court, Victoria Hotel',
 'Davisville',
 'Fairview, Henry Farm, Oriole',
 'First Canadian Place, Underground city',
 'Garden District, Ryerson',
 'High Park, The Junction South',
 'India Bazaar, The Beaches West',
 'Kensington Market, Chinatown, Grange Park',
 'Little Portugal, Trinity',
 "Queen's Park, Ontario Provincial Government",
 'Richmond, Adelaide, King',
 'Runnymede, Swansea',
 'St. James Town',
 'St. James Town, Cabbagetown',
 'Stn A PO Boxes',
 'Studio District',
 'The Danforth West, Riverdale',
 'Thorncliffe Park',
 'Toronto Dominion Centre, Design Exchange',
 'University of Toronto, Harbord',
 'Willowdale, Willowdale East']

In [411]:
# remove neighbourhoods that do not have any restaurant in their top 5 venues in 'neighborhoods_venues_sorted'
neighborhoods_restaurant_w = neighborhoods_venues_sorted[neighborhoods_venues_sorted['Neighbourhood'].isin(temp_list)]
print(neighborhoods_restaurant_w.shape) # check for new number of neighbourhoods
neighborhoods_restaurant_w.head()

(26, 6)


Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,"Bathurst Manor, Wilson Heights, Downsview North",Coffee Shop,Bank,Restaurant,Supermarket,Diner
1,"Bedford Park, Lawrence Manor East",Sandwich Place,Italian Restaurant,Coffee Shop,Greek Restaurant,Thai Restaurant
2,Berczy Park,Coffee Shop,Cocktail Bar,Bakery,Cheese Shop,Seafood Restaurant
4,Central Bay Street,Coffee Shop,Sandwich Place,Café,Italian Restaurant,Bubble Tea Shop
5,Church and Wellesley,Coffee Shop,Sushi Restaurant,Japanese Restaurant,Gay Bar,Restaurant


In [412]:
# remove neighbourhoods that do not have any restaurant in their top 5 venues in df_grouped
df_grouped = df_grouped[df_grouped['Neighbourhood'].isin(temp_list)]
print(df_grouped.shape) # make sure we have the same number of neighbourhoods as the previous table
df_grouped.head() 

(26, 225)


Unnamed: 0,Neighbourhood,Accessories Store,Adult Boutique,Afghan Restaurant,American Restaurant,Antique Shop,Aquarium,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Store,Belgian Restaurant,Bike Shop,Bistro,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Station,Butcher,Café,Cajun / Creole Restaurant,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Auditorium,College Cafeteria,College Gym,College Rec Center,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Coworking Space,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Eastern European Restaurant,Electronics Store,Escape Room,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Gas Station,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hobby Shop,Hookah Bar,Hospital,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indoor Play Area,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Korean Restaurant,Lake,Latin American Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Market,Martial Arts School,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Moving Target,Museum,Music Venue,Neighborhood,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Playground,Plaza,Poke Place,Portuguese Restaurant,Poutine Place,Pub,Ramen Restaurant,Record Shop,Restaurant,Roof Deck,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soup Place,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Stationery Store,Steakhouse,Strip Club,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tea Room,Thai Restaurant,Theater,Theme Restaurant,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,"Bathurst Manor, Wilson Heights, Downsview North",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.095238,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.095238,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.047619,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Bedford Park, Lawrence Manor East",0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.043478,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.017544,0.035088,0.0,0.0,0.0,0.0,0.017544,0.017544,0.0,0.035088,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.017544,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.017544,0.052632,0.087719,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.017544,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.035088,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.017544,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.017544,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.017544,0.0,0.017544,0.017544,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Central Bay Street,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.032258,0.0,0.0,0.0,0.048387,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.177419,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.016129,0.016129,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.016129,0.0,0.0,0.016129,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.016129,0.016129,0.0,0.0,0.0,0.0,0.048387,0.016129,0.0,0.0,0.016129,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.016129,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.016129,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.032258,0.0,0.048387,0.0,0.016129,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.016129,0.0,0.0,0.016129
5,Church and Wellesley,0.0,0.012658,0.012658,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.012658,0.0,0.0,0.012658,0.0,0.012658,0.012658,0.0,0.0,0.025316,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.075949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.012658,0.0,0.0,0.0,0.012658,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.012658,0.012658,0.0,0.0,0.0,0.025316,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.012658,0.037975,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.012658,0.0,0.0,0.0,0.012658,0.0,0.0,0.025316,0.0,0.0,0.012658,0.012658,0.0,0.0,0.0,0.0,0.012658,0.063291,0.0,0.0,0.012658,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.025316,0.025316,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.012658,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.025316,0.012658,0.0,0.037975,0.0,0.012658,0.0,0.012658,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,0.012658,0.0,0.0,0.063291,0.0,0.0,0.0,0.0,0.0,0.012658,0.012658,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025316


#### Cluser the neighbourhoods by their venue categories

In [413]:
# set number of clusters
kclusters = 3

df_grouped_clustering = df_grouped.drop('Neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=11).fit(df_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([0, 0, 1, 1, 1, 1, 0, 1, 1, 1])

In [414]:
# add clustering labels
neighborhoods_restaurant_w.insert(0, 'Cluster Labels', kmeans.labels_)

df_merged = Popular_Neighbourhoods[Popular_Neighbourhoods['Neighbourhood'].isin(temp_list)]

# merge df_grouped with df_neighbourhoods to add latitude/longitude for each neighborhood
df_merged = df_merged.join(neighborhoods_restaurant_w.set_index('Neighbourhood'), on='Neighbourhood')
print(df_merged.shape)
df_merged.head()

(26, 11)


Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,1,Coffee Shop,Sushi Restaurant,Creperie,Smoothie Shop,Burrito Place
9,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,1,Coffee Shop,Clothing Store,Café,Japanese Restaurant,Middle Eastern Restaurant
15,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,1,Coffee Shop,Café,American Restaurant,Clothing Store,Cocktail Bar
20,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,1,Coffee Shop,Cocktail Bar,Bakery,Cheese Shop,Seafood Restaurant
24,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383,1,Coffee Shop,Sandwich Place,Café,Italian Restaurant,Bubble Tea Shop


In [415]:
# Lets see our clustered groups
df_merged.groupby('Cluster Labels').count()

Unnamed: 0_level_0,Postal Code,Borough,Neighbourhood,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
Cluster Labels,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
0,10,10,10,10,10,10,10,10,10,10
1,15,15,15,15,15,15,15,15,15,15
2,1,1,1,1,1,1,1,1,1,1


#### Examine clusters

In [416]:
# Cluster 1
df_merged.loc[df_merged['Cluster Labels'] == 0, df_merged.columns[[1] + list(range(5, df_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
28,North York,0,Coffee Shop,Bank,Restaurant,Supermarket,Diner
29,East York,0,Indian Restaurant,Sandwich Place,Yoga Studio,Supermarket,Gym
47,East Toronto,0,Sandwich Place,Fast Food Restaurant,Park,Sushi Restaurant,Pet Store
55,North York,0,Sandwich Place,Italian Restaurant,Coffee Shop,Greek Restaurant,Thai Restaurant
59,North York,0,Ramen Restaurant,Sandwich Place,Café,Shopping Mall,Restaurant
69,West Toronto,0,Thai Restaurant,Café,Mexican Restaurant,Bakery,Fried Chicken Joint
79,Central Toronto,0,Pizza Place,Dessert Shop,Sandwich Place,Café,Italian Restaurant
80,Downtown Toronto,0,Café,Italian Restaurant,Japanese Restaurant,Bar,Bookstore
81,West Toronto,0,Café,Coffee Shop,Pizza Place,Diner,Sushi Restaurant
96,Downtown Toronto,0,Coffee Shop,Pizza Place,Bakery,Restaurant,Café


In [417]:
# Cluster 2
df_merged.loc[df_merged['Cluster Labels'] == 1, df_merged.columns[[1] + list(range(5, df_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
4,Downtown Toronto,1,Coffee Shop,Sushi Restaurant,Creperie,Smoothie Shop,Burrito Place
9,Downtown Toronto,1,Coffee Shop,Clothing Store,Café,Japanese Restaurant,Middle Eastern Restaurant
15,Downtown Toronto,1,Coffee Shop,Café,American Restaurant,Clothing Store,Cocktail Bar
20,Downtown Toronto,1,Coffee Shop,Cocktail Bar,Bakery,Cheese Shop,Seafood Restaurant
24,Downtown Toronto,1,Coffee Shop,Sandwich Place,Café,Italian Restaurant,Bubble Tea Shop
30,Downtown Toronto,1,Coffee Shop,Café,Restaurant,Hotel,Deli / Bodega
33,North York,1,Clothing Store,Coffee Shop,Fast Food Restaurant,Restaurant,Accessories Store
37,West Toronto,1,Bar,Coffee Shop,Vietnamese Restaurant,Vegetarian / Vegan Restaurant,Café
42,Downtown Toronto,1,Coffee Shop,Hotel,Café,Salad Place,Restaurant
48,Downtown Toronto,1,Coffee Shop,Restaurant,Café,Hotel,Gym


In [418]:
# Cluster 3
df_merged.loc[df_merged['Cluster Labels'] == 2, df_merged.columns[[1] + list(range(5, df_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
41,East Toronto,2,Greek Restaurant,Italian Restaurant,Coffee Shop,Furniture / Home Store,Ice Cream Shop


#### Visualization of the neighbourhood clusters

In [419]:
# first we need to convert the 'Cluster Labels' column into interger

df_merged['Cluster Labels'] = df_merged['Cluster Labels'].astype(int)
df_merged.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,1,Coffee Shop,Sushi Restaurant,Creperie,Smoothie Shop,Burrito Place
9,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,1,Coffee Shop,Clothing Store,Café,Japanese Restaurant,Middle Eastern Restaurant
15,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,1,Coffee Shop,Café,American Restaurant,Clothing Store,Cocktail Bar
20,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,1,Coffee Shop,Cocktail Bar,Bakery,Cheese Shop,Seafood Restaurant
24,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383,1,Coffee Shop,Sandwich Place,Café,Italian Restaurant,Bubble Tea Shop


In [420]:
# create map

address = 'Toronto'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(df_merged['Latitude'], df_merged['Longitude'], df_merged['Neighbourhood'], df_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Decision making of this phase: As we see from the clusters and map, neighbourhoods in cluster 1 (masked as purple) are covered with mostly coffee shops where might not be places that people usually go for decent dinner. In addition, these neighbourhoods mostly locate in the very center of Toronto, meaning higher investment especially in rent. As a new restaurant, it can be very risky although foot traffic is high. In the contrary, neighbourhoods in cluster 0&2 (marked as red & green) locate further away from the center and have a great variety and number of restaurants. Therefore, we narrow down our chose in cluster 0&2 .

In [573]:
# Let's take a look at the complete info for cluster 0&2 neighbourhoods
Potential_neighbourhoods = df_merged[df_merged['Cluster Labels'] != 1 ]
print(Potential_neighbourhoods.shape)  # always double check the number of rows
Potential_neighbourhoods

(11, 11)


Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
28,M3H,North York,"Bathurst Manor, Wilson Heights, Downsview North",43.754328,-79.442259,0,Coffee Shop,Bank,Restaurant,Supermarket,Diner
29,M4H,East York,Thorncliffe Park,43.705369,-79.349372,0,Indian Restaurant,Sandwich Place,Yoga Studio,Supermarket,Gym
41,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,2,Greek Restaurant,Italian Restaurant,Coffee Shop,Furniture / Home Store,Ice Cream Shop
47,M4L,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572,0,Sandwich Place,Fast Food Restaurant,Park,Sushi Restaurant,Pet Store
55,M5M,North York,"Bedford Park, Lawrence Manor East",43.733283,-79.41975,0,Sandwich Place,Italian Restaurant,Coffee Shop,Greek Restaurant,Thai Restaurant
59,M2N,North York,"Willowdale, Willowdale East",43.77012,-79.408493,0,Ramen Restaurant,Sandwich Place,Café,Shopping Mall,Restaurant
69,M6P,West Toronto,"High Park, The Junction South",43.661608,-79.464763,0,Thai Restaurant,Café,Mexican Restaurant,Bakery,Fried Chicken Joint
79,M4S,Central Toronto,Davisville,43.704324,-79.38879,0,Pizza Place,Dessert Shop,Sandwich Place,Café,Italian Restaurant
80,M5S,Downtown Toronto,"University of Toronto, Harbord",43.662696,-79.400049,0,Café,Italian Restaurant,Japanese Restaurant,Bar,Bookstore
81,M6S,West Toronto,"Runnymede, Swansea",43.651571,-79.48445,0,Café,Coffee Shop,Pizza Place,Diner,Sushi Restaurant


### 3. Step 3: We want to open our restaurant at a neighbourhood where already has an aisan food consumer base

#### We want to open our Chinese restaurant in a neighbourhood where already exists some asian restaurants and accumulated some Asian food fans. Thus, in the top 5 venues from the table, we will keep the neighbourhoods that have asian restaurants, including Japanese, Indian, Thai, Ramen, Sushi. So we remove neighbourhood with Postal Code ['M3H', 'M4K', 'M4S','M4X']

In [574]:
# remove diqualified neighbourhoods
remove = ['M3H','M4K','M4S','M4X']
Potential_neighbourhoods = Potential_neighbourhoods[~Potential_neighbourhoods['Postal Code'].isin(remove)]
Potential_neighbourhoods # now we get 7 neighbourhoods left

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
29,M4H,East York,Thorncliffe Park,43.705369,-79.349372,0,Indian Restaurant,Sandwich Place,Yoga Studio,Supermarket,Gym
47,M4L,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572,0,Sandwich Place,Fast Food Restaurant,Park,Sushi Restaurant,Pet Store
55,M5M,North York,"Bedford Park, Lawrence Manor East",43.733283,-79.41975,0,Sandwich Place,Italian Restaurant,Coffee Shop,Greek Restaurant,Thai Restaurant
59,M2N,North York,"Willowdale, Willowdale East",43.77012,-79.408493,0,Ramen Restaurant,Sandwich Place,Café,Shopping Mall,Restaurant
69,M6P,West Toronto,"High Park, The Junction South",43.661608,-79.464763,0,Thai Restaurant,Café,Mexican Restaurant,Bakery,Fried Chicken Joint
80,M5S,Downtown Toronto,"University of Toronto, Harbord",43.662696,-79.400049,0,Café,Italian Restaurant,Japanese Restaurant,Bar,Bookstore
81,M6S,West Toronto,"Runnymede, Swansea",43.651571,-79.48445,0,Café,Coffee Shop,Pizza Place,Diner,Sushi Restaurant


### 4. Step 4: Now from the Popular_venues table, we want to analyze restaurants' reputation in these 7 neighbourhoods. Reason is: We often hear people recommend places like "I heard XXX area has some really nice XX food". Restaurants' joint reputation in an area often forms a general first impression for consumers which may influence the possibility of visiting the place.

In [519]:
# get venues in these 7 neighbourhoods
Venues_Reputation = Popular_venues[Popular_venues['Neighborhood'].isin(list(Potential_neighbourhoods['Neighbourhood']))]
print(Venues_Reputation.shape)                                  
Venues_Reputation

(196, 8)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue ID,Venue,Venue Latitude,Venue Longitude,Venue Category
546,Thorncliffe Park,43.705369,-79.349372,5b56812a4420d8002c58cefa,Costco,43.707051,-79.348093,Warehouse Store
547,Thorncliffe Park,43.705369,-79.349372,56c5d5e1cd106ec35067eeee,Fit4Less,43.705689,-79.346018,Gym
548,Thorncliffe Park,43.705369,-79.349372,4daf08e66e81e2dffdd4fe40,Iqbal Kebab & Sweet Centre,43.705923,-79.351521,Indian Restaurant
549,Thorncliffe Park,43.705369,-79.349372,4d8f75f5d4ec8cfa991e8c89,Bikram Yoga East York,43.70545,-79.351448,Yoga Studio
550,Thorncliffe Park,43.705369,-79.349372,51465d1ae4b0251c38008ee6,Hero Certified Burgers,43.705511,-79.347064,Burger Joint
551,Thorncliffe Park,43.705369,-79.349372,4d4c108e1ae437043a2ff060,Shoppers Drug Mart,43.70581,-79.347044,Pharmacy
552,Thorncliffe Park,43.705369,-79.349372,4b17edabf964a520c9c923e3,Subway,43.704596,-79.34967,Sandwich Place
553,Thorncliffe Park,43.705369,-79.349372,4bed9f2fbac3c9b6ad93fee9,Hakka Garden,43.704578,-79.34977,Indian Restaurant
554,Thorncliffe Park,43.705369,-79.349372,4b04708bf964a5202a5422e3,Iqbal foods,43.705751,-79.352054,Grocery Store
555,Thorncliffe Park,43.705369,-79.349372,4cf46801c9af6dcbdcf5ab7f,Petro-Canada,43.704058,-79.348094,Gas Station


In [520]:
# we only care about restaurants' reputation, so let's only keep venues that has "Restaurant" in their "Venue Category"
Venues_Reputation = Venues_Reputation[Venues_Reputation['Venue Category'].str.contains('Restaurant')]
Venues_Reputation.reset_index(drop = True, inplace = True)
print(Venues_Reputation.shape)
Venues_Reputation

(51, 8)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue ID,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Thorncliffe Park,43.705369,-79.349372,4daf08e66e81e2dffdd4fe40,Iqbal Kebab & Sweet Centre,43.705923,-79.351521,Indian Restaurant
1,Thorncliffe Park,43.705369,-79.349372,4bed9f2fbac3c9b6ad93fee9,Hakka Garden,43.704578,-79.34977,Indian Restaurant
2,Thorncliffe Park,43.705369,-79.349372,5d30009ce299c90008a9820d,A&W,43.706275,-79.34467,Fast Food Restaurant
3,Thorncliffe Park,43.705369,-79.349372,4b564423f964a520520828e3,Swiss Chalet,43.707786,-79.344132,Restaurant
4,"India Bazaar, The Beaches West",43.668999,-79.315572,4c1169396e5dc9b61b10b02d,The Burger's Priest,43.666731,-79.315556,Fast Food Restaurant
5,"India Bazaar, The Beaches West",43.668999,-79.315572,4fd52f42e4b0b916eb02ab1b,O Sushi,43.666684,-79.316614,Sushi Restaurant
6,"India Bazaar, The Beaches West",43.668999,-79.315572,4ba0153bf964a520995837e3,Casa di Giorgio,43.666645,-79.315204,Italian Restaurant
7,"India Bazaar, The Beaches West",43.668999,-79.315572,4b1b0a31f964a520ebf623e3,Harvey's,43.666528,-79.315127,Restaurant
8,"India Bazaar, The Beaches West",43.668999,-79.315572,4d9620afb188721e907bef36,KFC,43.666624,-79.315916,Fast Food Restaurant
9,"Bedford Park, Lawrence Manor East",43.733283,-79.41975,588cf74e5289302f30e711e1,Darbar Persian Grill,43.735484,-79.420006,Restaurant


In [None]:
# define a function to get ratings of each venue from Foursquare

for ID in Venues_Reputation['Venue ID']:
    print(ID)
            
    # create the API request URL
    url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(
        ID,
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION)
            
    # make the GET request
    try:
        rating = requests.get(url).json()['response']['venue']['rating']
        rating_list.append(rating)
    
    except Error:
        rating_list.append('missing')
        pass

In [553]:
# let's check out the ratings, there is one venue has no rating
print(len(rating_list))
rating_list

51


[7.5,
 6.3,
 6.4,
 5.9,
 7.7,
 7.4,
 7.1,
 6.2,
 5.5,
 7.3,
 7.3,
 6.9,
 6.9,
 6.9,
 6.8,
 7.0,
 6.3,
 6.3,
 8.2,
 7.6,
 6.8,
 6.8,
 6.5,
 6.2,
 6.4,
 6.2,
 6.1,
 'missing',
 9.0,
 7.7,
 7.3,
 6.9,
 6.4,
 6.4,
 5.8,
 8.9,
 9.2,
 8.2,
 8.1,
 7.6,
 7.3,
 7.6,
 6.4,
 7.3,
 8.3,
 7.2,
 7.2,
 7.1,
 7.4,
 7.3,
 6.9]

In [554]:
# insert the rating list as a new column into the venue table

Venues_Reputation['Rating'] = rating_list
print(Venues_Reputation.shape)
Venues_Reputation

(51, 9)


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue ID,Venue,Venue Latitude,Venue Longitude,Venue Category,Rating
0,Thorncliffe Park,43.705369,-79.349372,4daf08e66e81e2dffdd4fe40,Iqbal Kebab & Sweet Centre,43.705923,-79.351521,Indian Restaurant,7.5
1,Thorncliffe Park,43.705369,-79.349372,4bed9f2fbac3c9b6ad93fee9,Hakka Garden,43.704578,-79.34977,Indian Restaurant,6.3
2,Thorncliffe Park,43.705369,-79.349372,5d30009ce299c90008a9820d,A&W,43.706275,-79.34467,Fast Food Restaurant,6.4
3,Thorncliffe Park,43.705369,-79.349372,4b564423f964a520520828e3,Swiss Chalet,43.707786,-79.344132,Restaurant,5.9
4,"India Bazaar, The Beaches West",43.668999,-79.315572,4c1169396e5dc9b61b10b02d,The Burger's Priest,43.666731,-79.315556,Fast Food Restaurant,7.7
5,"India Bazaar, The Beaches West",43.668999,-79.315572,4fd52f42e4b0b916eb02ab1b,O Sushi,43.666684,-79.316614,Sushi Restaurant,7.4
6,"India Bazaar, The Beaches West",43.668999,-79.315572,4ba0153bf964a520995837e3,Casa di Giorgio,43.666645,-79.315204,Italian Restaurant,7.1
7,"India Bazaar, The Beaches West",43.668999,-79.315572,4b1b0a31f964a520ebf623e3,Harvey's,43.666528,-79.315127,Restaurant,6.2
8,"India Bazaar, The Beaches West",43.668999,-79.315572,4d9620afb188721e907bef36,KFC,43.666624,-79.315916,Fast Food Restaurant,5.5
9,"Bedford Park, Lawrence Manor East",43.733283,-79.41975,588cf74e5289302f30e711e1,Darbar Persian Grill,43.735484,-79.420006,Restaurant,7.3


In [555]:
# filter out the venue without rating and select relavant columns for further analysis

Rated_Restaurants = Venues_Reputation.drop(27)
Rated_Restaurants = Rated_Restaurants[['Neighborhood','Rating']]
print(Rated_Restaurants.shape)
Rated_Restaurants

(50, 2)


Unnamed: 0,Neighborhood,Rating
0,Thorncliffe Park,7.5
1,Thorncliffe Park,6.3
2,Thorncliffe Park,6.4
3,Thorncliffe Park,5.9
4,"India Bazaar, The Beaches West",7.7
5,"India Bazaar, The Beaches West",7.4
6,"India Bazaar, The Beaches West",7.1
7,"India Bazaar, The Beaches West",6.2
8,"India Bazaar, The Beaches West",5.5
9,"Bedford Park, Lawrence Manor East",7.3


In [556]:
# Getting counts of restaurants for each neighborhood

counts = Rated_Restaurants.groupby('Neighborhood').count()
counts

Unnamed: 0_level_0,Rating
Neighborhood,Unnamed: 1_level_1
"Bedford Park, Lawrence Manor East",9
"High Park, The Junction South",7
"India Bazaar, The Beaches West",5
"Runnymede, Swansea",8
Thorncliffe Park,4
"University of Toronto, Harbord",8
"Willowdale, Willowdale East",9


In [557]:
# Getting average ratings of restaurants for each neighborhood

Rated_Restaurants['Rating'] = pd.to_numeric(Rated_Restaurants['Rating'])
AvgRating = Rated_Restaurants.groupby('Neighborhood').mean()
AvgRating

Unnamed: 0_level_0,Rating
Neighborhood,Unnamed: 1_level_1
"Bedford Park, Lawrence Manor East",6.855556
"High Park, The Junction South",7.071429
"India Bazaar, The Beaches West",6.78
"Runnymede, Swansea",7.3375
Thorncliffe Park,6.525
"University of Toronto, Harbord",7.9125
"Willowdale, Willowdale East",6.755556


In [567]:
# summarize the results by merging the above two tables

summary = pd.merge(counts, AvgRating, on = 'Neighborhood')
summary.reset_index(level = 0, inplace = True)
summary.rename(columns = {'Neighborhood':'Neighbourhood', 'Rating_x': 'Restaurant Counts', 'Rating_y': 'Restaurant Avg. Rating'}, inplace = True)
summary.sort_values(by = 'Restaurant Avg. Rating', ascending = False)

Unnamed: 0,Neighbourhood,Restaurant Counts,Restaurant Avg. Rating
5,"University of Toronto, Harbord",8,7.9125
3,"Runnymede, Swansea",8,7.3375
1,"High Park, The Junction South",7,7.071429
0,"Bedford Park, Lawrence Manor East",9,6.855556
2,"India Bazaar, The Beaches West",5,6.78
6,"Willowdale, Willowdale East",9,6.755556
4,Thorncliffe Park,4,6.525


In [575]:
# to have a bigger picture, let's merge this table with the dataframe Potential_neighbourhoods

Potential_neighbourhoods = pd.merge(summary, Potential_neighbourhoods, on = 'Neighbourhood')
Potential_neighbourhoods.drop(['Cluster Labels'], axis = 1, inplace = True)
Potential_neighbourhoods

Unnamed: 0,Neighbourhood,Restaurant Counts,Restaurant Avg. Rating,Postal Code,Borough,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,"Bedford Park, Lawrence Manor East",9,6.855556,M5M,North York,43.733283,-79.41975,Sandwich Place,Italian Restaurant,Coffee Shop,Greek Restaurant,Thai Restaurant
1,"High Park, The Junction South",7,7.071429,M6P,West Toronto,43.661608,-79.464763,Thai Restaurant,Café,Mexican Restaurant,Bakery,Fried Chicken Joint
2,"India Bazaar, The Beaches West",5,6.78,M4L,East Toronto,43.668999,-79.315572,Sandwich Place,Fast Food Restaurant,Park,Sushi Restaurant,Pet Store
3,"Runnymede, Swansea",8,7.3375,M6S,West Toronto,43.651571,-79.48445,Café,Coffee Shop,Pizza Place,Diner,Sushi Restaurant
4,Thorncliffe Park,4,6.525,M4H,East York,43.705369,-79.349372,Indian Restaurant,Sandwich Place,Yoga Studio,Supermarket,Gym
5,"University of Toronto, Harbord",8,7.9125,M5S,Downtown Toronto,43.662696,-79.400049,Café,Italian Restaurant,Japanese Restaurant,Bar,Bookstore
6,"Willowdale, Willowdale East",9,6.755556,M2N,North York,43.77012,-79.408493,Ramen Restaurant,Sandwich Place,Café,Shopping Mall,Restaurant


#### The last decision making based on the above table can be more subjective. Without further information support, I will remove the neighborhoods which have averages of lower than 7 for all restaurans within 500 meters radius.  Now, we have three neighbourhoods left: University of Toronto, High Park, and Runnymede.

In [576]:
Potential_neighbourhoods = Potential_neighbourhoods[Potential_neighbourhoods['Restaurant Avg. Rating'] >= 7]
Potential_neighbourhoods

Unnamed: 0,Neighbourhood,Restaurant Counts,Restaurant Avg. Rating,Postal Code,Borough,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
1,"High Park, The Junction South",7,7.071429,M6P,West Toronto,43.661608,-79.464763,Thai Restaurant,Café,Mexican Restaurant,Bakery,Fried Chicken Joint
3,"Runnymede, Swansea",8,7.3375,M6S,West Toronto,43.651571,-79.48445,Café,Coffee Shop,Pizza Place,Diner,Sushi Restaurant
5,"University of Toronto, Harbord",8,7.9125,M5S,Downtown Toronto,43.662696,-79.400049,Café,Italian Restaurant,Japanese Restaurant,Bar,Bookstore


### Summary: In the above three neighbourhoods that 'survive' through our filtering, they are statistically good locations to open a Chinese restaurant, with their own pros and cons.
### Univerysity of Toronto, Harbord: highest average rating for restaurants; closest to the center which may mean higher rent; having the 2nd most venues (33) with a good diversity in 500 meter radius brings good foot traffic. 
### Runnymede, Swansea: second highest average rating for restaurants in the area; having the highest number of venues in the area (39) may bring the best traffic compared with the other two options. 
### High Park, The Junction South: lowest average rating for restaurants in the area, only slightly above 7 while the the highest rated one almost reaches 8; Thai restaurants is the most common venue in this area, which may already form a good asian food consumer base.