## 1) Introduction/Business Problem 

### 1.1 Background 
Toronto is the largest city in Ontario, Canada. It is a large and diverse metropolis that is home to 2.93 million people. Over 180 languages and dialects are spoken in Toronto, with 79 multi-lingual publications published in Toronto. This is definitely a multi-cultural city. There are approximately 7,500 restaurants, bars and nightclubs in Toronto.
### 1.2 Business Proposition 
Sally Roberts has recently returned to Toronto after living in Japan for 5 years. While living in Tokyo, she fell in love with Japanese food. She has decided to open a Japanese restaurant in Toronto. As more people are looking towards improving their eating habits, Japanese food offers a delicious and healthy option. This is an additional reason why Sally feels she could have great success and potentially consistent profit. However, as with any business, opening a new restaurant requires serious considerations and good planning. The most important consideration is the location of the restaurant. So, this project will attempt to answer the questions “Where should the investor open a Japanese Restaurant?” and “How many similar restaurants are operating in the area under consideration?” This study aims to help the client gain a better understanding of the boroughs and neighbourhoods of Toronto in terms of restaurant density. This should help the client decide which area would be best to open a Japanese restaurant.
### 1.3 Target Audience 
This Japanese restaurant will be aimed at people from all walks of life so that would include all of Toronto’s population of 2.9 million as well as its 27.5 tourists (annually). A central location would add to the appeal along with the price and menu choices.

## 2) Data 

### 2.1 Data Description 
In order to complete this study, data is needed to analyse restaurants in Toronto. Data relating to Toronto’s boroughs and neighbourhoods as well data about restaurants in these areas.
### 2.1 Data Sources 
Data will be obtained from the following sources: \ https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M \ o this will provide the data for the different boroughs and neighbourhoods in Toronto with postcodes.\ http://cocl.us/Geospatial_data \ o this source will provide the geographical data for each location i.e. its longitude and latitude.\ Foursquare APIs \ o This source has the all the venue data which will then be filtered for this specific requirement – restaurants in Toronto and Japanese restaurants in Toronto.

### Import all required Libraries

In [1]:
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files


#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library
from folium import plugins
from folium.plugins import MarkerCluster

print('Libraries imported.')

Libraries imported.


### Get Toronto Borough and Neighbourhood Data

In [2]:
# download url data from internet
url = "https://en.wikipedia.org/w/index.php?title=List_of_postal_codes_of_Canada:_M&oldid=1011037969"
wiki_page = requests.get(url)

In [3]:
#load data from wiki page to a dataframe
df_raw = pd.read_html(wiki_page.content, header=0)[0]
df_raw.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"


#### Remove all rows where there is 'Not Assigned' data

In [4]:
# Ignore cells where the Borough is not assigned.
df_new = df_raw[df_raw.Borough != 'Not assigned']

df_new.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
5,M6A,North York,"Lawrence Manor, Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


### All neighborhoods with the same Postal Code placed in the same row.
Each neighbourhood will be separated by a comma


In [5]:
#Reset the index
df_toronto = df_new.groupby(['Postal Code', 'Borough'])['Neighbourhood'].apply(lambda x: ', '.join(x))
df_toronto = df_toronto.reset_index()

#Rename the column names as expected for assignment
df_toronto.rename(columns = {'Postal Code':'PostalCode'}, inplace = True)
df_toronto.rename(columns = {'Neighbourhood':'Neighborhood'}, inplace = True)
df_toronto.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1B,Scarborough,"Malvern, Rouge"
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


In [6]:
#Check shape of dataframe - rows & columns
df_toronto.shape

(103, 3)

### Get the Longitude and Latitude Coordinates of each Neighbourhood

In [7]:
!pip install geocoder

Collecting geocoder
  Downloading geocoder-1.38.1-py2.py3-none-any.whl (98 kB)
[K     |████████████████████████████████| 98 kB 8.2 MB/s  eta 0:00:01
Collecting ratelim
  Downloading ratelim-0.1.6-py2.py3-none-any.whl (4.0 kB)
Installing collected packages: ratelim, geocoder
Successfully installed geocoder-1.38.1 ratelim-0.1.6


In [8]:
import geocoder
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

In [9]:
#read geospatial data file
url = 'https://cocl.us/Geospatial_data'
geotable = pd.read_csv(url)
geotable.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [10]:
#append lat & long into the dataframe
toronto = pd.merge(df_toronto, geotable, left_on='PostalCode', right_on='Postal Code', left_index=False, right_index=False)
toronto.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Postal Code,Latitude,Longitude
0,M1B,Scarborough,"Malvern, Rouge",M1B,43.806686,-79.194353
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",M1C,43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",M1E,43.763573,-79.188711
3,M1G,Scarborough,Woburn,M1G,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,M1H,43.773136,-79.239476


In [11]:
#drop extra Postal Code column
toronto.drop(columns = ['Postal Code'], inplace = True)

In [12]:
toronto.shape

(103, 5)

### Use geopy library to get the latitude and longitude values of Toronto.

In [13]:
address = 'Toronto, Ontario'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.6534817, -79.3839347.


### Create a Map of Toronto 

In [14]:
# create map of Toronto using latitude and longitude values
map_Toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(toronto['Latitude'], toronto['Longitude'], toronto['Borough'], toronto['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_Toronto)  
    
map_Toronto

### Exploring Toronto (Downtown, East, West, Central) only

In [15]:
toronto = toronto[toronto['Borough'].str.contains("Toronto")==True]
toronto.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
37,M4E,East Toronto,The Beaches,43.676357,-79.293031
41,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188
42,M4L,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572
43,M4M,East Toronto,Studio District,43.659526,-79.340923
44,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879


In [17]:
toronto

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
37,M4E,East Toronto,The Beaches,43.676357,-79.293031
41,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188
42,M4L,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572
43,M4M,East Toronto,Studio District,43.659526,-79.340923
44,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879
45,M4P,Central Toronto,Davisville North,43.712751,-79.390197
46,M4R,Central Toronto,"North Toronto West, Lawrence Park",43.715383,-79.405678
47,M4S,Central Toronto,Davisville,43.704324,-79.38879
48,M4T,Central Toronto,"Moore Park, Summerhill East",43.689574,-79.38316
49,M4V,Central Toronto,"Summerhill West, Rathnelly, South Hill, Forest...",43.686412,-79.400049


### Define Foursquare Credentials and Version

In [18]:
CLIENT_ID ='C3V5GZTFZEUTLE0W2F1JWHULJPSTK4HWCBHX51PZLVGISDTA' # your Foursquare ID
CLIENT_SECRET = 'XLKHQMUPV45ZRUVFU24UWHNFQC0HP51E1GKBEHR33TAB30Q1' # your Foursquare Secret
VERSION = '20210320' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: C3V5GZTFZEUTLE0W2F1JWHULJPSTK4HWCBHX51PZLVGISDTA
CLIENT_SECRET:XLKHQMUPV45ZRUVFU24UWHNFQC0HP51E1GKBEHR33TAB30Q1


In [19]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            100)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        print(name, len(results))
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['PostalCode', 
                  'PostalCode Latitude', 
                  'PostalCode Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

### Get Venues 

In [20]:
toronto_venues = getNearbyVenues(names=toronto['PostalCode'],
                                   latitudes=toronto['Latitude'],
                                   longitudes=toronto['Longitude']
                                  )

M4E
M4E 5
M4K
M4K 43
M4L
M4L 20
M4M
M4M 38
M4N
M4N 3
M4P
M4P 8
M4R
M4R 16
M4S
M4S 34
M4T
M4T 4
M4V
M4V 15
M4W
M4W 4
M4X
M4X 44
M4Y
M4Y 70
M5A
M5A 45
M5B
M5B 100
M5C
M5C 78
M5E
M5E 57
M5G
M5G 64
M5H
M5H 93
M5J
M5J 100
M5K
M5K 100
M5L
M5L 100
M5N
M5N 3
M5P
M5P 4
M5R
M5R 22
M5S
M5S 33
M5T
M5T 61
M5V
M5V 14
M5W
M5W 99
M5X
M5X 100
M6G
M6G 16
M6H
M6H 14
M6J
M6J 44
M6K
M6K 23
M6N
M6N 4
M6P
M6P 24
M6R
M6R 15
M6S
M6S 38
M7A
M7A 30
M7Y
M7Y 19


In [21]:
toronto_venues.head()

Unnamed: 0,PostalCode,PostalCode Latitude,PostalCode Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,M4E,43.676357,-79.293031,Glen Manor Ravine,43.676821,-79.293942,Trail
1,M4E,43.676357,-79.293031,The Big Carrot Natural Food Market,43.678879,-79.297734,Health Food Store
2,M4E,43.676357,-79.293031,Grover Pub and Grub,43.679181,-79.297215,Pub
3,M4E,43.676357,-79.293031,Domino's Pizza,43.679058,-79.297382,Pizza Place
4,M4E,43.676357,-79.293031,Upper Beaches,43.680563,-79.292869,Neighborhood


### Finding near by venues


In [22]:
len(toronto_venues['Venue'].unique()) #Number of venues

1040

In [23]:
len(toronto_venues['Venue Category'].unique()) #Number of types of venues

233

### Food Selling Venues

In [24]:
toronto_venues = toronto_venues[toronto_venues['Venue Category'].str.contains("Restaurant|Pizza|Burger|Diner|Salad|BBQ Joint|Food|Burrito")==True]
toronto_venues.head()

Unnamed: 0,PostalCode,PostalCode Latitude,PostalCode Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
1,M4E,43.676357,-79.293031,The Big Carrot Natural Food Market,43.678879,-79.297734,Health Food Store
3,M4E,43.676357,-79.293031,Domino's Pizza,43.679058,-79.297382,Pizza Place
6,M4K,43.679557,-79.352188,Pantheon,43.677621,-79.351434,Greek Restaurant
7,M4K,43.679557,-79.352188,Cafe Fiorentina,43.677743,-79.350115,Italian Restaurant
11,M4K,43.679557,-79.352188,Mezes,43.677962,-79.350196,Greek Restaurant


In [25]:
len(toronto_venues['Venue'].unique())

327

In [26]:
len(toronto_venues['Venue Category'].unique())

56

In [27]:
toronto_venues.shape

(475, 7)

### Japanese Restaurants in Toronto

In [28]:
toronto_venues_Jpn = toronto_venues[toronto_venues['Venue Category'].str.contains("Japanese|Sushi")==True]
toronto_venues_Jpn.head()

Unnamed: 0,PostalCode,PostalCode Latitude,PostalCode Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
52,M4L,43.668999,-79.315572,O Sushi,43.666684,-79.316614,Sushi Restaurant
141,M4S,43.704324,-79.38879,Sakae Sushi,43.704944,-79.388704,Sushi Restaurant
150,M4S,43.704324,-79.38879,Hokkaido Sushi,43.708082,-79.389995,Sushi Restaurant
174,M4V,43.686412,-79.400049,Daeco Sushi,43.687838,-79.395652,Sushi Restaurant
195,M4X,43.667967,-79.367675,Kingyo Toronto,43.665895,-79.368415,Japanese Restaurant


In [29]:
toronto_venues_Jpn.shape

(59, 7)

In [30]:
# import k-means from clustering stage
from sklearn.cluster import KMeans
import requests
import numpy as np

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


In [31]:
map_toronto = folium.Map(location=[43.6532, -79.3832], zoom_start=11)
map_toronto

### Adding the Food Selling Venues to the Map as Markers

In [32]:
# add markers to map
for lat, lng, postal, neighborhood in zip(toronto_venues['Venue Latitude'], toronto_venues['Venue Longitude'], toronto_venues['PostalCode'], toronto_venues['Venue']):
    label = '{}, {}'.format(postal,neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

### Restaurant Analysis

In [33]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['PostalCode'] = toronto_venues['PostalCode'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,PostalCode,Airport Food Court,American Restaurant,Asian Restaurant,BBQ Joint,Belgian Restaurant,Brazilian Restaurant,Burger Joint,Burrito Place,Cajun / Creole Restaurant,Caribbean Restaurant,Chinese Restaurant,Colombian Restaurant,Comfort Food Restaurant,Cuban Restaurant,Diner,Doner Restaurant,Dumpling Restaurant,Eastern European Restaurant,Ethiopian Restaurant,Falafel Restaurant,Fast Food Restaurant,Filipino Restaurant,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant,German Restaurant,Gluten-free Restaurant,Greek Restaurant,Health Food Store,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Korean Restaurant,Latin American Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Molecular Gastronomy Restaurant,Moroccan Restaurant,New American Restaurant,Pizza Place,Portuguese Restaurant,Ramen Restaurant,Restaurant,Salad Place,Seafood Restaurant,Sushi Restaurant,Taiwanese Restaurant,Thai Restaurant,Theme Restaurant,Tibetan Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant
1,M4E,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,M4E,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0
6,M4K,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
7,M4K,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
11,M4K,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [34]:
toronto_grouped = toronto_onehot.groupby('PostalCode').mean().reset_index()
toronto_grouped.head()

Unnamed: 0,PostalCode,Airport Food Court,American Restaurant,Asian Restaurant,BBQ Joint,Belgian Restaurant,Brazilian Restaurant,Burger Joint,Burrito Place,Cajun / Creole Restaurant,Caribbean Restaurant,Chinese Restaurant,Colombian Restaurant,Comfort Food Restaurant,Cuban Restaurant,Diner,Doner Restaurant,Dumpling Restaurant,Eastern European Restaurant,Ethiopian Restaurant,Falafel Restaurant,Fast Food Restaurant,Filipino Restaurant,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant,German Restaurant,Gluten-free Restaurant,Greek Restaurant,Health Food Store,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Korean Restaurant,Latin American Restaurant,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Molecular Gastronomy Restaurant,Moroccan Restaurant,New American Restaurant,Pizza Place,Portuguese Restaurant,Ramen Restaurant,Restaurant,Salad Place,Seafood Restaurant,Sushi Restaurant,Taiwanese Restaurant,Thai Restaurant,Theme Restaurant,Tibetan Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant
0,M4E,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,M4K,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.444444,0.0,0.055556,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0
2,M4L,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.166667,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0
3,M4M,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.1,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.1,0.0,0.0,0.0,0.0
4,M4P,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [35]:
toronto_grouped.shape

(37, 57)

### Top 5 restaurants in Toronto for each postal code

In [36]:
num_top_venues = 5

for hood in toronto_grouped['PostalCode']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['PostalCode'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----M4E----
                 venue  freq
0    Health Food Store   0.5
1          Pizza Place   0.5
2   Airport Food Court   0.0
3  Moroccan Restaurant   0.0
4    Indian Restaurant   0.0


----M4K----
                  venue  freq
0      Greek Restaurant  0.44
1    Italian Restaurant  0.17
2            Restaurant  0.11
3           Pizza Place  0.06
4  Caribbean Restaurant  0.06


----M4L----
                  venue  freq
0    Italian Restaurant  0.17
1     Food & Drink Shop  0.17
2      Sushi Restaurant  0.17
3            Restaurant  0.17
4  Fast Food Restaurant  0.17


----M4M----
                 venue  freq
0  American Restaurant   0.2
1   Seafood Restaurant   0.1
2   Italian Restaurant   0.1
3      Thai Restaurant   0.1
4                Diner   0.1


----M4P----
                venue  freq
0   Food & Drink Shop   1.0
1  Airport Food Court   0.0
2    Greek Restaurant   0.0
3   Indian Restaurant   0.0
4  Italian Restaurant   0.0


----M4R----
                  venue  freq
0           

In [37]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

### Top 10 restaurants in Toronto for each postal code

In [38]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['PostalCode']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
toronto_venues_sorted = pd.DataFrame(columns=columns)
toronto_venues_sorted['PostalCode'] = toronto_grouped['PostalCode']

for ind in np.arange(toronto_grouped.shape[0]):
    toronto_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

toronto_venues_sorted

Unnamed: 0,PostalCode,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M4E,Health Food Store,Pizza Place,Airport Food Court,Moroccan Restaurant,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Korean Restaurant,Latin American Restaurant,Mediterranean Restaurant
1,M4K,Greek Restaurant,Italian Restaurant,Restaurant,Pizza Place,Caribbean Restaurant,Indian Restaurant,American Restaurant,Tibetan Restaurant,Sushi Restaurant,Seafood Restaurant
2,M4L,Italian Restaurant,Food & Drink Shop,Sushi Restaurant,Restaurant,Fast Food Restaurant,Pizza Place,Airport Food Court,Molecular Gastronomy Restaurant,Indian Restaurant,Japanese Restaurant
3,M4M,American Restaurant,Seafood Restaurant,Italian Restaurant,Thai Restaurant,Diner,Latin American Restaurant,Middle Eastern Restaurant,Comfort Food Restaurant,Food,Mexican Restaurant
4,M4P,Food & Drink Shop,Airport Food Court,Greek Restaurant,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Korean Restaurant,Latin American Restaurant,Mediterranean Restaurant,Mexican Restaurant
5,M4R,Diner,Mexican Restaurant,Fast Food Restaurant,Chinese Restaurant,Restaurant,New American Restaurant,Italian Restaurant,Japanese Restaurant,Korean Restaurant,Latin American Restaurant
6,M4S,Pizza Place,Sushi Restaurant,Italian Restaurant,Diner,Greek Restaurant,Restaurant,Seafood Restaurant,Thai Restaurant,Indian Restaurant,Korean Restaurant
7,M4T,Restaurant,Airport Food Court,Moroccan Restaurant,Health Food Store,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Korean Restaurant,Latin American Restaurant,Mediterranean Restaurant
8,M4V,Vietnamese Restaurant,Sushi Restaurant,Restaurant,Pizza Place,American Restaurant,Molecular Gastronomy Restaurant,Health Food Store,Indian Restaurant,Italian Restaurant,Japanese Restaurant
9,M4X,Italian Restaurant,Pizza Place,Restaurant,Diner,Caribbean Restaurant,Japanese Restaurant,Chinese Restaurant,Indian Restaurant,Taiwanese Restaurant,Thai Restaurant


### K-Means Clustering

In [41]:
# set number of clusters
kclusters = 3

toronto_grouped_clustering = toronto_grouped.drop('PostalCode', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:37]

array([0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 2, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0], dtype=int32)

In [42]:
toronto_merged = toronto[0:37]

# add clustering labels
toronto_merged['Cluster Labels'] = kmeans.labels_

# merge scarborough_grouped with scarborough_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(toronto_venues_sorted.set_index('PostalCode'), on='PostalCode')

toronto_merged.head() # check the last columns!

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
37,M4E,East Toronto,The Beaches,43.676357,-79.293031,0,Health Food Store,Pizza Place,Airport Food Court,Moroccan Restaurant,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Korean Restaurant,Latin American Restaurant,Mediterranean Restaurant
41,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,0,Greek Restaurant,Italian Restaurant,Restaurant,Pizza Place,Caribbean Restaurant,Indian Restaurant,American Restaurant,Tibetan Restaurant,Sushi Restaurant,Seafood Restaurant
42,M4L,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572,0,Italian Restaurant,Food & Drink Shop,Sushi Restaurant,Restaurant,Fast Food Restaurant,Pizza Place,Airport Food Court,Molecular Gastronomy Restaurant,Indian Restaurant,Japanese Restaurant
43,M4M,East Toronto,Studio District,43.659526,-79.340923,0,American Restaurant,Seafood Restaurant,Italian Restaurant,Thai Restaurant,Diner,Latin American Restaurant,Middle Eastern Restaurant,Comfort Food Restaurant,Food,Mexican Restaurant
44,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,0,,,,,,,,,,


In [43]:
import matplotlib.cm as cm
import matplotlib.colors as colors
# create map
map_clusters = folium.Map(location=[43.6532, -79.3832], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['PostalCode'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Cluster 1

In [44]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0]

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
37,M4E,East Toronto,The Beaches,43.676357,-79.293031,0,Health Food Store,Pizza Place,Airport Food Court,Moroccan Restaurant,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Korean Restaurant,Latin American Restaurant,Mediterranean Restaurant
41,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,0,Greek Restaurant,Italian Restaurant,Restaurant,Pizza Place,Caribbean Restaurant,Indian Restaurant,American Restaurant,Tibetan Restaurant,Sushi Restaurant,Seafood Restaurant
42,M4L,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572,0,Italian Restaurant,Food & Drink Shop,Sushi Restaurant,Restaurant,Fast Food Restaurant,Pizza Place,Airport Food Court,Molecular Gastronomy Restaurant,Indian Restaurant,Japanese Restaurant
43,M4M,East Toronto,Studio District,43.659526,-79.340923,0,American Restaurant,Seafood Restaurant,Italian Restaurant,Thai Restaurant,Diner,Latin American Restaurant,Middle Eastern Restaurant,Comfort Food Restaurant,Food,Mexican Restaurant
44,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,0,,,,,,,,,,
45,M4P,Central Toronto,Davisville North,43.712751,-79.390197,0,Food & Drink Shop,Airport Food Court,Greek Restaurant,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Korean Restaurant,Latin American Restaurant,Mediterranean Restaurant,Mexican Restaurant
46,M4R,Central Toronto,"North Toronto West, Lawrence Park",43.715383,-79.405678,0,Diner,Mexican Restaurant,Fast Food Restaurant,Chinese Restaurant,Restaurant,New American Restaurant,Italian Restaurant,Japanese Restaurant,Korean Restaurant,Latin American Restaurant
48,M4T,Central Toronto,"Moore Park, Summerhill East",43.689574,-79.38316,0,Restaurant,Airport Food Court,Moroccan Restaurant,Health Food Store,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Korean Restaurant,Latin American Restaurant,Mediterranean Restaurant
49,M4V,Central Toronto,"Summerhill West, Rathnelly, South Hill, Forest...",43.686412,-79.400049,0,Vietnamese Restaurant,Sushi Restaurant,Restaurant,Pizza Place,American Restaurant,Molecular Gastronomy Restaurant,Health Food Store,Indian Restaurant,Italian Restaurant,Japanese Restaurant
50,M4W,Downtown Toronto,Rosedale,43.679563,-79.377529,0,,,,,,,,,,


### Cluster 2

In [45]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1]

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
47,M4S,Central Toronto,Davisville,43.704324,-79.38879,1,Pizza Place,Sushi Restaurant,Italian Restaurant,Diner,Greek Restaurant,Restaurant,Seafood Restaurant,Thai Restaurant,Indian Restaurant,Korean Restaurant
69,M5W,Downtown Toronto,Stn A PO Boxes,43.646435,-79.374846,1,Seafood Restaurant,Restaurant,Japanese Restaurant,Italian Restaurant,Comfort Food Restaurant,Fast Food Restaurant,Eastern European Restaurant,Food Truck,Diner,French Restaurant
76,M6H,West Toronto,"Dufferin, Dovercourt Village",43.669005,-79.442259,1,Middle Eastern Restaurant,Portuguese Restaurant,Airport Food Court,Moroccan Restaurant,Health Food Store,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Korean Restaurant,Latin American Restaurant
78,M6K,West Toronto,"Brockton, Parkdale Village, Exhibition Place",43.636847,-79.428191,1,Italian Restaurant,Burrito Place,Restaurant,Airport Food Court,New American Restaurant,Indian Restaurant,Japanese Restaurant,Korean Restaurant,Latin American Restaurant,Mediterranean Restaurant


### Cluster 3

In [46]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2]

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
66,M5S,Downtown Toronto,"University of Toronto, Harbord",43.662696,-79.400049,2,Italian Restaurant,Japanese Restaurant,French Restaurant,Sushi Restaurant,Restaurant,Comfort Food Restaurant,Airport Food Court,Moroccan Restaurant,Indian Restaurant,Korean Restaurant
