# Introduction 

Discussion of the business problem and target audience of this project

Having recently opened its first overseas branch in New York City, USA, this project aims to inform Zooba’s management if there is a potential to expand its business to the North America market, specifically in Toronto, Canada. 

Zooba (www.zoobaeats.com) is a popular fast-food chain in Egypt that offers street food style of cuisine.  In 2019, it begun its first overseas operation with their branch in New York City (https://tinyurl.com/y38kccmo). This event is a positive sign and as an opportunity for this brand (and the Egyptian street food, in general) to be known in the broader North America market. Having an interest in Egyptian street food, it is interesting to know whether there would be a positive reception for this brand if it exanpds in Toronto, Canada.  

Certainly, among the target market is the Egyptian diaspora.  Thus, it is important to know if there is enough critical mass of Egyptians in Toronto, Canada.  These Egyptians will be the catalyst in promoting their kind of street food to their communities especially to the younger Egyptians who were born and grew-up in Toronto, Canada and have not given such experience.  Expectedly, it will expand to the non-Egyptian market who through the diaspora serving as the brand ambassadors.  By non-Egyptian, it means to capture both the Arab and non-Arab clienteles who would want to experience the taste and flavour of the Egyptian street food.  

# Data

Description of the data that will be used to solve the problem and its sources

The following data are required to answer the problem:  

No. 1 - The population of Egyptians who migrated and have settled in Toronto, Canada.  
No. 2 - The population of Toronto, Canada distributed by ethnic origins.  
No. 3 - The level of Arabs as ethnic origin in Toronto, Canada.    
No. 4 - The locations and concentrations of Arabic restaurants in Toronto, Canada.

No. 5 - The optimal location where the next Zooba branch in Toronto, Canada should be located.

    

# Methodology

This section discusses and describes the exploratory data, the inferential statistical testing performed, and the machine learnings used and its reason of use.

In [1]:
# The code was removed by Watson Studio for sharing.

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - xlrd


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    xlrd-1.2.0                 |           py36_0         188 KB  anaconda
    ca-certificates-2019.8.28  |                0         132 KB  anaconda
    openssl-1.1.1              |       h7b6447c_0         5.0 MB  anaconda
    certifi-2019.9.11          |           py36_0         154 KB  anaconda
    ------------------------------------------------------------
                                           Total:         5.5 MB

The following packages will be UPDATED:

    ca-certificates: 2019.5.15-1       --> 2019.8.28-0      anaconda
    certifi:         2019.6.16-py36_1  --> 2019.9.11-py36_0 anaconda
    openssl:         1.1.1d-h7b6447c_1 --> 1.1.1-h7b6447c_0 anaconda
    xlrd:            1.2.0-py36_0      --

Unnamed: 0,Type,Coverage,OdName,AREA,AreaName,REG,RegName,DEV,DevName,1980,...,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013
0,Immigrants,Foreigners,Afghanistan,935,Asia,5501,Southern Asia,902,Developing regions,16,...,2978,3436,3009,2652,2111,1746,1758,2203,2635,2004
1,Immigrants,Foreigners,Albania,908,Europe,925,Southern Europe,901,Developed regions,1,...,1450,1223,856,702,560,716,561,539,620,603
2,Immigrants,Foreigners,Algeria,903,Africa,912,Northern Africa,902,Developing regions,80,...,3616,3626,4807,3623,4005,5393,4752,4325,3774,4331
3,Immigrants,Foreigners,American Samoa,909,Oceania,957,Polynesia,902,Developing regions,0,...,0,0,1,0,0,0,0,0,0,0
4,Immigrants,Foreigners,Andorra,908,Europe,925,Southern Europe,901,Developed regions,0,...,0,0,1,1,0,0,0,0,1,1


### Remove columns that are not needed for the analysis.  Also, rename columns to make it more readable.

In [2]:
df = pd.DataFrame (df_data_0)
df = df.drop ("Type", 1)
df = df.drop ("Coverage" , 1)
df = df.drop ("AREA", 1)
df = df.drop ("REG" , 1)
df = df.drop ("DEV" , 1)
df.drop(df.index[[195,196]] , inplace=True)
df.rename(columns={'OdName': 'Country'}, inplace=True)
df.rename(columns={'AreaName': 'Continent'}, inplace=True)
df.rename(columns={'RegName' : 'Region'}, inplace=True)
df.rename(columns={'DevName' : 'Category'}, inplace=True)
df.head ()


Unnamed: 0,Country,Continent,Region,Category,1980,1981,1982,1983,1984,1985,...,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013
0,Afghanistan,Asia,Southern Asia,Developing regions,16,39,39,47,71,340,...,2978,3436,3009,2652,2111,1746,1758,2203,2635,2004
1,Albania,Europe,Southern Europe,Developed regions,1,0,0,0,0,0,...,1450,1223,856,702,560,716,561,539,620,603
2,Algeria,Africa,Northern Africa,Developing regions,80,67,71,69,63,44,...,3616,3626,4807,3623,4005,5393,4752,4325,3774,4331
3,American Samoa,Oceania,Polynesia,Developing regions,0,1,0,0,0,0,...,0,0,1,0,0,0,0,0,0,0
4,Andorra,Europe,Southern Europe,Developed regions,0,0,0,0,0,0,...,0,0,1,1,0,0,0,0,1,1


### Create a column total and add all years from 1980 to 2013 (the years available).

In [3]:
df['TOTAL'] = df.apply(lambda x: x['1980'] + x['1981'] + x['1982']+ x['1983']+ x['1984']  + x['1985']+ x['1986']+ x['1987']+ x['1988']+ x['1980']\
                       + x['1990']+ x['1991']+ x['1992']+ x['1993']+ x['1994']+ x['1995']+ x['1996']+ x['1997']+ x['1998']+ x['1999']+ x['2000']\
                       + x['2001']+ x['2002']+ x['2003']+ x['2004']+ x['2005']+ x['2006']+ x['2007']+ x['2008']+ x['2009']+ x['2010']+ x['2011']\
                       + x['2012']+ x['2013'],axis=1)

df.loc[1:38, ['Country', 'TOTAL']] 


Unnamed: 0,Country,TOTAL
1,Albania,15697
2,Algeria,69085
3,American Samoa,5
4,Andorra,15
5,Angola,2108
6,Antigua and Barbuda,930
7,Argentina,19426
8,Armenia,3310
9,Australia,23978
10,Austria,4992


In [4]:
df[(df.values  == "Africa" ) ]

Unnamed: 0,Country,Continent,Region,Category,1980,1981,1982,1983,1984,1985,...,2005,2006,2007,2008,2009,2010,2011,2012,2013,TOTAL
2,Algeria,Africa,Northern Africa,Developing regions,80,67,71,69,63,44,...,3626,4807,3623,4005,5393,4752,4325,3774,4331,69085
5,Angola,Africa,Middle Africa,Developing regions,1,3,6,6,4,3,...,295,184,106,76,62,61,39,70,45,2108
19,Benin,Africa,Western Africa,Developing regions,2,5,4,3,4,3,...,95,116,183,205,238,290,284,391,397,2840
23,Botswana,Africa,Southern Africa,Developing regions,10,1,3,3,7,4,...,7,11,8,28,15,42,53,64,76,398
27,Burkina Faso,Africa,Western Africa,Developing regions,2,1,3,2,3,2,...,91,147,136,139,162,186,144,269,322,2041
28,Burundi,Africa,Eastern Africa,Developing regions,0,0,0,0,1,2,...,626,468,614,448,566,529,604,684,480,8096
29,Cabo Verde,Africa,Western Africa,Developing regions,1,1,2,0,11,1,...,5,7,2,5,1,3,3,6,2,195
31,Cameroon,Africa,Middle Africa,Developing regions,9,2,16,7,8,13,...,604,697,1025,1279,1344,1800,1638,2507,2439,15992
33,Central African Republic,Africa,Middle Africa,Developing regions,4,3,1,0,0,0,...,49,18,30,28,19,26,18,45,169,550
34,Chad,Africa,Middle Africa,Developing regions,0,0,1,0,0,1,...,126,96,131,95,87,98,79,97,86,1639


In [5]:
total_by_continent = df['TOTAL'].groupby(df['Continent'])

print (total_by_continent)

<pandas.core.groupby.generic.SeriesGroupBy object at 0x7fcbf674a438>


## Analyzing with Folium and Providing Credentials

In [34]:
!conda install -c conda-forge geopy --yes 
!pip install foursquare

from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize
!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

print('Folium installed')
print('Libraries imported.')

CLIENT_ID = '0PV4LA2UFQZUDGSHWMXMJ2ZKEFAV0PYZL01P15MBSJWOTVA5' # your Foursquare ID
CLIENT_SECRET = '5SVO3OBC4U0PNJTO2QEUBSZPOD5XJGMXGKR21J3LEPAYDISU' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Solving environment: done

# All requested packages already installed.

Solving environment: done

# All requested packages already installed.

Folium installed
Libraries imported.
Your credentails:
CLIENT_ID: 0PV4LA2UFQZUDGSHWMXMJ2ZKEFAV0PYZL01P15MBSJWOTVA5
CLIENT_SECRET:5SVO3OBC4U0PNJTO2QEUBSZPOD5XJGMXGKR21J3LEPAYDISU


In [35]:
!pip install beautifulsoup4
!pip install lxml
!pip install folium


import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Visualization
import matplotlib.pyplot
import seaborn as sns
# Too see full dataframe...
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.width', None)

import json # library to handle JSON files

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library

print('Libraries imported.')
# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory
# First We have to locate the file path and changed accordingly
url ='https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
# Any results you write to the current directory are saved as output

Libraries imported.


In [36]:
# Read File
df_wiki=pd.read_html(url)
#Check the type
type(df_wiki)
# Call the position where the table is stored
neighbourhood=df_wiki[0]
# Rename the Columns
neighbourhood.rename(columns={0:'Postcode', 1: 'Borough', 2: 'Neighbourhood'}, inplace=True)
# Eliminate the first row
neighbourhood=neighbourhood.drop([0])
# Eliminate "Not assigned", categorical values from "Borough" Column
neighbourhood=neighbourhood[neighbourhood.Borough !='Not assigned']
# Making DataFrame
neighbourhood=pd.DataFrame(neighbourhood)
# Merging rows with same Postcode
neighbourhood.set_index(['Postcode','Borough'],inplace=True)
merge_result = neighbourhood.groupby(level=['Postcode','Borough'], sort=False).agg( ','.join)
# Setting the index
serial_wise=merge_result.reset_index()
# Assign the 'Borough' column value to 'Neighborhood' where 'Not assigned' occurs
serial_wise.loc[4, 'Neighbourhood']='Queen\'s Park'
# Saving the file for future use!
serial_wise.to_excel('wikipedia_table.xls')
# Showing the Data Frame
df=pd.DataFrame(serial_wise)
df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Harbourfront,Regent Park"
3,M6A,North York,"Lawrence Heights,Lawrence Manor"
4,M7A,Queen's Park,Queen's Park


In [37]:
search_query = 'Arabic'
radius = 2500
print(search_query + ' .... OK!')

Arabic .... OK!


In [38]:
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_query, radius, LIMIT)
url

'https://api.foursquare.com/v2/venues/search?client_id=0PV4LA2UFQZUDGSHWMXMJ2ZKEFAV0PYZL01P15MBSJWOTVA5&client_secret=5SVO3OBC4U0PNJTO2QEUBSZPOD5XJGMXGKR21J3LEPAYDISU&ll=43.653963,-79.387207&v=20180604&query=Arabic&radius=2500&limit=30'

In [39]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5d8eb591be70780039cd2789'},
 'response': {'venues': []}}

In [40]:
csv_url = 'https://cocl.us/Geospatial_data'
df_location = pd.read_csv(csv_url)
df_location.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [41]:
from bs4 import BeautifulSoup

In [42]:
wiki_Toronto_postal_codes = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
# get page text and parse using BeautifulSoup
source = requests.get(wiki_Toronto_postal_codes).text
soup = BeautifulSoup(source, 'lxml')

# table with data
pcode_table = soup.find('table',{'class':'wikitable sortable'})
table_data = []
# find all table rows
for tr in pcode_table.find_all('tr'):
    row = []
    # find all cells within row
    for td in tr.find_all('td'):
        # append extracted and trimmed cell text into row data  
        row.append(td.get_text(strip=True))
    # skip adding row to table_data in case is empty (header row)
    if len(row):
        table_data.append(row)
# create data frame from list of lists
df_wiki = pd.DataFrame(data=table_data, columns=['PostalCode', 'Borough', 'Neighbourhood'])
# filter out rows with Borough equal to 'Not assigned'
df_wiki = df_wiki[df_wiki.Borough != 'Not assigned']
df_wiki.head(5)

Unnamed: 0,PostalCode,Borough,Neighbourhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront
5,M5A,Downtown Toronto,Regent Park
6,M6A,North York,Lawrence Heights


In [43]:
df_wiki.loc[df_wiki['Neighbourhood'] == 'Not assigned','Neighbourhood'] = df_wiki['Borough']
df_grouped = df_wiki.groupby(['PostalCode', 'Borough'])['Neighbourhood'].apply(lambda neighbourhoods: ', '.join(neighbourhoods)).to_frame().reset_index()
df_grouped.head(5)

csv_url = 'https://cocl.us/Geospatial_data'
df_location = pd.read_csv(csv_url)
df_location.head()
# rename column
df_location.rename(index=str, columns={'Postal Code': 'PostalCode'}, inplace=True)
# marge datasets on PostalCode column value
df_location = pd.merge(df_grouped, df_location, on='PostalCode')
# df_grouped shape
print('Data frame shape: ', df_grouped.shape)

print('Shape of df_location', df_location.shape)
df_location.head(10)

Data frame shape:  (103, 3)
Shape of df_location (103, 5)


Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park",43.727929,-79.262029
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",43.711112,-79.284577
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",43.716316,-79.239476
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848


In [44]:
df_filtered = df_location[df_location['Borough'].str.contains('Toronto')]
df_filtered.head()

latlong_df = pd.read_csv("http://cocl.us/Geospatial_data")
latlong_df.columns = ['Postcode', 'Latitude', 'Longitude'] #renaming Postal Code to Post Code to match first dataframe
latlong_df

Unnamed: 0,Postcode,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476
5,M1J,43.744734,-79.239476
6,M1K,43.727929,-79.262029
7,M1L,43.711112,-79.284577
8,M1M,43.716316,-79.239476
9,M1N,43.692657,-79.264848


In [45]:
Toronto_address = 'Toronto, CA'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(Toronto_address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto City are 43.653963, -79.387207.


In [46]:
# create map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, borough, neighbourhood in zip(df_filtered['Latitude'], df_filtered['Longitude'], df_filtered['Borough'], df_filtered['Neighbourhood']):
    label = '{}, {}'.format(neighbourhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

In [47]:
# rename column
df_location.rename(index=str, columns={'Postal Code': 'PostalCode'}, inplace=True)
# marge datasets on PostalCode column value
df_location = pd.merge(df_grouped, df_location, on='PostalCode')
print('Shape of df_location', df_location.shape)
df_location.head(10)

Shape of df_location (103, 7)


Unnamed: 0,PostalCode,Borough_x,Neighbourhood_x,Borough_y,Neighbourhood_y,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,Scarborough,Cedarbrae,43.773136,-79.239476
5,M1J,Scarborough,Scarborough Village,Scarborough,Scarborough Village,43.744734,-79.239476
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park",Scarborough,"East Birchmount Park, Ionview, Kennedy Park",43.727929,-79.262029
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",Scarborough,"Clairlea, Golden Mile, Oakridge",43.711112,-79.284577
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",43.716316,-79.239476
9,M1N,Scarborough,"Birch Cliff, Cliffside West",Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848


In [48]:
Toronto_address = 'Toronto, CA'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(Toronto_address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of CA are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of CA are 43.653963, -79.387207.


In [49]:
# create map of Toronto using latitude and longitude values
map_Toronto = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, borough, neighbourhood in zip(df_filtered['Latitude'], df_filtered['Longitude'], df_filtered['Borough'], df_filtered['Neighbourhood']):
    label = '{}, {}'.format(neighbourhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_Toronto)  
    
map_Toronto

In [50]:
CLIENT_ID = '0PV4LA2UFQZUDGSHWMXMJ2ZKEFAV0PYZL01P15MBSJWOTVA5' # your Foursquare ID
CLIENT_SECRET = '5SVO3OBC4U0PNJTO2QEUBSZPOD5XJGMXGKR21J3LEPAYDISU' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30
print('My credentials:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

My credentials:
CLIENT_ID: 0PV4LA2UFQZUDGSHWMXMJ2ZKEFAV0PYZL01P15MBSJWOTVA5
CLIENT_SECRET:5SVO3OBC4U0PNJTO2QEUBSZPOD5XJGMXGKR21J3LEPAYDISU


In [51]:
def getNearbyVenues(names, latitudes, longitudes, radius=500, limit=100):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            limit)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)


# call getNearbyVenues for each Neighbourhood
toronto_venues = getNearbyVenues(names=df_filtered['Neighbourhood'],
                                   latitudes=df_filtered['Latitude'],
                                   longitudes=df_filtered['Longitude']
                                  )

toronto_venues.head(20)

The Beaches
The Danforth West, Riverdale
The Beaches West, India Bazaar
Studio District
Lawrence Park
Davisville North
North Toronto West
Davisville
Moore Park, Summerhill East
Deer Park, Forest Hill SE, Rathnelly, South Hill, Summerhill West
Rosedale
Cabbagetown, St. James Town
Church and Wellesley
Harbourfront, Regent Park
Ryerson, Garden District
St. James Town
Berczy Park
Central Bay Street
Adelaide, King, Richmond
Harbourfront East, Toronto Islands, Union Station
Design Exchange, Toronto Dominion Centre
Commerce Court, Victoria Hotel
Roselawn
Forest Hill North, Forest Hill West
The Annex, North Midtown, Yorkville
Harbord, University of Toronto
Chinatown, Grange Park, Kensington Market
CN Tower, Bathurst Quay, Island airport, Harbourfront West, King and Spadina, Railway Lands, South Niagara
Stn A PO Boxes 25 The Esplanade
First Canadian Place, Underground city
Christie
Dovercourt Village, Dufferin
Little Portugal, Trinity
Brockton, Exhibition Place, Parkdale Village
High Park, The 

Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,The Beaches,43.676357,-79.293031,Glen Manor Ravine,43.676821,-79.293942,Trail
1,The Beaches,43.676357,-79.293031,The Big Carrot Natural Food Market,43.678879,-79.297734,Health Food Store
2,The Beaches,43.676357,-79.293031,Grover Pub and Grub,43.679181,-79.297215,Pub
3,The Beaches,43.676357,-79.293031,Glen Stewart Ravine,43.6763,-79.294784,Other Great Outdoors
4,The Beaches,43.676357,-79.293031,Upper Beaches,43.680563,-79.292869,Neighborhood
5,"The Danforth West, Riverdale",43.679557,-79.352188,Pantheon,43.677621,-79.351434,Greek Restaurant
6,"The Danforth West, Riverdale",43.679557,-79.352188,Dolce Gelato,43.677773,-79.351187,Ice Cream Shop
7,"The Danforth West, Riverdale",43.679557,-79.352188,MenEssentials,43.67782,-79.351265,Cosmetics Shop
8,"The Danforth West, Riverdale",43.679557,-79.352188,Mezes,43.677962,-79.350196,Greek Restaurant
9,"The Danforth West, Riverdale",43.679557,-79.352188,La Diperie,43.67753,-79.352295,Ice Cream Shop


In [53]:
toronto_venues.groupby('Neighbourhood').count()

Unnamed: 0_level_0,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Adelaide, King, Richmond",100,100,100,100,100,100
Berczy Park,57,57,57,57,57,57
"Brockton, Exhibition Place, Parkdale Village",21,21,21,21,21,21
Business Reply Mail Processing Centre 969 Eastern,19,19,19,19,19,19
"CN Tower, Bathurst Quay, Island airport, Harbourfront West, King and Spadina, Railway Lands, South Niagara",16,16,16,16,16,16
"Cabbagetown, St. James Town",43,43,43,43,43,43
Central Bay Street,86,86,86,86,86,86
"Chinatown, Grange Park, Kensington Market",100,100,100,100,100,100
Christie,16,16,16,16,16,16
Church and Wellesley,90,90,90,90,90,90


In [54]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")
mid =  toronto_venues['Neighbourhood']
toronto_onehot.drop(labels=['Neighborhood'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Airport'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Airport Food Court'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Airport Gate'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Airport Lounge'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Airport Service'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Airport Terminal'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Antique Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Aquarium'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Art Gallery'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Art Museum'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Arts & Crafts Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Athletics & Sports'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Auto Workshop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Baby Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Bagel Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Bank'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Bar'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Baseball Stadium'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Basketball Stadium'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Beach'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Bed & Breakfast'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Beer Bar'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Beer Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Bike Rental / Bike Share'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Board Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Boat or Ferry'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Bookstore'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Boutique'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Building'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Bus Line'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Butcher'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Café'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Camera Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Cheese Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Chocolate Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Church'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Climbing Gym'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Clothing Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Cocktail Bar'], axis=1,inplace = True)
toronto_onehot.drop(labels=['College Arts Building'], axis=1,inplace = True)
toronto_onehot.drop(labels=['College Gym'], axis=1,inplace = True)
toronto_onehot.drop(labels=['College Rec Center'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Comic Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Concert Hall'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Convenience Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Cosmetics Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Coworking Space'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Dance Studio'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Deli / Bodega'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Department Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Dessert Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Diner'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Discount Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Dog Run'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Electronics Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Event Space'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Farmers Market'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Festival'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Fish Market'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Flea Market'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Flower Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Food'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Food & Drink Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Food Court'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Food Truck'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Fountain'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Fruit & Vegetable Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Furniture / Home Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Gaming Cafe'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Garden'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Garden Center'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Gastropub'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Gay Bar'], axis=1,inplace = True)
toronto_onehot.drop(labels=['General Entertainment'], axis=1,inplace = True)
toronto_onehot.drop(labels=['General Travel'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Gift Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Gluten-free Restaurant'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Gourmet Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Grocery Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Gym'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Gym / Fitness Center'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Harbor / Marina'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Health & Beauty Service'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Health Food Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Historic Site'], axis=1,inplace = True)
toronto_onehot.drop(labels=['History Museum'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Hobby Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Hospital'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Hostel'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Hotel Bar'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Ice Cream Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Indie Movie Theater'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Indoor Play Area'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Intersection'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Irish Pub'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Jazz Club'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Jewelry Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Juice Bar'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Lake'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Light Rail Station'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Lingerie Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Liquor Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Lounge'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Market'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Martial Arts Dojo'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Metro Station'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Miscellaneous Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Monument / Landmark'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Movie Theater'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Museum'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Music Venue'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Nightclub'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Office'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Opera House'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Optical Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Organic Grocery'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Other Great Outdoors'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Park'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Performing Arts Venue'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Pet Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Pharmacy'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Plane'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Playground'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Plaza'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Poke Place'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Post Office'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Poutine Place'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Pub'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Record Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Rental Car Location'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Roof Deck'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Salon / Barbershop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Scenic Lookout'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Sculpture Garden'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Shoe Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Shopping Mall'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Skate Park'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Skating Rink'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Smoke Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Spa'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Speakeasy'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Sporting Goods Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Sports Bar'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Stationery Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Steakhouse'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Strip Club'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Swim School'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Tailor Shop'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Tanning Salon'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Tennis Court'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Toy / Game Store'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Trail'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Train Station'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Wine Bar'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Wings Joint'], axis=1,inplace = True)
toronto_onehot.drop(labels=['Yoga Studio'], axis=1,inplace = True)

# insert back Neighborhood at front
toronto_onehot.insert(0, 'Neighbourhood', mid)

toronto_onehot.head()

Unnamed: 0,Neighbourhood,Afghan Restaurant,American Restaurant,Asian Restaurant,BBQ Joint,Bakery,Belgian Restaurant,Bistro,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Stop,Cajun / Creole Restaurant,Caribbean Restaurant,Chinese Restaurant,Coffee Shop,Colombian Restaurant,Comfort Food Restaurant,Creperie,Cuban Restaurant,Cupcake Shop,Dim Sum Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Ethiopian Restaurant,Falafel Restaurant,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,French Restaurant,Fried Chicken Joint,Greek Restaurant,Hookah Bar,Hotel,Hotpot Restaurant,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Jewish Restaurant,Korean Restaurant,Latin American Restaurant,Mac & Cheese Joint,Malay Restaurant,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Molecular Gastronomy Restaurant,New American Restaurant,Noodle House,Persian Restaurant,Pizza Place,Polish Restaurant,Portuguese Restaurant,Ramen Restaurant,Restaurant,Sake Bar,Salad Place,Sandwich Place,Seafood Restaurant,Smoothie Shop,Snack Place,Soup Place,South American Restaurant,Southern / Soul Food Restaurant,Stadium,Supermarket,Sushi Restaurant,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Theme Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant
0,The Beaches,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,The Beaches,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,The Beaches,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,The Beaches,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,The Beaches,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [55]:
toronto_grouped = toronto_onehot.groupby('Neighbourhood').mean().reset_index()
toronto_grouped

Unnamed: 0,Neighbourhood,Afghan Restaurant,American Restaurant,Asian Restaurant,BBQ Joint,Bakery,Belgian Restaurant,Bistro,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Stop,Cajun / Creole Restaurant,Caribbean Restaurant,Chinese Restaurant,Coffee Shop,Colombian Restaurant,Comfort Food Restaurant,Creperie,Cuban Restaurant,Cupcake Shop,Dim Sum Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Ethiopian Restaurant,Falafel Restaurant,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,French Restaurant,Fried Chicken Joint,Greek Restaurant,Hookah Bar,Hotel,Hotpot Restaurant,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Jewish Restaurant,Korean Restaurant,Latin American Restaurant,Mac & Cheese Joint,Malay Restaurant,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Molecular Gastronomy Restaurant,New American Restaurant,Noodle House,Persian Restaurant,Pizza Place,Polish Restaurant,Portuguese Restaurant,Ramen Restaurant,Restaurant,Sake Bar,Salad Place,Sandwich Place,Seafood Restaurant,Smoothie Shop,Snack Place,Soup Place,South American Restaurant,Southern / Soul Food Restaurant,Stadium,Supermarket,Sushi Restaurant,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Theme Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant
0,"Adelaide, King, Richmond",0.0,0.03,0.02,0.0,0.02,0.0,0.0,0.01,0.02,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.08,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.03,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.03,0.0,0.02,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.02,0.0,0.0
1,Berczy Park,0.0,0.0,0.0,0.017544,0.035088,0.017544,0.017544,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.070175,0.0,0.017544,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.017544,0.0,0.017544,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.017544,0.0,0.0,0.017544,0.0,0.0
2,"Brockton, Exhibition Place, Parkdale Village",0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.095238,0.0,0.0,0.0,0.047619,0.0,0.0,0.047619,0.0,0.095238,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Business Reply Mail Processing Centre 969 Eastern,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"CN Tower, Bathurst Quay, Island airport, Harbo...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,"Cabbagetown, St. James Town",0.0,0.0,0.0,0.0,0.046512,0.0,0.0,0.0,0.023256,0.0,0.0,0.0,0.0,0.0,0.0,0.023256,0.023256,0.093023,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.023256,0.046512,0.023256,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.046512,0.0,0.0,0.0,0.046512,0.0,0.0,0.023256,0.0,0.0,0.023256,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.023256,0.0,0.0,0.023256,0.0,0.0,0.0,0.0,0.0
6,Central Bay Street,0.0,0.011628,0.0,0.0,0.011628,0.0,0.0,0.0,0.0,0.0,0.023256,0.034884,0.0,0.0,0.0,0.0,0.023256,0.139535,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.0,0.011628,0.0,0.0,0.0,0.011628,0.0,0.0,0.0,0.011628,0.0,0.023256,0.046512,0.023256,0.0,0.011628,0.0,0.0,0.0,0.0,0.0,0.0,0.046512,0.011628,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,0.011628,0.0,0.0,0.023256,0.034884,0.011628,0.011628,0.0,0.0,0.0,0.0,0.0,0.0,0.023256,0.0,0.0,0.0,0.011628,0.011628,0.0,0.0,0.011628,0.0,0.0
7,"Chinatown, Grange Park, Kensington Market",0.0,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.0,0.01,0.01,0.02,0.01,0.0,0.0,0.02,0.04,0.03,0.0,0.02,0.0,0.0,0.0,0.01,0.01,0.02,0.04,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.06,0.0,0.04
8,Christie,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Church and Wellesley,0.011111,0.011111,0.0,0.0,0.0,0.0,0.0,0.0,0.011111,0.0,0.022222,0.022222,0.011111,0.0,0.0,0.011111,0.011111,0.088889,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011111,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.011111,0.011111,0.055556,0.0,0.0,0.0,0.0,0.0,0.022222,0.022222,0.011111,0.0,0.0,0.0,0.0,0.0,0.011111,0.011111,0.011111,0.0,0.011111,0.033333,0.011111,0.0,0.0,0.011111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.044444,0.0,0.0,0.0,0.011111,0.011111,0.011111,0.011111,0.0,0.011111,0.011111


In [56]:
num_top_venues = 10

for neighbourhood in toronto_grouped['Neighbourhood']:
    print("---*"+neighbourhood+"*---")
    temp = toronto_grouped[toronto_grouped['Neighbourhood'] == neighbourhood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

---*Adelaide, King, Richmond*---
                           venue  freq
0                    Coffee Shop  0.08
1                Thai Restaurant  0.03
2            American Restaurant  0.03
3                          Hotel  0.03
4                   Burger Joint  0.03
5                     Restaurant  0.03
6                 Breakfast Spot  0.02
7               Sushi Restaurant  0.02
8  Vegetarian / Vegan Restaurant  0.02
9                         Bakery  0.02


---*Berczy Park*---
                     venue  freq
0              Coffee Shop  0.07
1       Seafood Restaurant  0.04
2                   Bakery  0.04
3       Italian Restaurant  0.04
4               Restaurant  0.02
5          Thai Restaurant  0.02
6  Comfort Food Restaurant  0.02
7                    Hotel  0.02
8         Greek Restaurant  0.02
9                 Tea Room  0.02


---*Brockton, Exhibition Place, Parkdale Village*---
                  venue  freq
0        Breakfast Spot  0.10
1           Coffee Shop  0.10
2       

In [57]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

# Preparing a data set venues_sorted in which all neighbourhoods of Toronto are listed along with its top 10 most commonvenues.
# This will help in better visualization of each cluster after they are formed.

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighbourhoods_venues_sorted = pd.DataFrame(columns=columns)
neighbourhoods_venues_sorted['Neighbourhood'] = toronto_grouped['Neighbourhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighbourhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighbourhoods_venues_sorted.head()

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Adelaide, King, Richmond",Coffee Shop,Hotel,American Restaurant,Restaurant,Thai Restaurant,Burger Joint,Sushi Restaurant,Vegetarian / Vegan Restaurant,Asian Restaurant,Bakery
1,Berczy Park,Coffee Shop,Seafood Restaurant,Italian Restaurant,Bakery,Restaurant,Creperie,Hotel,Comfort Food Restaurant,French Restaurant,Breakfast Spot
2,"Brockton, Exhibition Place, Parkdale Village",Breakfast Spot,Coffee Shop,Italian Restaurant,Restaurant,Caribbean Restaurant,Burrito Place,Stadium,Bakery,Fried Chicken Joint,Hookah Bar
3,Business Reply Mail Processing Centre 969 Eastern,Burrito Place,Pizza Place,Fast Food Restaurant,Brewery,Restaurant,Vietnamese Restaurant,Cupcake Shop,Dim Sum Restaurant,Doner Restaurant,Donut Shop
4,"CN Tower, Bathurst Quay, Island airport, Harbo...",Coffee Shop,Vietnamese Restaurant,Falafel Restaurant,Cupcake Shop,Dim Sum Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Ethiopian Restaurant


In [58]:
# set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([0, 0, 0, 4, 4, 0, 0, 4, 0, 0], dtype=int32)

In [59]:
toronto_grouped_clustering.head(5)

Unnamed: 0,Afghan Restaurant,American Restaurant,Asian Restaurant,BBQ Joint,Bakery,Belgian Restaurant,Bistro,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Stop,Cajun / Creole Restaurant,Caribbean Restaurant,Chinese Restaurant,Coffee Shop,Colombian Restaurant,Comfort Food Restaurant,Creperie,Cuban Restaurant,Cupcake Shop,Dim Sum Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Ethiopian Restaurant,Falafel Restaurant,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,French Restaurant,Fried Chicken Joint,Greek Restaurant,Hookah Bar,Hotel,Hotpot Restaurant,Indian Restaurant,Italian Restaurant,Japanese Restaurant,Jewish Restaurant,Korean Restaurant,Latin American Restaurant,Mac & Cheese Joint,Malay Restaurant,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Modern European Restaurant,Molecular Gastronomy Restaurant,New American Restaurant,Noodle House,Persian Restaurant,Pizza Place,Polish Restaurant,Portuguese Restaurant,Ramen Restaurant,Restaurant,Sake Bar,Salad Place,Sandwich Place,Seafood Restaurant,Smoothie Shop,Snack Place,Soup Place,South American Restaurant,Southern / Soul Food Restaurant,Stadium,Supermarket,Sushi Restaurant,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Theme Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant
0,0.0,0.03,0.02,0.0,0.02,0.0,0.0,0.01,0.02,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.08,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.03,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.03,0.0,0.02,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.02,0.0,0.0
1,0.0,0.0,0.0,0.017544,0.035088,0.017544,0.017544,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.070175,0.0,0.017544,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.017544,0.0,0.017544,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.017544,0.0,0.0,0.017544,0.0,0.0
2,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.095238,0.0,0.0,0.0,0.047619,0.0,0.0,0.047619,0.0,0.095238,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [60]:
# add clustering labels
neighbourhoods_venues_sorted.insert (0, 'Cluster Labels', kmeans.labels_)


In [61]:
neighbourhoods_venues_sorted.head()

Unnamed: 0,Cluster Labels,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,0,"Adelaide, King, Richmond",Coffee Shop,Hotel,American Restaurant,Restaurant,Thai Restaurant,Burger Joint,Sushi Restaurant,Vegetarian / Vegan Restaurant,Asian Restaurant,Bakery
1,0,Berczy Park,Coffee Shop,Seafood Restaurant,Italian Restaurant,Bakery,Restaurant,Creperie,Hotel,Comfort Food Restaurant,French Restaurant,Breakfast Spot
2,0,"Brockton, Exhibition Place, Parkdale Village",Breakfast Spot,Coffee Shop,Italian Restaurant,Restaurant,Caribbean Restaurant,Burrito Place,Stadium,Bakery,Fried Chicken Joint,Hookah Bar
3,4,Business Reply Mail Processing Centre 969 Eastern,Burrito Place,Pizza Place,Fast Food Restaurant,Brewery,Restaurant,Vietnamese Restaurant,Cupcake Shop,Dim Sum Restaurant,Doner Restaurant,Donut Shop
4,4,"CN Tower, Bathurst Quay, Island airport, Harbo...",Coffee Shop,Vietnamese Restaurant,Falafel Restaurant,Cupcake Shop,Dim Sum Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Ethiopian Restaurant


In [62]:
df_filtered2 = df_filtered.rename(index=str, columns={'Neighbourhood': 'Neighbourhood'})
toronto_merged = df_filtered2
toronto_merged.head()

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude
37,M4E,East Toronto,The Beaches,43.676357,-79.293031
41,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188
42,M4L,East Toronto,"The Beaches West, India Bazaar",43.668999,-79.315572
43,M4M,East Toronto,Studio District,43.659526,-79.340923
44,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879


In [63]:
# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighbourhoods_venues_sorted.set_index('Neighbourhood'), on='Neighbourhood')

toronto_merged.head() # check the last columns!

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
37,M4E,East Toronto,The Beaches,43.676357,-79.293031,4,Vietnamese Restaurant,Creperie,Cupcake Shop,Dim Sum Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Ethiopian Restaurant,Falafel Restaurant
41,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,0,Greek Restaurant,Coffee Shop,Italian Restaurant,Restaurant,Sushi Restaurant,Pizza Place,Caribbean Restaurant,Bubble Tea Shop,Brewery,Indian Restaurant
42,M4L,East Toronto,"The Beaches West, India Bazaar",43.668999,-79.315572,2,Pizza Place,Sandwich Place,Fast Food Restaurant,Coffee Shop,Burrito Place,Burger Joint,Brewery,Sushi Restaurant,Italian Restaurant,Fish & Chips Shop
43,M4M,East Toronto,Studio District,43.659526,-79.340923,0,Coffee Shop,American Restaurant,Italian Restaurant,Bakery,Middle Eastern Restaurant,Thai Restaurant,Comfort Food Restaurant,Latin American Restaurant,Brewery,Chinese Restaurant
44,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,4,Vietnamese Restaurant,Creperie,Cupcake Shop,Dim Sum Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Ethiopian Restaurant,Falafel Restaurant


# Visualization

In [64]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighbourhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters


In [65]:
#Cluster 1
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
41,East Toronto,0,Greek Restaurant,Coffee Shop,Italian Restaurant,Restaurant,Sushi Restaurant,Pizza Place,Caribbean Restaurant,Bubble Tea Shop,Brewery,Indian Restaurant
43,East Toronto,0,Coffee Shop,American Restaurant,Italian Restaurant,Bakery,Middle Eastern Restaurant,Thai Restaurant,Comfort Food Restaurant,Latin American Restaurant,Brewery,Chinese Restaurant
46,Central Toronto,0,Coffee Shop,Restaurant,Burger Joint,Chinese Restaurant,Mexican Restaurant,Cupcake Shop,Dim Sum Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant
49,Central Toronto,0,Coffee Shop,Vietnamese Restaurant,American Restaurant,Restaurant,Pizza Place,Fried Chicken Joint,Sushi Restaurant,Supermarket,Eastern European Restaurant,Cupcake Shop
51,Downtown Toronto,0,Coffee Shop,Restaurant,Italian Restaurant,Pizza Place,Bakery,Breakfast Spot,Indian Restaurant,Chinese Restaurant,Caribbean Restaurant,Sandwich Place
52,Downtown Toronto,0,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Restaurant,Bubble Tea Shop,Burger Joint,Men's Store,Mediterranean Restaurant,Fast Food Restaurant,Hotel
53,Downtown Toronto,0,Coffee Shop,Bakery,Restaurant,Theater,Breakfast Spot,Mexican Restaurant,Hotel,French Restaurant,Brewery,Vietnamese Restaurant
54,Downtown Toronto,0,Coffee Shop,Middle Eastern Restaurant,Japanese Restaurant,Pizza Place,Italian Restaurant,Fast Food Restaurant,Restaurant,Ramen Restaurant,Tea Room,Bubble Tea Shop
55,Downtown Toronto,0,Coffee Shop,Restaurant,Hotel,Italian Restaurant,Breakfast Spot,Bakery,BBQ Joint,Seafood Restaurant,Pizza Place,American Restaurant
56,Downtown Toronto,0,Coffee Shop,Seafood Restaurant,Italian Restaurant,Bakery,Restaurant,Creperie,Hotel,Comfort Food Restaurant,French Restaurant,Breakfast Spot


In [66]:
#Cluster 2
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
64,Central Toronto,1,Sushi Restaurant,Fast Food Restaurant,Cupcake Shop,Dim Sum Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Ethiopian Restaurant,Falafel Restaurant


In [67]:
#Cluster 3
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
42,East Toronto,2,Pizza Place,Sandwich Place,Fast Food Restaurant,Coffee Shop,Burrito Place,Burger Joint,Brewery,Sushi Restaurant,Italian Restaurant,Fish & Chips Shop
47,Central Toronto,2,Pizza Place,Sandwich Place,Sushi Restaurant,Italian Restaurant,Coffee Shop,Restaurant,Indian Restaurant,Fried Chicken Joint,Seafood Restaurant,Brewery
65,Central Toronto,2,Coffee Shop,Sandwich Place,Pizza Place,Jewish Restaurant,Burger Joint,Indian Restaurant,Vegetarian / Vegan Restaurant,American Restaurant,BBQ Joint,Hookah Bar
84,West Toronto,2,Coffee Shop,Italian Restaurant,Sushi Restaurant,Pizza Place,South American Restaurant,Sandwich Place,Smoothie Shop,Restaurant,Latin American Restaurant,Falafel Restaurant


In [68]:
#Cluster 4
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
48,Central Toronto,3,Restaurant,Falafel Restaurant,Cuban Restaurant,Cupcake Shop,Dim Sum Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Ethiopian Restaurant


In [69]:
#Cluster 5
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
37,East Toronto,4,Vietnamese Restaurant,Creperie,Cupcake Shop,Dim Sum Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Ethiopian Restaurant,Falafel Restaurant
44,Central Toronto,4,Vietnamese Restaurant,Creperie,Cupcake Shop,Dim Sum Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Ethiopian Restaurant,Falafel Restaurant
45,Central Toronto,4,Hotel,Sandwich Place,Breakfast Spot,Vietnamese Restaurant,Falafel Restaurant,Cupcake Shop,Dim Sum Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant
50,Downtown Toronto,4,Vietnamese Restaurant,Creperie,Cupcake Shop,Dim Sum Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Ethiopian Restaurant,Falafel Restaurant
63,Central Toronto,4,Vietnamese Restaurant,Creperie,Cupcake Shop,Dim Sum Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Ethiopian Restaurant,Falafel Restaurant
66,Downtown Toronto,4,Japanese Restaurant,Bakery,Sandwich Place,Restaurant,Theater,Chinese Restaurant,Italian Restaurant,Video Game Store,French Restaurant,Noodle House
67,Downtown Toronto,4,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Chinese Restaurant,Bakery,Mexican Restaurant,Dumpling Restaurant,Coffee Shop,Burger Joint,Caribbean Restaurant,Comfort Food Restaurant
68,Downtown Toronto,4,Coffee Shop,Vietnamese Restaurant,Falafel Restaurant,Cupcake Shop,Dim Sum Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Ethiopian Restaurant
76,West Toronto,4,Supermarket,Bakery,Brewery,Middle Eastern Restaurant,Fast Food Restaurant,Dim Sum Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant
77,West Toronto,4,Coffee Shop,Asian Restaurant,Vietnamese Restaurant,Men's Store,French Restaurant,Restaurant,Pizza Place,Brewery,Cuban Restaurant,Brazilian Restaurant


# Discussion and Conclusion Section 

## This section discusses observations noted and any Recommendations/Conclusions made based on the results.

At this point in time, based on the analyses as a result of clustering above and the high population of Egyptian, Arabic, and African populace, there is high success for a Zooba outlet be opened in ANY locations within Toronto.  There is not much competition and the market is still big; thus, the opportunity is still wide. 