# Battle of the Toronto Neighborhoods: The Mexican Restaurant

# Table of Contents

* [Introduction/Business Problem](#intro)
* [Data](#data)
* [Methodology](#methodology)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

# Introduction/Business Problem<a name="intro"></a>

### Introduction
Wikipedia states that (as of 2016) Toronto is Canada’s most populous city and the fourth most in North America. It’s diverse, it’s multicultural, it’s a global hub of commerce and entertainment.. and its profile is always growing. To focalize upon Toronto’s restaurant industry, many have come and gone but there have been a rare and resilient few have come to Toronto and actually thrive in the market. Which is why any hopes of a restaurant business to remain afloat and be successful needs to be analyzed and planned carefully. Even before any notion of looking at a piece of real estate or buying your first piece of cutlery, it is important for stakeholders to be able find all data possible and leverage it to make a sound business decision. 

So, I hope that this project can illustrate how restaurant stakeholders strategize launching and breaking ground for a new restaurant in a metropolitan city like Toronto.

### Business Problem
My client, a wealthy and successfull restauranteur is eager to expand his business operations and brand here in Toronto. They're hoping to create an authentic Mexican eating experience rich in culure and cuisine. Since the city of Toronto is a very competitive market, my client needs insightful data in order to decide if it's good to establish this restuarant in the city and in which neighborhood.

### Interested Audience
Albeit that this is a project about trying to break ground for a Mexican restaurant, I believe that the analyzed data can be relevant for any restauranteur trying to make it in the city of Toronto. As well, with the data science practices and techniques used throughout this project, this can be a useful example and resource for any data science practitioner.

# Data <a name="data"></a>

###  Data Sources
These are the data sources used for this report:

* From Wikipedia, I will pull data of Toronto's neighborhoods. With this data, I will create  data frameworks and geological mappings. Here's the link:
https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M

* I will need geological coordinates for each postal code, and that is pulled from this csv file: https://cocl.us/Geospatial_data

* The Foursquare API will be used to obtain location and restaurant  data. The pulled data will provide categorical data that can give further analysis that will determine the selection of a location for the restaurant.

### Data Acquisition and Usage
With the data scraped from Wikipedia, I then put it into dataframes where I cleaned it of 'Not assigned' and non-Toronto data. I then merged it with coordinate data so we can geologically map each neighborhood in python. With these neighborhoods mapped, we can then use the Foursquare API to explore restaurant data for each neighborhood. That way, this will create insights for my client as to which neighborhoods have a food scene and which ones do not. Furthermore, we can also see the frequency of Mexican food in the city, which should also help my client in making deicions to break ground for their restaurant.

# Methodology<a name="methodology"></a>

## Step 1: Wrangling location data from Wikipedia page

### Importing necessary packages for wrangling data and creating dataframes

In [1]:
!pip install beautifulsoup4 
from bs4 import BeautifulSoup                            # package for scraping data
!pip install urllib3 
import urllib3                                           # package for working with urls

import pandas as pd                                      # package to handle dataframes
import numpy as np                                       # package to handle scientific computing
import requests                                          # package to handle http requests
from requests.adapters import HTTPAdapter                # package containing transport adapters that requests  
                                                         # uses to define and maintain connections
from requests.packages.urllib3.util.retry import Retry   # package to support urllib3
from IPython.display import display_html                 # public API for display tools in IPython



### Scraping the Wikipedia page for Toronto geospatial data

In [2]:
List_url = "https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M"
source = requests.get(List_url).text

soup = BeautifulSoup(source, 'xml')
table=soup.find('table')

print(soup.title)
tab = str(soup.table)
display_html(tab,raw=True)

<title>List of postal codes of Canada: M - Wikipedia</title>


Postal Code,Borough,Neighborhood
M1A,Not assigned,Not assigned
M2A,Not assigned,Not assigned
M3A,North York,Parkwoods
M4A,North York,Victoria Village
M5A,Downtown Toronto,"Regent Park, Harbourfront"
M6A,North York,"Lawrence Manor, Lawrence Heights"
M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
M8A,Not assigned,Not assigned
M9A,Etobicoke,"Islington Avenue, Humber Valley Village"
M1B,Scarborough,"Malvern, Rouge"


### Create pandas dataframe from the scraped Wikipedia data

In [3]:
dfs = pd.read_html(tab)
df=dfs[0]
df.head()

Unnamed: 0,Postal Code,Borough,Neighborhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"


## Step 2: Cleaning location data & creating and exploring the dataframe

### Dropping the rows where borough is 'Not assigned'

In [4]:
df1 = df[df.Borough != 'Not assigned']
df1.head()

Unnamed: 0,Postal Code,Borough,Neighborhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
5,M6A,North York,"Lawrence Manor, Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


### Combining the neighborhoods with same postal code

In [5]:
df2 = df1.groupby(['Postal Code','Borough'], sort=False).agg(', '.join)
df2.reset_index(inplace=True)
df2.head()

Unnamed: 0,Postal Code,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


### Replacing the name of neighborhoods which are 'Not assigned' with names of borough

In [6]:
df2['Neighborhood'] = np.where(df2['Neighborhood'] == 'Not assigned',df2['Borough'], df2['Neighborhood'])
df2.head()

Unnamed: 0,Postal Code,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


### Filtering for only 'Toronto' neighborhoods

In [7]:
df3 = df2[df2['Borough'].str.contains('Toronto',regex=False)]
df3.head()

Unnamed: 0,Postal Code,Borough,Neighborhood
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
9,M5B,Downtown Toronto,"Garden District, Ryerson"
15,M5C,Downtown Toronto,St. James Town
19,M4E,East Toronto,The Beaches


### Shape of 'Toronto' only boroughs data frame

In [8]:
print('The dataframe contains {} boroughs and {} neighborhoods.'.format(
        len(df3['Borough'].unique()),df3.shape[0]))

The dataframe contains 4 boroughs and 39 neighborhoods.


### Looking at neighborhood count by borough

In [9]:
print(df3.groupby('Borough').count()['Neighborhood'])

Borough
Central Toronto      9
Downtown Toronto    19
East Toronto         5
West Toronto         6
Name: Neighborhood, dtype: int64


## Step 3: Adding the longitudes and latitudes to dataset

### Importing csv file containing longitude and latitude data for various Canadian postal codes

In [10]:
long_lat = pd.read_csv('https://cocl.us/Geospatial_data')
long_lat.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


### Merge the two tables together so that we have the longitude and latitude for all the postal codes from the Part 2 dataset

In [11]:
df4 = pd.merge(df3,long_lat, on = "Postal Code")
df4.tail()

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
34,M5W,Downtown Toronto,Stn A PO Boxes,43.646435,-79.374846
35,M4X,Downtown Toronto,"St. James Town, Cabbagetown",43.667967,-79.367675
36,M5X,Downtown Toronto,"First Canadian Place, Underground city",43.648429,-79.38228
37,M4Y,Downtown Toronto,Church and Wellesley,43.66586,-79.38316
38,M7Y,East Toronto,"Business reply mail Processing Centre, South C...",43.662744,-79.321558


## Step 4: Visualization and Clustering of Toronto neighborhoods

### Importing packages for geomapping visualizations and clustering

In [12]:
!conda install -c conda-forge geopy --yes                 
from geopy.geocoders import Nominatim                    # module for geocoding
!conda install -c conda-forge folium=0.5.0 --yes         
import folium                                            # module to create leaflet maps

import matplotlib as plt                                 # visualization package
import matplotlib.cm as cm                               # visualization package with colour mapping features
import matplotlib.colors as colors                       # visualization package to display various colors
from sklearn.cluster import KMeans                       # module used to cluster data points

Collecting package metadata (current_repodata.json): done
Solving environment: / 
  - anaconda/osx-64::ca-certificates-2020.1.1-0, anaconda/osx-64::openssl-1.1.1d-h1de35cc_4
  - anaconda/osx-64::openssl-1.1.1d-h1de35cc_4, defaults/osx-64::ca-certificates-2020.1.1-0
  - anaconda/osx-64::ca-certificates-2020.1.1-0, defaults/osx-64::openssl-1.1.1d-h1de35cc_4
  - defaults/osx-64::ca-certificates-2020.1.1-0, defaults/osx-64::openssl-1.1.1d-h1de35ccdone

# All requested packages already installed.

Collecting package metadata (current_repodata.json): done
Solving environment: / 
  - anaconda/osx-64::ca-certificates-2020.1.1-0, anaconda/osx-64::openssl-1.1.1d-h1de35cc_4
  - anaconda/osx-64::ca-certificates-2020.1.1-0, defaults/osx-64::openssl-1.1.1d-h1de35cc_4
  - anaconda/osx-64::openssl-1.1.1d-h1de35cc_4, defaults/osx-64::ca-certificates-2020.1.1-0
  - defaults/osx-64::ca-certificates-2020.1.1-0, defaults/osx-64::openssl-1.1.1d-h1de35ccdone

# All requested packages already installed.



### Use Geopy to get longitude and latitude of Toronto for Folium mapping

In [13]:
address = 'Toronto, CA'

geolocator = Nominatim(user_agent="TO")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinates of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinates of Toronto are 43.6534817, -79.3839347.


### Creating visualization using the Folium package and Part 3's dataset to create mapping points

In [14]:
map_toronto = folium.Map(location=[43.6534817,-79.3839347],zoom_start=10)

for lat,lng,borough,neighborhood in zip(df4['Latitude'],df4['Longitude'],df4['Borough'],df4['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
    [lat,lng],
    radius=4,
    popup=label,
    color='red',
    fill=True,
    fill_color='white',
    fill_opacity=.7,
    parse_html=False).add_to(map_toronto)

map_toronto

### To use kmeans clustering, I based my k value upon the number of boroughs there are in Toronto

In [15]:
df4.Borough.unique()

array(['Downtown Toronto', 'East Toronto', 'West Toronto',
       'Central Toronto'], dtype=object)

### Creating the clusters by finding each neighborhood's best fit

In [16]:
k=4

toronto_clustering = df4.drop(['Postal Code','Borough','Neighborhood'],axis=1)
kmeans = KMeans(n_clusters = k,random_state=0).fit(toronto_clustering)
kmeans.labels_

array([3, 3, 3, 3, 0, 3, 3, 1, 3, 1, 3, 1, 0, 3, 1, 0, 3, 0, 2, 2, 2, 2,
       1, 2, 3, 1, 2, 3, 1, 2, 3, 2, 3, 3, 3, 3, 3, 3, 0], dtype=int32)

### Creating a new dataframe that includes the each neighborhood's  cluster value (Borough ID) 

In [17]:
toronto_cluster = df4.copy()
toronto_cluster["Borough ID"] = kmeans.labels_
toronto_cluster.sort_values(by='Borough')

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Borough ID
19,M5N,Central Toronto,Roselawn,43.711695,-79.416936,2
20,M4P,Central Toronto,Davisville North,43.712751,-79.390197,2
21,M5P,Central Toronto,"Forest Hill North & West, Forest Hill Road Park",43.696948,-79.411307,2
23,M4R,Central Toronto,"North Toronto West, Lawrence Park",43.715383,-79.405678,2
24,M5R,Central Toronto,"The Annex, North Midtown, Yorkville",43.67271,-79.405678,3
26,M4S,Central Toronto,Davisville,43.704324,-79.38879,2
29,M4T,Central Toronto,"Moore Park, Summerhill East",43.689574,-79.38316,2
31,M4V,Central Toronto,"Summerhill West, Rathnelly, South Hill, Forest...",43.686412,-79.400049,2
18,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,2
30,M5T,Downtown Toronto,"Kensington Market, Chinatown, Grange Park",43.653206,-79.400049,3


### Creating visualization of Toronto's boroughs based on kmeans clusters

In [18]:
# create map
map_clusters = folium.Map(location=[43.6534817,-79.3839347],zoom_start=10)

# set color scheme for the clusters
x = np.arange(k)
ys = [i + x + (i*x)**2 for i in range(k)]
colors_array = cm.Dark2(np.linspace(0, 1, len(ys)))
palette = [colors.rgb2hex(i) for i in colors_array]


# add markers to the map while each cluster/borough is defined by a color
markers_colors = []
for lat, lon, neighbourhood, cluster in zip(toronto_cluster['Latitude'], toronto_cluster['Longitude'], toronto_cluster['Neighborhood'], toronto_cluster['Borough ID']):
    label = folium.Popup(' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=palette[cluster-1],
        fill=True,
        fill_color=palette[cluster-1],
        fill_opacity=.7, 
        parse_html=False).add_to(map_clusters)
    
map_clusters

## Step 5: Using Foursquare to Explore Neighborhoods

### Import packages to handle json files

In [19]:
import json                                        # library to handle JSON files
from pandas.io.json import json_normalize          # package that tranforms JSON file into a pandas dataframe

### Defining my Foursquare credentials

In [20]:
CLIENT_ID = 'MDJKWOUGU5UE4VAT1W3GCSDCLHX1TOX3KVVOMKECGCEXTKSU' 
CLIENT_SECRET = 'A33NRTEDKWQUV0SPJN3YKVSKOBDW2OXNLLGWYEHACYJ12HHD' 
VERSION = '20180605' 

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: MDJKWOUGU5UE4VAT1W3GCSDCLHX1TOX3KVVOMKECGCEXTKSU
CLIENT_SECRET:A33NRTEDKWQUV0SPJN3YKVSKOBDW2OXNLLGWYEHACYJ12HHD


### Defining Toronto's geolocation coordinates

In [21]:
address = 'Toronto, Canada'

geolocator = Nominatim(user_agent="TO_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.6534817, -79.3839347.


### Creating a json file of Foursquare venue data

In [22]:
radius = 1000
LIMIT = 100

def getVenues(names, latitudes, longitudes):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):

# create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION,
            lat, 
            lng, 
            radius, 
            LIMIT)
            
# make the GET request
        venue_results = requests.get(url).json()['response']['groups'][0]['items']
        
# return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in venue_results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood_Latitude', 
                  'Neighborhood_Longitude', 
                  'Venue', 
                  'Venue_Latitude', 
                  'Venue_Longitude', 
                  'Venue_Category']
    
    return(nearby_venues)

### Creating dataframe of Toronto's 'nearby restaurants'

In [25]:
to_venues = getVenues(names=toronto_cluster['Neighborhood'],
                            latitudes=toronto_cluster['Latitude'],
                            longitudes=toronto_cluster['Longitude'])
to_venues

Unnamed: 0,Neighborhood,Neighborhood_Latitude,Neighborhood_Longitude,Venue,Venue_Latitude,Venue_Longitude,Venue_Category
0,"Regent Park, Harbourfront",43.654260,-79.360636,Roselle Desserts,43.653447,-79.362017,Bakery
1,"Regent Park, Harbourfront",43.654260,-79.360636,Tandem Coffee,43.653559,-79.361809,Coffee Shop
2,"Regent Park, Harbourfront",43.654260,-79.360636,Corktown Common,43.655618,-79.356211,Park
3,"Regent Park, Harbourfront",43.654260,-79.360636,The Distillery Historic District,43.650244,-79.359323,Historic Site
4,"Regent Park, Harbourfront",43.654260,-79.360636,Cooper Koo Family YMCA,43.653249,-79.358008,Distribution Center
...,...,...,...,...,...,...,...
3191,"Business reply mail Processing Centre, South C...",43.662744,-79.321558,Leslie Jones,43.662960,-79.331834,American Restaurant
3192,"Business reply mail Processing Centre, South C...",43.662744,-79.321558,Tim Hortons,43.662644,-79.309945,Coffee Shop
3193,"Business reply mail Processing Centre, South C...",43.662744,-79.321558,Breakfast Club,43.662811,-79.310174,Breakfast Spot
3194,"Business reply mail Processing Centre, South C...",43.662744,-79.321558,Carters Landing,43.662414,-79.309898,Bistro


In [53]:
tor_venues = to_venues[to_venues['Venue_Category'].str.contains('staurant')]
tor_venues.head()

Unnamed: 0,Neighborhood,Neighborhood_Latitude,Neighborhood_Longitude,Venue,Venue_Latitude,Venue_Longitude,Venue_Category
6,"Regent Park, Harbourfront",43.65426,-79.360636,Impact Kitchen,43.656369,-79.35698,Restaurant
10,"Regent Park, Harbourfront",43.65426,-79.360636,Souk Tabule,43.653756,-79.35439,Mediterranean Restaurant
20,"Regent Park, Harbourfront",43.65426,-79.360636,Cluny Bistro & Boulangerie,43.650565,-79.357843,French Restaurant
24,"Regent Park, Harbourfront",43.65426,-79.360636,Mangia and Bevi Resto-Bar,43.65225,-79.366355,Italian Restaurant
31,"Regent Park, Harbourfront",43.65426,-79.360636,Sukhothai,43.658444,-79.365681,Thai Restaurant


In [55]:
print('There are {} restaurants in the city of Toronto.'.format(
    tor_venues.Venue_Category.count()))

There are 789 restaurants in the city of Toronto.


In [56]:
print('There are {} uniques restaurant cuisines.'.format(len(tor_venues['Venue_Category'].unique())))

There are 53 uniques restaurant cuisines.


### Creating 2 dataframes where one has only Mexican restaurants while the other has zero Mexican restaurants

In [57]:
MexTO = tor_venues.loc[tor_venues['Venue_Category'] == 'Mexican Restaurant']
MexTO.head()

Unnamed: 0,Neighborhood,Neighborhood_Latitude,Neighborhood_Longitude,Venue,Venue_Latitude,Venue_Longitude,Venue_Category
114,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,Como En Casa,43.66516,-79.384796,Mexican Restaurant
227,"Garden District, Ryerson",43.657162,-79.378937,Chipotle Mexican Grill,43.65686,-79.38091,Mexican Restaurant
415,The Beaches,43.676357,-79.293031,Xola,43.672603,-79.28808,Mexican Restaurant
640,Central Bay Street,43.657952,-79.387383,Chipotle Mexican Grill,43.65686,-79.38091,Mexican Restaurant
657,Central Bay Street,43.657952,-79.387383,Como En Casa,43.66516,-79.384796,Mexican Restaurant


In [84]:
noMexTO = tor_venues.loc[tor_venues['Venue_Category'] != 'Mexican Restaurant']
noMexTO.head()

Unnamed: 0,Neighborhood,Neighborhood_Latitude,Neighborhood_Longitude,Venue,Venue_Latitude,Venue_Longitude,Venue_Category
6,"Regent Park, Harbourfront",43.65426,-79.360636,Impact Kitchen,43.656369,-79.35698,Restaurant
10,"Regent Park, Harbourfront",43.65426,-79.360636,Souk Tabule,43.653756,-79.35439,Mediterranean Restaurant
20,"Regent Park, Harbourfront",43.65426,-79.360636,Cluny Bistro & Boulangerie,43.650565,-79.357843,French Restaurant
24,"Regent Park, Harbourfront",43.65426,-79.360636,Mangia and Bevi Resto-Bar,43.65225,-79.366355,Italian Restaurant
31,"Regent Park, Harbourfront",43.65426,-79.360636,Sukhothai,43.658444,-79.365681,Thai Restaurant


## Step 6: Clustering Foursquare data 

### Create a dataframe of the top 10 venue types of each neighborhood that do not have a Mexican restaurant. We do this by using one hot enoding to turn the restaurant type into numerical data

In [59]:
# one hot encoding
to_onehot = pd.get_dummies(tor_venues[['Venue_Category']], prefix="", prefix_sep="")

# add neighborhood column into dataframe
to_onehot['Neighborhood'] = tor_venues['Neighborhood'] 

# create dataframe with venue common mean by Neighborhood
to_rank = to_onehot.groupby('Neighborhood').mean().reset_index()
print(to_rank.shape)
to_rank

(37, 54)


Unnamed: 0,Neighborhood,American Restaurant,Asian Restaurant,Belgian Restaurant,Brazilian Restaurant,Cajun / Creole Restaurant,Cantonese Restaurant,Caribbean Restaurant,Chinese Restaurant,Comfort Food Restaurant,...,Sushi Restaurant,Syrian Restaurant,Taiwanese Restaurant,Tapas Restaurant,Thai Restaurant,Theme Restaurant,Tibetan Restaurant,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant
0,Berczy Park,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,...,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.055556,0.0
1,"Brockton, Parkdale Village, Exhibition Place",0.038462,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.038462,...,0.0,0.0,0.0,0.038462,0.0,0.0,0.115385,0.0,0.076923,0.0
2,"Business reply mail Processing Centre, South C...",0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.222222,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0
3,Central Bay Street,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.166667,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.083333,0.0
4,Christie,0.029412,0.0,0.0,0.0,0.0,0.0,0.029412,0.029412,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.029412
5,Church and Wellesley,0.043478,0.0,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,...,0.130435,0.0,0.0,0.0,0.086957,0.043478,0.0,0.0,0.0,0.0
6,"Commerce Court, Victoria Hotel",0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.043478,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.086957,0.0
7,Davisville,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,...,0.179487,0.025641,0.0,0.0,0.025641,0.0,0.0,0.0,0.025641,0.025641
8,Davisville North,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,...,0.12,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.04
9,"Dufferin, Dovercourt Village",0.0,0.0,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,...,0.2,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.066667


### Creating printout of each neighborhood ranking it's top 5 most common restaurant types

In [60]:
num_top_venues = 5

for hood in to_rank['Neighborhood']:
    print("----"+hood+"----")
    temp = to_rank[to_rank['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Berczy Park----
                           venue  freq
0            Japanese Restaurant  0.28
1                     Restaurant  0.22
2              French Restaurant  0.06
3  Vegetarian / Vegan Restaurant  0.06
4                Thai Restaurant  0.06


----Brockton, Parkdale Village, Exhibition Place----
                           venue  freq
0                     Restaurant  0.23
1             Tibetan Restaurant  0.12
2  Vegetarian / Vegan Restaurant  0.08
3             Italian Restaurant  0.08
4              Indian Restaurant  0.08


----Business reply mail Processing Centre, South Central Letter Processing Plant Toronto----
                  venue  freq
0  Fast Food Restaurant  0.22
1    Italian Restaurant  0.22
2      Sushi Restaurant  0.22
3   American Restaurant  0.11
4       Thai Restaurant  0.11


----Central Bay Street----
                           venue  freq
0               Sushi Restaurant  0.17
1            Japanese Restaurant  0.12
2               Ramen Restaurant  0.

### Creating a dataframe of each neighborhood's top 10 most common restaurant types

In [61]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [62]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
venues_sorted = pd.DataFrame(columns=columns)
venues_sorted['Neighborhood'] = to_rank['Neighborhood']

for ind in np.arange(to_rank.shape[0]):
    venues_sorted.iloc[ind, 1:] = return_most_common_venues(to_rank.iloc[ind, :], num_top_venues)

venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Berczy Park,Japanese Restaurant,Restaurant,Seafood Restaurant,Comfort Food Restaurant,French Restaurant,Greek Restaurant,Italian Restaurant,Vegetarian / Vegan Restaurant,Middle Eastern Restaurant,American Restaurant
1,"Brockton, Parkdale Village, Exhibition Place",Restaurant,Tibetan Restaurant,Indian Restaurant,Italian Restaurant,Vegetarian / Vegan Restaurant,Japanese Restaurant,New American Restaurant,Caribbean Restaurant,Comfort Food Restaurant,Ethiopian Restaurant
2,"Business reply mail Processing Centre, South C...",Italian Restaurant,Fast Food Restaurant,Sushi Restaurant,Thai Restaurant,French Restaurant,American Restaurant,Cajun / Creole Restaurant,Cantonese Restaurant,Indian Restaurant,Indian Chinese Restaurant
3,Central Bay Street,Sushi Restaurant,Japanese Restaurant,Ramen Restaurant,Italian Restaurant,Vegetarian / Vegan Restaurant,Mexican Restaurant,Restaurant,Falafel Restaurant,Fast Food Restaurant,Middle Eastern Restaurant
4,Christie,Korean Restaurant,Mexican Restaurant,Ethiopian Restaurant,Indian Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Ramen Restaurant,Caribbean Restaurant,Chinese Restaurant,Eastern European Restaurant
5,Church and Wellesley,Japanese Restaurant,Sushi Restaurant,Restaurant,Caribbean Restaurant,Thai Restaurant,Italian Restaurant,Ramen Restaurant,Falafel Restaurant,Indian Restaurant,Mediterranean Restaurant
6,"Commerce Court, Victoria Hotel",Restaurant,Japanese Restaurant,Seafood Restaurant,Vegetarian / Vegan Restaurant,Thai Restaurant,French Restaurant,Italian Restaurant,Mediterranean Restaurant,Middle Eastern Restaurant,New American Restaurant
7,Davisville,Italian Restaurant,Sushi Restaurant,Fast Food Restaurant,Indian Restaurant,Restaurant,Middle Eastern Restaurant,Mexican Restaurant,Vietnamese Restaurant,Caribbean Restaurant,French Restaurant
8,Davisville North,Italian Restaurant,Fast Food Restaurant,Restaurant,Sushi Restaurant,Mexican Restaurant,Vietnamese Restaurant,Seafood Restaurant,Chinese Restaurant,Greek Restaurant,Vegetarian / Vegan Restaurant
9,"Dufferin, Dovercourt Village",Sushi Restaurant,Italian Restaurant,Brazilian Restaurant,Portuguese Restaurant,Vietnamese Restaurant,Mexican Restaurant,Mediterranean Restaurant,Restaurant,Middle Eastern Restaurant,Thai Restaurant


## Step 7: Clustering Toronto neighborhood venues

### Using the previous k value of 4, I created clusters based upon the Foursquare with neighborhoods that do not have a Mexican restaurant

In [63]:
k_clusters = 4

to_fsquare_clustering = to_rank.drop('Neighborhood', 1)

# run k-means clustering
kmeans1 = KMeans(init = "k-means++", n_clusters=k_clusters, random_state=None).fit(to_fsquare_clustering)

# check cluster labels generated for each row in the dataframe
kmeans1.labels_

array([1, 1, 2, 0, 0, 0, 1, 2, 2, 2, 1, 2, 0, 1, 2, 3, 0, 1, 2, 2, 2, 0,
       2, 1, 3, 2, 2, 1, 0, 1, 2, 2, 0, 0, 0, 1, 0], dtype=int32)

### Merging the clustered foursquare array with the sorted venue data

In [64]:
# add clustering labels
venues_sorted.insert(0, 'Cluster_ID', kmeans1.labels_ )

toronto_rank = noMexTO

# merge toronto_ with neighbourhoods to add latitude/longitude for each neighborhood
to_merged = df4.join(venues_sorted.set_index('Neighborhood'), on='Neighborhood')

to_merged

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Cluster_ID,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,2.0,Italian Restaurant,Restaurant,Indian Restaurant,Sushi Restaurant,Thai Restaurant,Middle Eastern Restaurant,Mediterranean Restaurant,French Restaurant,German Restaurant,Asian Restaurant
1,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,0.0,Japanese Restaurant,Italian Restaurant,Ramen Restaurant,Sushi Restaurant,Thai Restaurant,Middle Eastern Restaurant,Restaurant,Theme Restaurant,Indian Restaurant,Vegetarian / Vegan Restaurant
2,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,0.0,Japanese Restaurant,Italian Restaurant,Ramen Restaurant,Restaurant,Middle Eastern Restaurant,Falafel Restaurant,Fast Food Restaurant,German Restaurant,Mexican Restaurant,Modern European Restaurant
3,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,1.0,Restaurant,Japanese Restaurant,Italian Restaurant,Seafood Restaurant,Comfort Food Restaurant,Fast Food Restaurant,French Restaurant,German Restaurant,Vegetarian / Vegan Restaurant,Middle Eastern Restaurant
4,M4E,East Toronto,The Beaches,43.676357,-79.293031,0.0,Japanese Restaurant,Caribbean Restaurant,Restaurant,Vegetarian / Vegan Restaurant,Greek Restaurant,French Restaurant,Mediterranean Restaurant,Mexican Restaurant,Ramen Restaurant,Indian Restaurant
5,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,1.0,Japanese Restaurant,Restaurant,Seafood Restaurant,Comfort Food Restaurant,French Restaurant,Greek Restaurant,Italian Restaurant,Vegetarian / Vegan Restaurant,Middle Eastern Restaurant,American Restaurant
6,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383,0.0,Sushi Restaurant,Japanese Restaurant,Ramen Restaurant,Italian Restaurant,Vegetarian / Vegan Restaurant,Mexican Restaurant,Restaurant,Falafel Restaurant,Fast Food Restaurant,Middle Eastern Restaurant
7,M6G,Downtown Toronto,Christie,43.669542,-79.422564,0.0,Korean Restaurant,Mexican Restaurant,Ethiopian Restaurant,Indian Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Ramen Restaurant,Caribbean Restaurant,Chinese Restaurant,Eastern European Restaurant
8,M5H,Downtown Toronto,"Richmond, Adelaide, King",43.650571,-79.384568,1.0,Japanese Restaurant,Restaurant,Italian Restaurant,Sushi Restaurant,Seafood Restaurant,Brazilian Restaurant,Fast Food Restaurant,Vegetarian / Vegan Restaurant,Mediterranean Restaurant,New American Restaurant
9,M6H,West Toronto,"Dufferin, Dovercourt Village",43.669005,-79.442259,2.0,Sushi Restaurant,Italian Restaurant,Brazilian Restaurant,Portuguese Restaurant,Vietnamese Restaurant,Mexican Restaurant,Mediterranean Restaurant,Restaurant,Middle Eastern Restaurant,Thai Restaurant


### Dropping all 'NaN' data 

In [65]:
to_merged.dropna(axis=0, how='any',inplace=True)
to_merged.reset_index(inplace=True, drop=True)
to_merged

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Cluster_ID,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,2.0,Italian Restaurant,Restaurant,Indian Restaurant,Sushi Restaurant,Thai Restaurant,Middle Eastern Restaurant,Mediterranean Restaurant,French Restaurant,German Restaurant,Asian Restaurant
1,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,0.0,Japanese Restaurant,Italian Restaurant,Ramen Restaurant,Sushi Restaurant,Thai Restaurant,Middle Eastern Restaurant,Restaurant,Theme Restaurant,Indian Restaurant,Vegetarian / Vegan Restaurant
2,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,0.0,Japanese Restaurant,Italian Restaurant,Ramen Restaurant,Restaurant,Middle Eastern Restaurant,Falafel Restaurant,Fast Food Restaurant,German Restaurant,Mexican Restaurant,Modern European Restaurant
3,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,1.0,Restaurant,Japanese Restaurant,Italian Restaurant,Seafood Restaurant,Comfort Food Restaurant,Fast Food Restaurant,French Restaurant,German Restaurant,Vegetarian / Vegan Restaurant,Middle Eastern Restaurant
4,M4E,East Toronto,The Beaches,43.676357,-79.293031,0.0,Japanese Restaurant,Caribbean Restaurant,Restaurant,Vegetarian / Vegan Restaurant,Greek Restaurant,French Restaurant,Mediterranean Restaurant,Mexican Restaurant,Ramen Restaurant,Indian Restaurant
5,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,1.0,Japanese Restaurant,Restaurant,Seafood Restaurant,Comfort Food Restaurant,French Restaurant,Greek Restaurant,Italian Restaurant,Vegetarian / Vegan Restaurant,Middle Eastern Restaurant,American Restaurant
6,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383,0.0,Sushi Restaurant,Japanese Restaurant,Ramen Restaurant,Italian Restaurant,Vegetarian / Vegan Restaurant,Mexican Restaurant,Restaurant,Falafel Restaurant,Fast Food Restaurant,Middle Eastern Restaurant
7,M6G,Downtown Toronto,Christie,43.669542,-79.422564,0.0,Korean Restaurant,Mexican Restaurant,Ethiopian Restaurant,Indian Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Ramen Restaurant,Caribbean Restaurant,Chinese Restaurant,Eastern European Restaurant
8,M5H,Downtown Toronto,"Richmond, Adelaide, King",43.650571,-79.384568,1.0,Japanese Restaurant,Restaurant,Italian Restaurant,Sushi Restaurant,Seafood Restaurant,Brazilian Restaurant,Fast Food Restaurant,Vegetarian / Vegan Restaurant,Mediterranean Restaurant,New American Restaurant
9,M6H,West Toronto,"Dufferin, Dovercourt Village",43.669005,-79.442259,2.0,Sushi Restaurant,Italian Restaurant,Brazilian Restaurant,Portuguese Restaurant,Vietnamese Restaurant,Mexican Restaurant,Mediterranean Restaurant,Restaurant,Middle Eastern Restaurant,Thai Restaurant


### Visualizing restaurant clusters but 'black' out areas that already have a Mexican restaurant

In [67]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(k_clusters)
ys = [i + x + (i*x)**2 for i in range(k_clusters)]
colors_array = cm.Dark2(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(to_merged['Latitude'], to_merged['Longitude'], to_merged['Neighborhood'], to_merged['Cluster_ID']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=6,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)


for lat, lon, poi in zip(MexTO['Venue_Latitude'], MexTO['Venue_Longitude'], MexTO['Neighborhood']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color='black',
        fill=True,
        fill_color='black',
        fill_opacity=1).add_to(map_clusters)    
    

map_clusters

## Step 8: Exploring the clusters

In [86]:
GrayPts = to_merged.loc[to_merged['Cluster_ID'] == 0, to_merged.columns[[1] + list(range(2, to_merged.shape[1]))]]
GrayPts

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster_ID,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,0.0,Japanese Restaurant,Italian Restaurant,Ramen Restaurant,Sushi Restaurant,Thai Restaurant,Middle Eastern Restaurant,Restaurant,Theme Restaurant,Indian Restaurant,Vegetarian / Vegan Restaurant
2,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,0.0,Japanese Restaurant,Italian Restaurant,Ramen Restaurant,Restaurant,Middle Eastern Restaurant,Falafel Restaurant,Fast Food Restaurant,German Restaurant,Mexican Restaurant,Modern European Restaurant
4,East Toronto,The Beaches,43.676357,-79.293031,0.0,Japanese Restaurant,Caribbean Restaurant,Restaurant,Vegetarian / Vegan Restaurant,Greek Restaurant,French Restaurant,Mediterranean Restaurant,Mexican Restaurant,Ramen Restaurant,Indian Restaurant
6,Downtown Toronto,Central Bay Street,43.657952,-79.387383,0.0,Sushi Restaurant,Japanese Restaurant,Ramen Restaurant,Italian Restaurant,Vegetarian / Vegan Restaurant,Mexican Restaurant,Restaurant,Falafel Restaurant,Fast Food Restaurant,Middle Eastern Restaurant
7,Downtown Toronto,Christie,43.669542,-79.422564,0.0,Korean Restaurant,Mexican Restaurant,Ethiopian Restaurant,Indian Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Ramen Restaurant,Caribbean Restaurant,Chinese Restaurant,Eastern European Restaurant
12,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,0.0,Greek Restaurant,Fast Food Restaurant,Italian Restaurant,Ramen Restaurant,Restaurant,Japanese Restaurant,Asian Restaurant,Caribbean Restaurant,Chinese Restaurant,Cuban Restaurant
23,Central Toronto,"The Annex, North Midtown, Yorkville",43.67271,-79.405678,0.0,Italian Restaurant,Vegetarian / Vegan Restaurant,Restaurant,Thai Restaurant,Indian Restaurant,Mexican Restaurant,Japanese Restaurant,Modern European Restaurant,Caribbean Restaurant,Eastern European Restaurant
26,Downtown Toronto,"University of Toronto, Harbord",43.662696,-79.400049,0.0,Vegetarian / Vegan Restaurant,Mexican Restaurant,Restaurant,Japanese Restaurant,Comfort Food Restaurant,Italian Restaurant,Persian Restaurant,Belgian Restaurant,Caribbean Restaurant,Doner Restaurant
29,Downtown Toronto,"Kensington Market, Chinatown, Grange Park",43.653206,-79.400049,0.0,Vegetarian / Vegan Restaurant,Mexican Restaurant,Vietnamese Restaurant,Caribbean Restaurant,French Restaurant,Restaurant,Belgian Restaurant,Comfort Food Restaurant,Doner Restaurant,Filipino Restaurant
33,Downtown Toronto,"St. James Town, Cabbagetown",43.667967,-79.367675,0.0,Japanese Restaurant,Taiwanese Restaurant,Caribbean Restaurant,Indian Restaurant,Italian Restaurant,Restaurant,Sushi Restaurant,American Restaurant,Thai Restaurant,Syrian Restaurant


In [73]:
GreenPts = to_merged.loc[to_merged['Cluster_ID'] == 1, to_merged.columns[[1] + list(range(2, to_merged.shape[1]))]]
GreenPts

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster_ID,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Downtown Toronto,St. James Town,43.651494,-79.375418,1.0,Restaurant,Japanese Restaurant,Italian Restaurant,Seafood Restaurant,Comfort Food Restaurant,Fast Food Restaurant,French Restaurant,German Restaurant,Vegetarian / Vegan Restaurant,Middle Eastern Restaurant
5,Downtown Toronto,Berczy Park,43.644771,-79.373306,1.0,Japanese Restaurant,Restaurant,Seafood Restaurant,Comfort Food Restaurant,French Restaurant,Greek Restaurant,Italian Restaurant,Vegetarian / Vegan Restaurant,Middle Eastern Restaurant,American Restaurant
8,Downtown Toronto,"Richmond, Adelaide, King",43.650571,-79.384568,1.0,Japanese Restaurant,Restaurant,Italian Restaurant,Sushi Restaurant,Seafood Restaurant,Brazilian Restaurant,Fast Food Restaurant,Vegetarian / Vegan Restaurant,Mediterranean Restaurant,New American Restaurant
10,Downtown Toronto,"Harbourfront East, Union Station, Toronto Islands",43.640816,-79.381752,1.0,Japanese Restaurant,Italian Restaurant,Vegetarian / Vegan Restaurant,Restaurant,Chinese Restaurant,Greek Restaurant,French Restaurant,Mediterranean Restaurant,Seafood Restaurant,Indian Restaurant
11,West Toronto,"Little Portugal, Trinity",43.647927,-79.41975,1.0,Restaurant,Vegetarian / Vegan Restaurant,Asian Restaurant,Italian Restaurant,Vietnamese Restaurant,Seafood Restaurant,Japanese Restaurant,American Restaurant,New American Restaurant,French Restaurant
13,Downtown Toronto,"Toronto Dominion Centre, Design Exchange",43.647177,-79.381576,1.0,Japanese Restaurant,Restaurant,Italian Restaurant,Seafood Restaurant,Thai Restaurant,Vegetarian / Vegan Restaurant,Brazilian Restaurant,French Restaurant,Mediterranean Restaurant,New American Restaurant
14,West Toronto,"Brockton, Parkdale Village, Exhibition Place",43.636847,-79.428191,1.0,Restaurant,Tibetan Restaurant,Indian Restaurant,Italian Restaurant,Vegetarian / Vegan Restaurant,Japanese Restaurant,New American Restaurant,Caribbean Restaurant,Comfort Food Restaurant,Ethiopian Restaurant
16,Downtown Toronto,"Commerce Court, Victoria Hotel",43.648198,-79.379817,1.0,Restaurant,Japanese Restaurant,Seafood Restaurant,Vegetarian / Vegan Restaurant,Thai Restaurant,French Restaurant,Italian Restaurant,Mediterranean Restaurant,Middle Eastern Restaurant,New American Restaurant
32,Downtown Toronto,Stn A PO Boxes,43.646435,-79.374846,1.0,Japanese Restaurant,Restaurant,Seafood Restaurant,American Restaurant,New American Restaurant,Greek Restaurant,Thai Restaurant,Italian Restaurant,Vegetarian / Vegan Restaurant,French Restaurant
34,Downtown Toronto,"First Canadian Place, Underground city",43.648429,-79.38228,1.0,Restaurant,Japanese Restaurant,Seafood Restaurant,Italian Restaurant,Thai Restaurant,Vegetarian / Vegan Restaurant,Brazilian Restaurant,French Restaurant,Mediterranean Restaurant,New American Restaurant


In [74]:
PurplePts = to_merged.loc[to_merged['Cluster_ID'] == 2, to_merged.columns[[1] + list(range(2, to_merged.shape[1]))]]
PurplePts

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster_ID,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,2.0,Italian Restaurant,Restaurant,Indian Restaurant,Sushi Restaurant,Thai Restaurant,Middle Eastern Restaurant,Mediterranean Restaurant,French Restaurant,German Restaurant,Asian Restaurant
9,West Toronto,"Dufferin, Dovercourt Village",43.669005,-79.442259,2.0,Sushi Restaurant,Italian Restaurant,Brazilian Restaurant,Portuguese Restaurant,Vietnamese Restaurant,Mexican Restaurant,Mediterranean Restaurant,Restaurant,Middle Eastern Restaurant,Thai Restaurant
17,East Toronto,Studio District,43.659526,-79.340923,2.0,Vietnamese Restaurant,American Restaurant,Italian Restaurant,French Restaurant,Thai Restaurant,Latin American Restaurant,Sushi Restaurant,Restaurant,Falafel Restaurant,Comfort Food Restaurant
18,Central Toronto,Roselawn,43.711695,-79.416936,2.0,Sushi Restaurant,Italian Restaurant,Japanese Restaurant,Asian Restaurant,Ethiopian Restaurant,Indian Restaurant,Indian Chinese Restaurant,Hawaiian Restaurant,Halal Restaurant,Greek Restaurant
19,Central Toronto,Davisville North,43.712751,-79.390197,2.0,Italian Restaurant,Fast Food Restaurant,Restaurant,Sushi Restaurant,Mexican Restaurant,Vietnamese Restaurant,Seafood Restaurant,Chinese Restaurant,Greek Restaurant,Vegetarian / Vegan Restaurant
20,Central Toronto,"Forest Hill North & West, Forest Hill Road Park",43.696948,-79.411307,2.0,Sushi Restaurant,Japanese Restaurant,Italian Restaurant,Persian Restaurant,Vegetarian / Vegan Restaurant,Middle Eastern Restaurant,Fast Food Restaurant,Hawaiian Restaurant,Halal Restaurant,Greek Restaurant
21,West Toronto,"High Park, The Junction South",43.661608,-79.464763,2.0,Italian Restaurant,Thai Restaurant,Sushi Restaurant,Mexican Restaurant,Restaurant,Seafood Restaurant,Mediterranean Restaurant,Fast Food Restaurant,Latin American Restaurant,Vietnamese Restaurant
22,Central Toronto,"North Toronto West, Lawrence Park",43.715383,-79.405678,2.0,Italian Restaurant,Mexican Restaurant,Restaurant,Chinese Restaurant,Fast Food Restaurant,Sushi Restaurant,Vietnamese Restaurant,Caribbean Restaurant,Belgian Restaurant,Indian Chinese Restaurant
24,West Toronto,"Parkdale, Roncesvalles",43.64896,-79.456325,2.0,Thai Restaurant,Sushi Restaurant,Restaurant,Eastern European Restaurant,American Restaurant,Chinese Restaurant,Cuban Restaurant,Falafel Restaurant,Italian Restaurant,Mediterranean Restaurant
25,Central Toronto,Davisville,43.704324,-79.38879,2.0,Italian Restaurant,Sushi Restaurant,Fast Food Restaurant,Indian Restaurant,Restaurant,Middle Eastern Restaurant,Mexican Restaurant,Vietnamese Restaurant,Caribbean Restaurant,French Restaurant


In [75]:
YellowPts = to_merged.loc[to_merged['Cluster_ID'] == 3, to_merged.columns[[1] + list(range(2, to_merged.shape[1]))]]
YellowPts

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster_ID,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
15,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572,3.0,Indian Restaurant,Fast Food Restaurant,Restaurant,Italian Restaurant,Vegetarian / Vegan Restaurant,Halal Restaurant,French Restaurant,Middle Eastern Restaurant,Pakistani Restaurant,Indian Chinese Restaurant
31,Downtown Toronto,Rosedale,43.679563,-79.377529,3.0,Japanese Restaurant,Indian Restaurant,Filipino Restaurant,Italian Restaurant,Indian Chinese Restaurant,Hawaiian Restaurant,Halal Restaurant,Greek Restaurant,German Restaurant,French Restaurant


# Results and Discussion <a name="results"></a>

Our analysis dictates that even though Toronto has an abundant amount of restaurants (789, according to Foursquare), there are pockets of low density restaurant data for a lot of the areas. Albeit a new restaurant can vitalize a neighborhood and create a new food scene, this is still a gamble that my client would like to avoid. Looking at the above map, we can see established neighborhoods; but if you notice the black points, those are Mexican restaurants. So, we would also want to avoid the surrounding areas as well.  

Furthermore, if you look at the breakdown of each cluster and their common restaurants for each, you can see that Mexican restaurants aren’t that popular in the city; compare to the most common in asian cuisines (Japanese/Sushi, Korean, Vietnamese, Thai), Middle-Eastern cuisines (Indian, Halal, Pakistani) and Italian looking like the most common. So, adding a Mexican restaurant might give some variety to dining in any of these neighborhoods. 

However, with the 2 criterias mentioned prior to an established food scene and a lack of Mexican restaurants in a cluster, I can recommend that the green cluster is the most optimal to break ground for a new Mexican restaurant. 


# Conclusion <a name="conclusion"></a>

The purpose of this project was to identify areas in Toronto that would facilitate a new Mexican restaurant for my client. By looking at clean datasets and cross referencing them with Foursquare data, albeit a limited collection of data points, I was able to create some insights that my client should consider moving forward. And with the recommendation of the green cluster, it’s not a bad choice in any way. South downtown Toronto is an area that filled with various food options, shopping districts and tourist attractions all placed along the subway line. 

If only they had a Mexican restaurant….
