# Battle of the Neighbourhoods - Toronto

### IBM Data Science Capstone Project

## Contents :
1. [Introduction](#introduction)
2. [Data](#data)
3. [Methodology](#methodology)
4. [Analysis and Observations](#analysis)
5. [Results and Discussions](#results)
6. [Conclusion](#conclusion)


## 1. Introduction <a name="introduction"></a>

We have been tasked with finding prime locations to open a restaurant in **Toronto, Canada**.

In this project, we will try to create a database of top restaurants in various neighbourhoods in Downtown Toronto. We have chosen Downtown Toronto as we would like our restaurant to be in the city center.

Then, we will cluster the restaurants into pockets of high density, and accordingly pick suitable locations for our restaurant.

## 2. Data <a name="data"></a>

For this project, we are going to need a list of boroughs in Toronto, and neighbourhoods in these boroughs. For this, we will scrape a Wikipedia page listing all neighbourhoods in toronto by Postal Code.

Now, to find restaurants in the neighbourhoods, we are going to need the coordinates of the neighbourhood centers, from where we can search radially outwards. The following contains a dataset of coordinates of postal codes in Toronto:

* https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0701EN-SkillsNetwork/labs_v1/Geospatial_Coordinates.csv

Finally, we will use **Foursquare API** to look for top venues around these coordinates.

#### 2.1 : Dataframe consisting of Neighbourhoods, Postal Codes and Boroughs

In this section, we create a dataframe that lists all the neighbourhoods and boroughs grouped by postal codes in the city of _Toronto, Canada._

First we import and install the necessary libraries for this project:

In [1]:
!pip install bs4
!pip install requests
!pip install folium

import pandas as pd
import numpy as np
import requests
import folium
import matplotlib.cm as cm
import matplotlib.colors as colors
import sklearn.utils

from sklearn.cluster import DBSCAN
from sklearn.preprocessing import StandardScaler
from bs4 import BeautifulSoup
from pandas.io.json import json_normalize
from geopy.geocoders import Nominatim

print("All libraries imported")

Collecting bs4
  Downloading bs4-0.0.1.tar.gz (1.1 kB)
Building wheels for collected packages: bs4
  Building wheel for bs4 (setup.py) ... [?25ldone
[?25h  Created wheel for bs4: filename=bs4-0.0.1-py3-none-any.whl size=1273 sha256=952aff82f05aad210cb816119076ff1bccf969d8849b9f1edbec5c4d36e9b1d9
  Stored in directory: /tmp/wsuser/.cache/pip/wheels/75/78/21/68b124549c9bdc94f822c02fb9aa3578a669843f9767776bca
Successfully built bs4
Installing collected packages: bs4
Successfully installed bs4-0.0.1
Collecting folium
  Downloading folium-0.12.1-py2.py3-none-any.whl (94 kB)
[K     |████████████████████████████████| 94 kB 7.6 MB/s  eta 0:00:01
Collecting branca>=0.3.0
  Downloading branca-0.4.2-py3-none-any.whl (24 kB)
Installing collected packages: branca, folium
Successfully installed branca-0.4.2 folium-0.12.1
All libraries imported


Now, let's use `requests` to get the html file from the given Wikipedia URL, as text.

We will also create a BeautifulSoup object to scrape the html data.

In [2]:
# The html text will be stored in the variable 'data'
url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
data = requests.get(url).text
 
# Uncomment the following line and run the cell to view the data
#data

In [3]:
# Create a BeautifulSoup object 'soup', using 'html5lib' to parse the html file
soup = BeautifulSoup(data, 'html5lib')
 
# Use 'find' to find the html element with a table tag
table = soup.find("table")
 
# Uncomment the following line and run the cell to view the table in html
#table

Now, we traverse through the table and get the necessary data.

Finally, we add the data, row by row, to our dataframe.

In [4]:
flag = True
flag2 = True

# Create a datframe 'toronto' using pandas to store the data we collect from the html file
toronto = pd.DataFrame(columns=['Postal Code','Borough','Neighbourhood'])
 
# Loop through the html file to find the postal codes and names of boroughs and neighbourhoods
for row in table.find_all('td'):
    
    if row.span.text == 'Not assigned':          # To skip all cells without borough names
        pass
    
    else:
        post = row.p.text[:3]                                                               # Get the postal code      
        brgh = (row.span.text).split('(')[0]                                                # Get the borough name by splitting the neighbourhood names away from combined text
        nbhd = ((row.span.text).split('(')[1].replace(' /', ',')).replace(')', '')          # Get the neighbourhood names, remove brackets, replace forward slases by commas to separate individual neighbourhoods
        
        toronto = toronto.append({'Postal Code':post, 'Borough':brgh, 'Neighbourhood':nbhd}, ignore_index=True)          # Add rows of data to toronto dataframe
 
# There are some erroneous borough names, so we manually replace them with the correct names
toronto['Borough'] = toronto['Borough'].replace({'Downtown TorontoStn A PO Boxes25 The Esplanade':'Downtown Toronto Stn A',
                                             'East TorontoBusiness reply mail Processing Centre969 Eastern':'East Toronto Business',
                                             'EtobicokeNorthwest':'Etobicoke Northwest','East YorkEast Toronto':'East York/East Toronto',
                                             'MississaugaCanada Post Gateway Processing Centre':'Mississauga'})
toronto

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Queen's Park,Ontario Provincial Government
...,...,...,...
98,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North"
99,M4Y,Downtown Toronto,Church and Wellesley
100,M7Y,East Toronto Business,Enclave of M4L
101,M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu..."


#### 2.2 Adding coordinates to each Postal Code in the dataframe

In this section, we will add geographical coordinates to each postal code using a separate dataset at:
- https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0701EN-SkillsNetwork/labs_v1/Geospatial_Coordinates.csv.

In [5]:
# Read the csv data file into a dataframe
coordinates = pd.read_csv('https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0701EN-SkillsNetwork/labs_v1/Geospatial_Coordinates.csv')
coordinates

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476
...,...,...,...
98,M9N,43.706876,-79.518188
99,M9P,43.696319,-79.532242
100,M9R,43.688905,-79.554724
101,M9V,43.739416,-79.588437


We will now merge the two datasets using 'Postal Code' as the key, using an inner join.

However, if the statement is exeucted repeatedly, it causes more columns to be added.

To stop that, we use a control variable called `flag`, and set it up to execute the merge only if it is `True`. We have initialised `flag` with `True` at the beginning, before creating the toronto dataframe, and we change its value to `False` after the merge.

In [6]:
# Merging the two dataframes to add latitudes and longitudes to toronto dataframe
if flag:
    toronto = pd.merge(toronto, coordinates, on='Postal Code', how='inner')
    flag = False          # To prevent further merging
 
toronto

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.654260,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Queen's Park,Ontario Provincial Government,43.662301,-79.389494
...,...,...,...,...,...
98,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North",43.653654,-79.506944
99,M4Y,Downtown Toronto,Church and Wellesley,43.665860,-79.383160
100,M7Y,East Toronto Business,Enclave of M4L,43.662744,-79.321558
101,M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu...",43.636258,-79.498509


## 3. Methodology <a name="methodology"></a>

In this project, we will try to locate restaurants in Downtown Toronto, i.e. close to the city center, but will look for areas where restaurant density is fairly low (not too much, not too less).

We have collected the required data about the boroughs, neighbourhoods in Toronto, by postal codes, and we have their respective coordinates, and we have created a single dataframe containing all this information.

Next, we will cretae a similar dataset exclusively for **Downtown Toronto**, which is our area of interest, and then explore these neighbourhoods using Foursquare API, and get the **top 50 venues** within a radius of **500 metres** of the given coordinates.

We will also visualise these on maps of Toronto and Downtown Toronto.

Then, we will filter out all kinds of restaurants from the list of venues in each neighbourhood, and finally create clusters of restaurants based on density, and show them on a map of Downtown Toronto. For this, we will use **density based clustering** to create clusters with a minimum of **5 restaurants within 250 meters of each other**.

Finally, we will show these clusters on a map, and draw conclusions from it, and choose the most suitable locations for our new restaurant.

## 4. Analysis and Observations <a name = "analysis"></a>


In this section, we will explore the neighbourhoods of Downtown Toronto using Foursquare API, get top 50 venues in each neighbourhood, filter out all types of restaurants, and then cluster these restaurants by density.

We will also present our observations on a map of Downtown Toronto.

#### 4.1 Exploring the Neighbourhoods

Here, we will explore the postal codes of Toronto to look for places of interest.

First, let's get the geographical coordinates of Toronto, and create a map with all the postal codes displayed.

In [7]:
address = 'Toronto, ON'
 
coordinates = Nominatim(user_agent="tor")
location = coordinates.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.6534817, -79.3839347.


In [8]:
toronto_map = folium.Map(location=[latitude, longitude], zoom_start=11)
 
# Adding markers for postal codes in toronto on the map
for lat, lon, borough, postcode in zip(toronto['Latitude'], toronto['Longitude'], toronto['Borough'], toronto['Postal Code']):
    label = '{}, {}'.format(postcode, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        popup=label,
        radius=3.25,
        color='green',
        fill=True,
        fill_color='#16ff94',
        fill_opacity=0.7,
        parse_html=False).add_to(toronto_map)  
    
toronto_map

#### 4.2 Downtown Toronto

Let us focus on Downtown Toronto for our Restaurant.

In [9]:
down_toronto = toronto[toronto['Borough'] == 'Downtown Toronto'].reset_index(drop=True)
down_toronto

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
1,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
2,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
3,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306
4,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383
5,M6G,Downtown Toronto,Christie,43.669542,-79.422564
6,M5H,Downtown Toronto,"Richmond, Adelaide, King",43.650571,-79.384568
7,M5J,Downtown Toronto,"Harbourfront East, Union Station, Toronto Islands",43.640816,-79.381752
8,M5K,Downtown Toronto,"Toronto Dominion Centre, Design Exchange",43.647177,-79.381576
9,M5L,Downtown Toronto,"Commerce Court, Victoria Hotel",43.648198,-79.379817


In [10]:
# The code was removed by Watson Studio for sharing.

Now, using Foursquare, we will get a list of all venues around the area in each postal code, within a radius of 500 meters from the center.

In [11]:
def get_venues(codes, latitudes, longitudes, radius=500):          # this function will loop through the various postcodes in Downtown Toronto,
                                                                   #and return 50 nearby venues within a radius of 500 meters
    
    venues_list=[]
    for code, lat, lng in zip(codes, latitudes, longitudes):
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            client_id, 
            client_secret, 
            version, 
            lat, 
            lng, 
            radius, 
            limit)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # get the required information about the venues
        venues_list.append([(
            code, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Postal Code', 
                  'Latitude', 
                  'Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [12]:
down_toronto_venues = get_venues(down_toronto['Postal Code'], down_toronto['Latitude'], down_toronto['Longitude'])
down_toronto_venues

Unnamed: 0,Postal Code,Latitude,Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,M5A,43.65426,-79.360636,Roselle Desserts,43.653447,-79.362017,Bakery
1,M5A,43.65426,-79.360636,Tandem Coffee,43.653559,-79.361809,Coffee Shop
2,M5A,43.65426,-79.360636,Cooper Koo Family YMCA,43.653249,-79.358008,Distribution Center
3,M5A,43.65426,-79.360636,Impact Kitchen,43.656369,-79.356980,Restaurant
4,M5A,43.65426,-79.360636,Body Blitz Spa East,43.654735,-79.359874,Spa
...,...,...,...,...,...,...,...
711,M4Y,43.66586,-79.383160,The Yoga Sanctuary,43.661499,-79.383636,Yoga Studio
712,M4Y,43.66586,-79.383160,Rooster Coffee House,43.669654,-79.379871,Coffee Shop
713,M4Y,43.66586,-79.383160,Wow! Sushi,43.668514,-79.386686,Sushi Restaurant
714,M4Y,43.66586,-79.383160,Coffee Island,43.664271,-79.386972,Coffee Shop


Now, we shall proceed to create a map of Downtown Toronto and display all these venues on it.

In [13]:
address = 'Downtown Toronto, Toronto, ON'

geolocator = Nominatim(user_agent="down_tor")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Downtown Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Downtown Toronto are 43.6563221, -79.3809161.


In [14]:
down_toronto_map = folium.Map(location=[latitude, longitude], zoom_start=14)
 
# Adding markers for venues in toronto on the map
for lat, lon, place, category in zip(down_toronto_venues['Venue Latitude'],
                                     down_toronto_venues['Venue Longitude'],
                                     down_toronto_venues['Venue'],
                                     down_toronto_venues['Venue Category']):
    label = '{}, {}'.format(place, category)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        popup=label,
        radius=3.25,
        color='green',
        fill=True,
        fill_color='#16ff94',
        fill_opacity=0.7,
        parse_html=False).add_to(down_toronto_map)  
    
down_toronto_map

In [22]:
print('There are ', len(down_toronto_venues['Venue Category'].unique()), 'unique types of venues in Downtown Toronto')

There are  186 unique types of venues in Downtown Toronto


#### 4.3 Filtering out the Restaurants

Now, we will create a new dataframe containing a list of all kinds of restaurants returned by Foursquare.

In [16]:
down_toronto_restaurants = pd.DataFrame()

string1 = 'Restaurant'
string2 = 'restaurant'

for i in range(0,down_toronto_venues.shape[0]):
  if string1 in down_toronto_venues.loc[i,'Venue Category'] or string2 in down_toronto_venues.loc[i,'Venue Category']:
    down_toronto_restaurants = down_toronto_restaurants.append({'Postal Code':down_toronto_venues.loc[i,'Postal Code'],
                                                                'Latitude':down_toronto_venues.loc[i,'Latitude'],
                                                                'Longitude':down_toronto_venues.loc[i,'Longitude'],
                                                                'Restaurant Name':down_toronto_venues.loc[i,'Venue'],
                                                                'Restaurant Latitude':down_toronto_venues.loc[i,'Venue Latitude'],
                                                                'Restaurant Longitude':down_toronto_venues.loc[i,'Venue Longitude'],
                                                                'Restaurant Type':down_toronto_venues.loc[i,'Venue Category'],}, ignore_index=True)

In [17]:
down_toronto_restaurants

Unnamed: 0,Latitude,Longitude,Postal Code,Restaurant Latitude,Restaurant Longitude,Restaurant Name,Restaurant Type
0,43.65426,-79.360636,M5A,43.656369,-79.356980,Impact Kitchen,Restaurant
1,43.65426,-79.360636,M5A,43.650565,-79.357843,Cluny Bistro & Boulangerie,French Restaurant
2,43.65426,-79.360636,M5A,43.650601,-79.358920,El Catrin,Mexican Restaurant
3,43.65426,-79.360636,M5A,43.649970,-79.360153,Izumi,Asian Restaurant
4,43.65426,-79.360636,M5A,43.653475,-79.355458,Copper Branch,Vegetarian / Vegan Restaurant
...,...,...,...,...,...,...,...
160,43.66586,-79.383160,M4Y,43.667872,-79.385659,Kothur Indian Cuisine,Indian Restaurant
161,43.66586,-79.383160,M4Y,43.664665,-79.380641,Loaded Pierogi,Polish Restaurant
162,43.66586,-79.383160,M4Y,43.668759,-79.385694,Wish,Restaurant
163,43.66586,-79.383160,M4Y,43.663894,-79.380210,Kawa Sushi,Japanese Restaurant


Let us put these restaurants on our map of Downtown Toronto.

In [18]:
down_toronto_map = folium.Map(location=[latitude, longitude], zoom_start=14)
 
# Adding markers for venues in toronto on the map
for lat, lon, place, category in zip(down_toronto_restaurants['Restaurant Latitude'],
                                     down_toronto_restaurants['Restaurant Longitude'],
                                     down_toronto_restaurants['Restaurant Name'],
                                     down_toronto_restaurants['Restaurant Type']):
    label = '{}, {}'.format(place, category)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        popup=label,
        radius=3.25,
        color='green',
        fill=True,
        fill_color='#16ff94',
        fill_opacity=0.7,
        parse_html=False).add_to(down_toronto_map)  
    
down_toronto_map

#### 5.4 Clustering the Resaturants

Here, we will use DBSCAN clustering to group together the restaurants that are close to other restaurants.

This creates clusters of restaurants based on their density in an area.

Now, we use Scikit-Learn's DBSCAN to cluster the Restaurants based on density.

In [19]:
sklearn.utils.check_random_state(1000)
Clus_dataSet = down_toronto_restaurants[['Restaurant Latitude','Restaurant Longitude']]           # Set of coordinates to begin clustering
Clus_dataSet = np.nan_to_num(Clus_dataSet)
Clus_dataSet = StandardScaler().fit_transform(Clus_dataSet)

db = DBSCAN(eps=0.25, min_samples=5).fit(Clus_dataSet)                                            # Clustering restaurants with minimum 5 restaurants within a radius of 250 meters
core_samples_mask = np.zeros_like(db.labels_, dtype=bool)
core_samples_mask[db.core_sample_indices_] = True
labels = db.labels_
down_toronto_restaurants["Cluster"]=labels                                                        # Assigning cluster labels to the restaurants

realClusterNum=len(set(labels)) - (1 if -1 in labels else 0)
clusterNum = len(set(labels)) 

down_toronto_restaurants

Unnamed: 0,Latitude,Longitude,Postal Code,Restaurant Latitude,Restaurant Longitude,Restaurant Name,Restaurant Type,Cluster
0,43.65426,-79.360636,M5A,43.656369,-79.356980,Impact Kitchen,Restaurant,-1
1,43.65426,-79.360636,M5A,43.650565,-79.357843,Cluny Bistro & Boulangerie,French Restaurant,-1
2,43.65426,-79.360636,M5A,43.650601,-79.358920,El Catrin,Mexican Restaurant,-1
3,43.65426,-79.360636,M5A,43.649970,-79.360153,Izumi,Asian Restaurant,-1
4,43.65426,-79.360636,M5A,43.653475,-79.355458,Copper Branch,Vegetarian / Vegan Restaurant,-1
...,...,...,...,...,...,...,...,...
160,43.66586,-79.383160,M4Y,43.667872,-79.385659,Kothur Indian Cuisine,Indian Restaurant,5
161,43.66586,-79.383160,M4Y,43.664665,-79.380641,Loaded Pierogi,Polish Restaurant,-1
162,43.66586,-79.383160,M4Y,43.668759,-79.385694,Wish,Restaurant,-1
163,43.66586,-79.383160,M4Y,43.663894,-79.380210,Kawa Sushi,Japanese Restaurant,-1


Note that the cluster label **-1** implies that the data point is an outlier.

In [20]:
print('Number of clusters (excluding outliers) is ', realClusterNum)

Number of clusters (excluding outliers) is  6


Finally, we create a map of Downtown Toronto and superimpose our Restaurant Clusters onto it.

In [21]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=14)

# set color scheme
colors = ['red', 'blue', 'green', 'magenta', 'orange', 'purple', 'brown']

# add markers to the map
for lat, lon, poi, nam, typ, cluster in zip(down_toronto_restaurants['Restaurant Latitude'],
                                            down_toronto_restaurants['Restaurant Longitude'],
                                            down_toronto_restaurants['Postal Code'],
                                            down_toronto_restaurants['Restaurant Name'],
                                            down_toronto_restaurants['Restaurant Type'],
                                            down_toronto_restaurants['Cluster']):
    label = folium.Popup(str(poi) + ',\n' + str(nam) + ',\n' + str(typ) + ((',\nCluster ' + str(cluster+1)) if cluster != -1 else ',\nOutlier'), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=3.5,
        popup=label,
        color=colors[cluster],
        fill=True,
        fill_opacity=0.5).add_to(map_clusters)
       
map_clusters

Clusters marked in brown above indicate **outliers** (those that aren't members of any cluster).

## 5. Results and Discussions <a name = "results"></a>

We have, using Foursquare data, obtained the locations of several restaurants in Downtown Toronto, and clustered them by density, i.e. by the number of restaurants in a given area in Downtown Toronto.

For clustering, we used a density-based approach rather than k-means or any other approach. The reason for this is that our business problem does not specify any specific type (or cuisine) for our restaurant. So, it is of no use to cluster the restaurants by type. Rather, our goal is to simply find locations which is not too densely packed with restaurants.

To define the clusters, we have chosen a distance of **250 metres**, and **atleast 5 restaurants**, meaning a restaurant belongs to a cluster if there are atleast 5 other restaurants within a radius of 250 meters from it.

This resulted in the formation 6 clusters (as of now - results may differ as Foursquare data gets updated), with several restaurants remaining outliers. For our restaurant, we shall chose a region that is not too dense with restaurants, as it would be too competitive. When faced with too many choices, consumers tend to default to their *regular* choice, which would be bad for our business.

We shall also not choose an *outlier* area, as it could mean the area might not be competitive enough, i.e. there may not be enough customers, or that it may not be a well-developed commercial area, which is again bad for business.

A restaurant in these areas can work, but we'd rather stay on the safe side. Hence, we choose locations from either low-density clusters, or from outliers that are decently close enough to each other, as suitable places for our restaurant.

Hence, as per our selection criteria the following locations make good candidates for our final list:
* Richmond.
* St. James Town.
* Bay Street.
* Church and Wellesley.
* Harbourfront and Union Station.

All of the above locations are very close to the center of Downtown Toronto, and in areas that are not too dense, but just dense enough with restaurants.

## 6. Conclusion <a name = "conclusion"></a>

The stated goal of this project was to identify the most suitable locations for opening a new restaurant in Downtown Toronto, Canada. With the help of Foursquare, we identified the top restaurants in the neighbourhoods of Downtown Toronto. Then these restaurants were clustered together on the basis of density, and then locations from low-density clusters and high-density outliers were finalised.

However, this is by no means a comprehensive study, and follow-up studies must be carried out to evaluate other factors such as emergency services, accessibility, look and feel, prices etc. Our project also does not consider fast food chains, pubs, bars, coffee shops and other eateries and diners, which can also be addressed in the follow-up studies.