# Discovering the Best Location for a New Restaurant in Toronto Neighborhoods
### By Adedamola Adedokun

## 1. INTRODUCTION
### 1.1 BACKGROUND


Toronto is the most populous city in Canada, with a population of over 2 million people. The city is located on the shores of the western end of Lake Ontario. Being the largest city in the country, the city accounts for a significant portion of Canada's economic activity and more than 20% of Canada's population. Toronto is an international centre of business, finance, arts, and culture. It is considered on of the most diverse citites in the world with a large population of immigrants from around the globe. This has also made Toronto one of the most multicultural and cosmopolitan cities in the world. 

The diverse population of Toronto reflects its current and historical role as a choice immigration destination. More than 50 percent of residents belong to a visible minority population group, and over 200 distinct ethnic origins are represented among its inhabitants. While the majority of Torontonians speak English as their primary language, over 160 languages are spoken in the city. Being a large city with a diverse population, information on hot spot / best locations would contribute greatly to the success of a new restaurant business in the city.

### PROBLEM

This project aims to discover / explore the best location for a new restaurant business based on the geographical data of the commercial neighborhoods in Toronto. It's important for client to be able to discover a peculiar location that is optimized inside the cluster of similar businesses which is already familiar to prospective customers.




### INTEREST

Location is considered one of the key factors contributing to the success of a restaurant. Entrepreneurs and business owners will be interested in this project because it will guide their decision in selecting the best neigborhood location for the establishment of a new restaurant business. This can be shown on a map and information chart where each neighborhood district is clustered according to the venue density.

## 2. Data Acquisition and Cleaning

To consider the problem:

Data was scraped from the wikipedia webpage which includes the list of postal codes within the city of Toronto. Here is the link to the dataset: https://en.wikipedia.org/w/index.php?title=List_of_postal_codes_of_Canada:_M. Geographical coordinates of each postal code from : http://cocl.us/Geospatial_data

Forsquare API was used to get the most common venues of given Boroughs of Toronto.

Data was cleaned and transformed. Only records with an assigned Boroughs were used while those without an assigned Boroughs were excluded. Data scraped / downloaded were combined into one table. Afterwards, we will use the K-means clustering algorithm to show the clusters of businesses and show a representation of this on a Folium map.


#### Initial Dataset

![image.png](attachment:image.png)

#### After Clean-up

![image.png](attachment:image.png)

#### Combined Dataset with Coordinates

![image.png](attachment:image.png)

## 3. Methodology

I used python folium library to visualize geographic details of Toronto and its boroughs and I created a map of Toronto with boroughs superimposed on top. Latitude and longitude values were used to get the visual below:

![image.png](attachment:image.png)

I utilized the Foursquare API to explore the boroughs and segment them. I designed the limit as 100 venue and the radius 500 meter for each borough from their given latitude and longitude informations. Here is a head of the list Venues name, category, latitude and longitude informations from Forsquare API.

![image.png](attachment:image.png)

In summary of this data 46 venues were returned by Foursquare. Here is a merged table of neighborhoods and venues.

![image.png](attachment:image.png)

I created a table which shows list of top 10 venue category for each neighborhood in below table.

![image.png](attachment:image.png)

We have some common venue categories in Neigborhoods. For this reason I used unsupervised learning K-means algorithm to cluster the Neigborhoods. K-Means algorithm is one of the most common cluster method of unsupervised learning. K-Means will be used to cluster the Neigborhoods into 5 clusters.

Here is my merged table with cluster labels for each neighborhood.

![image.png](attachment:image.png)

## 4. Results

![image.png](attachment:image.png)

Visual representation of 5 clusters on the map
1. The red dots are mainly filled with lots of social venues, cafes and restaurants.
2. The yellow dots are areas also with coffee shops and some restaurants.
3. The purple dots are park related areas for outdoor activities.
4. The sky blue dots are areas with grocery stores and some social venues.
5. The light green dots are the areas related to transportation and city airport.


## 5. Discussion

As mentioned earlier, Toronto is a big city with a high population density. The clusters reveal the hotspots for a prospective restaurant in the commercial part of the city. 
I used the Kmeans algorithm as part of this clustering study. When I tested the Elbow method, I set the optimum k value to 5. However, the coordinates for the downtown part of the city were used. For more detailed and accurate guidance, the data set can be expanded and the details of the neighborhood or street can also be explored.

I finalized the study by visualizing the data and clustering information on the Toronto map. In future studies, more indepth applications can be carried out for prospective clients in other parts of the city.

## 6. Conclusion

As a result, people are turning to big cities to work or start a business. For this reason, people can achieve greater success if their decisions are guided by the right information and facts. A diverse and promising city offers great opportunity if one can be strategic about the right places to exploit them. This holds true for a client seeking to establish a new restaurant in the city, the benefit of having the right location cannot be over-emphasized.



## 7. References

1. Wikipedia 
2. Foursquare API