# Capstone Project - The Battle of the Neighborhoods-Week 5
### Applied Data Science Capstone by IBM-Coursera

## Table of Contents
* [Business Problem](#intro)
* [Data](#Data)
* [Methodology](#Methodology)
* [Analysis](#Analysis)
* [Results and Discussions](#Results)
* [Conclusion](#Conclusion)

## Business Problem <a name ="intro"></a>

In this project, we will be focusing on providing *recommendations* to stakeholders who are interested in opening a restaurant or a business in city of **Toronto** in canada.

This report will help two stakeholders, the one who want to open a restaurant and the one who is interested in starting any business but in need of where he/she can set up his/her office.

We will use the DataScience toolbox and other researches to provide informations to stakeholders and recommend them the optimal place/location for their businesses.

## Data <a name="Data"></a>

According to our problem, there are considerations that will drive our decisions:
Let's see visit each problem to see what to consider;
1. Restaurant opening:
    * To recommend a good location for the restaurant, we will check the neighborhoods with less number of restaurants, This will help in identifying the area with less **Competitions**.
    * We might also check what type of *cuisines* that nearby restaurant serves, by looking into that this should help the stakeholder in being creative which also might raise his/her customers.
    * For the good location in neighborhoods, there has to be some public services like <code>hospitals,Fire department, etc</code>, after all there are your customers.
2. Best office opening:
    * For someone interested in finding the best office, the best location must have public services as stated above.
    * The type of Business has to be close to unique, which will help him/her to have less competition.


We will use **FourSquare API** to generate data of restaurants in Toronto, other Data sources will be the neighborhoods data that can be scraped on websites as well as other datasets that can help in making a good decisions out of data analysis.

**Note:** <code>*</code>*Some of the cells below were left unrunned, since the data collection process was carried and data was saved, but i keept them to show the process of data collection.*<code>*</code>

### Data Collection

In [1]:
import numpy as np
import pandas as pd
import json
import requests
import time
import folium
import re
import matplotlib.cm as cm
import matplotlib.colors as colors

from IPython import display
from bs4 import BeautifulSoup

In [None]:
url = "https://www.toronto.ca/city-government/data-research-maps/neighbourhoods-communities/neighbourhood-profiles/"
web = requests.get(url).text
soup = BeautifulSoup(web, 'html5lib') # soup object
divs = soup.find_all('div') #the table is located on one div
print('There are %d divs' %len(divs))

In [None]:
#div tags that contains the table
div_tag = soup.find_all(id = "neighbourhoodApp")
div_tag

In [None]:
#find divs with areas
divs= soup.find('div',{"id":"neighbourhoodMap"})
#content = str(divs)
areas = soup.find_all('area') #find all tags with area
len(areas)

In [None]:
#Extracting text in area Tags
location = []
coordinates = []
for area in areas:
    location.append(area['alt'])
    coordinates.append(area['coords'])
print('#of location is {} and coordinates:{}'.format(len(location),len(coordinates)))
print('Done')

In [None]:
coordinates[0]

In [None]:
#Creating a dataframe for our data
neighborhood_data = pd.DataFrame(columns=["Location", "Coordinates"])

for area in areas:
    loc = area['alt']
    cor = area['coords']
    neighborhood_data = neighborhood_data.append({"Location":loc, "Coordinates":cor}, ignore_index=True)

neighborhood_data.to_csv('Neighborhood.csv')#saving the data for future use

In [None]:
neighborhood_data = pd.read_csv('Neighborhood.csv')
neighborhood_data.head()

Let's find the location address for our neighborhood using Geolocator

In [None]:
from geopy.geocoders import Nominatim

address ="Etobicoke West Mall" 
geolocator = Nominatim(user_agent = 'foursquare_agent')
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

Loading the FourSquare_API credentials

In [None]:
with open('Foursquare_cred.json','r') as cred:
    intel = json.load(cred)
    
client_id = intel[0]['ClientId']
secret = intel[0]['ClientSecret']
code = intel[0]['Code']
token = intel[0]['access_token']

### Exploring the location with FourSquare API

Let's create the GET request URL

In [None]:
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 500
version = '20210605'
location = 'Eringate-Centennial-West Deane'
url = 'https://api.foursquare.com/v2/venues/explore?near={}&client_id={}&client_secret={}&v={}&radius={}&limit={}'.format(
    location,
    client_id,
    secret,
    version,
    radius,
    LIMIT)
results = requests.get(url).json()['response']

results

In [None]:
r = results['response']['groups'][0]['items']
df = pd.DataFrame(columns=['venue_name', 'latitude', 'longitude', 'category'])

for v in r:
    v_name = v['venue']['name'],
    lat = v['venue']['location']['lat'],
    lng = v['venue']['location']['lng'],
    cat = v['venue']['categories'][0]['name']
    df = df.append({'venue_name':v_name,
                   'latitude':lat,
                   'longitude':lng,
                   'category':cat},ignore_index = True)

we could use the function to get the venues in the neighborhoods

In [None]:
def getNearbyVenues(locations):
    LIMIT = 50 # limit of number of venues returned by Foursquare API
    radius = 500
    version = '20210605'
    
    venues_list=[]
    
    for location in locations:
        print(location)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?near={}&client_id={}&client_secret={}&v={}&radius={}&limit={}'.format(
            location,
            client_id,
            secret,
            version,
            radius,
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()['response']
        if len(results)==0:
            print("====> No Data for {}".format(location))
            continue
        result = results['groups'][0]['items']
        
        venues_list.append([(
            location,  
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in result])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood',  
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [None]:
venues_Toronto = getNearbyVenues(locations = neighborhood_data['Location'])

In [None]:
venues_Toronto.head()

In [None]:
#save the data to csv
#venues_Toronto.to_csv('VenuesToronto.csv')

In [2]:
venues_Toronto = pd.read_csv('VenuesToronto.csv', index_col=0)
venues_Toronto.head()

Unnamed: 0,Neighborhood,Venue,Venue_Latitude,Venue_Longitude,Venue_Category
0,Etobicoke West Mall,Centennial Park,43.656154,-79.58754,Park
1,Etobicoke West Mall,Tim Hortons,43.644742,-79.567681,Coffee Shop
2,Etobicoke West Mall,The Beer Store,43.641313,-79.576925,Beer Store
3,Etobicoke West Mall,Porta Via,43.663449,-79.589638,Sandwich Place
4,Etobicoke West Mall,Best for Bride,43.635767,-79.539916,Women's Store


We can visualize the venue in neighborhoods using folium

In [3]:
#Toronto coordinate
latitude = 43.651070
longitude = -79.347015
# create map of Toronto using latitude and longitude values
venue_map = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, venue, category in zip(venues_Toronto['Venue_Latitude'], venues_Toronto['Venue_Longitude'], venues_Toronto['Venue'], venues_Toronto['Venue_Category']):
    label = '{}, {}'.format(category, venue)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(venue_map)  
    
venue_map

Now that we have data we can start making analysis to find solutions to our problem

## Methodology

The goal of this project is to provide **recommendations** to stakeholders, who want to open a restaurant in neighborhoods of Toronto and to the one with other business venture other than restaurants.

First step will be identifying the number of restaurants in every neighborhood and their location, this will help in identifying the area with low number of restaurants.

Secondly, we will identify other services in neighborhoods to see which area is in need of additional services.

we will use Clustering method to cluster our neighborhoods and study each cluster.

## Analysis

In [4]:
#The shape of our data
print('Shape of our data:',venues_Toronto.shape)
venues_Toronto['Venue_Category'].value_counts().to_frame()

Shape of our data: (1863, 5)


Unnamed: 0,Venue_Category
Sandwich Place,98
Coffee Shop,78
Bank,63
Pizza Place,56
Park,49
...,...
Motorcycle Shop,1
Post Office,1
Hospital,1
Dance Studio,1


We can see that venues with category of<code>Sandwich Places</code> and <code>Coffee shops</code> have a significant number in the city of Toronto.

But again <code>Banks</code> are also in top 5 of services available in Toronto.

we can filter the data and keep the data with category of restaurants

In [5]:
restaurants = []

for category in venues_Toronto.Venue_Category:
    res = re.search('Restaurant', category)
    if res != None:
        restaurants.append(category)

In [6]:
len(restaurants)

483

According to the data extracted, We got more than <code>400</code> restaurants in our data.

In [7]:
restaurants_df = venues_Toronto.loc[venues_Toronto.Venue_Category.isin(restaurants)]
restaurants_df.set_index('Neighborhood',inplace = True)
print(restaurants_df.shape)
restaurants_df.head()

(483, 4)


Unnamed: 0_level_0,Venue,Venue_Latitude,Venue_Longitude,Venue_Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Etobicoke West Mall,Mrakovic,43.666641,-79.57885,Eastern European Restaurant
Etobicoke West Mall,Taste of Thailand Cuisine,43.635928,-79.540785,Thai Restaurant
Etobicoke West Mall,Bravo Bistro,43.65942,-79.603604,Eastern European Restaurant
Etobicoke West Mall,Anatolia Restaurant,43.644596,-79.53281,Turkish Restaurant
Etobicoke West Mall,Astoria Shish Kebob House,43.621795,-79.57054,Greek Restaurant


**Let's find out how many unique categories in our data**

In [8]:
print('There are {} uniques categories.'.format(len(restaurants_df['Venue_Category'].unique())))

There are 48 uniques categories.


In [9]:
#one-hot encoding
restaurants_onehot = pd.get_dummies(restaurants_df[['Venue_Category']])
restaurants_onehot.head()

Unnamed: 0_level_0,Venue_Category_Afghan Restaurant,Venue_Category_American Restaurant,Venue_Category_Asian Restaurant,Venue_Category_Brazilian Restaurant,Venue_Category_Cajun / Creole Restaurant,Venue_Category_Caribbean Restaurant,Venue_Category_Chinese Restaurant,Venue_Category_Comfort Food Restaurant,Venue_Category_Cuban Restaurant,Venue_Category_Dim Sum Restaurant,...,Venue_Category_Spanish Restaurant,Venue_Category_Sri Lankan Restaurant,Venue_Category_Sushi Restaurant,Venue_Category_Tapas Restaurant,Venue_Category_Thai Restaurant,Venue_Category_Tibetan Restaurant,Venue_Category_Turkish Restaurant,Venue_Category_Vegetarian / Vegan Restaurant,Venue_Category_Vietnamese Restaurant,Venue_Category_Xinjiang Restaurant
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Etobicoke West Mall,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Etobicoke West Mall,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,1,0,0,0,0,0
Etobicoke West Mall,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Etobicoke West Mall,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0
Etobicoke West Mall,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [10]:
restaurants_grouped = restaurants_onehot.groupby('Neighborhood').mean().reset_index()
restaurants_grouped.head()

Unnamed: 0,Neighborhood,Venue_Category_Afghan Restaurant,Venue_Category_American Restaurant,Venue_Category_Asian Restaurant,Venue_Category_Brazilian Restaurant,Venue_Category_Cajun / Creole Restaurant,Venue_Category_Caribbean Restaurant,Venue_Category_Chinese Restaurant,Venue_Category_Comfort Food Restaurant,Venue_Category_Cuban Restaurant,...,Venue_Category_Spanish Restaurant,Venue_Category_Sri Lankan Restaurant,Venue_Category_Sushi Restaurant,Venue_Category_Tapas Restaurant,Venue_Category_Thai Restaurant,Venue_Category_Tibetan Restaurant,Venue_Category_Turkish Restaurant,Venue_Category_Vegetarian / Vegan Restaurant,Venue_Category_Vietnamese Restaurant,Venue_Category_Xinjiang Restaurant
0,Agincourt North,0.0,0.0,0.045455,0.0,0.0,0.181818,0.272727,0.0,0.0,...,0.0,0.045455,0.045455,0.0,0.0,0.0,0.0,0.045455,0.045455,0.0
1,Alderwood,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.111111,0.0
2,Annex,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Bayview Village,0.0,0.0,0.0,0.0,0.0,0.071429,0.214286,0.0,0.0,...,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Bendale,0.0,0.0,0.125,0.0,0.125,0.0,0.0,0.0,0.0,...,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [11]:
#the size of our new dataframe
restaurants_grouped.shape

(44, 49)

Let's use the  **clusttering method** on our data

In [12]:
# import k-means from clustering stage
from sklearn.cluster import KMeans

In [13]:
#number of clusters
k_clusters = 4
restaurants_cluster = restaurants_grouped.drop('Neighborhood', 1)
#k-means clustering
Kmeans = KMeans(n_clusters = k_clusters, random_state =0).fit(restaurants_cluster)
# check cluster labels generated for each row in the dataframe
Kmeans.labels_[0:10]

array([2, 2, 3, 2, 2, 0, 2, 0, 0, 2])

In [None]:
restaurants_grouped.drop('Cluster_labels',1, inplace = True) #run this incase you change number of clusters

In [14]:
#Add the clustering labels
restaurants_grouped.insert(0, 'Cluster_labels', Kmeans.labels_)

#joining the dataframes to add the latitudes and longitudes of each venue
restaurants_merged = restaurants_df.join(restaurants_grouped.set_index('Neighborhood'), on = 'Neighborhood')
restaurants_merged.head()

Unnamed: 0_level_0,Venue,Venue_Latitude,Venue_Longitude,Venue_Category,Cluster_labels,Venue_Category_Afghan Restaurant,Venue_Category_American Restaurant,Venue_Category_Asian Restaurant,Venue_Category_Brazilian Restaurant,Venue_Category_Cajun / Creole Restaurant,...,Venue_Category_Spanish Restaurant,Venue_Category_Sri Lankan Restaurant,Venue_Category_Sushi Restaurant,Venue_Category_Tapas Restaurant,Venue_Category_Thai Restaurant,Venue_Category_Tibetan Restaurant,Venue_Category_Turkish Restaurant,Venue_Category_Vegetarian / Vegan Restaurant,Venue_Category_Vietnamese Restaurant,Venue_Category_Xinjiang Restaurant
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Etobicoke West Mall,Mrakovic,43.666641,-79.57885,Eastern European Restaurant,2,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.083333,0.0,0.083333,0.0,0.083333,0.0,0.083333,0.0
Etobicoke West Mall,Taste of Thailand Cuisine,43.635928,-79.540785,Thai Restaurant,2,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.083333,0.0,0.083333,0.0,0.083333,0.0,0.083333,0.0
Etobicoke West Mall,Bravo Bistro,43.65942,-79.603604,Eastern European Restaurant,2,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.083333,0.0,0.083333,0.0,0.083333,0.0,0.083333,0.0
Etobicoke West Mall,Anatolia Restaurant,43.644596,-79.53281,Turkish Restaurant,2,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.083333,0.0,0.083333,0.0,0.083333,0.0,0.083333,0.0
Etobicoke West Mall,Astoria Shish Kebob House,43.621795,-79.57054,Greek Restaurant,2,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.083333,0.0,0.083333,0.0,0.083333,0.0,0.083333,0.0


Now we can visualize the resulting clusters

In [30]:
#Toronto coordinate
latitude = 43.651070
longitude = -79.347015
# create map of Toronto using latitude and longitude values
restaurants_map = folium.Map(location=[latitude, longitude], zoom_start=10)

# set color scheme for the clusters
x = np.arange(k_clusters)
ys = [i + x + (i*x)**2 for i in range(k_clusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to map
for lat, lng, venue, cluster in zip(restaurants_merged['Venue_Latitude'], restaurants_merged['Venue_Longitude'], restaurants_merged['Venue'], restaurants_merged['Cluster_labels']):
    label = 'Cluster:{}, {}'.format(cluster, venue)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=[cluster-1],
        fill_opacity=0.7,
        parse_html=False).add_to(restaurants_map)  
    
restaurants_map

### Examining each cluster

In [16]:
for k in range(k_clusters):
    print(restaurants_merged.loc[restaurants_merged['Cluster_labels'] == k, restaurants_merged.columns[[1,3]]])
    print('===='*20)

                   Venue_Latitude             Venue_Category
Neighborhood                                                
Trinity-Bellwoods       43.646553  Middle Eastern Restaurant
Trinity-Bellwoods       43.646145            Thai Restaurant
Trinity-Bellwoods       43.645892        Dumpling Restaurant
Trinity-Bellwoods       43.645550         Italian Restaurant
Trinity-Bellwoods       43.647265  Middle Eastern Restaurant
...                           ...                        ...
Dorset Park             41.829131        American Restaurant
Malvern                 40.035973         Italian Restaurant
Malvern                 40.035356         Italian Restaurant
Malvern                 40.037536        American Restaurant
Malvern                 40.035459         Seafood Restaurant

[63 rows x 2 columns]
                   Venue_Latitude                   Venue_Category
Neighborhood                                                      
Moss Park               28.392298             Fast

We can inspect all clusters differently

**CLUSTER 1**

In [18]:
cluster1 = restaurants_merged.loc[restaurants_merged['Cluster_labels'] == 0, restaurants_merged.columns[[0,3]]]

print('There are {} uniques categories.'.format(len(cluster1['Venue_Category'].unique())))
cluster1['Venue_Category'].value_counts().to_frame()

There are 18 uniques categories.


Unnamed: 0,Venue_Category
American Restaurant,15
Restaurant,10
Seafood Restaurant,5
Italian Restaurant,5
Thai Restaurant,5
Middle Eastern Restaurant,4
Mexican Restaurant,3
New American Restaurant,2
Chinese Restaurant,2
Mediterranean Restaurant,2


There is 15 American Restaurant in this cluster, the least number of Restaurants are *Asian, Greek, Cuban,Indian* Restaurants

**CLUSTER 2**

In [27]:
cluster2 = restaurants_merged.loc[restaurants_merged['Cluster_labels'] == 1, restaurants_merged.columns[[0,3]]]

print('There are {} uniques categories.'.format(len(cluster2['Venue_Category'].unique())))
cluster2['Venue_Category'].value_counts().to_frame()

There are 14 uniques categories.


Unnamed: 0,Venue_Category
Fast Food Restaurant,16
American Restaurant,5
Mexican Restaurant,5
Italian Restaurant,4
Chinese Restaurant,3
Indian Restaurant,2
Thai Restaurant,2
New American Restaurant,2
Hakka Restaurant,1
Caribbean Restaurant,1


**CLUSTER 3**

In [28]:
cluster3 = restaurants_merged.loc[restaurants_merged['Cluster_labels'] == 2, restaurants_merged.columns[[0,3]]]

print('There are {} uniques categories.'.format(len(cluster3['Venue_Category'].unique())))
cluster3['Venue_Category'].value_counts().to_frame()

There are 46 uniques categories.


Unnamed: 0,Venue_Category
Indian Restaurant,32
Italian Restaurant,29
Sushi Restaurant,25
Chinese Restaurant,25
Restaurant,21
Middle Eastern Restaurant,20
Asian Restaurant,18
Thai Restaurant,16
Caribbean Restaurant,14
Seafood Restaurant,13


In this cluster Indian Restaurants are many as well as Italian Restaurants, but again *afghan,Ramen*, etc..., there is small number of those in here.

**CLUSTER 4**

In [29]:
cluster4 = restaurants_merged.loc[restaurants_merged['Cluster_labels'] == 3, restaurants_merged.columns[[0,3]]]

print('There are {} uniques categories.'.format(len(cluster4['Venue_Category'].unique())))
cluster4['Venue_Category'].value_counts().to_frame()

There are 5 uniques categories.


Unnamed: 0,Venue_Category
Mexican Restaurant,5
Fast Food Restaurant,2
Italian Restaurant,1
Japanese Restaurant,1
Spanish Restaurant,1


### Let's check other services in neighborhoods other than restaurants

In [20]:
#services_df = venues_Toronto.loc[venues_Toronto.Venue_Category.not(restaurants)]
services = [service for service in venues_Toronto.Venue_Category if service not in restaurants]    

In [21]:
#checking the length of services
len(services)

1380

we can see that there is more than <code>1300</code> **services** you can find in *Toronto* other othan Restaurants

In [22]:
# dataframe of these services
services_df = venues_Toronto.loc[venues_Toronto.Venue_Category.isin(services)]
services_df.set_index('Neighborhood', inplace = True)
#shape of our dataframe
print('The shape is:',services_df.shape)
#Uniques values are:
print('The number of unique categories', len(services_df['Venue_Category'].unique()))
services_df.head()

The shape is: (1380, 4)
The number of unique categories 192


Unnamed: 0_level_0,Venue,Venue_Latitude,Venue_Longitude,Venue_Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Etobicoke West Mall,Centennial Park,43.656154,-79.58754,Park
Etobicoke West Mall,Tim Hortons,43.644742,-79.567681,Coffee Shop
Etobicoke West Mall,The Beer Store,43.641313,-79.576925,Beer Store
Etobicoke West Mall,Porta Via,43.663449,-79.589638,Sandwich Place
Etobicoke West Mall,Best for Bride,43.635767,-79.539916,Women's Store


Lets visualize the services on map

In [26]:
#Toronto coordinate
latitude = 43.651070
longitude = -79.347015
# create map of Toronto using latitude and longitude values
services_map = folium.Map(location=[latitude, longitude], zoom_start=10)


# add markers to map
for lat, lng, venue, category in zip(services_df['Venue_Latitude'], services_df['Venue_Longitude'], services_df['Venue'], services_df['Venue_Category']):
    label = 'Category:{}, {}'.format(category, venue)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius = 5,
        popup=label,
        #color='blue',
        fill=True,
        fill_color='blue',
        fill_opacity=0.7,
        parse_html=False).add_to(services_map) 
    
services_map

## Results and Discussion <a name = "Results"></a>

Through this analysis, we found that there's high density of Restaurants in East of Eglington Avenue and Lawrence Avenue and there's low density of Restaurants in North of Etobicoke.
We also found that according to clusters of Restaurants, that there is some areas with high competition and low competition in Toronto. Within <code>cluster1</code> There is large number of **American Restaurants** but again their number keep getting low in the other Clusters.

**The Fast Food Restaurants** Seems to be in all cluters which shows another major competition in opening the Fast food type of restaurants.

The Restaurants that are based on countries culture are scattered in Toronto. We can see some like *Greek,Chinese, Italian,Persian, cuban, Afghan* Restaurants ,etc..., those restaurants don't seeem to have a large number among clusters and for someone looking to open restaurants based on his/her country, it's worth a shot.

But again as of analysis we can see that **Indian and Italian** Restaurants are popular ones on those types. But trying to open them in <code>cluster1</code> you can have the less competition compared to other clusters.

According to the map of services, there is big number of services and other business other than restaurants on North-East of University of Toronto, which in fact could be the best setup for *offices* for stakeholders interested in locating their offices in Toronto. Those business available on that side includes, *bars, brewery, cafes, clothing stores and bakeries*. there's also *parks, stables and banks*.

## Conclusion <a name ="Conclusion"></a>

The project's purpose was to gather data on neighborhoods of Toronto perfoming analysis on data about the restaurants and services to aid stakeholders who want to open mainly restaurants or other services,to locate the best area in Toronto's Neighborhoods.

By gathering data from toronto city website about neighborhoods, and use them to gather venues in those neighborhoods using FourSquare API. Even though some data were unavailable on some neighborhoods, the availables ones helped in clustering the neighborhood's venues and finding the best areas suited for opening restaurants and types of restaurants suited for each clusters.

After all the decision is up to stakeholder, based on his/her specific characterstics of neighborhoods in the recommended areas, also by considering some additional factors such as culture based business, attractiveness of neighborhoods, services available, prices and major roads etc.

## Change Log

| Date (YYYY-MM-DD) | Change Description                           |
| ----------------- | -------------------------------------------- |
| 2021-06-13        | Gathered the neighborhoods names             |
| 2021-06-14        | Gathered the data of venues in neighborhoods |
| 2021-06-17        | Methodology and Analysis                    |
| 2021-06-23        | Results and Conclusion                       |