## 1) Introduction/Business Problem<a id="Introduction/BusinessProblem"></a>

Introduction where you discuss the business problem and who would be interested in this project.

**_The main reason of this project is to help people who are planning to open a restaurant in the Toronto area. This study provides data about the competitors, population and income in different neighborhoods in Toronto._**

## 2) Downloading and Prepping Data<a id="#1"></a>

Data where you describe the data that will be used to solve the problem and the source of the data.

**_In order to provide the necessary data for this study I have used the population and the average income per neighborhood from Toronto’s 2016 Census. This information then has been combined with Foursquare API and Toronto’s neighborhood shape file to store data on competitors within the same area.
Links available to this data are listed below:_**

**_Toronto Neighborhoods' shape file https://open.toronto.ca/catalogue/?sort=last_refreshed%20desc_**

**_Toronto's Census 2016
https://open.toronto.ca/dataset/wellbeing-toronto-demographics/_**



Before we get the data and start exploring it, let's download all the dependencies that we will need.

In [0]:
# !conda install -c conda-forge geopy --yes 
# !conda install -c conda-forge geocoder --yes
# !conda install -c conda-forge/label/gcc7 geopandas --yes
# !conda install -c conda-forge folium=0.5.0 --yes

In [0]:
import numpy as np

import json

import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
pd.options.mode.chained_assignment = None  

import time

import geopy
from geopy.geocoders import Nominatim

import geocoder

import geopandas as gpd

import folium
from folium.plugins import HeatMap

import requests 
from pandas.io.json import json_normalize 

import csv

print('Libraries imported.')

Libraries imported.


### 2.1) Download and Explore Datasets

#### 2.1.1) Loading Toronto's 2016 Census into a dataframe

In [0]:
csv_path='https://www.toronto.ca/ext/open_data/catalog/data_set_files/2016_neighbourhood_profiles.csv'
df = pd.read_csv(csv_path, thousands=',',encoding='latin1')
print('Data loaded')

**Data Sample**

Exploring collected data

In [0]:
df.head()

Unnamed: 0,Category,Topic,Data Source,Characteristic,City of Toronto,Agincourt North,Agincourt South-Malvern West,Alderwood,Annex,Banbury-Don Mills,Bathurst Manor,Bay Street Corridor,Bayview Village,Bayview Woods-Steeles,Bedford Park-Nortown,Beechborough-Greenbrook,Bendale,Birchcliffe-Cliffside,Black Creek,Blake-Jones,Briar Hill-Belgravia,Bridle Path-Sunnybrook-York Mills,Broadview North,Brookhaven-Amesbury,Cabbagetown-South St. James Town,Caledonia-Fairbank,Casa Loma,Centennial Scarborough,Church-Yonge Corridor,Clairlea-Birchmount,Clanton Park,Cliffcrest,Corso Italia-Davenport,Danforth,Danforth East York,Don Valley Village,Dorset Park,Dovercourt-Wallace Emerson-Junction,Downsview-Roding-CFB,Dufferin Grove,East End-Danforth,Edenbridge-Humber Valley,Eglinton East,Elms-Old Rexdale,Englemount-Lawrence,Eringate-Centennial-West Deane,Etobicoke West Mall,Flemingdon Park,Forest Hill North,Forest Hill South,Glenfield-Jane Heights,Greenwood-Coxwell,Guildwood,Henry Farm,High Park North,High Park-Swansea,Highland Creek,Hillcrest Village,Humber Heights-Westmount,Humber Summit,Humbermede,Humewood-Cedarvale,Ionview,Islington-City Centre West,Junction Area,Keelesdale-Eglinton West,Kennedy Park,Kensington-Chinatown,Kingsview Village-The Westway,Kingsway South,Lambton Baby Point,L'Amoreaux,Lansing-Westgate,Lawrence Park North,Lawrence Park South,Leaside-Bennington,Little Portugal,Long Branch,Malvern,Maple Leaf,Markland Wood,Milliken,Mimico (includes Humber Bay Shores),Morningside,Moss Park,Mount Dennis,Mount Olive-Silverstone-Jamestown,Mount Pleasant East,Mount Pleasant West,New Toronto,Newtonbrook East,Newtonbrook West,Niagara,North Riverdale,North St. James Town,Oakridge,Oakwood Village,O'Connor-Parkview,Old East York,Palmerston-Little Italy,Parkwoods-Donalda,Pelmo Park-Humberlea,Playter Estates-Danforth,Pleasant View,Princess-Rosethorn,Regent Park,Rexdale-Kipling,Rockcliffe-Smythe,Roncesvalles,Rosedale-Moore Park,Rouge,Runnymede-Bloor West Village,Rustic,Scarborough Village,South Parkdale,South Riverdale,St.Andrew-Windfields,Steeles,Stonegate-Queensway,Tam O'Shanter-Sullivan,Taylor-Massey,The Beaches,Thistletown-Beaumond Heights,Thorncliffe Park,Trinity-Bellwoods,University,Victoria Village,Waterfront Communities-The Island,West Hill,West Humber-Clairville,Westminster-Branson,Weston,Weston-Pelham Park,Wexford/Maryvale,Willowdale East,Willowdale West,Willowridge-Martingrove-Richview,Woburn,Woodbine Corridor,Woodbine-Lumsden,Wychwood,Yonge-Eglinton,Yonge-St.Clair,York University Heights,Yorkdale-Glen Park
0,Neighbourhood Information,Neighbourhood Information,City of Toronto,Neighbourhood Number,,129,128,20,95,42,34,76,52,49,39,112,127,122,24,69,108,41,57,30,71,109,96,133,75,120,33,123,92,66,59,47,126,93,26,83,62,9,138,5,32,11,13,44,102,101,25,65,140,53,88,87,134,48,8,21,22,106,125,14,90,110,124,78,6,15,114,117,38,105,103,56,84,19,132,29,12,130,17,135,73,115,2,99,104,18,50,36,82,68,74,121,107,54,58,80,45,23,67,46,10,72,4,111,86,98,131,89,28,139,85,70,40,116,16,118,61,63,3,55,81,79,43,77,136,1,35,113,91,119,51,37,7,137,64,60,94,100,97,27,31
1,Neighbourhood Information,Neighbourhood Information,City of Toronto,TSNS2020 Designation,,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,NIA,No Designation,No Designation,NIA,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,Emerging Neighbourhood,No Designation,NIA,No Designation,No Designation,No Designation,NIA,NIA,Emerging Neighbourhood,No Designation,No Designation,NIA,No Designation,No Designation,NIA,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,Emerging Neighbourhood,NIA,NIA,No Designation,NIA,No Designation,No Designation,NIA,NIA,No Designation,NIA,No Designation,No Designation,Emerging Neighbourhood,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,Emerging Neighbourhood,No Designation,No Designation,No Designation,No Designation,NIA,No Designation,NIA,NIA,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,NIA,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,No Designation,NIA,No Designation,NIA,No Designation,No Designation,No Designation,No Designation,NIA,NIA,NIA,No Designation,No Designation,Emerging Neighbourhood,No Designation,No Designation,NIA,No Designation,NIA,NIA,No Designation,No Designation,NIA,No Designation,NIA,No Designation,Emerging Neighbourhood,NIA,NIA,No Designation,No Designation,No Designation,No Designation,NIA,No Designation,No Designation,No Designation,No Designation,No Designation,NIA,Emerging Neighbourhood
2,Population,Population and dwellings,Census Profile 98-316-X2016001,"Population, 2016",2731571,29113,23757,12054,30526,27695,15873,25797,21396,13154,23236,6577,29960,22291,21737,7727,14257,9266,11499,17757,11669,9955,10968,13362,31340,26984,16472,15935,14133,9666,17180,27051,25003,36625,35052,11785,21381,15535,22776,9456,22372,18588,11848,21933,12806,10732,30491,14417,9917,15723,22162,23925,12494,16934,10948,12416,15545,14365,13641,43965,14366,11058,17123,17945,22000,9271,7985,43993,16164,14607,15179,16828,15559,10084,43794,10111,10554,26572,33964,17455,20506,13593,32954,16775,29658,11463,16097,23831,31180,11916,18615,13845,21210,18675,9233,13826,34805,10722,7804,15818,11051,10803,10529,22246,14974,20923,46496,10070,9941,16724,21849,27876,17812,24623,25051,27446,15683,21567,10360,21108,16556,7607,17510,65913,27392,33312,26274,17992,11098,27917,50434,16936,22156,53485,12541,7865,14349,11817,12528,27593,14804
3,Population,Population and dwellings,Census Profile 98-316-X2016001,"Population, 2011",2615060,30279,21988,11904,29177,26918,15434,19348,17671,13530,23185,6488,27876,21856,22057,7763,14302,8713,11563,17787,12053,9851,10487,13093,28349,24770,14612,15703,13743,9444,16712,26739,24363,34631,34659,11449,20839,14943,22829,9550,22086,18810,10927,22168,12474,10926,31390,14083,9816,11333,21292,21740,13097,17656,10583,12525,15853,14108,13091,38084,14027,10638,17058,18495,21723,9170,7921,44919,14642,14541,15070,17011,12050,9632,45086,10197,10436,27167,26541,17587,16306,13145,32788,15982,28593,10900,16423,23052,21274,12191,17832,13497,21073,18316,9118,13746,34617,8710,7653,16144,11197,10007,10488,22267,15050,20631,45912,9632,9951,16609,21251,25642,17958,25017,24691,27398,15594,21130,10138,19225,16802,7782,17182,43361,26547,34100,25446,18170,12010,27018,45041,15004,21343,53350,11703,7826,13986,10578,11652,27713,14687
4,Population,Population and dwellings,Census Profile 98-316-X2016001,Population Change 2011-2016,4.50%,-3.90%,8.00%,1.30%,4.60%,2.90%,2.80%,33.30%,21.10%,-2.80%,0.20%,1.40%,7.50%,2.00%,-1.50%,-0.50%,-0.30%,6.30%,-0.60%,-0.20%,-3.20%,1.10%,4.60%,2.10%,10.60%,8.90%,12.70%,1.50%,2.80%,2.40%,2.80%,1.20%,2.60%,5.80%,1.10%,2.90%,2.60%,4.00%,-0.20%,-1.00%,1.30%,-1.20%,8.40%,-1.10%,2.70%,-1.80%,-2.90%,2.40%,1.00%,38.70%,4.10%,10.10%,-4.60%,-4.10%,3.40%,-0.90%,-1.90%,1.80%,4.20%,15.40%,2.40%,3.90%,0.40%,-3.00%,1.30%,1.10%,0.80%,-2.10%,10.40%,0.50%,0.70%,-1.10%,29.10%,4.70%,-2.90%,-0.80%,1.10%,-2.20%,28.00%,-0.80%,25.80%,3.40%,0.50%,5.00%,3.70%,5.20%,-2.00%,3.40%,46.60%,-2.30%,4.40%,2.60%,0.70%,2.00%,1.30%,0.60%,0.50%,23.10%,2.00%,-2.00%,-1.30%,8.00%,0.40%,-0.10%,-0.50%,1.40%,1.30%,4.50%,-0.10%,0.70%,2.80%,8.70%,-0.80%,-1.60%,1.50%,0.20%,0.60%,2.10%,2.20%,9.80%,-1.50%,-2.20%,1.90%,52.00%,3.20%,-2.30%,3.30%,-1.00%,-7.60%,3.30%,12.00%,12.90%,3.80%,0.30%,7.20%,0.50%,2.60%,11.70%,7.50%,-0.40%,0.80%


#### 2.1.2) Collecting neighborhoods names

In [0]:
Neighbourhoods = list(df.columns.values)
Neighbourhoods = Neighbourhoods[5:]
print(Neighbourhoods)

['Agincourt North', 'Agincourt South-Malvern West', 'Alderwood', 'Annex', 'Banbury-Don Mills', 'Bathurst Manor', 'Bay Street Corridor', 'Bayview Village', 'Bayview Woods-Steeles', 'Bedford Park-Nortown', 'Beechborough-Greenbrook', 'Bendale', 'Birchcliffe-Cliffside', 'Black Creek', 'Blake-Jones', 'Briar Hill-Belgravia', 'Bridle Path-Sunnybrook-York Mills', 'Broadview North', 'Brookhaven-Amesbury', 'Cabbagetown-South St. James Town', 'Caledonia-Fairbank', 'Casa Loma', 'Centennial Scarborough', 'Church-Yonge Corridor', 'Clairlea-Birchmount', 'Clanton Park', 'Cliffcrest', 'Corso Italia-Davenport', 'Danforth', 'Danforth East York', 'Don Valley Village', 'Dorset Park', 'Dovercourt-Wallace Emerson-Junction', 'Downsview-Roding-CFB', 'Dufferin Grove', 'East End-Danforth', 'Edenbridge-Humber Valley', 'Eglinton East', 'Elms-Old Rexdale', 'Englemount-Lawrence', 'Eringate-Centennial-West Deane', 'Etobicoke West Mall', 'Flemingdon Park', 'Forest Hill North', 'Forest Hill South', 'Glenfield-Jane Heig

#### 2.1.3) Collecting Population and Average Income per neighborhood

Creating a new dataset

In [0]:
dfToronto = pd.DataFrame(index=Neighbourhoods, columns=['Population_2016','Income_2016','Neighbourhood_Number'])

Populating the dataset with the data

In [0]:
# Population_2016 = Population, 2016
# Income_2016 = Total income: Average amount ($)

for index, row in dfToronto.iterrows():
    dfToronto.at[index, 'Population_2016'] = df[index][2]
    dfToronto.at[index, 'Income_2016'] = df[index][2264]
    dfToronto.at[index, 'Neighbourhood_Number'] = df[index][0]

dfToronto['Population_2016'] = dfToronto['Population_2016'].str.replace(',','.').astype(float)
dfToronto['Income_2016'] = dfToronto['Income_2016'].str.replace(',','.').astype(float)

dfToronto = dfToronto.sort_values('Income_2016', ascending=0)
dfToronto.head()

Unnamed: 0,Population_2016,Income_2016,Neighbourhood_Number
Bridle Path-Sunnybrook-York Mills,9.266,308.01,41
Rosedale-Moore Park,20.923,207.903,98
Forest Hill South,10.732,204.521,101
Lawrence Park South,15.179,169.203,103
Casa Loma,10.968,165.047,96


Toronto Open Data website provides a shapefile with all neighbourhoods.NEIGHBORHOODS_WGS84.shp can be found on https://www.toronto.ca/city-government/data-research-maps/open-data/open-data-catalogue/#a45bd45a-ede8-730e-1abc-93105b2c439f


Coverting shapefile to json using shp2gj.py - https://gist.github.com/frankrowe/6071443 

In [0]:
import shapefile

# read the shapefile
reader = shapefile.Reader("NEIGHBORHOODS_WGS84.shp")
fields = reader.fields[1:]
field_names = [field[0] for field in fields]
buffer = []
for sr in reader.shapeRecords():
    atr = dict(zip(field_names, sr.record))
    geom = sr.shape.__geo_interface__
    buffer.append(dict(type="Feature", geometry=geom, properties=atr))

# write the GeoJSON file
from json import dumps
geojson = open("torontoNeighbourhoods.json", "w")
geojson.write(dumps({"type": "FeatureCollection","features": buffer}, indent=2) + "\n")
geojson.close()

print("torontoNeighbourhoods.json created!")

torontoNeighbourhoods.json created!


#### 2.1.4) Collecting geocodes for each neighborhood

In [0]:
dfToronto['Latitude'] = pd.Series("", index=dfToronto.index)
dfToronto['Longitude'] = pd.Series("", index=dfToronto.index)
dfToronto.reset_index()


Unnamed: 0,index,Population_2016,Income_2016,Neighbourhood_Number,Latitude,Longitude
0,Bridle Path-Sunnybrook-York Mills,9.266,308.01,41,,
1,Rosedale-Moore Park,20.923,207.903,98,,
2,Forest Hill South,10.732,204.521,101,,
3,Lawrence Park South,15.179,169.203,103,,
4,Casa Loma,10.968,165.047,96,,
5,Kingsway South,9.271,144.642,15,,
6,Leaside-Bennington,16.828,125.564,56,,
7,Bedford Park-Nortown,23.236,123.077,39,,
8,Yonge-St.Clair,12.528,114.174,97,,
9,Annex,30.526,112.766,95,,


In [0]:
dfToronto = pd.read_csv("dfToronto.csv", header = 0, names = column_names, sep=';')

#Convert Neighborhood IDs to string and add zeros at the begining to mathc with geoson file
#Normalize data Neighbourhood_Number must have 3 digits to math the json ID
dfToronto['Neighbourhood_Number'] = dfToronto['Neighbourhood_Number'].astype(str)
for index, row in dfToronto.iterrows():
    if len(dfToronto.at[index, 'Neighbourhood_Number']) == 1:
        dfToronto.at[index, 'Neighbourhood_Number'] =  "00" + dfToronto.at[index, 'Neighbourhood_Number']
    if len(dfToronto.at[index, 'Neighbourhood_Number']) == 2:
        dfToronto.at[index, 'Neighbourhood_Number'] =  "0" + dfToronto.at[index, 'Neighbourhood_Number']
        
dfToronto.head()

Unnamed: 0,Neighborhood,Population_2016,Income_2016,Neighbourhood_Number,Latitude,Longitude
0,Bridle Path-Sunnybrook-York Mills,9.266,308.01,41,,
1,Rosedale-Moore Park,20.923,207.903,98,43.690388,-79.383297
2,Forest Hill South,10.732,204.521,101,43.693559,-79.413902
3,Lawrence Park South,15.179,169.203,103,43.729199,-79.403253
4,Casa Loma,10.968,165.047,96,43.678111,-79.409408


#### 2.1.5) Creating a new map of Toronto and Choropleth

In [0]:
#https://www.toronto.ca/city-government/data-research-maps/open-data/open-data-catalogue/#a45bd45a-ede8-730e-1abc-93105b2c439f

#Use geopy library to get the latitude and longitude values of Toronto
address = 'Toronto,ON'

geolocator = Nominatim(user_agent="my-application")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.653963, -79.387207.


#### 2.1.6) Toronto's Income per Neighborhood

In [0]:
# Load the shape of the neighborhoods
state_geo = 'torontoNeighbourhoods.json'

# Initialize the map:
m = folium.Map(tiles='Stamen Terrain', location=[latitude+0.05, longitude+0.1], zoom_start=11)

# Add the color for the Income_2016 choropleth:
m.choropleth(
 geo_data=state_geo,
 name='Income_2016',
 data=dfToronto,
 columns=['Neighbourhood_Number', 'Income_2016'],
 key_on='properties.AREA_S_CD',
 fill_color='YlGn',
 fill_opacity=0.5,
 line_opacity=0.5,
 legend_name='Income 2016'
)
m.add_child(folium.map.LayerControl())
    
# Save to html
m

#### 2.1.7) Toronto's Population per Neighborhood

In [0]:
# Initialize the map:
m1 = folium.Map(tiles='Stamen Terrain', location=[latitude+0.05, longitude+0.1], zoom_start=11)

# Add the color for the Population_2016 choropleth:
m1.choropleth(
    geo_data=state_geo,
    name='Population_2016',
    data=dfToronto,
    columns=['Neighbourhood_Number', 'Population_2016'],
    key_on='properties.AREA_S_CD',
    fill_color='YlOrRd',
    fill_opacity=0.5,
    line_opacity=0.5,
    legend_name='Population 2016'
)

m1.add_child(folium.map.LayerControl())

# Save to html
m1

In [0]:
# Initialize the map:
m2 = folium.Map(tiles='Stamen Terrain', location=[latitude+0.05, longitude+0.1], zoom_start=11)

for i in range(len(dfToronto)):
    try: 
        folium.Marker(location=[dfToronto['Latitude'][i], dfToronto['Longitude'][i]],
                      popup=folium.Popup(dfToronto['Neighborhood'][i], parse_html=True)
                      ).add_to(m2)
    except: 
        pass
    
m2

#### 2.1.8) Utilizing the Foursquare API to explore the neighborhoods venues

In [0]:
#### Define Foursquare Credentials and Version

CLIENT_ID = 'xxxxxxxxxxxxxx' # your Foursquare ID
CLIENT_SECRET = 'xxxxxxxxxxxxxxxxx' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

In [0]:
LIMIT = 100
RADIUS = 500

# function that extracts the category of the venue
def getNearbyVenues(name, lat, lng, radius=500):
    
    venues_list=[]

    # create the API request URL
    url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
        CLIENT_ID, 
        CLIENT_SECRET, 
        VERSION, 
        lat, 
        lng, 
        radius, 
        LIMIT)

    # make the GET request
    results = requests.get(url).json()["response"]['groups'][0]['items']

    # return only relevant information for each nearby venue
    venues_list.append([(
        name, 
        lat, 
        lng, 
        v['venue']['name'], 
        v['venue']['location']['lat'], 
        v['venue']['location']['lng'],  
        v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [0]:
## List all venues of all neighborhoods

nearby_venues = pd.DataFrame()

for i in range(len(dfToronto)):
    try: 
        print(".",end="")
        nearby_venues = nearby_venues.append(getNearbyVenues(name=dfToronto['Neighborhood'][i], lat=dfToronto['Latitude'][i], lng=dfToronto['Longitude'][i]))
    except: 
        pass


nearby_venues.head()

............................................................................................................................................

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Rosedale-Moore Park,43.690388,-79.383297,Pure Fitness,43.693007,-79.387781,Gym
1,Rosedale-Moore Park,43.690388,-79.383297,Moore Park Tennis Club,43.693289,-79.3829,Tennis Court
2,Rosedale-Moore Park,43.690388,-79.383297,Moorevale Park,43.69361,-79.383465,Playground
3,Rosedale-Moore Park,43.690388,-79.383297,Mount Pleasant Road And Moore,43.69356,-79.3846,Intersection
4,Rosedale-Moore Park,43.690388,-79.383297,On The Run,43.69341,-79.386506,Convenience Store


In [0]:
print('There are {} uniques categories.'.format(len(nearby_venues['Venue Category'].unique())))

nearby_venues['Venue Category'].unique()

There are 260 uniques categories.


array(['Gym', 'Tennis Court', 'Playground', 'Intersection',
       'Convenience Store', 'Bank', 'Arts & Crafts Store', 'Park',
       'Accessories Store', 'Mediterranean Restaurant', 'Bakery',
       'Tea Room', 'Bubble Tea Shop', 'Ice Cream Shop',
       'Toy / Game Store', 'BBQ Joint', 'Burger Joint',
       'Japanese Restaurant', 'Asian Restaurant', 'Sushi Restaurant',
       'Café', 'Coffee Shop', 'Pub', 'Italian Restaurant', 'Diner',
       'Hobby Shop', 'Frozen Yogurt Shop', 'Pharmacy', 'Pizza Place',
       'Lingerie Store', 'Cosmetics Shop', 'Seafood Restaurant',
       'Sandwich Place', 'Gastropub', 'Dance Studio', 'Bus Line',
       'Fast Food Restaurant', 'Mobile Phone Shop', 'Pool',
       'Salon / Barbershop', 'Shoe Store', 'Shopping Mall',
       'Metro Station', 'Massage Studio', 'Deli / Bodega', 'Spa',
       'Castle', 'Historic Site', 'Museum', 'History Museum',
       'Indian Restaurant', 'French Restaurant', 'Steakhouse',
       'Vegetarian / Vegan Restaurant', 'Thea

In [0]:
## Filter only the "Restaurants" out of all venues 
nearby_restaurants = pd.DataFrame()

nearby_restaurants = nearby_venues[nearby_venues['Venue Category'].str.contains("Restaurant")]
nearby_restaurants.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
5,Forest Hill South,43.693559,-79.413902,Sofra Mediterranean Cuisine,43.689251,-79.412882,Mediterranean Restaurant
7,Lawrence Park South,43.729199,-79.403253,Shinobu by Maki Sushi,43.732562,-79.404147,Japanese Restaurant
8,Lawrence Park South,43.729199,-79.403253,Riz North,43.730792,-79.403719,Asian Restaurant
10,Lawrence Park South,43.729199,-79.403253,Shoushin,43.73138,-79.403961,Sushi Restaurant
16,Lawrence Park South,43.729199,-79.403253,Yonge Sushi,43.733182,-79.404323,Sushi Restaurant


#### 2.1.9) Listing restaurants in the map

In [0]:
# Initialize the map:
m3 = folium.Map(tiles='Stamen Terrain', location=[latitude+0.05, longitude+0.1], zoom_start=11)

nearby_restaurants['Venue Latitude'] = pd.to_numeric(nearby_restaurants['Venue Latitude'], errors='coerce').fillna(0)
nearby_restaurants['Venue Longitude'] = pd.to_numeric(nearby_restaurants['Venue Longitude'], errors='coerce').fillna(0)

for la, lo, ve in zip(nearby_restaurants['Venue Latitude'],nearby_restaurants['Venue Longitude'],nearby_restaurants['Venue']):
    try:
        m3.add_child(folium.Marker(location=[la,lo],popup=folium.Popup(ve,parse_html=True),icon=folium.Icon(icon='info-sign', color='green')))
    except:
        print("error")

m3

## 3) Methodology

Methodology section which represents the main component of the report where you discuss and describe any exploratory data analysis that you did, any inferential statistical testing that you performed, and what machine learnings were used and why.

**_In this report I have used various maps to help an individual/investor better decide in which neighborhood to open a new restaurant based on population, income and competition. In order to originate these maps, I used data from Foursquare to display the restaurants in the neighborhoods currently and used Toronto’s Census 2016 combined with choropleth to determine the population and income in each region._**

## 4) Results

Results section where you discuss the results.

**_As you can notice on the maps majority of the restaurants are located on the south side and main streets of Toronto. We can also see that most of the high income neighborhoods are located up north. It was also noticed that a more population in an area did not reflect upon the number of restaurants in the area._**

## 5) Discussion

Discussion section where you discuss any observations you noted and any recommendations you can make based on the results.

**_At the beginning of this project I was expecting to find clusters of restaurants in different locations of Toronto and in the end the results did not meet my hypothesis._**

## 6) Conclusion

Conclusion section where you conclude the report.

**_In conclusion, this study can be useful for investors who are looking to open a new restaurant in Toronto by comparing different neighborhood profiles based on income, population and competition. This study only use’s a few variables to conduct research so it does not cover all variables that may also need to be taken into consideration therefore this study can not be used as a single decision making tool for an individual/investor._**