# Capstone Project - The Battle of the Neighborhoods (Week 2)

### Applied Data Science Capstone by Briana Grant

## Table of Contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)   

## Introduction: Business Problem <a name="introduction">

The stakeholder wants to open an affordable Vegan/Vegetarian restaurant for an underserved populations. 

The most desirable location would be one that is populous with other Vegan/Vegetarian restaurants, yet where residents have a low average income. The study will be used to display the neighborhoods with the highest Vegan/Vegetarian venues, lowest income, and least restaurants that will offer anyone seeking to open a new business the opportunity to leverage competitive data to make their decision.

## Data <a name="data"></a>

Based on definition of our problem, factors that will influence our decission are:

*number of existing restaurants in the neighborhood (any type of restaurant)
*number of Vegan/Vegetarian restaurants in the neighborhood, if any

The data being used in this study will be sourced from:

* number of restaurants and their type and location in every neighborhood will be obtained using **Foursquare API**
* The Neighborhood Population **2016 Toronto Census**
* Average Income per Neighborhood **2016 Toronto Census**
* Existing Venue Data: number of restaurants and their type and location in every neighborhood will be obtained using **Foursquare API** 

In [50]:
import requests
import pandas as pd
import numpy as np
from bs4 import BeautifulSoup
import json
from pandas.io.json import json_normalize
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans

In [51]:
from pandas import DataFrame
from geopy.geocoders import Nominatim
import folium

## Methodology

In this project, areas of Toronto that have a low restaurant density of Vegan/Vegetarian Restaurants are highlighted. 

In first step we have collected the required data: location and type (category) of every restaurant within a 100 mi radius of Toronto. Vegan/Vegetarians restaurants have also been searched and specified (according to Foursquare categorization).

Our Second step in our analysis  will be the calculation and exploration of 'restaurant density' across different areas of Berlin - we will use markers to identify a few clusters of our specified restaurant.

In third and final step we will focus on most promising areas that combine not only restaurant density, but also low income areas and high income areas

# Toronto Boroughs and Neighborhoods

The following data of Toronto's boroughs, Neighborhoods, Latitude and Longitude, is derived from a previously created dataset in the Capstone repository. It has already been "cleaned".

In [52]:
df = pd.read_pickle('Toronto')

In [13]:
df

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.654260,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
...,...,...,...,...,...
98,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North",43.653654,-79.506944
99,M4Y,Downtown Toronto,Church and Wellesley,43.665860,-79.383160
100,M7Y,East Toronto,"Business reply mail Processing Centre, South C...",43.662744,-79.321558
101,M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu...",43.636258,-79.498509


# First, Scarborough and Venues

In [14]:
scarborough_data= df[df['Borough'] == 'Scarborough'].reset_index(drop=True)
scarborough_data.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


In [15]:
address = 'Scarborough, Toronto'

geolocator = Nominatim(user_agent="ga_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Scarborough, CA are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Scarborough, CA are 43.773077, -79.257774.


In [68]:
map_scarborough = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(scarborough_data['Latitude'], scarborough_data['Longitude'], scarborough_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_scarborough)  
print("Scarborough's Neighborhoods")   
map_scarborough

Scarborough's Neighborhoods


In [17]:
CLIENT_ID = "BWJHK44KKVHGIOELR54W55YSVCZOTFPTP1C4IHZIFAUDXRZN"
CLIENT_SECRET = "ZMJ1PWTLSVYGZIMF5X0BV3M4MAXK3FPUJT41CUBOLBRCTSTR"
VERSION = "20200712"
LIMIT = 30
print("Your credentials")
print("CLIENT_ID: " + CLIENT_ID)
print("CLIENT_SECRET: " + CLIENT_SECRET)

Your credentials
CLIENT_ID: BWJHK44KKVHGIOELR54W55YSVCZOTFPTP1C4IHZIFAUDXRZN
CLIENT_SECRET: ZMJ1PWTLSVYGZIMF5X0BV3M4MAXK3FPUJT41CUBOLBRCTSTR


In [18]:

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        #print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            100)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    print('Found {} venues in {} neighborhoods.'.format(nearby_venues.shape[0], len(venues_list)))
    
    return(nearby_venues)

In [19]:
scarborough_venues = getNearbyVenues(names=scarborough_data['Neighborhood'],
                                   latitudes=scarborough_data['Latitude'],
                                   longitudes=scarborough_data['Longitude'])

Found 88 venues in 17 neighborhoods.


In [20]:
print(scarborough_venues.shape)
scarborough_venues.head()

(88, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Malvern, Rouge",43.806686,-79.194353,Wendy’s,43.807448,-79.199056,Fast Food Restaurant
1,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497,RIGHT WAY TO GOLF,43.785177,-79.161108,Golf Course
2,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497,Great Shine Window Cleaning,43.783145,-79.157431,Home Service
3,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497,Royal Canadian Legion,43.782533,-79.163085,Bar
4,"Guildwood, Morningside, West Hill",43.763573,-79.188711,RBC Royal Bank,43.76679,-79.191151,Bank


In [30]:
scarborough_veg = scarborough_venues[scarborough_venues['Venue Category'] == 'Vegetarian / Vegan Restaurant']
scarborough_veg.head(9)

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category


There are zero (0) neighborhoods in the Scarborough Borough in Toronto

# Secondly, Downtown Toronto

In [69]:
address = 'Downtown Toronto, Toronto'

geolocator = Nominatim(user_agent="ga_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Downtown Toronto, CA are {}, {}.'.format(latitude, longitude))

map_metro = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(metro_data['Latitude'], metro_data['Longitude'], metro_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_metro)  
print("Metropolitan Toronto Neighborhoods")   
map_metro

The geograpical coordinate of Downtown Toronto, CA are 43.6541737, -79.38081164513409.
Metropolitan Toronto Neighborhoods


In [22]:
metro_data= df[df['Borough'] == 'Downtown Toronto'].reset_index(drop=True)
metro_data.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
1,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
2,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
3,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
4,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306


In [38]:
metro_venues = getNearbyVenues(names=metro_data['Neighborhood'],
                                   latitudes=metro_data['Latitude'],
                                   longitudes=metro_data['Longitude'])

metro_veg = metro_venues[metro_venues['Venue Category'] == 'Vegetarian / Vegan Restaurant']
metro_veg.head(9)

Found 1228 venues in 19 neighborhoods.


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
76,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,The Green Beet Cafe,43.662096,-79.394153,Vegetarian / Vegan Restaurant
204,St. James Town,43.651494,-79.375418,Fresh On Front,43.647815,-79.374453,Vegetarian / Vegan Restaurant
266,Berczy Park,43.644771,-79.373306,Fresh On Front,43.647815,-79.374453,Vegetarian / Vegan Restaurant
343,Central Bay Street,43.657952,-79.387383,Vegetarian Haven,43.656016,-79.392758,Vegetarian / Vegan Restaurant
410,"Richmond, Adelaide, King",43.650571,-79.384568,Rosalinda,43.650252,-79.385156,Vegetarian / Vegan Restaurant
562,"Harbourfront East, Union Station, Toronto Islands",43.640816,-79.381752,Kupfert & Kim,43.641179,-79.378144,Vegetarian / Vegan Restaurant
647,"Toronto Dominion Centre, Design Exchange",43.647177,-79.381576,Rosalinda,43.650252,-79.385156,Vegetarian / Vegan Restaurant
747,"Commerce Court, Victoria Hotel",43.648198,-79.379817,Fresh On Front,43.647815,-79.374453,Vegetarian / Vegan Restaurant
787,"Commerce Court, Victoria Hotel",43.648198,-79.379817,Rosalinda,43.650252,-79.385156,Vegetarian / Vegan Restaurant


# 3. Borough: Etobicoke 

In [71]:
address = 'Etobicoke, Toronto'

geolocator = Nominatim(user_agent="ga_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Etobicoke, CA are {}, {}.'.format(latitude, longitude))

map_etobi = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(etobicoke_data['Latitude'], etobicoke_data['Longitude'], etobicoke_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_etobi)  
print("Metropolitan Toronto Neighborhoods")   
map_etobi

The geograpical coordinate of Etobicoke, CA are 43.6435559, -79.5656326.
Metropolitan Toronto Neighborhoods


In [31]:
etobicoke_data= df[df['Borough'] == 'Etobicoke'].reset_index(drop=True)
etobicoke_data.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M9A,Etobicoke,"Islington Avenue, Humber Valley Village",43.667856,-79.532242
1,M9B,Etobicoke,"West Deane Park, Princess Gardens, Martin Grov...",43.650943,-79.554724
2,M9C,Etobicoke,"Eringate, Bloordale Gardens, Old Burnhamthorpe...",43.643515,-79.577201
3,M9P,Etobicoke,Westmount,43.696319,-79.532242
4,M9R,Etobicoke,"Kingsview Village, St. Phillips, Martin Grove ...",43.688905,-79.554724


In [36]:
etobi_venues = getNearbyVenues(names=etobicoke_data['Neighborhood'],
                                   latitudes=etobicoke_data['Latitude'],
                                   longitudes=etobicoke_data['Longitude'])

etobi_veg = etobi_venues[etobi_venues['Venue Category'] == 'Vegetarian / Vegan Restaurant']
etobi_veg.head(9)

Found 72 venues in 12 neighborhoods.


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category


# Borough North York 

In [77]:
address = 'North York, Toronto'

geolocator = Nominatim(user_agent="ga_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of North York, CA are {}, {}.'.format(latitude, longitude))

map_nyork = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(nyork_data['Latitude'], nyork_data['Longitude'], nyork_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_nyork)  
print("North York Neighborhoods")   
map_nyork

The geograpical coordinate of North York, CA are 43.7543263, -79.44911696639593.
North York Neighborhoods


In [37]:
nyork_data= df[df['Borough'] == 'North York'].reset_index(drop=True)
nyork_data.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
3,M3B,North York,Don Mills,43.745906,-79.352188
4,M6B,North York,Glencairn,43.709577,-79.445073


In [39]:
nyork_venues = getNearbyVenues(names=nyork_data['Neighborhood'],
                                   latitudes=nyork_data['Latitude'],
                                   longitudes=nyork_data['Longitude'])

nyork_veg = nyork_venues[nyork_venues['Venue Category'] == 'Vegetarian / Vegan Restaurant']
nyork_veg.head(9)

Found 239 venues in 24 neighborhoods.


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category


# Borough East York

In [78]:
address = 'East York, Toronto'

geolocator = Nominatim(user_agent="ga_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of East York, CA are {}, {}.'.format(latitude, longitude))

map_eyork = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(eyork_data['Latitude'], eyork_data['Longitude'], eyork_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_eyork)  
print("East York Neighborhoods")   
map_eyork

The geograpical coordinate of East York, CA are 43.699971000000005, -79.33251996261595.
East York Neighborhoods


In [40]:
eyork_data= df[df['Borough'] == 'East York'].reset_index(drop=True)
eyork_data.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M4B,East York,"Parkview Hill, Woodbine Gardens",43.706397,-79.309937
1,M4C,East York,Woodbine Heights,43.695344,-79.318389
2,M4G,East York,Leaside,43.70906,-79.363452
3,M4H,East York,Thorncliffe Park,43.705369,-79.349372
4,M4J,East York,"East Toronto, Broadview North (Old East York)",43.685347,-79.338106


In [41]:
eyork_venues = getNearbyVenues(names=eyork_data['Neighborhood'],
                                   latitudes=eyork_data['Latitude'],
                                   longitudes=eyork_data['Longitude'])

eyork_veg = eyork_venues[eyork_venues['Venue Category'] == 'Vegetarian / Vegan Restaurant']
eyork_veg.head(9)

Found 77 venues in 5 neighborhoods.


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category


# Borough: York

In [79]:
address = 'York, Toronto'

geolocator = Nominatim(user_agent="ga_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of York, CA are {}, {}.'.format(latitude, longitude))

map_york = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(york_data['Latitude'], york_data['Longitude'], york_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_york)  
print("York Neighborhoods")   
map_york

The geograpical coordinate of York, CA are 43.6896191, -79.479188.
York Neighborhoods


In [43]:
york_data= df[df['Borough'] == 'York'].reset_index(drop=True)
york_data.head()

york_venues = getNearbyVenues(names=york_data['Neighborhood'],
                                   latitudes=york_data['Latitude'],
                                   longitudes=york_data['Longitude'])

york_veg = york_venues[york_venues['Venue Category'] == 'Vegetarian / Vegan Restaurant']
york_veg.head(9)

Found 16 venues in 5 neighborhoods.


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category


## Borough: East Toronto

In [80]:
address = 'East, Toronto'

geolocator = Nominatim(user_agent="ga_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of East Toronto, CA are {}, {}.'.format(latitude, longitude))

map_etor = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(etor_data['Latitude'], etor_data['Longitude'], etor_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_etor)  
print("East Toronto Neighborhoods")   
map_etor

The geograpical coordinate of East Toronto, CA are 43.6534817, -79.3839347.
East Toronto Neighborhoods


In [46]:
etor_data= df[df['Borough'] == 'East Toronto'].reset_index(drop=True)
etor_data.head()

etor_venues = getNearbyVenues(names=etor_data['Neighborhood'],
                                   latitudes=etor_data['Latitude'],
                                   longitudes=etor_data['Longitude'])

etor_veg = etor_venues[etor_venues['Venue Category'] == 'Vegetarian / Vegan Restaurant']
etor_veg.head(9)

Found 122 venues in 5 neighborhoods.


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category


# Results/Conclusion

In conclusion of our Foursquare data, Metropolitan or Downtown Toronto is the only Borough that is populated with Vegan/Vegetarian Venues and would therefore offer the most competitive enviroment for a contractor seeking to build an affordable Vegan/Vegetarian Restaurant.

However, if the contractor is seeking to serve an underserved community, the best population would be the borough that is the furthest distance away from Downtown Toronto because they are therefore, furthest away from Vegan/Vegetarian Restaurants.

The final decision on the optimal restaurant location will be made by stakeholders based on specific characteristics of neighborhoods and locations in every recommended zone taking in a variety of different variables such as average household income, population, and trends.