<a href="https://colab.research.google.com/github/ktsmit1/Coursera_Capstone/blob/master/Battle_of_the_Neighborhoods_Week_5.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Applied Data Science Capstone Class - Battle of the Neighborhoods - Week 5

## 1. Description of the problem and the topic background

Portillo's is a fast-food, casual style restaurant that is based in the Chicago area. Founded by Dick Portillo in 1963, it specializes in serving Chicago-style food such as hot dogs, Maxwell Street Polish, and Italian Beef. After returning from the U.S. Marine Corps, Dick Portillo took his savings and investment funds from his brother to purchase a 12-foot trailer (named, "The Dog House") and operated it on North Avenue in Villa Park, Illinois as a hot dog stand. By the 1990's, after rebranding to "Portillo's", there were 25 stores in the Chicago area.

Portillo's is a Chicago institution. The stores are decorated with a variety of historic memorabilia from the 20s, 30s, 50s, and 60s. From the music to the décor, each restaurant has the old-time Chicago feel. Families typically have weekly meals at Portillo's, enjoying comfort food at reasonable prices in a casual, eclectic atmosphere.

Portillo continued to expand into Northern Illinois and the metropolitan areas of Indianapolis, Minneapolis-St. Paul, Milwaukee, Phoenix, Southern California, and Tampa, Florida.

In 2014, the chain was sold to Berkshires Partners. There are 60 stores in the chain. Plans are to continue to expand by 5 to 7 stores per year.

**Problem Statement**

In this project we will try to find the optimal location for the new Portillo’s restaurant.  We will first identify areas that do not have many restaurants in the area.  Then we will look for areas that do not have many/any fast food restaurants in the area.  Since we would like to “land and expand”, we will look for an area closer to downtown Toronto, to better promote the brand.

The final results will be presented to the leadership team for the Portillo's chain at Berkshire Hathaway.  The leadership will make the final location selection.


## 2. Description of the data and how it will be used to solve the problem

**2.1 Data Requirements**

<b>Toronto Data Requirements</b>

According to the Toronto newspaper, The Star, the center of Toronto is at 33 Wanless Crescent, Toronto, ON, Canada.  We will use Google Maps API reverse geocoding to identify the longitude and latitude for this address.

Rather than focusing on designated neighborhoods (e.g., St. James Town), we will create a grid, centered around the geographic center of Toronto, to define our “neighborhoods”.

<b>Geospatial Data</b>

Google Maps API reverse geocoding will be used to define the coordinates of each “neighborhood” (i.e., individual grid area). Toronto is 630.2 km square (or roughly 25 km by 25 km).

<b>Foursquare Data</b>

Foursquare data will be used to identify the fast food venues in every “neighborhood”.

<b>Import the Required Libraries</b>



In [64]:
# Install required libraries
#Install Beautifulsoup version 4
!pip install beautifulsoup4
!pip install lxml
!pip install html5lib
!pip install requests

#Import the necessary libraries
import pandas as pd
import numpy as np
import itertools
import json

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors
from matplotlib.ticker import NullFormatter
import matplotlib.ticker as ticker
import matplotlib.pyplot as plt

# import k-means from clustering stage
from sklearn.cluster import KMeans
%matplotlib inline

#Install geopy
!pip install geopy
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

#Install Google Maps API
!pip install -U googlemaps
import googlemaps

#Install shapely
!pip install shapely

#Install pyproj
!pip install pyproj

#Import the libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 

#Transform the json file into a pandas dataframe
from pandas.io.json import json_normalize

#Install folium
!pip install folium
import folium

print('Libraries imported.')

Requirement already up-to-date: googlemaps in /usr/local/lib/python3.6/dist-packages (4.2.2)
Libraries imported.


<b>CREATE GRID AREA (i.e., "NEIGHBORHOODS")</b>

<b>Find the latitude & longitude of the geographic center of Toronto using Google Maps API.</b>

In [0]:
#@title Hide Key
api_key = 'AIzaSyAkqWoikp1bMRkibN7Xb_n1sDmL_n2SoY0'
address = '33 Wanless Crescent, Toronto, Canada'

In [66]:
import requests

def get_coordinates(api_key, address, verbose=False):
    try:
        url = 'https://maps.googleapis.com/maps/api/geocode/json?key={}&address={}'.format(api_key, address)
        response = requests.get(url).json()
        if verbose:
            print('Google Maps API JSON result =>', response)
        results = response['results']
        geographical_data = results[0]['geometry']['location'] # get geographical coordinates
        lat = geographical_data['lat']
        lon = geographical_data['lng']
        return [lat, lon]
    except:
        return [None, None]
    
toronto_center = get_coordinates(api_key, address)
print('Coordinate of {}: {}'.format(address, toronto_center))

Coordinate of 33 Wanless Crescent, Toronto, Canada: [43.7267632, -79.3905724]


We will not begin to build out the "neighborhoods" (i.e., grid areas).  However, we will first need to "re-project" our representation of locations from one using latitude/longitude (i.e., a spherical coordinate system) to one that uses x,y coordinates in meters (i.e., a UTM Cartesian coordinate system).

The logic below was taken from the GeoPandas site (https://geopandas.readthedocs.io/en/latest/projections.html) and the example notebook, https://cocl.us/coursera_capstone_notebook.  

The code below takes the latitude and longitude for the geographic center of Toronto at 33 Wanless Cresent as inputs.

In [67]:
import shapely.geometry
import pyproj
import math

def lonlat_to_xy(lon, lat):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    xy = pyproj.transform(proj_latlon, proj_xy, lon, lat)
    return xy[0], xy[1]

def xy_to_lonlat(x, y):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
    return lonlat[0], lonlat[1]

def calc_xy_distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    return math.sqrt(dx*dx + dy*dy)

print('Coordinate transformation check')
print('-------------------------------')
print('Toronto center longitude={}, latitude={}'.format(toronto_center[1], toronto_center[0]))
x, y = lonlat_to_xy(toronto_center[1], toronto_center[0])
print('Toronto center UTM X={}, Y={}'.format(x, y))
lo, la = xy_to_lonlat(x, y)
print('Toronto center longitude={}, latitude={}'.format(lo, la))

Coordinate transformation check
-------------------------------
Toronto center longitude=-79.3905724, latitude=43.7267632
Toronto center UTM X=-5298790.4967889115, Y=10507014.992347237
Toronto center longitude=-79.39057240000045, latitude=43.72676319999977


<b>Create Grid Areas (i.e., "Neighborhoods")</b>

Next step is to create a grid of “neighborhoods”.  In our case, these will be equally spaced and be within 25 km of the geographic center of Toronto.  The “neighborhoods” will have a radius of 500 km. 

In [68]:
toronto_center_x, toronto_center_y = lonlat_to_xy(toronto_center[1], toronto_center[0]) # City center in Cartesian coordinates

k = math.sqrt(3) / 2 # Vertical offset for hexagonal grid cells
x_min = toronto_center_x - 12500
x_step = 1000
y_min = toronto_center_y - 12500 - (int(21/k)*k*1000 - 25000)/2
y_step = 1000 * k 

latitudes = []
longitudes = []
distances_from_center = []
xs = []
ys = []
for i in range(0, int(21/k)):
    y = y_min + i * y_step
    x_offset = 300 if i%2==0 else 0
    for j in range(0, 21):
        x = x_min + j * x_step + x_offset
        distance_from_center = calc_xy_distance(toronto_center_x, toronto_center_y, x, y)
        if (distance_from_center <= 12501):
            lon, lat = xy_to_lonlat(x, y)
            latitudes.append(lat)
            longitudes.append(lon)
            distances_from_center.append(distance_from_center)
            xs.append(x)
            ys.append(y)

print(len(latitudes), 'neighborhoods generated.')

455 neighborhoods generated.


<b>Visualize the "Neighborhoods" (i.e., Grid Areas)</b>

In [69]:
map_toronto = folium.Map(location=toronto_center, zoom_start=12)
folium.Marker(toronto_center, popup='33 Wanless Cresent').add_to(map_toronto)
for lat, lon in zip(latitudes, longitudes):
    #folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_toronto) 
    folium.Circle([lat, lon], radius=300, color='blue', fill=False).add_to(map_toronto)
    #folium.Marker([lat, lon]).add_to(map_toronto)
map_toronto

<b>Now that we have the "Neighborhoods", we will approximate the addresses for each using the Google Maps API</b>

In [0]:
def get_address(api_key, latitude, longitude, verbose=False):
    try:
        url = 'https://maps.googleapis.com/maps/api/geocode/json?key={}&latlng={},{}'.format(api_key, latitude, longitude)
        response = requests.get(url).json()
        if verbose:
            print('Google Maps API JSON result =>', response)
        results = response['results']
        address = results[0]['formatted_address']
        return address
    except:
        return None

addr = get_address(api_key, toronto_center[0], toronto_center[1])
#print('Reverse geocoding check')
#print('-----------------------')
#print('Address of [{}, {}] is: {}'.format(toronto_center[0], toronto_center[1], addr))

In [71]:
print('Obtaining location addresses: ', end='')
addresses = []
for lat, lon in zip(latitudes, longitudes):
    address = get_address(api_key, lat, lon)
    if address is None:
        address = 'NO ADDRESS'
    address = address.replace(', Canada', '') # We don't need country part of address
    addresses.append(address)
    print(' .', end='')
print(' done.')

Obtaining location addresses:  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . done.


In [72]:
addresses[220:240]

['23 Chieftain Crescent, North York, ON M2L 2H3',
 '38 Carluke Crescent, North York, ON M2L 2J4',
 '10 Mallingham Ct, Toronto, ON M2N',
 'Empress Parkette, Empress Parkette, 433 Empress Ave, Toronto, ON M2N, 433 Empress Ave, Toronto, ON M2N',
 '938 The PATH - Bay Adelaide Centre, Toronto, ON M5H 1Y6',
 '285 Victoria St, Toronto, ON M5B 1W1',
 '70 Donna Shaw Lane, Toronto, ON M4Y 1B4',
 '80 Charles St E, Toronto, ON M4Y 2W6',
 '104 Park Rd, Toronto, ON M4W 2N7',
 '200 Mt Pleasant Rd, Toronto, ON M4W 1W2',
 '127 Inglewood Dr, Toronto, ON M4T 1H6',
 '374 Mt Pleasant Rd, Toronto, ON M4T 1V3',
 '250 Davisville Ave, Toronto, ON M4S 1H2',
 '304 Soudan Ave, Toronto, ON M4S 1W5',
 '261 Erskine Ave, Toronto, ON M4P 1Z6',
 '2 Strathgowan Crescent, Toronto, ON M4N 2Z5',
 '45 St Ives Crescent, Toronto, ON M4N 3B5',
 '206 Riverview Dr, Toronto, ON M4N 3C8',
 '21 Knightswood Rd, North York, ON M4N 2H1',
 '7 Cedarwood Ave, North York, ON M2L 1L7']

<b>Put Address, Latitude, Longitude, and Distance from Center into a Pandas dataframe</b>

In [73]:
df_locations = pd.DataFrame({'Address': addresses,
                             'Latitude': latitudes,
                             'Longitude': longitudes,
                             'X': xs,
                             'Y': ys,
                             'Distance from center': distances_from_center})

df_locations.head(10)

Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center
0,"104 Goodwood Park Ct, East York, ON M4C 2H1",43.695298,-79.295703,-5304990.0,10496620.0,12101.239606
1,"385 Dawes Rd, East York, ON M4B 2E6",43.701513,-79.296631,-5303990.0,10496620.0,11620.671237
2,"103 Glencrest Blvd, East York, ON M4B 1L9",43.707728,-79.297559,-5302990.0,10496620.0,11208.925015
3,"1318 Victoria Park Ave, East York, ON M4B 2L4",43.713944,-79.298488,-5301990.0,10496620.0,10873.821775
4,"10 Holswade Rd, Scarborough, ON M1L 2G2",43.720161,-79.299417,-5300990.0,10496620.0,10622.61738
5,"1880 Eglinton Ave E, Scarborough, ON M1L 2L1",43.726379,-79.300346,-5299990.0,10496620.0,10461.357464
6,"29 Shaneen Blvd, Scarborough, ON M1R 1B6",43.732597,-79.301275,-5298990.0,10496620.0,10394.229168
7,"1029 Pharmacy Ave, Scarborough, ON M1R 2G8",43.738816,-79.302205,-5297990.0,10496620.0,10423.051377
8,"16 Gooderham Dr, Scarborough, ON M1R 3G5",43.745035,-79.303135,-5296990.0,10496620.0,10547.037499
9,"106 Elinor Ave, Scarborough, ON M1R 3H4",43.751256,-79.304065,-5295990.0,10496620.0,10762.899238


<b>Save the data to a local file</b>

In [0]:
df_locations.to_pickle('./locations.pkl')  

##Foursquare

The next step is to marry the “neighborhood” locations with data from the Foursquare API, in order to get information on restaurants in the area.

The venues we will be interested in are the “food” venue category.  We will exclude coffee shops, bakeries, etc. from the list, since these would not be considered alternatives for a Portillo’s.  We will also make sure to include all subcategories of the “fast food” category, since we are trying to identify areas lacking fast food restaurants.

In [0]:
#@title Hidden Foursquare Credentials

CLIENT_ID = 'PX34EKN1YODZERJIGF5AWLHWHHLSNZBZUM53F3VWRFD1V4W0' # your Foursquare ID
CLIENT_SECRET = 'YTHIRHEHDCFMG4ML12GZZ3XDKVK3WFNMXYSZYOYVFYFJYOAE' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

In [0]:
# Category IDs corresponding to Italian restaurants were taken from Foursquare web site (https://developer.foursquare.com/docs/resources/categories):

food_category = '4d4b7105d754a06374d81259' # 'Root' category for all food-related venues

fast_food_restaurant_categories= ['4bf58dd8d48988d16e941735']

def is_restaurant(categories, specific_filter=None):
    restaurant_words = ['restaurant', 'diner', 'taverna', 'steakhouse', 'joint']
    restaurant = False
    specific = False
    for c in categories:
        category_name = c[0].lower()
        category_id = c[1]
        for r in restaurant_words:
            if r in category_name:
                restaurant = True
        if 'fast food' in category_name:
            restaurant = False
        if not(specific_filter is None) and (category_id in specific_filter):
            specific = True
            restaurant = True
    return restaurant, specific

def get_categories(categories):
    return [(cat['name'], cat['id']) for cat in categories]

def format_address(location):
    address = ', '.join(location['formattedAddress'])
    address = address.replace(', Canada', '')
    return address

def get_venues_near_location(lat, lon, category, client_id, client_secret, radius=500, limit=100):
    version = '20180724'
    url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&categoryId={}&radius={}&limit={}'.format(
        client_id, client_secret, version, lat, lon, category, radius, limit)
    try:
        results = requests.get(url).json()['response']['groups'][0]['items']
        venues = [(item['venue']['id'],
                   item['venue']['name'],
                   get_categories(item['venue']['categories']),
                   (item['venue']['location']['lat'], item['venue']['location']['lng']),
                   format_address(item['venue']['location']),
                   item['venue']['location']['distance']) for item in results]        
    except:
        venues = []
    return venues

For each of our “neighborhoods”, identify nearby restaurants.  Also maintain a listing of all restaurants and identified fast food restaurants.

In [77]:
import pickle

def get_restaurants(lats, lons):
    restaurants = {}
    fast_food_restaurants = {}
    location_restaurants = []

    print('Obtaining venues around candidate locations:', end='')
    for lat, lon in zip(lats, lons):
        # Using radius=550 to meke sure we have overlaps/full coverage so we don't miss any restaurant (we're using dictionaries to remove any duplicates resulting from area overlaps)
        venues = get_venues_near_location(lat, lon, food_category, CLIENT_ID, CLIENT_SECRET, radius=550, limit=100)
        area_restaurants = []
        for venue in venues:
            venue_id = venue[0]
            venue_name = venue[1]
            venue_categories = venue[2]
            venue_latlon = venue[3]
            venue_address = venue[4]
            venue_distance = venue[5]
            is_res, is_fast_food = is_restaurant(venue_categories, specific_filter=fast_food_restaurant_categories)
            if is_res:
                x, y = lonlat_to_xy(venue_latlon[1], venue_latlon[0])
                restaurant = (venue_id, venue_name, venue_latlon[0], venue_latlon[1], venue_address, venue_distance, is_fast_food, x, y)
                if venue_distance<=500:
                    area_restaurants.append(restaurant)
                restaurants[venue_id] = restaurant
                if is_fast_food:
                    fast_food_restaurants[venue_id] = restaurant
        location_restaurants.append(area_restaurants)
        print(' .', end='')
    print(' done.')
    return restaurants, fast_food_restaurants, location_restaurants

# Try to load from local file system in case we did this before
restaurants = {}
fast_food_restaurants = {}
location_restaurants = []
loaded = False
try:
    with open('restaurants_550.pkl', 'rb') as f:
        restaurants = pickle.load(f)
    with open('fast_food_restaurants_550.pkl', 'rb') as f:
        fast_food_restaurants = pickle.load(f)
    with open('location_restaurants_550.pkl', 'rb') as f:
        location_restaurants = pickle.load(f)
    print('Restaurant data loaded.')
    loaded = True
except:
    pass

# If load failed use the Foursquare API to get the data
if not loaded:
    restaurants, fast_food_restaurants, location_restaurants = get_restaurants(latitudes, longitudes)
    
    # Let's persists this in local file system
    with open('restaurants_550.pkl', 'wb') as f:
        pickle.dump(restaurants, f)
    with open('fast_food_restaurants_550.pkl', 'wb') as f:
        pickle.dump(fast_food_restaurants, f)
    with open('location_restaurants_550.pkl', 'wb') as f:
        pickle.dump(location_restaurants, f)

Restaurant data loaded.


In [78]:
print('Total number of restaurants:', len(restaurants))
print('Total number of fast food restaurants:', len(fast_food_restaurants))
print('Percentage of fast food restaurants: {:.2f}%'.format(len(fast_food_restaurants) / len(restaurants) * 100))
print('Average number of restaurants in neighborhood:', np.array([len(r) for r in location_restaurants]).mean())

Total number of restaurants: 1872
Total number of fast food restaurants: 91
Percentage of fast food restaurants: 4.86%
Average number of restaurants in neighborhood: 6.578021978021978


In [79]:
print('List of Fast Food Restaurants')
print('---------------------------')
for r in list(fast_food_restaurants.values())[:10]:
    print(r)
print('...')
print('Total:', len(fast_food_restaurants))

List of Fast Food Restaurants
---------------------------
('4b7c9784f964a520b29c2fe3', 'KFC', 43.7247, -79.2987, '#151 - 1 Eglinton Square, Toronto ON M1L 2K1', 229, True, -5300277.9295531465, 10496461.873529654)
('4bc0c3c6b492d13a55d5a460', 'A&W', 43.726426, -79.29693, '1896 Eglinton Ave. East, Scarborough ON M1L 2L9', 274, True, -5300025.660092352, 10496228.106965734)
('4bc893342f94d13af8db137f', 'Burger King', 43.725089, -79.298033, '100 Eglinton Square, Scarborough ON M1L 2K1', 235, True, -5300224.422242652, 10496378.28052765)
('4e3c57fac65b4ec275d3ae06', 'KFC', 43.74290836654114, -79.30796427681433, '1760 Lawrence Avenue East, Scarborough ON M1R 2Y1', 190, True, -5297268.161168148, 10497215.712831855)
('4b26f64ff964a520868324e3', 'A&W', 43.76627195201865, -79.30184693144743, '1585 Warden Avenue (401), Scarborough ON M1R 2S9', 435, True, -5293632.582627236, 10496108.690284625)
('4c292b173492a5938588b828', 'KFC', 43.6881, -79.3029, '2500 Danforth Avenue, Toronto ON M4C 1L2', 540, Tr

In [80]:
print('Restaurants around location')
print('---------------------------')
for i in range(100, 110):
    rs = location_restaurants[i][:8]
    names = ', '.join([r[1] for r in rs])
    print('Restaurants around location {}: {}'.format(i+1, names))

Restaurants around location
---------------------------
Restaurants around location 101: Shawarma and Kebob, Stirling Room
Restaurants around location 102: Tekka Sushi, Beijing Hotpot 后海•味北京火鍋, Iqbal Kebab & Sweets Centre
Restaurants around location 103: Hero Certified Burgers, New York Fries - Fairview Mall, Moxie's Classic Grill, Thai Express, Heart Sushi, Spring Rolls, KFC, Bourbon St. Grill
Restaurants around location 104: Completo, Hanoi 3 Seasons, Ascari Enoteca, Kibo Sushi House, Brooklyn Tavern, Baldini, eastside social, Goods And Provisions
Restaurants around location 105: Maple Leaf Tavern, La Cubana East, Great Burger Kitchen, Loaded Pierogi, Gare De L'est, Tropical Joe's, Blackjack BBQ, Com Tam Dao Vien/Peach Garden Express
Restaurants around location 106: Tropical Joe's, La Cubana East, Great Burger Kitchen, Loaded Pierogi, Blackjack BBQ, Chula Taberna Mexicana, Com Tam Dao Vien/Peach Garden Express, KFC
Restaurants around location 107: Motorama Restaurant, Danforth Dragon

<b>Display the restaurants identified on our map, showing fast food restaurants in a different color.</b>

In [81]:
map_toronto = folium.Map(location=toronto_center, zoom_start=13)
folium.Marker(toronto_center, popup='Toronto Center').add_to(map_toronto)
for res in restaurants.values():
    lat = res[2]; lon = res[3]
    is_fast_food = res[6]
    color = 'red' if is_fast_food else 'blue'
    folium.CircleMarker([lat, lon], radius=3, color=color, fill=True, fill_color=color, fill_opacity=1).add_to(map_toronto)
map_toronto

At this point we have the restaurants that are near the Toronto city center and have identified which ones are fast food restaurants.  We also know which restaurants are near each “neighborhood”.

Since we have completed all of our data gathering, we will pivot to creating a report on the suggested location(s) for a new Portillo’s store(s).

## Methodology



The methodology we will use is to first identify areas that do not have many restaurants, particularly fast food restaurants.  We will limit our analysis to a radius of 12.5 km around the center of Toronto (33 Wanless Crescent, Toronto ).

We have collected the required data for the location and type of every restaurant within the defined radius of the city center.  We have also identified those restaurants that fall into the Foursquare venue category of “Fast Food”.

The next step in our analysis will be to explore the density of restaurants across the different areas of Toronto.  We will use heat maps to identify areas that are close to the center of Toronto that have a low number of restaurants and few to no fast food restaurants in that area.  We will focus our attention on these areas.

In the last step we will focus on the areas of interest and create clusters of locations.  We will take into consideration locations with no more than two restaurants within 500 meters and locations without any fast food restaurants within 1 kilometer.  We will display a map of those locations to identify areas that can be used by the leadership team of the Portillo’s chain to begin a physical location search for a Portillo’s store location.

## Analysis

<b>Exploratory analysis to determine the number of restaurants in every area</b>

In [82]:
location_restaurants_count = [len(res) for res in location_restaurants]

df_locations['Restaurants in area'] = location_restaurants_count

print('Average number of restaurants in every area with radius=500m:', np.array(location_restaurants_count).mean())

df_locations.head(10)

Average number of restaurants in every area with radius=500m: 6.578021978021978


Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center,Restaurants in area
0,"104 Goodwood Park Ct, East York, ON M4C 2H1",43.695298,-79.295703,-5304990.0,10496620.0,12101.239606,0
1,"385 Dawes Rd, East York, ON M4B 2E6",43.701513,-79.296631,-5303990.0,10496620.0,11620.671237,2
2,"103 Glencrest Blvd, East York, ON M4B 1L9",43.707728,-79.297559,-5302990.0,10496620.0,11208.925015,3
3,"1318 Victoria Park Ave, East York, ON M4B 2L4",43.713944,-79.298488,-5301990.0,10496620.0,10873.821775,0
4,"10 Holswade Rd, Scarborough, ON M1L 2G2",43.720161,-79.299417,-5300990.0,10496620.0,10622.61738,4
5,"1880 Eglinton Ave E, Scarborough, ON M1L 2L1",43.726379,-79.300346,-5299990.0,10496620.0,10461.357464,7
6,"29 Shaneen Blvd, Scarborough, ON M1R 1B6",43.732597,-79.301275,-5298990.0,10496620.0,10394.229168,2
7,"1029 Pharmacy Ave, Scarborough, ON M1R 2G8",43.738816,-79.302205,-5297990.0,10496620.0,10423.051377,2
8,"16 Gooderham Dr, Scarborough, ON M1R 3G5",43.745035,-79.303135,-5296990.0,10496620.0,10547.037499,10
9,"106 Elinor Ave, Scarborough, ON M1R 3H4",43.751256,-79.304065,-5295990.0,10496620.0,10762.899238,0


<b>Calculate the distance to the nearest fast food restaurant from the centers of the neighborhoods</b>

In [0]:
distances_to_fast_food_restaurant = []

for area_x, area_y in zip(xs, ys):
    min_distance = 12500
    for res in fast_food_restaurants.values():
        res_x = res[7]
        res_y = res[8]
        d = calc_xy_distance(area_x, area_y, res_x, res_y)
        if d<min_distance:
            min_distance = d
    distances_to_fast_food_restaurant.append(min_distance)

df_locations['Distance to the fast food restaurant'] = distances_to_fast_food_restaurant

In [84]:
df_locations.head(10)

Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center,Restaurants in area,Distance to the fast food restaurant
0,"104 Goodwood Park Ct, East York, ON M4C 2H1",43.695298,-79.295703,-5304990.0,10496620.0,12101.239606,0,1422.756467
1,"385 Dawes Rd, East York, ON M4B 2E6",43.701513,-79.296631,-5303990.0,10496620.0,11620.671237,2,2091.277771
2,"103 Glencrest Blvd, East York, ON M4B 1L9",43.707728,-79.297559,-5302990.0,10496620.0,11208.925015,3,1433.877553
3,"1318 Victoria Park Ave, East York, ON M4B 2L4",43.713944,-79.298488,-5301990.0,10496620.0,10873.821775,0,1318.54729
4,"10 Holswade Rd, Scarborough, ON M1L 2G2",43.720161,-79.299417,-5300990.0,10496620.0,10622.61738,4,730.488329
5,"1880 Eglinton Ave E, Scarborough, ON M1L 2L1",43.726379,-79.300346,-5299990.0,10496620.0,10461.357464,7,329.36109
6,"29 Shaneen Blvd, Scarborough, ON M1R 1B6",43.732597,-79.301275,-5298990.0,10496620.0,10394.229168,2,1107.816259
7,"1029 Pharmacy Ave, Scarborough, ON M1R 2G8",43.738816,-79.302205,-5297990.0,10496620.0,10423.051377,2,934.584288
8,"16 Gooderham Dr, Scarborough, ON M1R 3G5",43.745035,-79.303135,-5296990.0,10496620.0,10547.037499,10,654.810316
9,"106 Elinor Ave, Scarborough, ON M1R 3H4",43.751256,-79.304065,-5295990.0,10496620.0,10762.899238,0,1408.58273


In [85]:
print('Average distance in meters to the nearest fast food restaurant from the neighborhood centers:', df_locations['Distance to the fast food restaurant'].mean())

Average distance in meters to the nearest fast food restaurant from the neighborhood centers: 1416.8820857912706


So the average distance to a fast food restaurant from a neighborhood center is about 1.4 km.  

<b>Draw the borders of the Toronto neighborhoods on our map and a few circles indicating distance of 1km, 3km, 5km, 8km and 12km from the center of Toronto</b>

In [0]:
toronto_boroughs_url = 'http://raw.githubusercontent.com/ktsmit1/Coursera_Capstone/master/toronto.geojson'
toronto_boroughs = requests.get(toronto_boroughs_url).json()

def boroughs_style(feature):
    return { 'color': 'blue', 'fill': False }

In [0]:
restaurant_latlons = [[res[2], res[3]] for res in restaurants.values()]

fast_food_latlons = [[res[2], res[3]] for res in fast_food_restaurants.values()]

In [88]:
from folium import plugins
from folium.plugins import HeatMap

map_toronto = folium.Map(location=toronto_center, zoom_start=12)
folium.TileLayer('cartodbpositron').add_to(map_toronto) #cartodbpositron cartodbdark_matter
HeatMap(restaurant_latlons).add_to(map_toronto)
folium.Marker(toronto_center).add_to(map_toronto)
folium.Circle(toronto_center, radius=1000, fill=False, color='white').add_to(map_toronto)
folium.Circle(toronto_center, radius=3000, fill=False, color='white').add_to(map_toronto)
folium.Circle(toronto_center, radius=5000, fill=False, color='white').add_to(map_toronto)
folium.Circle(toronto_center, radius=8000, fill=False, color='white').add_to(map_toronto)
folium.Circle(toronto_center, radius=12000, fill=False, color='white').add_to(map_toronto)
folium.GeoJson(toronto_boroughs, style_function=boroughs_style, name='geojson').add_to(map_toronto)
map_toronto

There is low restaurant density Northeast of the Toronto city center.

We will now create a heat map showing only fast food restaurants.

In [89]:
from folium import plugins
from folium.plugins import HeatMap

map_toronto = folium.Map(location=toronto_center, zoom_start=12)
folium.TileLayer('cartodbpositron').add_to(map_toronto) #cartodbpositron cartodbdark_matter
HeatMap(fast_food_latlons).add_to(map_toronto)
folium.Marker(toronto_center).add_to(map_toronto)
folium.Circle(toronto_center, radius=1000, fill=False, color='white').add_to(map_toronto)
folium.Circle(toronto_center, radius=3000, fill=False, color='white').add_to(map_toronto)
folium.Circle(toronto_center, radius=5000, fill=False, color='white').add_to(map_toronto)
folium.Circle(toronto_center, radius=8000, fill=False, color='white').add_to(map_toronto)
folium.Circle(toronto_center, radius=12000, fill=False, color='white').add_to(map_toronto)
folium.GeoJson(toronto_boroughs, style_function=boroughs_style, name='geojson').add_to(map_toronto)
map_toronto

This map shows that the areas within 1 kilometer of Toronto’s geographic center and 1 – 3 kilometers North and East of the geographic center do not have many restaurants, particularly fast food restaurants.  These include the neighborhoods of North Toronto/Midtown, York Mills, and Don Mills.

### <b>North Toronto/Midtown, York Mills, and Don Mills</b>

York Mills is one of the most affluent areas in Toronto, so property values would me much higher than other areas.  Don Mills was originally a planned community that has a modernist principle, which would contrast with Portillo’s emphasis on memorabilia from the 20’s, 30’s, 40’s, and 50’s.  

The North Toronto/Midtown area has rapidly changing to a greater density with the construction of residential condominium buildings in the area.  The area around the intersection of Yonge Street and Eglinton Avenue has access to the Line 1 subway, but also has the Eglinton Mall and Cineplex and other restaurant in the area.  

The Crosstown LTR (light-rail transit), which will improve east/west transit  along Eglinton Avenue is currently causing frustration, but will be completed in 2021.  The area of Eglinton West would benefit when this construction is completed.  The area’s strong Caribbean character seems be changing, accommodating other food aesthetics.  This would be the recommended location for a physical search for the next store location.

## Results

Our analysis shows that there are a large number of restaurants overall within a 12.5 km radius around the geographic center of Toronto, of which a much smaller number are classified as fast food restaurants, there still are pockets 1-5 km to the north and east of the geographic center.  

After visualizing the concentration of restaurants and fast food restaurants in particular around the geographic center, our analysis brought our attention to three neighborhoods: North Toronto/Midtown, York Mills and Don Mills.  After reviewing the overall affluence of the areas (which would translate to higher land costs) and the accessibility (via mass transit), we decided to focus on the North Toronto/Midtown location.

After reviewing the areas within the North Toronto/Midtown neighborhood, we determined that the Eglinton West showed the most promise and will not have to compete with other restaurant options in the Yonge Street and Eglinton Avenue area.
This recommendation should be taken as a starting point for a physical street level study and discussion of any other (at this point unknown) factors that might impact the success of a store location site.

## Conclusion

The purpose of this project was to identify an area(s) to locate a new Portillo’s restaurant store in the Toronto area, in order to provide guidance on beginning a street-level physical search.  

We utilized Foursquare data to determine the number of restaurants around the geographic center of Toronto and identifying those classified as fast food restaurants (which would could be seen as substitutes for a Portillo’s store).  We also clustered these data into geographic areas around the geographic center, identifying the latitude and longitude of their centers.  These were visualized, utilizing a geographic heat map of Toronto, to identify areas around the geographic center of Toronto that did not have many (if any) restaurants and (in particular) fast food restaurants.

Through this process we were able to identify three neighborhoods to consider targeting.  After considering the characteristics of each, which could lead to higher prices for the land (increasing the cost of building a new store) and the architectural design of the area (which might conflict with the Portillo’s style), accessibility, and growth in the areas, the North Toronto/Midtown area was identify.  The Eglinton West was identified as the area requiring a physical street-level search for the next store.

The leadership of the Portillo’s chain will want to review factors such as zoning restrictions, detractors of the area (e.g., traffic congestion, noisiness of the area), and overall land costs prior to making a final decision on the new store location.