# Capstone Project - The Battle of the Neighborhoods

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

In this project we will try to find the viability to establish a food truck in Los Angeles, specifically in Pershing Square. 
This report will be targeted to stakeholders interested in opening an **Food Truck** in **Los Angeles**, California.

Since there are lots of Food Trucks and restaurants in Los Angeles we will try to detect the **locations of all Restaurants and Food Trucks** nearby, this way we can see if the chosen location is a viable point for a new Food Truck.

## Data <a name="data"></a>

Based on definition of our problem, factors that will influence our decission are:
* number of existing restaurants in the neighborhood (any type of restaurant)
* number of existing food trucks in the neighborhood 

We decided to use regularly spaced grid of locations, centered around city center, to define our neighborhoods.

Following data sources will be needed to extract/generate the required information:
* number of restaurants and food trucks and their type and location in every neighborhood will be obtained using **Foursquare API**
* coordinate of Pershing Square in Los Angeles will be obtained using **Foursquare API**

### Neighborhoods Coordinates

Import all the libraries needed

In [1]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

Solving environment: done

# All requested packages already installed.

Solving environment: done

# All requested packages already installed.



Connecting with the client in Foursquare

In [2]:
CLIENT_ID = '0H4ZHQIZXRL3EVHBNTC3KCLP3X5VTXNNHIM23KEQ55Q1JKFG' # your Foursquare ID
CLIENT_SECRET = 'BN2PBTW23KGNA1CRKJ3W00X0Z1VRACTO0MHUY2JVHHPN0YAI' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30

Get the latitude and longitude coordinates for the center of Los Angeles, specifically Pershing Square using Foursquare API

In [3]:
address = '532 S Olive St, Los Angeles, CA'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

34.0489194285714 -118.253233816327


Let's get the latitude and longitude coordinates for all the food trucks in Los Angeles near Pershing Square

In [4]:
search_queryFoodTrucks = '4bf58dd8d48988d1cb941735'

radius = 1000
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&categoryId={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_queryFoodTrucks, radius, LIMIT)
resultsFoodTrucks = requests.get(url).json()
venuesFoodTrucks = resultsFoodTrucks['response']['venues']
# tranform venues into a dataframe
dataframeFoodTrucks = json_normalize(venuesFoodTrucks)
dataframeFoodTrucks.shape

(30, 18)

Let's get the latitude and longitude coordinates for all the restaurants in Los Angeles near Pershing Square

In [5]:
search_queryRest = '4d4b7105d754a06374d81259'

url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&categoryId={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, search_queryRest, radius, LIMIT)
resultsRest = requests.get(url).json()
venuesRest = resultsRest['response']['venues']
# tranform venues into a dataframe
dataframeRest = json_normalize(venuesRest)
dataframeRest.shape

(30, 25)

#### Filtering the Dataset

In [6]:
# keep only columns that include venue name, and anything that is associated with location
#filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
filtered_columnsFT = ['name', 'categories', 'location.lat', 'location.lng']
dataframe_filteredFT = dataframeFoodTrucks.loc[:, filtered_columnsFT]

# function that extracts the category of the venue
def get_category_typeFT(row):
    try:
        categories_listFT = row['categories']
    except:
        categories_listFT = row['venue.categories']
        
    if len(categories_listFT) == 0:
        return None
    else:
        return categories_listFT[0]['name']

# filter the category for each row
dataframe_filteredFT['categories'] = dataframe_filteredFT.apply(get_category_typeFT, axis=1)

# clean column names by keeping only last term
dataframe_filteredFT.columns = [column.split('.')[-1] for column in dataframe_filteredFT.columns]
dataframe_filteredFT.head(5)

Unnamed: 0,name,categories,lat,lng
0,Mr V's Cheesesteak & Burritos Truck,Food Truck,34.04828,-118.251186
1,Germany's Famous Bratwurst Truck,Food Truck,34.054339,-118.262098
2,Cousins Maine Lobster Truck,Food Truck,34.053353,-118.253111
3,made in brooklyn pizza,Food Truck,34.053605,-118.245738
4,Love Bird Chicken,Food Truck,34.053252,-118.252694


In [7]:
# keep only columns that include venue name, and anything that is associated with location
#filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
filtered_columnsRest = ['name', 'categories', 'location.lat', 'location.lng']
dataframe_filteredRest = dataframeRest.loc[:, filtered_columnsRest]

# function that extracts the category of the venue
def get_category_typeRest(row):
    try:
        categories_listRest = row['categories']
    except:
        categories_listRest = row['venue.categories']
        
    if len(categories_listRest) == 0:
        return None
    else:
        return categories_listRest[0]['name']

# filter the category for each row
dataframe_filteredRest['categories'] = dataframe_filteredRest.apply(get_category_typeRest, axis=1)

# clean column names by keeping only last term
dataframe_filteredRest.columns = [column.split('.')[-1] for column in dataframe_filteredRest.columns]
dataframe_filteredRest.head(5)

Unnamed: 0,name,categories,lat,lng
0,Perch,French Restaurant,34.048919,-118.251428
1,71Above,New American Restaurant,34.051045,-118.254403
2,"24/7 Restaurant at The Standard, Downtown LA",American Restaurant,34.049743,-118.256548
3,The Coffee Bean & Tea Leaf,Coffee Shop,34.048866,-118.25883
4,Starbucks,Coffee Shop,34.05338,-118.25348


Plotting all the Food Trucks and Restaurants near Pershing Square.
* Pershing Square is going to be red
* Food Trucks are blue 
* Restaurants are green

In [8]:
venues_map = folium.Map(location=[latitude, longitude], zoom_start=16) # generate map centred around the Conrad Hotel

# add a red circle marker to represent the new food truck location
folium.features.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map)

# add the Food truck as blue circle markers
for lat, lng, label in zip(dataframe_filteredFT.lat, dataframe_filteredFT.lng, dataframe_filteredFT.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map)
    
# add the Restaurants as green circle markers
for lat, lng, label in zip(dataframe_filteredRest.lat, dataframe_filteredRest.lng, dataframe_filteredRest.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='green',
        popup=label,
        fill = True,
        fill_color='green',
        fill_opacity=0.6
    ).add_to(venues_map)    

# display map
venues_map

## Methodology <a name="methodology"></a>

In this project we will direct our efforts on detecting restaurant density near the selected point chosen to put the Food Truck. 
We will limit our analysis to an area of 1km around Pershing Square.

In first step we have collected the required **data: location and type (category) of every restaurants and Food Trucks within 1km from Pershing Square**.

Second step in our analysis will be calculation and exploration of '**restaurant density**' around Pershing Square - we will use **heatmaps** to identify a the areas with the high number of restaurants nearby. 

In third and final step we will focus on the areas with more restaurants or Food Trucks and within those create the clusters (using **k-means clustering**) of those locations to identify the quantity and distance between the restaurants and food trucks. This would tell us the buisness need for restaurants in those areas.

## Analysis <a name="analysis"></a>

#### Explanatory data analysis on raw data.

Number of Restaurants and Food Trucks near Pershing Square

In [9]:
Quantity = len(dataframe_filteredRest) + len(dataframe_filteredFT)
Quantity

60

Distance to the nearest Restaurant or Food Truck near Pershing Square 

In [10]:
from geopy.distance import geodesic
origin = (latitude, longitude)  # (latitude, longitude) don't confuse
dist = (34.048919, -118.251428)
print(geodesic(origin, dist).meters)  # 23576.805481751613

166.73426722420623


In [11]:
from geopy.distance import geodesic
distancesRest = []
origin = (latitude, longitude)  # (latitude, longitude) don't confuse
for x_dist, y_dist in zip(dataframe_filteredRest.lat, dataframe_filteredRest.lng):
    dist = (x_dist, y_dist)
    distancesRest.append(geodesic(origin, dist).meters)

dataframe_filteredRest['Distance'] = distancesRest
dataframe_filteredRest.head(5)

Unnamed: 0,name,categories,lat,lng,Distance
0,Perch,French Restaurant,34.048919,-118.251428,166.734267
1,71Above,New American Restaurant,34.051045,-118.254403,259.340394
2,"24/7 Restaurant at The Standard, Downtown LA",American Restaurant,34.049743,-118.256548,319.315692
3,The Coffee Bean & Tea Leaf,Coffee Shop,34.048866,-118.25883,516.762082
4,Starbucks,Coffee Shop,34.05338,-118.25348,495.303175


In [12]:
distancesFC = []
origin = (latitude, longitude)  # (latitude, longitude) don't confuse
for x_dist, y_dist in zip(dataframe_filteredFT.lat, dataframe_filteredFT.lng):
    dist = (x_dist, y_dist)
    distancesFC.append(geodesic(origin, dist).meters)

dataframe_filteredFT['Distance'] = distancesFC
dataframe_filteredFT.head(5)

Unnamed: 0,name,categories,lat,lng,Distance
0,Mr V's Cheesesteak & Burritos Truck,Food Truck,34.04828,-118.251186,201.941813
1,Germany's Famous Bratwurst Truck,Food Truck,34.054339,-118.262098,1015.510431
2,Cousins Maine Lobster Truck,Food Truck,34.053353,-118.253111,491.96793
3,made in brooklyn pizza,Food Truck,34.053605,-118.245738,865.495395
4,Love Bird Chicken,Food Truck,34.053252,-118.252694,483.191557


Getting the nearest distance

In [13]:
minDist = 1000
for dist in dataframe_filteredRest.Distance: 
    if(minDist>dist): 
        minDist = dist
print("Nearest Restaurant in mts:",  minDist)

Nearest Restaurant in mts: 166.73426722420623


In [14]:
minDist = 1000
for dist in dataframe_filteredFT.Distance: 
    if(minDist>dist): 
        minDist = dist
print("Nearest Food Truck in mts:",  minDist)

Nearest Food Truck in mts: 120.75844380843405


#### Heatmap

Use heatmaps to identify the areas with the high number of restaurants nearby.

In [15]:
dataframe_filteredAll = pd.concat([dataframe_filteredFT,dataframe_filteredRest])
dataframe_filteredAll.shape

(60, 5)

In [16]:
rest_latlons = []
for la, lo in zip(dataframe_filteredAll.lat, dataframe_filteredAll.lng):
    rest_latlons.extend([[la, lo]])
    

Plot the HeatMap, the red circle represent the 100mts near Pershing Square

In [17]:
from folium import plugins
from folium.plugins import HeatMap

LA_map = folium.Map(location=[latitude, longitude], zoom_start=16) 
folium.TileLayer('cartodbpositron').add_to(LA_map) 
HeatMap(rest_latlons).add_to(LA_map)
folium.Circle([latitude, longitude], radius=100, fill=False, color='red').add_to(LA_map)
LA_map

The hottest parts on the map are located at the north and west, indicading a higher density of existing restaurants and food trucks in these areas. 
**The lowest density is located in the east.** 

There are a low density of restaurants and food trucks near Pershing Square

#### Creating cluster using KMenas

In [19]:
from sklearn.cluster import KMeans
f1 = dataframe_filteredAll['lng'].values
f2 = dataframe_filteredAll['lat'].values
X = np.array(list(zip(f1, f2)))
# Number of clusters
kmeans = KMeans(n_clusters=3)
# Fitting the input data
kmeans = kmeans.fit(X)
# Getting the cluster labels
labels = kmeans.predict(X)
# Centroid values
centroids = kmeans.cluster_centers_
print(centroids) # From sci-kit learn

[[-118.25029118   34.05151975]
 [-118.25389106   34.04504486]
 [-118.26012744   34.048395  ]]


In [20]:
LA_map = folium.Map(location=[latitude, longitude], zoom_start=14)
folium.Marker([latitude, longitude]).add_to(LA_map)
for lat, lon in zip(dataframe_filteredAll.lat, dataframe_filteredAll.lng):
    folium.Circle([lat, lon], radius=250, color='#00000000', fill=True, fill_color='#0066ff', fill_opacity=0.07).add_to(LA_map)
for lat, lon in zip(dataframe_filteredAll.lat, dataframe_filteredAll.lng):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(LA_map)
for lon, lat in centroids:
    folium.Circle([lat, lon], radius=500, color='green', fill=False).add_to(LA_map) 
LA_map

## Results and Discussion <a name="results"></a>

Our analysis shows that the number of restaurants and food trucks in Los Angeles arround Pershing Square (in an area of interest which was 1x1km) was only 60. 

The Highest concentration of restaurants was detected north and west from Pershing Square and the lowest density is located in the east from Pershing Square. So we can tell that there are a low density of restaurants and food trucks near Pershing Square. 

Despised the lowest density of restaurants nearest to Pershing Square the nearest Restaurant was 166.7 mts and the nearest Food Truck was 120.75 mts close to our desire food truck location.

We divided the data in 3 clusters using K-means, this location contain a greatest number of restaurants and food trucks. Pershing Square is between 2 clusters. 

## Conclusion <a name="conclusion"></a>

The purpose of this project was to identify the areas of Los Angeles close to Pershing Square who has restaurants and food trucks in order to aid stakeholders in narrowing down the search for possible direct competition for the new food truck. By calculating restaurants and food trucks density distribution from Foursquare data we have first identified general boroughs that justify further analysis, and then generated extensive collection of locations which satisfy some basic requirements regarding existing nearby restaurants and food trucks. Then it was perfomed a clustering of those locations in order to create major zones of interest (containing greatest number of zones with restaurants and food trucks) and addresses of those zone centers were created to be used as starting points for final exploration by stakeholders.

Final decission on food truck location viability and food truck direct competition will be made by stakeholders based on specific characteristics of neighborhoods and locations in every analized zone.