# The Battle of Neighborhoods: Pop-ups in San Francisco

In [2]:
import config as cfg
import pandas as pd
from sodapy import Socrata

import requests
from bs4 import BeautifulSoup
import folium
from geopy.geocoders import Nominatim
from sklearn.cluster import KMeans
import matplotlib.cm as cm
import matplotlib.colors as colors

pd.options.mode.chained_assignment = None

### Table of Contents

* [Introduction](#introduction)
* [Data](#data)


## Introduction: Business Problem <a class="anchor" id="introduction"></a>

The Covid-19 pandemic continues to cause widespread economic distruption leading to the permanent closure of thousands of businesses. Now an increasing number of people have difficulty procuring food supplies as many eateries are shutting down. Restaraunts are trying to remain profitable despite loing a significant amount of business due to stay-in-shelter, no indoor seating, and social distancing amid health and safety concerns. However, some alternative eateries are continuing to operate and doing better than they imagined: food trucks. These services provide meals from motorized vehicles or carts. 
Food trucks also experience lower sales due to the absence of office workers and large decline of street traffic. Howver unlike restaurants that are fixed facilties, food trucks can quickly change location, menu and market. Operators have adapted by branching out into residential areas to capitalize on the large portion of people staying at home. Food truck sales fluctuate wildly depending on a number of factors, most of which depend on location. 


This report uses machine learning tools to assist **food trucks operators** looking for the best locations in San Francisco. Due to the absence of office workers, we will try to detect locations near **residential areas**. We are also interested in locations near the **workspaces of essential workers**. The report will use data science analysis to generate promising San Francisco neighborhoods based on these criteria. Advantages of each venue will be expressed so that the best location cna be chosen by stakeholders.

## Data <a class="anchor" id="data"></a>

In San Francisco, food trucks must satisfy [DPW Order 182,101](https://www.sfpublicworks.org/sites/default/files/3858-DPW%20Order_182101-MFF.pdf) requirements to be a legal street-food vendor. Hence they can only operate in the approved zones shown in red on the Mobile Food Faculity Permit map:

<img src="mff_rev_092014.jpg" width="600"/>

The report will look at areas that are approved for food trucks by using [Mobile Food Facility Permits data](https://data.sfgov.org/Economy-and-Community/Mobile-Food-Facility-Permit/rqzj-sfat) provided by San Francisco Department of Public Works on DataSF.

Factors that will influence our recommendations:
- Whether location is in an approved zone
- The type and location of venues in the neighborhood

We will use the Mobile Food Facility Permits data to define our venues in the approved zones. The data we will need are:
- **facilitytype**: Type of facilty permitted: truck or push cart
- **address**
- **location**: Latitude and Longitude
- **status**: Status of permit: Approved or Requested

This will joined with location data from the FourSquare API, which provides venue data for those neighborhoods:

### Neighborhood Candidates

Let's create the latitude & longitude coordinates for centroids of our candidate neighborhoods from the Mobile Food Facility Permits data.

We will filter the data to only show non-expired zones in **status** for food truck operators in **facilitytype** only.

In [5]:
client = Socrata("data.sfgov.org",
                cfg.datasf["App Token"],
                username=cfg.datasf["username"],
                password=cfg.datasf["password"])
results = client.get("rqzj-sfat", limit=2000)
results_df = pd.DataFrame.from_records(results)

mff_df = results_df[["facilitytype", "address", "location", 'status']] 
mff_df = mff_df.loc[mff_df['status'] == "ISSUED"]
mff_df = mff_df.loc[mff_df['facilitytype'] == "Truck"]

In [6]:
latitudes = mff_df.loc[:,"location"].apply(lambda row: row.get('latitude'))
longitudes = mff_df.loc[:,"location"].apply(lambda row: row.get('longitude'))
mff_df['latitude'] = latitudes
mff_df['longitude'] = longitudes
mff_df.head()

Unnamed: 0,facilitytype,address,location,status,latitude,longitude
187,Truck,727 SANSOME ST,"{'latitude': '37.7969490060212', 'longitude': ...",ISSUED,37.7969490060212,-122.402183431894
188,Truck,400 CALIFORNIA ST,"{'latitude': '37.793304275561', 'longitude': '...",ISSUED,37.793304275561,-122.401458998413
239,Truck,601 03RD ST,"{'latitude': '37.7800771744392', 'longitude': ...",ISSUED,37.7800771744392,-122.393767294483


### Foursquare

Now that we have our location candidates, we will use the Foursquare API to get info of the venues in each neighborhood.

In [20]:
CLIENT_ID = cfg.foursquare["Client Id"]
CLIENT_SECRET = cfg.foursquare["Client Secret"]
VERSION = '20210118'
LIMIT = 100 
radius = 500

In [17]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [18]:
mff_venues = getNearbyVenues(
    names = mff_df['address'],
    latitudes = mff_df['latitude'],
    longitudes = mff_df['longitude'],
)

727 SANSOME ST
400 CALIFORNIA ST
601 03RD ST


In [22]:
mff_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,727 SANSOME ST,37.7969490060212,-122.402183431894,Kusakabe,37.795498,-122.402952,Sushi Restaurant
1,727 SANSOME ST,37.7969490060212,-122.402183431894,Shinola,37.796092,-122.402913,Men's Store
2,727 SANSOME ST,37.7969490060212,-122.402183431894,Verjus,37.795579,-122.402675,Wine Bar
3,727 SANSOME ST,37.7969490060212,-122.402183431894,Cotogna,37.797346,-122.403624,Italian Restaurant
4,727 SANSOME ST,37.7969490060212,-122.402183431894,Kokkari Estiatorio,37.796883,-122.399655,Greek Restaurant
