## Capstone Project-The Battle of the Neighborhoods (Week 1)
### Applied Data Science Capstone by IBM/Coursera

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)

## Introduction: Business Problem <a name="introduction"></a>

With the COVID-19 pandemic, many businesses like offices, schools, colleges, etc. have been shut down forcing people to work/study from home. However, the food industry personnel do not have the opportunity to work from home and are required to continue with their daily work activities while adhering to the safety measures.

What’s concerning is that it can survive on different surfaces for quite a long time and once we come in contact with a contaminated surface, we can get infected too. This naturally makes us think twice before ordering food online or before visiting a restaurant to dine-in.

This would also change consumer behavior and would affect our decision while choosing any food outlet. Consumers would not only look at food outlets that serve good quality food at a good price but will also look at the hygiene rating and the area in which the food outlet is located in to ensure that safety and hygiene are not being compromised.

## Data <a name="data"></a>

The aim of this project is to cluster the food outlets in San Francisco, California based on:

1. Customer Rating
2. Inspection Score
3. Location
4. Coronavirus cases in the neighborhood
5. Online Delivery Service 

Based on definition of our problem, factors that will influence our decission are:
* Health Aspects of food outlet looking at Location Scores and Cornavirus cases.
* Number of existing restaurants in the neighborhood (any type of restaurant)
* Number of and distance to Italian restaurants in the neighborhood, if any

We will be using data from the San Francisco Government API for Covid-19 and Health Inspection Data and the Foursquare API:

- [San Francisco Neighborhood Covid-19 Data](#Covid_Data) - To get all the confirmed coronavirus cases in the different neighborhoods of San Francisco
                
- [San Francisco Government Restaurant Health Inspection Data](#SF_Data) - Using the San Francisco's LIVES restaurant inspection data leverages the LIVES Flattened Schema (https://goo.gl/c3nNvr), which is based on LIVES version 2.0, cited on Yelp's website (http://www.yelp.com/healthscores).

- [FourSquare API](#FourSquare) - Use the location coordinates of the districts we received from the Covid19 API and pass it as input to the FourSqaure API to retrieve 100 venues within 4 kms for each Neighborhood of San Francisco.

In [1]:
import warnings
warnings. filterwarnings("ignore")

In [2]:
#Install Libraries
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.colors as colors

from pandas.io.json import json_normalize
import requests
import json

!pip install geopy
from geopy.geocoders import Nominatim
from sklearn.cluster import KMeans
!pip install folium
import folium
print("Libraries imported!")

Libraries imported!


### San Francisco Covid19 Data <a name="Covid_Data"></a>

In [3]:
# Create URL to JSON file (alternatively this can be a filepath)
#url = 'https://data.sfgov.org/resource/tef6-3vsw.json'
url = 'https://data.sfgov.org/resource/tpyr-dvnc.json'
# Load the first sheet of the JSON file into a data frame
df = pd.read_json(url, orient='columns')

# View the first five rows
df.head()
covid_df = df.loc[df['area_type'] == 'Analysis Neighborhood']
covid_df.rename(columns={'id': 'Neighborhood', 'count': 'Cases','rate': 'Rate of Cases per 10k'}, inplace=True)
covid_df.head()

Unnamed: 0,area_type,Neighborhood,Cases,Rate of Cases per 10k,deaths,acs_population,last_updated_at,multipolygon
1,Analysis Neighborhood,Financial District/South Beach,79.0,40.600267,0.0,19458,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1..."
2,Analysis Neighborhood,Haight Ashbury,43.0,23.213129,0.0,18524,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1..."
8,Analysis Neighborhood,Outer Richmond,95.0,20.701227,,45891,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1..."
9,Analysis Neighborhood,Visitacion Valley,225.0,118.389897,,19005,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1..."
16,Analysis Neighborhood,Lincoln Park,,,,305,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1..."


In [77]:
covid_df['latitude'] = 0

In [78]:
covid_df['longitude'] = 0

In [66]:
address = "Lakeshore, San Francisco, CA"
geolocator = Nominatim(user_agent="SanFran_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

print('The geograpical coordinate of this city are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of this city are 42.3038299, -82.8189267.


In [98]:
covid_df.head()

Unnamed: 0,area_type,Neighborhood,Cases,Rate of Cases per 10k,deaths,acs_population,last_updated_at,multipolygon,latitude,longitude
1,Analysis Neighborhood,Financial District/South Beach,79.0,40.600267,0.0,19458,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.793647,-122.398938
2,Analysis Neighborhood,Haight Ashbury,43.0,23.213129,0.0,18524,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.770015,-122.446952
8,Analysis Neighborhood,Outer Richmond,95.0,20.701227,,45891,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.777046,-122.465453
9,Analysis Neighborhood,Visitacion Valley,225.0,118.389897,,19005,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.712132,-122.409713
16,Analysis Neighborhood,Lincoln Park,,,,305,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.784625,-122.499086
22,Analysis Neighborhood,Nob Hill,96.0,36.11874,,26579,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.793262,-122.415249
34,Analysis Neighborhood,Glen Park,30.0,34.718204,,8641,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.734281,-122.43447
36,Analysis Neighborhood,Bernal Heights,187.0,72.318045,,25858,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.742986,-122.415804
37,Analysis Neighborhood,Castro/Upper Market,66.0,29.617663,0.0,22284,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.760856,-122.434957
38,Analysis Neighborhood,Mission,810.0,135.817167,,59639,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",49.158935,-122.283583


In [130]:
nbr = 'McLaren Park'
latitude = 37.7180842
longitude = -122.4190721
covid_df.loc[covid_df['Neighborhood'] == nbr, 'latitude'] = latitude
covid_df.loc[covid_df['Neighborhood'] == nbr, 'longitude'] = longitude

In [131]:
covid_df

Unnamed: 0,area_type,Neighborhood,Cases,Rate of Cases per 10k,deaths,acs_population,last_updated_at,multipolygon,latitude,longitude
1,Analysis Neighborhood,Financial District/South Beach,79.0,40.600267,0.0,19458,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.793647,-122.398938
2,Analysis Neighborhood,Haight Ashbury,43.0,23.213129,0.0,18524,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.770015,-122.446952
8,Analysis Neighborhood,Outer Richmond,95.0,20.701227,,45891,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.777046,-122.465453
9,Analysis Neighborhood,Visitacion Valley,225.0,118.389897,,19005,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.712132,-122.409713
16,Analysis Neighborhood,Lincoln Park,,,,305,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.784625,-122.499086
22,Analysis Neighborhood,Nob Hill,96.0,36.11874,,26579,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.793262,-122.415249
34,Analysis Neighborhood,Glen Park,30.0,34.718204,,8641,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.734281,-122.43447
36,Analysis Neighborhood,Bernal Heights,187.0,72.318045,,25858,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.742986,-122.415804
37,Analysis Neighborhood,Castro/Upper Market,66.0,29.617663,0.0,22284,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.760856,-122.434957
38,Analysis Neighborhood,Mission,810.0,135.817167,,59639,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.76,-122.42


In [132]:
covid_df.head()

Unnamed: 0,area_type,Neighborhood,Cases,Rate of Cases per 10k,deaths,acs_population,last_updated_at,multipolygon,latitude,longitude
1,Analysis Neighborhood,Financial District/South Beach,79.0,40.600267,0.0,19458,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.793647,-122.398938
2,Analysis Neighborhood,Haight Ashbury,43.0,23.213129,0.0,18524,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.770015,-122.446952
8,Analysis Neighborhood,Outer Richmond,95.0,20.701227,,45891,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.777046,-122.465453
9,Analysis Neighborhood,Visitacion Valley,225.0,118.389897,,19005,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.712132,-122.409713
16,Analysis Neighborhood,Lincoln Park,,,,305,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.784625,-122.499086


In [5]:
# Base San Francisco Map
address = "San Francisco, CA"
geolocator = Nominatim(user_agent="SanFran_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
# San Francisco latitude and longitude values
latitude = 37.77
longitude = -122.42

print('The geograpical coordinate of this city are {}, {}.'.format(latitude, longitude))
map_SF = folium.Map(location=[latitude, longitude], zoom_start=12)
map_SF

The geograpical coordinate of this city are 37.77, -122.42.


### San Francisco Health Inspection Data <a name="SF_Data"></a>

In [6]:
# Create URL to JSON file (alternatively this can be a filepath)
url = 'https://data.sfgov.org/resource/pyih-qa8i.json'

# Load the first sheet of the JSON file into a data frame
health_df = pd.read_json(url, orient='columns')
health_df['business_name'].str.strip()
# View the first five rows
health_df.head()

Unnamed: 0,business_id,business_name,business_address,business_city,business_state,business_postal_code,inspection_id,inspection_date,inspection_type,violation_id,...,inspection_score,business_latitude,business_longitude,business_location,:@computed_region_fyvs_ahh9,:@computed_region_p5aj_wyqh,:@computed_region_rxqg_mtj9,:@computed_region_yftq_j783,:@computed_region_bh8s_q3mv,:@computed_region_ajp5_b2md
0,69618,Fancy Wheatfield Bakery,1362 Stockton St,San Francisco,CA,94133,6961820190304,2019-03-04T00:00:00.000,Complaint,6.96182e+18,...,,,,,,,,,,
1,97975,BREADBELLY,1408 Clement St,San Francisco,CA,94118,9797520190725,2019-07-25T00:00:00.000,Routine - Unscheduled,9.79752e+18,...,96.0,,,,,,,,,
2,69487,Hakkasan San Francisco,1 Kearny St,San Francisco,CA,94108,6948720180418,2018-04-18T00:00:00.000,Routine - Unscheduled,6.94872e+18,...,88.0,,,,,,,,,
3,91044,Chopsticks Restaurant,4615 Mission St,San Francisco,CA,94112,9104420170818,2017-08-18T00:00:00.000,Non-inspection site visit,,...,,,,,,,,,,
4,85987,Tselogs,552 Jones St,San Francisco,CA,94102,8598720180412,2018-04-12T00:00:00.000,Routine - Unscheduled,8.59872e+18,...,94.0,,,,,,,,,


In [7]:
# Only work with Restaurants with an inspection score
health_df = health_df[health_df['inspection_score'].notnull()]
health_df.sort_values(by='inspection_score', ascending=False)

Unnamed: 0,business_id,business_name,business_address,business_city,business_state,business_postal_code,inspection_id,inspection_date,inspection_type,violation_id,...,inspection_score,business_latitude,business_longitude,business_location,:@computed_region_fyvs_ahh9,:@computed_region_p5aj_wyqh,:@computed_region_rxqg_mtj9,:@computed_region_yftq_j783,:@computed_region_bh8s_q3mv,:@computed_region_ajp5_b2md
141,93193,Ballast Coffee,329 West Portal Ave,San Francisco,CA,94127,9319320181101,2018-11-01T00:00:00.000,Routine - Unscheduled,,...,100.0,,,,,,,,,
160,90010,Noeteca,1551 Dolores St,San Francisco,CA,94110,9001020190729,2019-07-29T00:00:00.000,Routine - Unscheduled,,...,100.0,,,,,,,,,
407,95129,Homeplate Boba Cart,"24 Willie Mays Pl View Level, Section 319",San Francisco,CA,94107,9512920180911,2018-09-11T00:00:00.000,Routine - Unscheduled,,...,100.0,,,,,,,,,
781,94935,94635 Baby Bull Cart,24 Willie Mays Pl Upper CF Sec 143,San Francisco,CA,94107,9493520190412,2019-04-12T00:00:00.000,Routine - Unscheduled,,...,100.0,,,,,,,,,
247,86381,The Saratoga,1000 Larkin St,San Francisco,CA,94109,8638120190822,2019-08-22T00:00:00.000,Routine - Unscheduled,,...,100.0,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
888,71008,House of Pancakes,937 TARAVAL,San Francisco,CA,94116,7100820190820,2019-08-20T00:00:00.000,Routine - Unscheduled,7.100820e+18,...,59.0,,,,,,,,,
136,90622,Taqueria Lolita,750 Phelps St,San Francisco,CA,94124,9062220180821,2018-08-21T00:00:00.000,Routine - Unscheduled,9.062220e+18,...,57.0,,,,,,,,,
890,91843,Hello Sandwich & Noodle,426 Larkin St,San Francisco,CA,94102,9184320180822,2018-08-22T00:00:00.000,Routine - Unscheduled,9.184320e+18,...,55.0,,,,,,,,,
707,1154,SUNFLOWER RESTAURANT,506 Valencia St,San Francisco,CA,94103,115420190327,2019-03-27T00:00:00.000,Routine - Unscheduled,1.154202e+17,...,46.0,37.764678,-122.421905,"{'type': 'Point', 'coordinates': [-122.421905,...",19.0,4.0,5.0,8.0,28859.0,20.0


In [9]:
#Remove duplicates
health_df = health_df.drop_duplicates(subset='business_id', keep="first")
health_df.shape

(408, 23)

In [10]:
# df['risk_category'].value_counts()
health_df['inspection_score'].describe()

count    408.000000
mean      87.473039
std        8.921677
min       46.000000
25%       82.000000
50%       88.000000
75%       94.000000
max      100.000000
Name: inspection_score, dtype: float64

#### 408 Businesses Inspections Scores range between 46 and 100

In [296]:
health_df.head()

Unnamed: 0,business_id,business_name,business_address,business_city,business_state,business_postal_code,inspection_id,inspection_date,inspection_type,violation_id,violation_description,risk_category,business_phone_number,inspection_score,business_latitude,business_longitude,business_location,:@computed_region_fyvs_ahh9,:@computed_region_p5aj_wyqh,:@computed_region_rxqg_mtj9,:@computed_region_yftq_j783,:@computed_region_bh8s_q3mv,:@computed_region_ajp5_b2md
1,97975,BREADBELLY,1408 Clement St,San Francisco,CA,94118,9797520190725,2019-07-25T00:00:00.000,Routine - Unscheduled,9.79752e+18,Inadequately cleaned or sanitized food contact surfaces,Moderate Risk,14157240000.0,96.0,,,,,,,,,
2,69487,Hakkasan San Francisco,1 Kearny St,San Francisco,CA,94108,6948720180418,2018-04-18T00:00:00.000,Routine - Unscheduled,6.94872e+18,Inadequate and inaccessible handwashing facilities,Moderate Risk,,88.0,,,,,,,,,
4,85987,Tselogs,552 Jones St,San Francisco,CA,94102,8598720180412,2018-04-12T00:00:00.000,Routine - Unscheduled,8.59872e+18,Improper thawing methods,Moderate Risk,,94.0,,,,,,,,,
8,77901,"The Estate Kitchen, LLC",799 Bryant St,San Francisco,CA,94107,7790120180416,2018-04-16T00:00:00.000,Routine - Unscheduled,7.79012e+18,Improper food storage,Low Risk,,86.0,,,,,,,,,
9,87782,Beloved Cafe,3338 24th St,San Francisco,CA,94110,8778220180502,2018-05-02T00:00:00.000,Routine - Unscheduled,8.77822e+18,Low risk vermin infestation,Low Risk,14155540000.0,96.0,,,,,,,,,


In [12]:
def get_geocode(address):
    LatLong = []
    address = address + ', San Francisco, CA'
    geolocator = Nominatim(user_agent="SF_explorer")
    location = geolocator.geocode(address)
    if location is None:
        latitude = 0
        longitude = 0
    else:
        latitude = location.latitude
        longitude = location.longitude
    
    return latitude, longitude

In [332]:
health_df.head()

Unnamed: 0,business_id,business_name,business_address,business_city,business_state,business_postal_code,inspection_id,inspection_date,inspection_type,violation_id,violation_description,risk_category,business_phone_number,inspection_score,business_latitude,business_longitude,business_location,:@computed_region_fyvs_ahh9,:@computed_region_p5aj_wyqh,:@computed_region_rxqg_mtj9,:@computed_region_yftq_j783,:@computed_region_bh8s_q3mv,:@computed_region_ajp5_b2md
1,97975,BREADBELLY,1408 Clement St,San Francisco,CA,94118,9797520190725,2019-07-25T00:00:00.000,Routine - Unscheduled,9.79752e+18,Inadequately cleaned or sanitized food contact surfaces,Moderate Risk,14157200000.0,96,,,,,,,,,
2,69487,Hakkasan San Francisco,1 Kearny St,San Francisco,CA,94108,6948720180418,2018-04-18T00:00:00.000,Routine - Unscheduled,6.94872e+18,Inadequate and inaccessible handwashing facilities,Moderate Risk,,88,,,,,,,,,
4,85987,Tselogs,552 Jones St,San Francisco,CA,94102,8598720180412,2018-04-12T00:00:00.000,Routine - Unscheduled,8.59872e+18,Improper thawing methods,Moderate Risk,,94,,,,,,,,,
8,77901,"The Estate Kitchen, LLC",799 Bryant St,San Francisco,CA,94107,7790120180416,2018-04-16T00:00:00.000,Routine - Unscheduled,7.79012e+18,Improper food storage,Low Risk,,86,,,,,,,,,
9,87782,Beloved Cafe,3338 24th St,San Francisco,CA,94110,8778220180502,2018-05-02T00:00:00.000,Routine - Unscheduled,8.77822e+18,Low risk vermin infestation,Low Risk,14155500000.0,96,,,,,,,,,


In [13]:
#Populate Coordinates
for index, row in health_df.iterrows():
    lat, lng = get_geocode(row['business_address'])
    row['business_longitude'] = lng
    row['business_latitude'] = lat

In [14]:
health_df.head()

Unnamed: 0,business_id,business_name,business_address,business_city,business_state,business_postal_code,inspection_id,inspection_date,inspection_type,violation_id,...,inspection_score,business_latitude,business_longitude,business_location,:@computed_region_fyvs_ahh9,:@computed_region_p5aj_wyqh,:@computed_region_rxqg_mtj9,:@computed_region_yftq_j783,:@computed_region_bh8s_q3mv,:@computed_region_ajp5_b2md
1,97975,BREADBELLY,1408 Clement St,San Francisco,CA,94118,9797520190725,2019-07-25T00:00:00.000,Routine - Unscheduled,9.79752e+18,...,96.0,,,,,,,,,
2,69487,Hakkasan San Francisco,1 Kearny St,San Francisco,CA,94108,6948720180418,2018-04-18T00:00:00.000,Routine - Unscheduled,6.94872e+18,...,88.0,,,,,,,,,
4,85987,Tselogs,552 Jones St,San Francisco,CA,94102,8598720180412,2018-04-12T00:00:00.000,Routine - Unscheduled,8.59872e+18,...,94.0,,,,,,,,,
8,77901,"The Estate Kitchen, LLC",799 Bryant St,San Francisco,CA,94107,7790120180416,2018-04-16T00:00:00.000,Routine - Unscheduled,7.79012e+18,...,86.0,,,,,,,,,
9,87782,Beloved Cafe,3338 24th St,San Francisco,CA,94110,8778220180502,2018-05-02T00:00:00.000,Routine - Unscheduled,8.77822e+18,...,96.0,,,,,,,,,


In [15]:
health_df = health_df[health_df['business_latitude'] != 0]
health_df

Unnamed: 0,business_id,business_name,business_address,business_city,business_state,business_postal_code,inspection_id,inspection_date,inspection_type,violation_id,...,inspection_score,business_latitude,business_longitude,business_location,:@computed_region_fyvs_ahh9,:@computed_region_p5aj_wyqh,:@computed_region_rxqg_mtj9,:@computed_region_yftq_j783,:@computed_region_bh8s_q3mv,:@computed_region_ajp5_b2md
1,97975,BREADBELLY,1408 Clement St,San Francisco,CA,94118,9797520190725,2019-07-25T00:00:00.000,Routine - Unscheduled,9.797520e+18,...,96.0,,,,,,,,,
2,69487,Hakkasan San Francisco,1 Kearny St,San Francisco,CA,94108,6948720180418,2018-04-18T00:00:00.000,Routine - Unscheduled,6.948720e+18,...,88.0,,,,,,,,,
4,85987,Tselogs,552 Jones St,San Francisco,CA,94102,8598720180412,2018-04-12T00:00:00.000,Routine - Unscheduled,8.598720e+18,...,94.0,,,,,,,,,
8,77901,"The Estate Kitchen, LLC",799 Bryant St,San Francisco,CA,94107,7790120180416,2018-04-16T00:00:00.000,Routine - Unscheduled,7.790120e+18,...,86.0,,,,,,,,,
9,87782,Beloved Cafe,3338 24th St,San Francisco,CA,94110,8778220180502,2018-05-02T00:00:00.000,Routine - Unscheduled,8.778220e+18,...,96.0,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
989,1671,Just For You Cafe Inc.,732 22ND St,San Francisco,CA,94107,167120190624,2019-06-24T00:00:00.000,Routine - Unscheduled,1.671202e+17,...,92.0,37.757948,-122.388862,"{'type': 'Point', 'coordinates': [-122.388862,...",29.0,3.0,8.0,10.0,28856.0,26.0
992,1269,STARBUCKS,201 SPEAR St,San Francisco,CA,94105,126920190716,2019-07-16T00:00:00.000,Routine - Unscheduled,1.269202e+17,...,94.0,37.790944,-122.392051,"{'type': 'Point', 'coordinates': [-122.392051,...",6.0,2.0,9.0,6.0,28855.0,8.0
995,95311,95311 C&C Concessions/Portable 130 Lemonade Sn...,24 Willie Mays Pl Promenade Lvl Sect 130,San Francisco,CA,94107,9531120190412,2019-04-12T00:00:00.000,Routine - Unscheduled,,...,100.0,,,,,,,,,
998,94910,Ike's Kitchen,800 Van Ness Ave,San Francisco,CA,94109,9491020180824,2018-08-24T00:00:00.000,Routine - Unscheduled,9.491020e+18,...,77.0,,,,,,,,,


### Foursquare API Data <a name="FourSquare"></a>

In [16]:
CLIENT_ID = '0KAOUTNBE0UZIMJ0UVOCAWXBISWMGOZ0GRBX53GERNC4GOZR' 
CLIENT_SECRET = 'RMRTIYEFM4QRTZMFDAUZGAAXVZVCTHYR2ULOKXIRO4HHSI5D' 
VERSION = '20200707'
print('Your credentials:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)
radius=4000 #4 km
LIMIT=100

Your credentials:
CLIENT_ID: 0KAOUTNBE0UZIMJ0UVOCAWXBISWMGOZ0GRBX53GERNC4GOZR
CLIENT_SECRET:RMRTIYEFM4QRTZMFDAUZGAAXVZVCTHYR2ULOKXIRO4HHSI5D


In [162]:
def getNearbyVenues(names, latitudes, longitudes, radius=400):
    venues_list=[]
    LIMIT = 100
    for name, lat, lng in zip(names, latitudes, longitudes):
          # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
          # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng,
            v['venue']['id'],
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
        
        nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue ID',
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    return(nearby_venues)

In [163]:
SF_venues = getNearbyVenues(names=covid_df['Neighborhood'], 
                                    latitudes=covid_df['latitude'], 
                                    longitudes=covid_df['longitude'])
SF_venues

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue ID,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Financial District/South Beach,37.793647,-122.398938,40c10d00f964a520df001fe3,Embarcadero Center Cinema,37.794871,-122.399648,Indie Movie Theater
1,Financial District/South Beach,37.793647,-122.398938,587449b8abf6322f4435759f,Homegrown,37.793617,-122.400000,Sandwich Place
2,Financial District/South Beach,37.793647,-122.398938,51acf1382fc674aef7a09b1a,Wheel House,37.794481,-122.399970,Gym
3,Financial District/South Beach,37.793647,-122.398938,4bf2e4376a31d13ac307942e,Blue Hawaii Açaí Café,37.794668,-122.397912,Acai House
4,Financial District/South Beach,37.793647,-122.398938,459b7818f964a52089401fe3,Perbacco,37.793288,-122.399134,Italian Restaurant
...,...,...,...,...,...,...,...,...
1418,Lakeshore,37.733611,-122.491389,4af730b2f964a520fb0622e3,Big 5 Sporting Goods,37.732598,-122.490345,Sporting Goods Shop
1419,Lakeshore,37.733611,-122.491389,4b7a016cf964a520b61e2fe3,Noah's Bagels,37.732443,-122.489940,Bagel Shop
1420,Lakeshore,37.733611,-122.491389,4a809dc6f964a520b6f51fe3,YUYU Sushi,37.733055,-122.490703,Sushi Restaurant
1421,Lakeshore,37.733611,-122.491389,54710b85498e6ff092294183,MassageLuXe,37.733346,-122.490851,Spa


In [164]:
SF_venues['Neighborhood'].value_counts()

Hayes Valley                      100
Tenderloin                         92
Chinatown                          92
Castro/Upper Market                80
Mission                            80
Haight Ashbury                     76
Outer Mission                      70
Marina                             66
North Beach                        65
Japantown                          61
South of Market                    55
Financial District/South Beach     49
Noe Valley                         48
Mission Bay                        47
Nob Hill                           43
Inner Sunset                       38
Glen Park                          37
Excelsior                          36
Pacific Heights                    33
Presidio Heights                   25
Russian Hill                       25
Western Addition                   23
Lakeshore                          23
Lone Mountain/USF                  18
Outer Richmond                     18
West of Twin Peaks                 14
Golden Gate 

In [165]:
SF_venues['Venue Category'].value_counts()
#There are a lot of venues present that might not be present on Zomato so let's drop those

Coffee Shop           68
Park                  46
Bakery                31
Café                  30
Mexican Restaurant    30
                      ..
Nabe Restaurant        1
Persian Restaurant     1
Tattoo Parlor          1
Skating Rink           1
Acai House             1
Name: Venue Category, Length: 272, dtype: int64

In [166]:
SF_venues=SF_venues[SF_venues['Venue Category'].isin(['Pizza Place','Italian Restaurant'])]
SF_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue ID,Venue,Venue Latitude,Venue Longitude,Venue Category
4,Financial District/South Beach,37.793647,-122.398938,459b7818f964a52089401fe3,Perbacco,37.793288,-122.399134,Italian Restaurant
59,Haight Ashbury,37.770015,-122.446952,5849c58a6ad73d598e7f4174,Slice House by Tony Gemignani,37.769832,-122.44757,Pizza Place
103,Haight Ashbury,37.770015,-122.446952,44786832f964a520ca331fe3,Escape From New York Pizza,37.769416,-122.451361,Pizza Place
121,Haight Ashbury,37.770015,-122.446952,4b6cec46f964a5201a5e2ce3,cookwithjames,37.767981,-122.445134,Italian Restaurant
137,Outer Richmond,37.777046,-122.465453,5463e257498e84f8746a0d9d,Grinders Pizzeria,37.777373,-122.46371,Pizza Place


In [167]:
SF_venues.shape

(56, 8)

In [175]:
SF_venues['rating'] = ""

In [176]:
SF_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue ID,Venue,Venue Latitude,Venue Longitude,Venue Category,rating
4,Financial District/South Beach,37.793647,-122.398938,459b7818f964a52089401fe3,Perbacco,37.793288,-122.399134,Italian Restaurant,
59,Haight Ashbury,37.770015,-122.446952,5849c58a6ad73d598e7f4174,Slice House by Tony Gemignani,37.769832,-122.44757,Pizza Place,
103,Haight Ashbury,37.770015,-122.446952,44786832f964a520ca331fe3,Escape From New York Pizza,37.769416,-122.451361,Pizza Place,
121,Haight Ashbury,37.770015,-122.446952,4b6cec46f964a5201a5e2ce3,cookwithjames,37.767981,-122.445134,Italian Restaurant,
137,Outer Richmond,37.777046,-122.465453,5463e257498e84f8746a0d9d,Grinders Pizzeria,37.777373,-122.46371,Pizza Place,


In [178]:
# GET https://api.foursquare.com/v2/venues/VENUE_ID
venues_ids= SF_venues['Venue ID']
ratings=[]
for venue_id in venues_ids.values.tolist():
    url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION)
    result = requests.get(url).json()
    try:
        print(result['response']['venue']['rating'])
        # SF_venues.loc[SF_venues.rating == result['response']['venue']['rating'], "Venue ID"] = venue_id
        #ratings=ratings+[venues_rating]
    except:
        print('This venue has not been rated yet.')
#ratings

#venue_id = '4f3232e219836c91c7bfde94' # ID of Conca Cucina Italian Restaurant
#url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION)

This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not been rated yet.
This venue has not b

In [179]:
SF_Merged_df = pd.merge(SF_venues, covid_df, on='Neighborhood')

In [180]:
SF_Merged_df.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue ID,Venue,Venue Latitude,Venue Longitude,Venue Category,rating,area_type,Cases,Rate of Cases per 10k,deaths,acs_population,last_updated_at,multipolygon,latitude,longitude
0,Financial District/South Beach,37.793647,-122.398938,459b7818f964a52089401fe3,Perbacco,37.793288,-122.399134,Italian Restaurant,,Analysis Neighborhood,79.0,40.600267,0.0,19458,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.793647,-122.398938
1,Haight Ashbury,37.770015,-122.446952,5849c58a6ad73d598e7f4174,Slice House by Tony Gemignani,37.769832,-122.44757,Pizza Place,,Analysis Neighborhood,43.0,23.213129,0.0,18524,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.770015,-122.446952
2,Haight Ashbury,37.770015,-122.446952,44786832f964a520ca331fe3,Escape From New York Pizza,37.769416,-122.451361,Pizza Place,,Analysis Neighborhood,43.0,23.213129,0.0,18524,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.770015,-122.446952
3,Haight Ashbury,37.770015,-122.446952,4b6cec46f964a5201a5e2ce3,cookwithjames,37.767981,-122.445134,Italian Restaurant,,Analysis Neighborhood,43.0,23.213129,0.0,18524,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.770015,-122.446952
4,Outer Richmond,37.777046,-122.465453,5463e257498e84f8746a0d9d,Grinders Pizzeria,37.777373,-122.46371,Pizza Place,,Analysis Neighborhood,95.0,20.701227,,45891,2020-07-21 15:00:16.553,"{'type': 'MultiPolygon', 'coordinates': [[[[-1...",37.777046,-122.465453


In [181]:
SF_Merged_df.shape

(56, 18)

In [None]:
#health_df business_name

In [182]:
c_df = health_df.rename({'business_name': 'Venue'}, axis=1)

In [183]:
c_df.head()

Unnamed: 0,business_id,Venue,business_address,business_city,business_state,business_postal_code,inspection_id,inspection_date,inspection_type,violation_id,...,inspection_score,business_latitude,business_longitude,business_location,:@computed_region_fyvs_ahh9,:@computed_region_p5aj_wyqh,:@computed_region_rxqg_mtj9,:@computed_region_yftq_j783,:@computed_region_bh8s_q3mv,:@computed_region_ajp5_b2md
1,97975,BREADBELLY,1408 Clement St,San Francisco,CA,94118,9797520190725,2019-07-25T00:00:00.000,Routine - Unscheduled,9.79752e+18,...,96.0,,,,,,,,,
2,69487,Hakkasan San Francisco,1 Kearny St,San Francisco,CA,94108,6948720180418,2018-04-18T00:00:00.000,Routine - Unscheduled,6.94872e+18,...,88.0,,,,,,,,,
4,85987,Tselogs,552 Jones St,San Francisco,CA,94102,8598720180412,2018-04-12T00:00:00.000,Routine - Unscheduled,8.59872e+18,...,94.0,,,,,,,,,
8,77901,"The Estate Kitchen, LLC",799 Bryant St,San Francisco,CA,94107,7790120180416,2018-04-16T00:00:00.000,Routine - Unscheduled,7.79012e+18,...,86.0,,,,,,,,,
9,87782,Beloved Cafe,3338 24th St,San Francisco,CA,94110,8778220180502,2018-05-02T00:00:00.000,Routine - Unscheduled,8.77822e+18,...,96.0,,,,,,,,,


In [184]:
SF_Final_df = pd.merge(SF_Merged_df, c_df, on='Venue', how='outer')

In [185]:
SF_Final_df.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue ID,Venue,Venue Latitude,Venue Longitude,Venue Category,rating,area_type,...,inspection_score,business_latitude,business_longitude,business_location,:@computed_region_fyvs_ahh9,:@computed_region_p5aj_wyqh,:@computed_region_rxqg_mtj9,:@computed_region_yftq_j783,:@computed_region_bh8s_q3mv,:@computed_region_ajp5_b2md
0,Financial District/South Beach,37.793647,-122.398938,459b7818f964a52089401fe3,Perbacco,37.793288,-122.399134,Italian Restaurant,,Analysis Neighborhood,...,,,,,,,,,,
1,Haight Ashbury,37.770015,-122.446952,5849c58a6ad73d598e7f4174,Slice House by Tony Gemignani,37.769832,-122.44757,Pizza Place,,Analysis Neighborhood,...,,,,,,,,,,
2,Haight Ashbury,37.770015,-122.446952,44786832f964a520ca331fe3,Escape From New York Pizza,37.769416,-122.451361,Pizza Place,,Analysis Neighborhood,...,,,,,,,,,,
3,Haight Ashbury,37.770015,-122.446952,4b6cec46f964a5201a5e2ce3,cookwithjames,37.767981,-122.445134,Italian Restaurant,,Analysis Neighborhood,...,,,,,,,,,,
4,Outer Richmond,37.777046,-122.465453,5463e257498e84f8746a0d9d,Grinders Pizzeria,37.777373,-122.46371,Pizza Place,,Analysis Neighborhood,...,,,,,,,,,,
