# Introduction and Business Problem Statement

Introduction

Across the country, COVID-19 has closed countless restaurants. According to the Yelp report in September, there were 32,109 closures as of August 31, with 19,590 restaurants across nation have permanently shuttered their doors since March. Yet, there are still new restaurants opening their door against the pandemic. Many studies has found that restaurants work well for delivery and takeout have been able to keep their closure rates lower than others, including food trucks, bakeries and coffee shops. 

The aim of this project is to provide an optimal location to open a coffee shop in New York City under COVID. In this report, we will focus on all neighbourhoods in the New York City area.  


Data
1. NYC Boroughs/Neighborhood Geospatial Dataset 
2. Foursquare venue data through the Foursquare API


Let's get a brief overview of the structure of New York City.


In [102]:
# import all the required libraries
import pandas as pd
import numpy as np
import geopy
import requests
from geopy.geocoders import Nominatim
import json # library to handle JSON files
import wget

print('Libraries imported')

Libraries imported


In [103]:
# Import Folium to display maps
import folium
print('Folium Library imported')

Folium Library imported


In [106]:
# Let's download and explore the above mentioned datasets
wget.download('https://cocl.us/new_york_dataset', 'newyork_data.json')
print('Data downloaded!')

  0% [                                                                            ]      0 / 115774  7% [.....                                                                       ]   8192 / 115774 14% [..........                                                                  ]  16384 / 115774 21% [................                                                            ]  24576 / 115774 28% [.....................                                                       ]  32768 / 115774 35% [..........................                                                  ]  40960 / 115774 42% [................................                                            ]  49152 / 115774 49% [.....................................                                       ]  57344 / 115774 56% [...........................................                                 ]  65536 / 115774 63% [................................................                            ]  73728 / 115774

In [107]:
# Open the json file containing NYC data and display a feature
with open('newyork_data.json') as json_data:
    newyork_data = json.load(json_data) 

In [108]:
newyork_data

{'type': 'FeatureCollection',
 'totalFeatures': 306,
 'features': [{'type': 'Feature',
   'id': 'nyu_2451_34572.1',
   'geometry': {'type': 'Point',
    'coordinates': [-73.84720052054902, 40.89470517661]},
   'geometry_name': 'geom',
   'properties': {'name': 'Wakefield',
    'stacked': 1,
    'annoline1': 'Wakefield',
    'annoline2': None,
    'annoline3': None,
    'annoangle': 0.0,
    'borough': 'Bronx',
    'bbox': [-73.84720052054902,
     40.89470517661,
     -73.84720052054902,
     40.89470517661]}},
  {'type': 'Feature',
   'id': 'nyu_2451_34572.2',
   'geometry': {'type': 'Point',
    'coordinates': [-73.82993910812398, 40.87429419303012]},
   'geometry_name': 'geom',
   'properties': {'name': 'Co-op City',
    'stacked': 2,
    'annoline1': 'Co-op',
    'annoline2': 'City',
    'annoline3': None,
    'annoangle': 0.0,
    'borough': 'Bronx',
    'bbox': [-73.82993910812398,
     40.87429419303012,
     -73.82993910812398,
     40.87429419303012]}},
  {'type': 'Feature',
 

In [109]:
neighborhoods_data=newyork_data['features']
neighborhoods_data

[{'type': 'Feature',
  'id': 'nyu_2451_34572.1',
  'geometry': {'type': 'Point',
   'coordinates': [-73.84720052054902, 40.89470517661]},
  'geometry_name': 'geom',
  'properties': {'name': 'Wakefield',
   'stacked': 1,
   'annoline1': 'Wakefield',
   'annoline2': None,
   'annoline3': None,
   'annoangle': 0.0,
   'borough': 'Bronx',
   'bbox': [-73.84720052054902,
    40.89470517661,
    -73.84720052054902,
    40.89470517661]}},
 {'type': 'Feature',
  'id': 'nyu_2451_34572.2',
  'geometry': {'type': 'Point',
   'coordinates': [-73.82993910812398, 40.87429419303012]},
  'geometry_name': 'geom',
  'properties': {'name': 'Co-op City',
   'stacked': 2,
   'annoline1': 'Co-op',
   'annoline2': 'City',
   'annoline3': None,
   'annoangle': 0.0,
   'borough': 'Bronx',
   'bbox': [-73.82993910812398,
    40.87429419303012,
    -73.82993910812398,
    40.87429419303012]}},
 {'type': 'Feature',
  'id': 'nyu_2451_34572.3',
  'geometry': {'type': 'Point',
   'coordinates': [-73.82780644716412, 

In [110]:
# define the dataframe columns
column_names = ['Borough','Neighborhood', 'Latitude', 'Longitude'] 
# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

In [111]:
neighborhoods

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude


In [113]:
## Storing only the required data in a dataframe from above dictionaries
for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = neighborhood_name = data['properties']['name'] 
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

In [114]:
#quickly examine the resulting dataframe
neighborhoods.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


In [116]:
#neighborhoods.shape
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(neighborhoods['Borough'].unique()),
        neighborhoods.shape[0]
    )
)

The dataframe has 5 boroughs and 306 neighborhoods.


In [117]:
#use geopy library to get the latitude and longitude values of new york city
address = 'New York City, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of New York City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of New York City are 40.7127281, -74.0060152.


In [121]:
#create a map of new york using the above coordinates
map_newyork = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_newyork)  
    
map_newyork

Now we are going to utilize the Foursquare API to explore the neighborhoods and segment them