## MVP submission
Identifying foot traffic for a single Duane Reade store using MTA station entry data.

### Importing and selecting a Duane Reade store

In [1]:
# Import env variable API key
from dotenv import dotenv_values
config = dotenv_values(".env")
API_KEY = config["API_KEY"]
OSR_TOKEN = config["OSR_TOKEN"]

In [2]:
# Import libraries
import pandas as pd
import numpy as np
import geopandas
import matplotlib.pyplot as plt

import glob
import json
import requests
from geojson import Feature, Point, FeatureCollection

import folium
from openrouteservice import client
from datetime import datetime

drlocations = pd.read_csv('/Users/joycetagal/GitHub/metis-eda/drlocations_new.csv')

In [3]:
# Code for geocoding latlongs from address. Commented out to ensure API isn't pinged unnecessarily
# import googlemaps
# gmaps = googlemaps.Client(key=API_KEY)
# drlocations["latlon"] = drlocations.address.apply(lambda address : gmaps.geocode(address)[0]['geometry']['location'])
# drlocations = pd.concat([drlocations, drlocations['latlon']
#                         .apply(pd.Series)], axis=1
#                       ).drop("latlon", axis=1)
# drlocations.to_csv("drlocations_new.csv", index=False)

In [4]:
drlocations.head()

Unnamed: 0,store_name,address,lat,lng
0,Duane Reade,"250 BROADWAY, NEW YORK, NY 10007",40.713008,-74.007821
1,Duane Reade,"305 BROADWAY, NEW YORK, NY 10007",40.715479,-74.005545
2,Duane Reade,"17 JOHN ST NEW YORK, NY 10038",40.70997,-74.008741
3,Duane Reade,"185 GREENWICH ST, NEW YORK, NY 10007",40.711566,-74.011426
4,Duane Reade,"200 WATER ST, NEW YORK, NY 10038",40.707256,-74.004805


For the purposes of this notebook, I will just use the first Duane Reade store in this dataset, i.e. the one at 250 Broadway.

In [5]:
drlocation = drlocations.iloc[[1,]]
drlocation

Unnamed: 0,store_name,address,lat,lng
1,Duane Reade,"305 BROADWAY, NEW YORK, NY 10007",40.715479,-74.005545


## Identifying isochrones for the DR location

In [6]:
# First we import our MTA stations dataset. For the MVP I'm just using 3 stations.

d = [['City Hall', '40.7131583', '-74.00773'], 
     ['Park Place', '40.7131736', '-74.0092965'], 
     ['Fulton Street', '40.7115643' , '-74.009928']
    ]

In [68]:
stations = pd.DataFrame(d, columns = ['station_name', 'lat', 'lng'])

In [69]:
stations.values

array([['City Hall', '40.7131583', '-74.00773'],
       ['Park Place', '40.7131736', '-74.0092965'],
       ['Fulton Street', '40.7115643', '-74.009928']], dtype=object)

In [70]:
clnt = client.Client(key=OSR_TOKEN)
map1 = folium.Map(tiles='Stamen Toner', location=([40.715479, -74.005545]), zoom_start=15)    

params_iso = {'profile': 'foot-walking',
              'range': [300], #300/60 = 5 mins walking
              'attributes': ['total_pop']
             }

result = []
for station_name, lat, lng in stations.values:
    point = [lng, lat]
    params_iso['locations'] = [point]
    iso = clnt.isochrones(**params_iso)
    
    folium.features.GeoJson(iso).add_to(map1)
    result.append(iso)
    #result.append(iso['features'][0]['geometry'])
    
    folium.map.Marker(list(reversed(point)), # reverse coords due to weird folium lat/lon syntax
                      icon=folium.Icon(color='lightgray',
                                        icon_color='#cc0000',
                                        icon='subway',
                                        prefix='fa',
                                       ),
                      popup=station_name,
            ).add_to(map1) # Add apartment locations to map
    
#print(result)

#stations['geometry'] = result
    
map1

These isochrones from each MTA station illustrate the spatial polygon of a 5-minute walking distance. Using these isochrones, we can figure out which MTA stations are within a walking distance of our Duane Reade store and calculate the approximate foot traffic coming from the nearby MTA stations per day.

Later on in this notebook, I will create Geopandas dataframes from both the `stations` and `drlocations` dfs, and then left join on the `stations` dataframe to identify all the station isochrones that contain the Duane Reade coordinates. 


In [29]:
# Example code
#import geopandas as gpd
#study_area = json.loads("""
# {"type": "FeatureCollection", "features": [{"type": "Feature", "properties": {}, "geometry": {"type": "Polygon", "coordinates": [[[36.394272, -18.626726], [36.394272, -18.558391], [36.489716, -18.558391], [36.489716, -18.626726], [36.394272, -18.626726]]]}}]}
#""")
#gdf = gpd.GeoDataFrame.from_features(study_area["features"])
#print(gdf.head())

                                            geometry
0  POLYGON ((36.39427 -18.62673, 36.39427 -18.558...


In [None]:
#gdf = geopandas.GeoDataFrame.from_features(stations["features"])

In [61]:
#stations['geom'] = result
#stations

Unnamed: 0,station_name,lat,lng,geom
0,City Hall,40.7131583,-74.00773,"{'coordinates': [[[-74.012068, 40.714752], [-7..."
1,Park Place,40.7131736,-74.0092965,"{'coordinates': [[[-74.013437, 40.714451], [-7..."
2,Fulton Street,40.7115643,-74.009928,"{'coordinates': [[[-74.014371, 40.712998], [-7..."


In [None]:
#from shapely.geometry import shape
#stations['geometry'] = result

In [None]:
# DO NOT USE - EXAMPLE CODE FOR LATER USE 
# (https://www.linkedin.com/pulse/isochrones-geopandas-paul-whiteside/)
#l_processed = []

#for station_name, lat, lng  in stations.values:
#    if not station_name in l_processed:
#        print(station_name)
#        point = [lat, lng]
#        
#         try:
#             clnt = client.Client(key=api_key)
#             r = clnt.isochrones(**params_iso)
            
#             for feature in r['features']:
#                 feature['properties']['name'] = place
                
#             with open(f'isochrones/{place}.json', 'w') as f:
#                 f.write(json.dumps(r))
#                 l_processed.append(place)
#         except:
#             print(f"Problem processing {place}")
        
#         time.sleep(2)