## MVP submission
Identifying foot traffic for a single Duane Reade store using MTA station entry data.

### Importing and selecting a Duane Reade store

In [126]:
# Import env variable API key
from dotenv import dotenv_values
config = dotenv_values(".env")
API_KEY = config["API_KEY"]
OSR_TOKEN = config["OSR_TOKEN"]

In [127]:
# Import libraries
import pandas as pd
import numpy as np
import geopandas
import matplotlib.pyplot as plt

import glob
import json
import requests
from geojson import Feature, Point, FeatureCollection, Polygon

import folium
from openrouteservice import client
from datetime import datetime

drlocations = pd.read_csv('/Users/joycetagal/GitHub/metis-eda/drlocations_new.csv')

In [128]:
# Code for geocoding latlongs from address. Commented out to ensure API isn't pinged unnecessarily
# import googlemaps
# gmaps = googlemaps.Client(key=API_KEY)
# drlocations["latlon"] = drlocations.address.apply(lambda address : gmaps.geocode(address)[0]['geometry']['location'])
# drlocations = pd.concat([drlocations, drlocations['latlon']
#                         .apply(pd.Series)], axis=1
#                       ).drop("latlon", axis=1)
# drlocations.to_csv("drlocations_new.csv", index=False)

In [129]:
drlocations.head()

Unnamed: 0,store_name,address,lat,lng
0,Duane Reade,"250 BROADWAY, NEW YORK, NY 10007",40.713008,-74.007821
1,Duane Reade,"305 BROADWAY, NEW YORK, NY 10007",40.715479,-74.005545
2,Duane Reade,"17 JOHN ST NEW YORK, NY 10038",40.70997,-74.008741
3,Duane Reade,"185 GREENWICH ST, NEW YORK, NY 10007",40.711566,-74.011426
4,Duane Reade,"200 WATER ST, NEW YORK, NY 10038",40.707256,-74.004805


For the purposes of this notebook, I will just use the first Duane Reade store in this dataset, i.e. the one at 250 Broadway.

In [130]:
drlocation = drlocations.iloc[[1,]]
drlocation

Unnamed: 0,store_name,address,lat,lng
1,Duane Reade,"305 BROADWAY, NEW YORK, NY 10007",40.715479,-74.005545


## Identifying isochrones for the MTA locations

In [131]:
# First we import our MTA stations dataset. For the MVP I'm just using 3 stations.

d = [['City Hall', '40.7131583', '-74.00773'], 
     ['Park Place', '40.7131736', '-74.0092965'], 
     ['Fulton Street', '40.7115643' , '-74.009928']
    ]

In [132]:
stations = pd.DataFrame(d, columns = ['station_name', 'lat', 'lng'])

In [133]:
stations

Unnamed: 0,station_name,lat,lng
0,City Hall,40.7131583,-74.00773
1,Park Place,40.7131736,-74.0092965
2,Fulton Street,40.7115643,-74.009928


In [164]:
stations.values

array([['City Hall', '40.7131583', '-74.00773'],
       ['Park Place', '40.7131736', '-74.0092965'],
       ['Fulton Street', '40.7115643', '-74.009928']], dtype=object)

In [165]:
clnt = client.Client(key=OSR_TOKEN)
map1 = folium.Map(tiles='Stamen Toner', location=([40.715479, -74.005545]), zoom_start=15)    

params_iso = {'profile': 'foot-walking',
              'range': [300], #300/60 = 5 mins walking
              'attributes': ['total_pop']
             }

l_dfs = []
for station_name, lat, lng in stations.values:
    point = [lng, lat]
    print(point)
    params_iso['locations'] = [point]
    iso = clnt.isochrones(**params_iso)
    
    for feature in iso['features']:
        feature['properties']['station_name'] = station_name 
   
    gdf = geopandas.GeoDataFrame.from_features(iso)
    l_dfs.append(gdf)
    
    folium.features.GeoJson(iso).add_to(map1)
    
    folium.map.Marker(list(reversed(point)), # reverse coords due to weird folium lat/lon syntax
                      icon=folium.Icon(color='lightgray',
                                        icon_color='#cc0000',
                                        icon='subway',
                                        prefix='fa',
                                       ),
                      popup=station_name,
            ).add_to(map1) # Add apartment locations to map
    
#print(l_dfs)


    
map1

['-74.00773', '40.7131583']
['-74.0092965', '40.7131736']
['-74.009928', '40.7115643']


These isochrones from each MTA station illustrate the spatial polygon of a 5-minute walking distance. Using these isochrones, we can figure out which MTA stations are within a walking distance of our Duane Reade store and calculate the approximate foot traffic coming from the nearby MTA stations per day.

Later on in this notebook, I will create Geopandas dataframes from both the `stations` and `drlocations` dfs, and then left join on the `stations` dataframe to identify all the station isochrones that contain the Duane Reade coordinates. 


In [163]:
l_dfs

[                                            geometry  group_index  value  \
 0  POLYGON ((-74.01207 40.71475, -74.01120 40.711...            0  300.0   
 
                                      center  total_pop station_name  
 0  [-74.00770490753631, 40.713188731245175]     8155.0    City Hall  ,
                                             geometry  group_index  value  \
 0  POLYGON ((-74.01344 40.71445, -74.01327 40.713...            0  300.0   
 
                                    center  total_pop station_name  
 0  [-74.0092982646204, 40.71317211471518]     7655.0   Park Place  ,
                                             geometry  group_index  value  \
 0  POLYGON ((-74.01437 40.71300, -74.01406 40.711...            0  300.0   
 
                                     center  total_pop   station_name  
 0  [-74.00998009442152, 40.71150603102416]     6913.0  Fulton Street  ]

In [141]:
drlocations.head()

Unnamed: 0,store_name,address,lat,lng
0,Duane Reade,"250 BROADWAY, NEW YORK, NY 10007",40.713008,-74.007821
1,Duane Reade,"305 BROADWAY, NEW YORK, NY 10007",40.715479,-74.005545
2,Duane Reade,"17 JOHN ST NEW YORK, NY 10038",40.70997,-74.008741
3,Duane Reade,"185 GREENWICH ST, NEW YORK, NY 10007",40.711566,-74.011426
4,Duane Reade,"200 WATER ST, NEW YORK, NY 10038",40.707256,-74.004805


In [160]:
clnt = client.Client(key=OSR_TOKEN)
map2 = folium.Map(tiles='Stamen Toner', location=([40.715479, -74.005545]), zoom_start=15)    

for store_name, address, lat, lng in drlocations.values:
    point = [lng, lat]
    #folium.features.GeoJson(iso).add_to(map2)
    
    folium.map.Marker(list(reversed(point)), # reverse coords due to weird folium lat/lon syntax
                      icon=folium.Icon(color='lightgray',
                                        icon_color='#cc0000',
                                        icon='plus',
                                        prefix='fa',
                                       ),
                      popup=address,
            ).add_to(map2)

In [161]:
map2

## Creating GeoDataFrames

In this section I create GeoDataFrames for both of the previous datasets, to then identify which MTA stations are within 5 minutes walking distance of the DR locations. I will then use the associated MTA entries to calculate foot traffic in that DR vicinity.

In [139]:
dr_gdf = geopandas.GeoDataFrame(drlocation, geometry=geopandas.points_from_xy(drlocation.lng, drlocation.lat))
dr_gdf

Unnamed: 0,store_name,address,lat,lng,geometry
1,Duane Reade,"305 BROADWAY, NEW YORK, NY 10007",40.715479,-74.005545,POINT (-74.00554 40.71548)


In [138]:
gdf_isochrones = pd.concat(l_dfs)
stations_gdf = gdf_isochrones.merge(stations, on='station_name')
stations_gdf

Unnamed: 0,geometry,group_index,value,center,total_pop,station_name,lat,lng
0,"POLYGON ((-74.01207 40.71475, -74.01120 40.711...",0,300.0,"[-74.00770490753631, 40.713188731245175]",8155.0,City Hall,40.7131583,-74.00773
1,"POLYGON ((-74.01344 40.71445, -74.01327 40.713...",0,300.0,"[-74.0092982646204, 40.71317211471518]",7655.0,Park Place,40.7131736,-74.0092965
2,"POLYGON ((-74.01437 40.71300, -74.01406 40.711...",0,300.0,"[-74.00998009442152, 40.71150603102416]",6913.0,Fulton Street,40.7115643,-74.009928
