# Your tasks are as follows:

##### Connect to the Foursquare API
##### Connect to the Yelp API. This API offers similar services as Foursquare.
##### For each of the bike stations in Part 1, query both APIs to retrieve information for the following in that location:
##### Restaurants or bars
##### Various POIs (points of interest) of your choice
##### Create a DataFrame for the Yelp results and Foursquare results.
##### Compare the quality of the Yelp and Foursquare API. For your location, which API gives you the most complete information/better coverage? NOTE: Your definition of 'coverage' is up to you. It could be simple 'number of POIs in the area', but it could also be something more specific like 'number of reviews per POI', or 'number of different attributes of each POI'.
## Complete the yelp_foursquare_EDA.ipynb notebook to demonstrate how you executed the tasks above.

In [5]:
import pandas as pd
import os # use this to access your environment variables
import requests # this will be used to call the APIs
import numpy as np
import statsmodels.api as sm
import statsmodels.formula.api as smf
import scipy
import json #json parsing libraries

In [11]:
YELP_API_KEY = os.getenv('YELP_API_KEY')
FOURSQUARE_KEY = os.getenv('FOUR_SQUARE_API_KEY')

In [2]:
def get_venues_yelp(latitude, longitude, radius, api_key, categories):
    """
    Gets venues from yelp with a specified place type and coordinates.
    Args:
        latitude (float): latitude for query (must be combined with longitude)
        longitude (float): longitude for query (must be combined with latitude)
        api_key (str): Yelp API key to use for query
        categories (str) : Place types as found in https://docs.developer.yelp.com/docs/resources-categories
            If not passed no type will be specified. Separate ids with commas
    
    Returns:
        response: response object from the requests library.
    """
    url = "https://api.yelp.com/v3/businesses/search?"

    headers = {
        "accept": "application/json",
        "Authorization": f"Bearer {api_key}"
        }

    params = {
        "latitude": float(latitude),
        "longitude": float(longitude),
        'radius': radius,
        'categories': categories
    }

    response = requests.request("GET", url, headers=headers, params=params)
    return response

In [13]:
def get_venues_fs(latitude, longitude, radius, api_key, categories):
    """
    Gets venues from foursquare with a specified place type and coordinates.
    Args:
        latitude (float): latitude for query (must be combined with longitude)
        longitude (float): longitude for query (must be combined with latitude)
        api_key (str): foursquare API key to use for query
        categories (str) : Foursquare-recognized place types listed in: https://location.foursquare.com/places/docs/categories
            If not passed no place_type will be specified. Separate ids with commas
    
    Returns:
        response: response object from the requests library.
    """
    url = "https://api.foursquare.com/v3/places/search"
    params = {
        "ll": f"{latitude},{longitude}",
        'radius': radius,
        'categories': categories
    }
    headers = {
        "Accept": "application/json",
        "Authorization": api_key
    }
    response = requests.request("GET", url, headers=headers, params=params)
    return response

In [None]:
#These queries will allow us to call the Four Square and Yelp APIs to get information about locations 
#within a certain radius of the lat. and long. 
# But another function will need to be written to pass the data provided by the City Bikes API

In [15]:
def station_query(row):
    api_key = FOURSQUARE_KEY
    response = get_venues_fs(float(row['latitude']), float(row['longitude']), 5000, api_key, 19046)
    response = response.json()
    return response
    #19046 = CTA L station
    #19043 = Bus Stop
    #19054 = all public transit

In [None]:
def station_query_yelp(row):
    api_key = YELP_API_KEY
    response = get_venues_yelp(float(row['latitude']), float(row['longitude']), 5000, api_key, 'bikeshop')
    response = response.json()
    return response
    #19046 = CTA L station
    #19043 = Bus Stop
    #19054 = all public transit

In [None]:
#the two functions above will apply the data from our previous query to the functions querying Four Square and Yelp

In [None]:
data = pd.read_json(r'chi_six30_fri.json')

In [None]:
response = data.apply(lambda row: station_query(row), axis=1) 
#I very nearly understand how these lambda functions are working, 
#they are applying the calls to query the API to each row and returning the result. This is good

In [None]:
response2 = data.apply(lambda row: station_query_yelp(row), axis=1)

In [None]:
response2[1]['businesses'] #exploring the data received from Yelp, we see that we have made a poor choice.
len(response2) #while the correct length is returned

In [None]:
#writing a small function to parse the data and see how many results we get:

In [None]:
#Now the question is, what information do we need to investigate our question. 
# As addressed in the readme, I have elected to look at the relationship between public transit 
#and the bikeshare program in Chicago.

##### Information to get per bike station:
#####  how close is the closest 'l' stop?
##### how many stations are w/i 1.6km? (1 mi)
##### how many w/i 3k? (2 mi)

In [None]:
# The data from Four Square, however, is exactly what we are looking for and will be processed.

In [None]:
print(response[0].keys())
print(response[0]['results'][0].keys())
print(response[0]['results'][0]['distance'])
print(response[0]['results'][0]['location'])

In [None]:
# from these little probings we can see that the fields we want are within the results. 
# each query will return a list of CTA stations and give their distance from the lat. and long. provided

In [None]:
distance = []
near_station = []
num_stations_1mi = []
num_stations_2mi = []
def fs_append(response):
    for i in response:
        count1 = 0
        count2 = 0
        for l in i['results']:### adds to the count of stations, which will give an idea of public transit density and bike-share infrastructure
            if l['distance'] <1700:
                count1+=1
            elif l['distance'] < 3400:
                count2+=1
            else: 
                continue
        if len(i['results']) < 1:### if no results are returned, due to too small of a radius, this block will return 9999 as an error code
            distance.append(9999)
            near_station.append(9999)
            num_stations_1mi.append(9999)
            num_stations_2mi.append(9999)
            continue
        distance.append(i['results'][0]['distance']) ### Thankfully, four square returns results based on distance from the lat.,long. queried
        near_station.append(i['results'][0]['location']['formatted_address']) ### station name can be used to associate local densities of bike stations around CTA station
        num_stations_1mi.append(count1) ### according to statistics released by Divvy, most rides are quite short so we should expect the heaviest use by commuters to be at the stations within 1-2 miles of CTA stations 
        num_stations_2mi.append(count2)

In [None]:
fs_append(response)

In [None]:
len(distance) #since this returns the expected value, we can merge these locations with our data from divvy