# New Intersection Activation Dates

In lieu of having the activation date in the Miovision API, we've traditionally manually pulled data from API until we honed in on the day when data was first available. This notebook partly automates this process. It first determines which intersections now available from the [Miovision API](https://docs.api.miovision.com/#!/Intersections/get_intersections) need to be added to `miovision_api.intersections`. Given a user-defined range of dates to analyze, it then sequentially pulls from the API until it obtains the first date when data was available.

The user should manually validate the script's results using the Miovision API. The task of including the `px`, appending a geometry, and uploading to `miovision_api.intersections` remains manual, however.

In [63]:
import configparser
import pathlib

from requests import Session
import pandas as pd
import psycopg2
import datetime
import numpy as np

In [64]:
# Either manually insert your keys here, or use your own configparser.
#config = configparser.ConfigParser()
#config.read(pathlib.Path.home().joinpath('.charlesconfig').as_posix())
#postgres_settings = config['POSTGRES']
#miov_token = config['MIOVISION']['key']

In [65]:
# Either manually insert your keys here, or use your own configparser.
config = configparser.ConfigParser()
config.read('config.cfg')
postgres_settings = config['DBSETTINGS']
miov_token = config['API']['key']

In [66]:
session = Session()
session.proxies = {}

headers = {'Content-Type': 'application/json',
           'Authorization': miov_token}

## New Intersections Table

In [67]:
# Get intersections from Miovision API.
response = session.get("https://api.miovision.com/intersections/",
                       params={}, headers=headers, proxies=session.proxies)
df_api = pd.DataFrame(response.json())
df_api = df_api[['id', 'name']].copy()
df_api.columns = ['id', 'intersection_name']

# Get intersections currently stored in `miovision_api` on Postgres.
with psycopg2.connect(**postgres_settings) as conn:
    df_pg = pd.read_sql("SELECT * FROM miovision_api.intersections", con=conn)

# Join the two tables, and select intersections in the API and not in Postgres.
df_intersections = pd.merge(df_pg, df_api[['id', 'intersection_name']], how='outer',
                            left_on='id', right_on='id', suffixes=('', '_api'))
df_newints = df_intersections.loc[df_intersections['intersection_uid'].isna(), ['id', 'intersection_name_api']]
df_newints.index += 1

`df_newints` is a table of the new intersections to be added.

In [68]:
df_newints

Unnamed: 0,id,intersection_name_api
65,dbf09553-c593-4bb2-90e5-7eb3bc7ebe08,Bayview Avenue and River Street
66,35425467-0e8d-4fe7-b35d-6ccc9b71b0cf,Sheppard Avenue West and Jane Street
67,11dcfdc5-2b37-45c0-ac79-3d6926553582,Sheppard Avenue West and Keele Street
68,9ed9e7f3-9edc-4f58-ae5b-8c9add746886,Steeles Avenue West and Jane Street


## Find First Full Day of Data

The user must provide a `'test_daterange_start'` and a `'test_daterange_end'` column to `df_newints` as the start and inclusive end date, respectively, of the range of dates to search for the activation date. Different rows (i.e. intersections) can have different values.

Brent provided us with a list of configuration dates for the new intersections, which correspond to the day that SmartSense configuration is complete and the location starts reporting data. Searching in the vicinity of these dates:

In [69]:
df_newints['test_daterange_start'] = '2021-06-16'
df_newints['test_daterange_end'] = '2021-06-17'

df_newints.loc[59, 'test_daterange_start'] = '2020-12-20'
df_newints.loc[59, 'test_daterange_end'] = '2020-12-23'
df_newints.loc[60, 'test_daterange_start'] = '2021-05-31'
df_newints.loc[60, 'test_daterange_end'] = '2021-06-03'
df_newints.loc[61, 'test_daterange_start'] = '2021-05-12'
df_newints.loc[61, 'test_daterange_end'] = '2021-05-15'
df_newints.loc[62, 'test_daterange_start'] = '2021-06-06'
df_newints.loc[62, 'test_daterange_end'] = '2021-06-09'
df_newints.loc[63, 'test_daterange_start'] = '2021-06-07'
df_newints.loc[63, 'test_daterange_end'] = '2021-06-10'
df_newints.loc[66, 'test_daterange_start'] = '2021-05-12'
df_newints.loc[66, 'test_daterange_end'] = '2021-05-15'
df_newints.loc[68, 'test_daterange_start'] = '2021-05-12'
df_newints.loc[68, 'test_daterange_end'] = '2021-05-15'

In [70]:
def get_response_length(intersection_id, params):
    response = session.get(("https://api.miovision.com/intersections/{int_id}/tmc"
                            .format(int_id=intersection_id)),
                           params=params, headers=headers, proxies=session.proxies)

    if response.status_code != 200:
        return -1
    return len(response.json())


def get_first_data_date(intersection_id, start_time, end_time, max_retries=3):

    # Generate a sequence of dates.
    for ctime in pd.date_range(
            start_time, end=end_time, freq='D').to_pydatetime():

        params = {'endTime': ctime + datetime.timedelta(minutes=15),
                  'startTime': ctime}
        
        # The API throws an error when we query same day data.
        if ctime.date() >= datetime.date.today():
            return np.nan

        # For each date, try downloading 00:00 - 00:15 data (maximum of max_retries
        # times in case we hit HTTP errors).
        for i in range(max_retries):
            response_length = get_response_length(intersection_id, params)
            if response_length >= 0:
                break

        # If we keep getting other HTTP codes, throw an error.
        if response_length < 0:
            raise ValueError('keep getting HTTP errors from session!')

        # It's highly unlikely the first timestamp of available data is from midnight to 12:10 AM,
        # so set the actual activation date to the day before ctime.
        if response_length > 0:
            return ctime - datetime.timedelta(days=1)
    
    return np.nan

In [73]:
first_date_of_data = []

for i, row in df_newints.iterrows():
    first_date_of_data.append(
        get_first_data_date(
            row['id'],
            row['test_daterange_start'],
            row['test_daterange_end'],
            max_retries=3))

df_newints['activation_date'] = first_date_of_data

ValueError: keep getting HTTP errors from session!

`df_newints` now contains the activation dates of the intersections. A `NaT` indicates that no start date was found. If the start date occurred before `'test_daterange_start'`, `'activation_date'` will be set to `'test_daterange_start'`.

In [74]:
df_newints

Unnamed: 0,id,intersection_name_api,test_daterange_start,test_daterange_end
65,dbf09553-c593-4bb2-90e5-7eb3bc7ebe08,Bayview Avenue and River Street,2021-06-16,2021-06-17
66,35425467-0e8d-4fe7-b35d-6ccc9b71b0cf,Sheppard Avenue West and Jane Street,2021-05-12,2021-05-15
67,11dcfdc5-2b37-45c0-ac79-3d6926553582,Sheppard Avenue West and Keele Street,2021-06-16,2021-06-17
68,9ed9e7f3-9edc-4f58-ae5b-8c9add746886,Steeles Avenue West and Jane Street,2021-05-12,2021-05-15
59,,,2020-12-20,2020-12-23
60,,,2021-05-31,2021-06-03
61,,,2021-05-12,2021-05-15
62,,,2021-06-06,2021-06-09
63,,,2021-06-07,2021-06-10
