# Geospatial Analysis on FHWA Traffic Data to Predict Ideal Trailer-friendly Supercharging Locations

## Import data
(sourced from FHWA's LTPP InfoPave database)

In [75]:
import pandas as pd

raw_vehicle_class_adt_annual = pd.read_csv('data/VEHICLE_CLASS_ADT_ANNUAL.csv', usecols=['STATE_CODE_EXP','SHRP_ID','VEHICLE_CLASS', 'CLASS_ADT_ANNUAL'])
raw_equipment_info = pd.read_csv('data/SITE_EQUIPMENT_INFO.csv', usecols=['STATE_CODE_EXP', 'SHRP_ID', 'LATITUDE', 'LONGITUDE'])

## Restructure and clean data
Match latitude/longitude of each sensor (where applicable) to each vehicle classification record. Then, purge any records where sensor location or vehicle classification is not available. Additionally, any records containing vehicle classifications other than 3,4, and 5, should be excluded from analysis. The FHWA describes class 3,4,5 vehicles to include campers, RVs, and trailers. Some passenger vehicle hitch trailers are considered Class 2, which data is not recorded for.

In [76]:
wrk_equipment_info = raw_equipment_info.copy()
wrk_vcaa = raw_vehicle_class_adt_annual.copy()

# Create column giving device unique ID
wrk_equipment_info['UNIQUE_ID'] = wrk_equipment_info["STATE_CODE_EXP"] + '-' + wrk_equipment_info['SHRP_ID']
wrk_vcaa['UNIQUE_ID'] = wrk_vcaa.STATE_CODE_EXP + '-' + wrk_vcaa.SHRP_ID
wrk_equipment_info = wrk_equipment_info.dropna()
wrk_equipment_info = wrk_equipment_info[wrk_equipment_info.duplicated(['UNIQUE_ID', 'LATITUDE', 'LONGITUDE']) == False]

# Join lat/long columns based on UNIQUE_ID
wrk_equipment_info = wrk_equipment_info.reset_index(drop=True)
wrk = pd.concat([wrk_equipment_info,wrk_vcaa.merge(wrk_equipment_info[['UNIQUE_ID','LATITUDE','LONGITUDE']], how='left', on='UNIQUE_ID')],sort=False)

# Remove data where location or vehicle classification not available
wrk = wrk[wrk['VEHICLE_CLASS'].notna()]
wrk = wrk[wrk['LATITUDE'].notna()]

# Convert classification to strings since classification is categorical
wrk['VEHICLE_CLASS'] = [str(x) for x in wrk.VEHICLE_CLASS]

# Remove data that does not contain info about vehicle classes 3,4,5
wrk = wrk[wrk.VEHICLE_CLASS.isin(['3.0', '4.0', '5.0'])]

# Convert lat/long to data readable by folium
wrk['LOCATION'] = wrk[['LATITUDE', 'LONGITUDE']].values.tolist()

wrk

Unnamed: 0,STATE_CODE_EXP,SHRP_ID,LATITUDE,LONGITUDE,UNIQUE_ID,VEHICLE_CLASS,CLASS_ADT_ANNUAL,LOCATION
23,California,2040,40.636520,-124.21181,California-2040,4.0,15.2,"[40.63652039, -124.21181]"
24,California,2040,40.636520,-124.21181,California-2040,5.0,202.5,"[40.63652039, -124.21181]"
35,California,2041,40.636471,-124.21184,California-2041,4.0,0.4,"[40.63647079, -124.21184]"
40,California,2041,40.636471,-124.21184,California-2041,5.0,15.1,"[40.63647079, -124.21184]"
47,California,2041,40.636471,-124.21184,California-2041,4.0,12.5,"[40.63647079, -124.21184]"
...,...,...,...,...,...,...,...,...
422827,Maine,0506,44.998901,-68.70050,Maine-0506,4.0,6.4,"[44.99890137, -68.7005]"
422828,Maine,0507,44.998901,-68.70050,Maine-0507,4.0,6.4,"[44.99890137, -68.7005]"
422829,Maine,0508,44.998901,-68.70050,Maine-0508,4.0,6.4,"[44.99890137, -68.7005]"
422830,Maine,0509,44.998901,-68.70050,Maine-0509,4.0,6.4,"[44.99890137, -68.7005]"


## Analyze
Using the newly created data, we can generate summaries of the data and visualize it. The key value in the summary statistics is CLASS_ADT_ANNUAL, which is the annual traffic measurement for the given sensor for a given vehicle classification.

In [77]:
# Summarize
drv = wrk.copy()
drv.describe()

Unnamed: 0,LATITUDE,LONGITUDE,CLASS_ADT_ANNUAL
count,60432.0,60432.0,60432.0
mean,39.192967,-106.090309,106.038266
std,4.480453,15.540913,181.284134
min,31.51845,-149.78944,0.0
25%,35.217892,-117.03593,9.4
50%,39.020672,-112.696,29.4
75%,41.617859,-93.49318,127.6
max,64.948631,-60.23534,5933.4


**Important note**: because the FHWA is a federal agency, they rely on states to report things. States do not always report all information. For example, in the map, it is obvious that highly populated states such as Texas or Florida do not have any records showing, contrary to what one would expect. This is likely because the states did not record traffic data, record coordinates, or record FHWA vehicle classifications.

It is also important to note that the data may not be truly reflective of routes traveled by Class 3,4,5 vehicles. This is due to the nature of research methods used by state government agencies. Sensors are not located, for example, every 0.5 miles on every interstate.

In [78]:
import folium
from branca.element import Figure
from folium.plugins import HeatMap

fig = Figure(width=650, height=350)
m1 = folium.Map(width=650,height=350,location=[39.192967, -100.090309], zoom_start=3.5)
fig.add_child(m1)
HeatMap(drv.LOCATION, radius=13).add_to(m1)
m1