# Disruption Analysis Compilation

#### File preparation for ArcGIS Analysis

Here we will prepare original source Vehicle Location History files for disruption analysis in ArcGIS with the workflow outlined below:

- import and slice CSV file(s) to result in dataframe with only relevant fields to analysis.
- Construct dataframe datetime fields
- Construct disruption index functions for generating a disruption index field in the dataframe.
- Export the dataframe to a new CSV file for spatial analysis in GIS.

The processes will be outlined in more detail in their own sections.

In [2]:
# Importing relevant libraries
import pandas as pd
import os
import glob
import numpy as np
import warnings
from datetime import datetime

## Import Data

File data will be imported from CSVs provided by Michael Long at Capital Metropolitan Transportation Authority. Initial data will not be provided and subsequent data will be stripped of identifiers for bus and driver identification. The only relevant data for our analysis lies in the headway time of vehicles, and time and location of record. 

In [3]:
# Setting up path vars.
path = r'../00_Source_Data/Capital_Metro/Vehicle_Location_History' # Relative source file path
all_files = glob.glob(os.path.join(path , "*.csv")) # all files

In [4]:
# This block generates a list of dataframes where each item in the list is one file.
li = []

with warnings.catch_warnings():
    warnings.simplefilter('ignore') # ignore dataframe generation warnings
    for filename in all_files:
        df = pd.read_csv(filename, index_col=None, header=0)
        li.append(df)

In [5]:
# This block generates a dataframe with all data from the files stored in the source file path.

frame = pd.concat(li, axis=0, ignore_index=True)

#### Formatting the Dataframe

The new dataframe we're interested in only needs the following fields:
- **timecentral**: Datetime at CST formatted as YYYYMMHHmmssss
- **lat**: Latitude the reading was taken at in WKID 4326 or WGS 1984
- **lon**: Longitude the reading was taken at in WKID 4326 or WGS 1984
- **headwaysecs**: Headway reading in seconds

Fields that will be added later will be explained in the Disruption Index section.

In [6]:
data = frame[['timecentral', 'lat', 'lon', 'headwaysecs']]