## Visualize Speed down the I-15S on a Sample of Traffic Data

### Purpose & Motivation
In order to better understand how traffic jams form and propagate down a freeway, we want to create a visualization that allows us to explore with an interactive visualization.

### Direction from Advisor
Conduct exploratory analysis.

### Tasks/Questions to Answer
#### Questions to Answer
- What do traffic jams look like when they form?
- How do traffic jams move up/down a freeway?

#### Tasks
- Create an interactive visualization to explore how traffic jams form.

### Results & Conclusions
Jams generally form at one station and move backwards (opposite direction of traffic) down a freeway.  When jams recover, they tend to recover in the opposite direction.

In [11]:
%matplotlib inline
data_5min_path = "../station_5min/2015/d11/"
import pandas as pd
import numpy as np
import gzip
import time
from bokeh.io import curdoc, vform, output_notebook, push_notebook, output_file, show
from bokeh.models import ColumnDataSource, HBox, VBox
from bokeh.models.widgets import Slider, Button, DataTable, DateFormatter, TableColumn
from bokeh.plotting import Figure,show
from bokeh.models.layouts import WidgetBox
from bokeh.layouts import row, column
from os import listdir
from os.path import isfile, join
from ipywidgets import interact
import datetime as dt

In [7]:
big_df = pd.read_csv("../data/I15S_data.csv")
big_df = big_df.ix[big_df['Date'] < '2015-02-01',:]
small_df = big_df
small_df['Timestamp'] = pd.to_datetime(small_df['Timestamp'])
small_df['Time'] = small_df['Timestamp'].apply(lambda x:x.time())
small_df['Date'] = small_df['Timestamp'].apply(lambda x:x.date())

The I15S_data.csv file contains station aggregate data (not by lane) on a 5 minute interval.  The data contains an "index" which is the station order.  Lower indices are further north, higher indices are south.    

### Fucntions for Plot

In [8]:
def scale_time(i):
    base_time = dt.time(0,0,0)
    delta = dt.timedelta(minutes=i*5)
    my_time = (dt.datetime.combine(dt.date(1,1,1),base_time) + delta).time()
    return my_time

def get_day_of_week(date_value):
    day_list = ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday','Sunday']
    day_number = date_value.weekday()
    return day_list[day_number]

Some functions to make it easier to interact with Bokeh.

### Bokeh Plot

In [14]:
output_notebook()
date_value = dt.date(2015,1,1)
time_value = dt.time(0,0,0)
# Set up data
x = small_df.ix[(small_df['Date'] == date_value) & (small_df['Time'] == time_value),'index']
y = small_df.ix[(small_df['Date'] == date_value) & (small_df['Time'] == time_value),'AvgSpeed']
source = ColumnDataSource(data=dict(x=x, y=y))

# Set up plot
plot = Figure(plot_height=600, plot_width=900, title="Speed - Jam Analysis",
              tools="",
              x_range=[0, max(x)], y_range=[0, max(y)+10], x_axis_label='Stations', y_axis_label='AvgSpeed')
plot.scatter('x', 'y', source=source)

# Set up table
data_table = dict(
        dates=[date_value],
        hour=[time_value.hour],
        minute=[time_value.minute],
        day_of_week = [get_day_of_week(date_value)]
    )
source_table = ColumnDataSource(data_table)

columns = [
        TableColumn(field="dates", title="Date", formatter=DateFormatter()),
        TableColumn(field="hour", title="Hour"),
        TableColumn(field="minute", title="Minute"),
        TableColumn(field="day_of_week", title="Day"),
    ]
data_table = DataTable(source=source_table, columns=columns, width=600, height=50)

# Set up callbacks
def update_data(my_date,my_time):

    date_value = dt.date(2015,1,my_date)
    time_value = scale_time(my_time)
    # Set up data
    x = small_df.ix[(small_df['Date'] == date_value) & (small_df['Time'] == time_value),'index']
    y = small_df.ix[(small_df['Date'] == date_value) & (small_df['Time'] == time_value),'AvgSpeed']
    
    data_table = dict(
        dates=[date_value],
        hour=[time_value.hour],
        minute=[time_value.minute],
        day_of_week = [get_day_of_week(date_value)]
    )
    
    source.data = dict(x=x, y=y)
    source_table.data = data_table
    
    push_notebook()

show(column(plot,data_table), notebook_handle=True)

In [15]:
interact(update_data, my_date = (1,30), my_time= (0,287))

A Few Interesting Jams:
* date = 17, time = 101 
* date = 7, time = 188
* date - 22, time = 177