# WA Crash Viz and Analysis
### Final presentation & demo
#### by Katharine Chen, Tianqi Fang, Yutong Liu, Shuyi Yin

## Problem Background

+ A variety of factors (environmental, physical, etc) contribute to a road’s overall safety;<br>
+ Professional and non-professional users all need user-friendly interfaces to visualize and understand data;

## Our vision

Ideally, we want to develop a **website or app interface** that <br>
facilitates visualization and analysis of crashes for all levels of users;

## Data

We look at highway crashes **2013 - 2017** in WA;
+ Crashes ([HSIS](https://www.hsisinfo.org/index.cfm))<br>
    - accident: **<u>case num</u>**, **<u>road id</u>**, **<u>milepost</u>**, location, time, type, severity
    - occupant: **<u>case num</u>**, condition;
    - vehicle: **<u>case num</u>**, make, condition;
+ Roadway info ([HSIS](https://www.hsisinfo.org/index.cfm)):<br>
    - curve: **<u>road id</u>**, **<u>milepost</u>**, curvature;
    - grade: **<u>road id</u>**, **<u>milepost</u>**, grade;
    - road: **<u>road id</u>**, **<u>milepost</u>**, traffic, num of lanes, widths;
+ Coordinates conversion ([NOAA](https://www.ngs.noaa.gov/TOOLS/spc.shtml)):
    - state plane coordinates to lat/lng;

Our database diagram is online [here](https://dbdiagram.io/embed/5dee85d7edf08a25543ee300).

In [1]:
import pandas as pd
df2013 = pd.read_csv('../../data/crash-merged/2013.csv')

In [2]:
df2013.head()

Unnamed: 0,CASENO,FORM_REPT_NO,rd_inv,milepost,RTE_NBR,lat,lon,MONTH,DAYMTH,WEEKDAY,...,weather,rur_urb,REPORT,veh_count,COUNTY,AADT,mvmt,deg_curv,dir_grad,pct_grad
0,2013000001,E218403,290,3.29,290,47.671826,-117.344316,1,3,4,...,1,U,1,2,32.0,6669.0,0.02,5.0,-,0.05
1,2013000003,E218519,3,52.82,3,47.759766,-122.655264,1,2,3,...,1,U,3,1,18.0,30916.0,0.9,0.0,+,0.5
2,2013000005,E218367,5,192.42,5,47.960145,-122.199315,1,2,3,...,1,U,2,2,31.0,185099.0,1.35,0.0,-,0.44
3,2013000008,E219313,5,124.06,5,47.131128,-122.535729,1,3,4,...,3,U,1,2,27.0,143406.0,14.13,0.0,+,0.11
4,2013000009,E219663,900,2.34,900,47.483053,-122.249014,1,9,3,...,2,U,1,2,17.0,18407.0,0.47,0.0,+,6.62


In [3]:
df2013.columns

Index(['CASENO', 'FORM_REPT_NO', 'rd_inv', 'milepost', 'RTE_NBR', 'lat', 'lon',
       'MONTH', 'DAYMTH', 'WEEKDAY', 'RDSURF', 'LIGHT', 'weather', 'rur_urb',
       'REPORT', 'veh_count', 'COUNTY', 'AADT', 'mvmt', 'deg_curv', 'dir_grad',
       'pct_grad'],
      dtype='object')

## Use cases

With our **crash4viz** interface,
+ average driver may consult the map and analysis report before travel;
+ DOT planners, police officers and other professionals may look deeper into contributing factors;
+ all non-programmers can use this interface that visualize past crashes with their selection of ***county, weather, road, vehicle***, etc;
+ **Rainy day, steep downhill curved road, old car, little traffic, young driver**;

## Demo !!!

```python
python interface.py
```

## Design

+ Data manager: <br>
    + Provides an interaction to the dataset, i.e. query based on users’ input, such as “2017 + rainy + King County” fed to data manager will return crash records that 
        - Happens in 2017;
        - Happened in rainy weather condition;
        - Took place in King Count;
+ Analysis manager: <br>
    - recieves data from **data manager** and is activated by "generate report" button.
    - conducts descriptive analysis;
    - fits statistical models to the data and generate factor importance summary;

+ Visualization manager:
    - is the most important component of our package;
    - receives filtered data from data manager;
    - renders HTML maps;

## [Project structure](https://syin3.github.io/crash4viz/)

### Core scripts

+ ``Interface.py`` is above everything and calls all modules it needed;
+ Mapping and ML functions are in ``./crash4viz`` folder;
+ Testing functions are in ``./crash4viz/tests`` folder;
+ Data preparation scripts are in ``./crash4viz/dataprep`` folder;

### Data
+ Raw and processed are stored in ``./data`` folder for reference;

## Valuable lessons

+ Deisgn together and assign tasks early;
+ Communicate constantly on IOs of different components;
+ Build clean early;

## Future work

+ Add time filtering function;
+ ML on filtered data;
+ Write a website version with Django;
+ Better plots, cleaner code;