## unit_statuses.csv

A list of all stored unit statuses. A unit status is an escalator/elevator state with a start time and and end time. A unit only has one unit status at any given time. 

Note that data may be stale for stretches of time due to stale data provided by the WMATA API or outages in data collection in the DC Metro Metrics application. Escalator & Elevator statuses are constantly changing, so if you see no statuses for 3 or more consecutive hours, the data is likely stale during that stretch of time. There is over a 7 day outage in July 2014 when WMATA made unannounced backwards incompatible changes to the WMATA API which broke the DC Metro Metrics app.

When an escalator or elevator appears in the WMATA API list of outages for the first time, it is given an initial OPERATIONAL status, followed by the outage status.

 - *unit_id*: The unit_id of the escalator or elevator
 - *time*: The start time of the status in UTC.
 - *end_time*: The end time of the status in UTC. If NA, the status is still active.
 - *metro_open_time*: The number of seconds for which Metrorail was open during the duration of the status.
 - *update_type*: The type of the update. Should be one of: "Break", "Fix", "Off", "On", "Update". These categorize the type of state changes:

   - Break: the unit has transitioned to a broken state from a non-broken state. (e.g. Operational -> Service Call)
   - Fix means the unit has transitioned to an operational status from a broken status. (e.g. Major Report -> Operational)
   - Off means the unit has been turned off, but is not broken. (e.g. Operational -> Walker or Operational -> Preventive Maintenance Inspection)
   - On means the unit has been turned back on, but was not broken. (e.g. Preventive Maintenance Inspection -> Operational)
   - Update means the unit is still broken or off but has changed states. (e.g. Service Call -> Minor Repair)

 - *symptom_description*: The description of the unit state. WMATA changed the descriptions in July 2014, so there is some duplication here ("MAJOR REPAIR" and "Major Repair", "CALLBACK/REPAIR" -> "Service Call"). I can't make a one to one correspondence between the old symptom descriptions and the new symptom descriptions, so I didn't bother cleaning this up.

 - *symptom_category*: The type of the symptom description. Is one of: "ON", "BROKEN", "OFF", "REHAB", "INSPECTION"


In [2]:
import pandas as pd

In [3]:
data = pd.read_csv('data/unit_statuses.csv')

In [4]:
data.head()

Unnamed: 0,unit_id,time,end_time,metro_open_time,update_type,symptom_description,symptom_category
0,F06N02ESCALATOR,2013-06-01T16:38:21.475000+00:00,2013-06-01T16:38:21.476000+00:00,0.001,Update,OPERATIONAL,ON
1,C06X04ESCALATOR,2013-06-01T16:38:21.475000+00:00,2013-06-01T16:38:21.476000+00:00,0.001,Update,OPERATIONAL,ON
2,B05X01ESCALATOR,2013-06-01T16:38:21.475000+00:00,2013-06-01T16:38:21.476000+00:00,0.001,Update,OPERATIONAL,ON
3,K02X02ESCALATOR,2013-06-01T16:38:21.475000+00:00,2013-06-01T16:38:21.476000+00:00,0.001,Update,OPERATIONAL,ON
4,A03N01ESCALATOR,2013-06-01T16:38:21.475000+00:00,2013-06-01T16:38:21.476000+00:00,0.001,Update,OPERATIONAL,ON


# What I want to do
- split up the unit_id 
    - make a new column called 'unit' and have it be labeled either escalator or elevator
- break up time stamps  
    - 'full_date', 'year', 'month', 'day',