# OpenTransit Data Set Description

In [1]:
import json

import pandas as pd

In [2]:
def load_json(filename = 'sample_routes_data_15s.json'):
    with open(filename, 'r') as f:
        return json.load(f)

In [3]:
bus_locations = load_json()

In [7]:
route_1_buses = bus_locations[0]

In [8]:
route_1_buses.keys()

dict_keys(['rid', 'routeStates', 'stops'])

The json files `all_routes_data_15s.json`, `sample_routes_data_15s.json`, and `route_14_week_data.json` will load as a list of dicts, where each dict contains the bus location data from all the buses on one particular route. Each dict has the following keys:

In [14]:
route_1_buses['rid']

'1'

- `rid` is the route ID

In [12]:
route_1_buses['routeStates'][0]

{'vtime': '1539586814511',
 'vehicles': [{'vid': '5621',
   'lat': 37.790165,
   'lon': -122.432144,
   'heading': 255,
   'did': '1____O_F00'},
  {'vid': '5556',
   'lat': 37.795433,
   'lon': -122.396812,
   'heading': 218,
   'did': '1____O_F00'},
  {'vid': '5610',
   'lat': 37.7837559,
   'lon': -122.488441,
   'heading': 75,
   'did': '1____I_F00'},
  {'vid': '5582',
   'lat': 37.779846,
   'lon': -122.492851,
   'heading': 269,
   'did': '1____I_F00'},
  {'vid': '5606',
   'lat': 37.793411,
   'lon': -122.413483,
   'heading': 75,
   'did': '1____I_F00'}]}

- `routeStates` is a list of "snapshots" of the buses on the route at particular points in time. Each element in `routeStates` represents the state of the buses on the route at a particular time, and is a dict with the following keys:

    - `vtime` is the time at which the location data was recorded

    - `vehicles` is a list of each vehicle's latitude/longitude, heading, and direction (inbound/outbound) at that particular time, along with a vehicle id `vid` to identify each vehicle

In [13]:
route_1_buses['stops'][0]

{'sid': '4015',
 'name': 'Clay St & Drumm St',
 'lat': 37.7954399,
 'lon': -122.39682}

- `stops` is a list containing information about each stop on the route. Each element contains identifying information for each stop (`sid` and `name`) along with geographical data (`lat` and `lon`)
    