# About

The codebase is starting to get seriously complicated. Now, before digging even deeper into trip log generation, is a good time to unit test the action log conversion.

In [1]:
import sys; sys.path.append("../src/")
from processing import fetch_archival_gtfs_realtime_data, parse_gtfs_into_action_log

In [2]:
%load_ext autoreload
%autoreload 2

Create the local archival record, used for testing purposes.

In [12]:
with open("../src/tests/data/gtfs_realtime_pull_1.dat", "wb") as f:
    f.write(fetch_archival_gtfs_realtime_data(kind='gtfs', timestamp='2014-09-18-09-01', raw=True))

Read it in...

In [14]:
from google.transit import gtfs_realtime_pb2

with open("../src/tests/data/gtfs_realtime_pull_1.dat", "rb") as f:
    gtfs_r0 = gtfs_realtime_pb2.FeedMessage()
    gtfs_r0.ParseFromString(f.read())

In [19]:
from processing import parse_message_into_action_log

Case 1: a train that is `STOPPED_AT` to the first stop remaining in its trip sequence, with a bunch more stops to come.

In [20]:
parse_message_into_action_log(gtfs_r0.entity[0], gtfs_r0.entity[1], None)

Unnamed: 0,trip_id,route_id,action,stop_id,time_assigned,information_time
0,047600_1..S02R,1,STOPPED_AT,137S,1411045000.0,
1,047600_1..S02R,1,EXPECTED_TO_ARRIVE_AT,138S,1411045000.0,
2,047600_1..S02R,1,EXPECTED_TO_DEPART_AT,138S,1411045000.0,
3,047600_1..S02R,1,EXPECTED_TO_ARRIVE_AT,139S,1411045000.0,
4,047600_1..S02R,1,EXPECTED_TO_DEPART_AT,139S,1411045000.0,
5,047600_1..S02R,1,EXPECTED_TO_ARRIVE_AT,140S,1411045000.0,


In [21]:
gtfs_r0.entity[0]

id: "000001"
trip_update {
  trip {
    trip_id: "047600_1..S02R"
    start_date: "20140918"
    route_id: "1"
  }
  stop_time_update {
    arrival {
      time: 1411044718
    }
    departure {
      time: 1411044838
    }
    stop_id: "137S"
  }
  stop_time_update {
    arrival {
      time: 1411044928
    }
    departure {
      time: 1411044928
    }
    stop_id: "138S"
  }
  stop_time_update {
    arrival {
      time: 1411045018
    }
    departure {
      time: 1411045078
    }
    stop_id: "139S"
  }
  stop_time_update {
    arrival {
      time: 1411045228
    }
    stop_id: "140S"
  }
}

In [17]:
gtfs_r0.entity[1]

id: "000002"
vehicle {
  trip {
    trip_id: "047600_1..S02R"
    start_date: "20140918"
    route_id: "1"
  }
  current_stop_sequence: 35
  current_status: STOPPED_AT
  timestamp: 1411044718
  stop_id: "137S"
}

Case 2: a train that is `IN_TRANSIT_TO` to the first stop remaining in its trip sequence, with a bunch more stops to come.

In [33]:
parse_message_into_action_log(gtfs_r0.entity[24], gtfs_r0.entity[25], None)

Unnamed: 0,trip_id,route_id,action,stop_id,time_assigned,information_time
0,050500_1..N02R,1,EXPECTED_TO_ARRIVE_AT,119N,1411045000.0,
1,050500_1..N02R,1,EXPECTED_TO_DEPART_AT,119N,1411045000.0,
2,050500_1..N02R,1,EXPECTED_TO_ARRIVE_AT,118N,1411045000.0,
3,050500_1..N02R,1,EXPECTED_TO_DEPART_AT,118N,1411045000.0,
4,050500_1..N02R,1,EXPECTED_TO_ARRIVE_AT,117N,1411045000.0,
5,050500_1..N02R,1,EXPECTED_TO_DEPART_AT,117N,1411045000.0,
6,050500_1..N02R,1,EXPECTED_TO_ARRIVE_AT,116N,1411046000.0,
7,050500_1..N02R,1,EXPECTED_TO_DEPART_AT,116N,1411046000.0,
8,050500_1..N02R,1,EXPECTED_TO_ARRIVE_AT,115N,1411046000.0,
9,050500_1..N02R,1,EXPECTED_TO_DEPART_AT,115N,1411046000.0,


Case 3: a train that has not yet departed from the first stop in its sequence (and hence doesn't have a vehicle update yet).

In [53]:
parse_message_into_action_log(gtfs_r0.entity[108], None, None)

Unnamed: 0,trip_id,route_id,action,stop_id,time_assigned,information_time
0,056600_5..N69R,5,EXPECTED_TO_DEPART_AT,420N,1411047000.0,
1,056600_5..N69R,5,EXPECTED_TO_ARRIVE_AT,419N,1411047000.0,
2,056600_5..N69R,5,EXPECTED_TO_DEPART_AT,419N,1411047000.0,
3,056600_5..N69R,5,EXPECTED_TO_ARRIVE_AT,418N,1411047000.0,
4,056600_5..N69R,5,EXPECTED_TO_DEPART_AT,418N,1411047000.0,
5,056600_5..N69R,5,EXPECTED_TO_ARRIVE_AT,640N,1411047000.0,
6,056600_5..N69R,5,EXPECTED_TO_DEPART_AT,640N,1411047000.0,
7,056600_5..N69R,5,EXPECTED_TO_ARRIVE_AT,635N,1411047000.0,
8,056600_5..N69R,5,EXPECTED_TO_DEPART_AT,635N,1411047000.0,
9,056600_5..N69R,5,EXPECTED_TO_ARRIVE_AT,631N,1411048000.0,


Case 4: a train is `EN_ROUTE` to the final stop in its trip sequence.

Case 5: a train is `STOPPED_AT` the final stop in its trip sequence.

Case 6: a train for some reason has a departure assigned to the final stop on its line.

This is a bad case that occurs at least once, and we have to pull up another record to find it.

In [67]:
with open("../src/tests/data/gtfs_realtime_pull_2.dat", "wb") as f:
    f.write(fetch_archival_gtfs_realtime_data(kind='gtfs', timestamp='2014-09-17-09-36', raw=True))

In [68]:
from google.transit import gtfs_realtime_pb2

with open("../src/tests/data/gtfs_realtime_pull_2.dat", "rb") as f:
    gtfs_r1 = gtfs_realtime_pb2.FeedMessage()
    gtfs_r1.ParseFromString(f.read())

In [76]:
for i, message in enumerate(gtfs_r1.entity):
    if message.trip_update.trip.route_id != '':
        if str(message.trip_update.stop_time_update[-1].departure) != '':
            print(i)
            break

207


Here we go, the bad egg:

In [77]:
gtfs_r1.entity[207]

id: "000208"
trip_update {
  trip {
    trip_id: "049150_4..S40R"
    start_date: "20140917"
    route_id: "4"
  }
  stop_time_update {
    arrival {
      time: 1410960934
    }
    departure {
      time: 1410960934
    }
    stop_id: "250S"
  }
}

In [78]:
gtfs_r1.entity[208]

id: "000209"
vehicle {
  trip {
    trip_id: "049150_4..S40R"
    start_date: "20140917"
    route_id: "4"
  }
  current_stop_sequence: 27
  current_status: IN_TRANSIT_TO
  timestamp: 1410960882
  stop_id: "250S"
}

In [79]:
parse_message_into_action_log(gtfs_r1.entity[207], gtfs_r1.entity[208], None)

Unnamed: 0,trip_id,route_id,action,stop_id,time_assigned,information_time
0,049150_4..S40R,4,EXPECTED_TO_ARRIVE_AT,250S,1410961000.0,


From before..

In [3]:
gtfs_r = dict()

for n in range(0, 60, 5):
    print(n + 1)
    gtfs_r[n] = fetch_archival_gtfs_realtime_data(kind='gtfs', timestamp='2014-09-18-09-' + str(1 + n).zfill(2))
    
print("Done!")

1
6
11
16
21
26
31
36
41
46
51
56
Done!
