# Exploring the TTC Subway Real-time API
The API we're pulling data from is what supports the TTC's [Next Train Arrivals](http://www.ttc.ca/Subway/next_train_arrivals.jsp) page. With a bit of exploration through your browser's developer console, you can see that the page gets refreshed with data from a request to http://www.ttc.ca/Subway/loadNtas.action 

In [5]:
import requests #to handle http requests to the API
from datetime import datetime
import pytz

In [7]:
stationid = 3 
#We'll find out the full range of possible stations further down.
lineid = 1 
#[1,2,4]
request_time = datetime.now(pytz.timezone('Canada/Eastern')) 
#The time of the request. It's possible to get historical data?
#The request time must be passed in Unix epoch: milliseconds from Jan 1st, 1970
request_epoch = int(request_time.timestamp()*1000) 

In [12]:
# The url for the request
baseurl = "http://www.ttc.ca/Subway/loadNtas.action"

In [17]:
# Our query parameters for this API request
payload = {"subwayLine":lineid,
           "stationId":stationid,
           "searchCriteria":'', #The value in the search box
           #it has to be included otherwise the query fails
           "_":request_epoch} #Great job naming variables...
r = requests.get(baseurl, params = payload)

So now we've just received our first request from the API and the response is stored in the `requests` object `r`. From previous examination of the API we know that the response to an API request is in JSON format. So the below code will pretty print out the response so we can have a look at the variables.

In [19]:
r.json() 

{'allStations': 'success',
 'data': None,
 'defaultDirection': [['YKD1', 'Southbound<br/> To Union', 'YUS'],
  ['YKD2', 'Northbound<br/> To Downsview', 'YUS']],
 'limit': 3,
 'ntasData': [{'createDate': '2017-01-10T20:45:49',
   'id': 11790245361,
   'stationDirectionText': 'Southbound<br/> To Union',
   'stationId': 'YKD1',
   'subwayLine': 'YUS',
   'systemMessageType': 'Normal',
   'timeInt': 1.8108022857142858,
   'timeString': '01.81',
   'trainDirection': 'North',
   'trainId': 354,
   'trainMessage': 'Arriving'},
  {'createDate': '2017-01-10T20:45:49',
   'id': 11790245362,
   'stationDirectionText': 'Southbound<br/> To Union',
   'stationId': 'YKD1',
   'subwayLine': 'YUS',
   'systemMessageType': 'Normal',
   'timeInt': 7.033191146666667,
   'timeString': '07.03',
   'trainDirection': 'North',
   'trainId': 159,
   'trainMessage': 'Arriving'},
  {'createDate': '2017-01-10T20:45:49',
   'id': 11790245363,
   'stationDirectionText': 'Southbound<br/> To Union',
   'stationId': 'Y

## Building a scraping script
By opening up the inspector tools in the browser, we can see the full list of station ids by hovering over the `Select a subway station` dropdown list.
![](img/line1_stations.png)  
For Line 1 they are numbered 1-32
For Line 2 they are numbered 33-63
For Line 4 they are numbered 64-68
Thus we can construct a dictionary that will represent every possible API call:

In [1]:
lines = {1: range(1, 33), #max value must be 1 greater
         2: range(33, 64),
         3: range(64, 68)}

In [None]:
def get_API_response(line_id, station_id, epoch):
    payload = {"subwayLine":line_id,
           "stationId":station_id,
           "searchCriteria":'',
           "_":epoch}
    r = requests.get(base_url, params = payload) 
    return r.json()

def query_all_stations(epoch):
    data = {}
    for line_id, stations in lines.items:
        for station_id in stations:
            data[line_id][station_id] = get_API_response(line_id, station_id, epoch)
    return data