In [1]:
import requests as req
# import json

In [2]:
# I'm going to front-load all of my
# query strings in case I need to adjust
# them later.  You don't have to do it this way.

In [3]:
# Start with some API ID information for WMATA

apiKEY = "api_key"
apiVALUE = "b23a33173a6f44059e3f7a6c286cb881"

# Organize this data as a Dictionary -- a singleton
# dictionary -- with only one KV Pair within.

papersPlease = {apiKEY: apiVALUE}

# Note again:  No quotes --> variables
# Whereas:     quotes --> words, text, quotes, strings

In [4]:
# From the WMATA Developer page, this address
# is the "endpoint" for our Bus Stop Timer query.

# JSON was built initially for javascript, but it
# can be used anywhere.  Note that by using the
# word JSON in my variable, I'm apparently implying
# that this endpoint will RETURN data in a JSON-friendly
# format.  It would have been best for me to include
# a URL to take me right to the website where this
# is documented.

BusPredictorURL = "https://api.wmata.com/NextBusService.svc/json/jPredictions"

In [5]:
# The Bus Stop ID query

# I'm building this, piece by piece, in order
# to make it easier to adjust later.  Plus,
# it makes it easier to understand how it works.

# CarBarn hardcoded right now to eliminate potential
# problems in getting this data.  ALWAYS WORK TO 
# ELIMINATE AS MANY VARIABLES AS POSSIBLE.

baseURL = "https://api.wmata.com/Bus.svc/json/jStops"
latQuery = "?Lat="
lngQuery = "&Lon="
radQuery = "&Radius="
myLat = "38.905833" # CarBarn, 
myLng = "-77.069828" # courtesy Google
myRad = "501"

In [6]:
# Assemble the Bus Stop ID query
# Worth observing that the way this comes together is a little
# odd:  Some data is commonly accessed in the form of a tuple,
# or an ordered, finite set.  The most common are called two-tuples,
# or ordered pairs:  Think, for example, of X and Y, or of Vectors
# in Physics (magnitude and heading).  These are commonly delivered
# *as pairs*:  You wouldn't (typically) bother saying "x=5, y=12."
# You'd just say (5,12).  Same deal, typically, with Latitude and
# Longitude:  e.g., (38.073, -77.07) is, as they say, easy on the eyes.
# Stretching it out like WMATA does in their API is a bit odd.

geoDataQuery = (latQuery + myLat) + (lngQuery + myLng) + (radQuery + myRad)
totalRequestLive = baseURL + geoDataQuery

In [7]:
# Again, we don't have to break things down
# to steps that are this discrete, but it
# does make it easier to follow along.  There's
# nothing wrong with making things as easy for
# others to read as possible.

# This is where the actual QUERY gets built and
# sent to the servers.  For an idea of how much
# of an improvement this is over former methods,
# look up the urllib3 library for Python.

AllStops = req.get(totalRequestLive, headers = papersPlease)

In [8]:
# Did it work?  Look for code 200
# This won't stay here, but it is convenient
# feedback for now.

AllStops.status_code

200

In [9]:
# Now extract the section called "Stops" (which
# is really almost the whole thing).  NB that
# I'm using the .json() method of Requests objects.
# IMPORTANT to understand that this method
# doesn't turn something INTO JSON, but it
# CHANGES JSON into a PYTHON-friendly
# dictionary data format.

AllStopsDictionary = AllStops.json()
BusStopDict = AllStopsDictionary["Stops"]

In [10]:
for each_entry in BusStopDict:
    print(each_entry)

{'StopID': '1001345', 'Name': 'PROSPECT ST NW + 36TH ST NW', 'Lon': -77.069828, 'Lat': 38.905833, 'Routes': ['G2', 'G2v1']}
{'StopID': '1001354', 'Name': '35TH ST NW + N ST NW', 'Lon': -77.06888, 'Lat': 38.906649, 'Routes': ['G2', 'G2v1']}
{'StopID': '1001315', 'Name': 'M ST NW + 34TH ST NW', 'Lon': -77.067573, 'Lat': 38.904949, 'Routes': ['38B', '38Bv1', '38Bv2', 'D5']}
{'StopID': '1001370', 'Name': '37TH ST NW + O ST NW', 'Lon': -77.071671, 'Lat': 38.907395, 'Routes': ['G2', 'G2v1']}
{'StopID': '1001385', 'Name': 'O ST NW + 34TH ST NW', 'Lon': -77.067972, 'Lat': 38.9077, 'Routes': ['G2', 'G2v1']}
{'StopID': '1001398', 'Name': 'P ST NW + 35TH ST NW', 'Lon': -77.068904, 'Lat': 38.908794, 'Routes': ['G2']}
{'StopID': '1003205', 'Name': 'M ST NW + 33RD ST NW', 'Lon': -77.065862, 'Lat': 38.905208, 'Routes': ['38B', 'D5']}
{'StopID': '1001318', 'Name': 'M ST NW + POTOMAC ST NW', 'Lon': -77.065469, 'Lat': 38.905035, 'Routes': ['38B', '38Bv1', '38Bv2', 'D5']}
{'StopID': '1001401', 'Name': 'P

Data is typically packaged in a manner that requires as little effort as possible on the part of the producer and distributer of that data.  Which means, in order to maintain the delicate balance of the Universe, we are going to have to do a lot of work to make sense of that data.

Still, you may ask, why so much work?  In part, because I'm stretching the process out overmuch in order to expose actions that are typically concealed or invisible.  But the fact is that JSON was not built for Python, and though it translates, we do have to jump through some hoops.

In this case specifically, here's the structure of the JSON (or pseudo-JSON) that requests.json() delivers:


1.  A Tuple
    1.  contains a List
        1.  contains a pair of Tuples: 
            1. (A KVP - StopName, value is String)
            2. (A KVP - Predictions, value is a List)
                1.  List contains n Dictionaries
                        1.  Each dict contains the KVPs we want
                            1. Route etc.
                            2. Route etc.
                            3. Route etc.

Not that great.  But the good news is that once you work through it, it tends to remain fairly reliable.  This approach, for instance, is better *by far* than scraping data.

In [11]:
'''
for each_entry in BusStopDict:
    ID = each_entry["StopID"]
    BusPredictorRequest = BusPredictorURL + "?StopID=" + ID
    BusETA = req.get(BusPredictorRequest, headers = papersPlease)

    # Two different ways of accomplishing this
    # One uses the json library, the other
    # Depends on the .json() method in Requests library
    # jsonData = json.loads(BusETA.content.decode('utf-8'))

    
    jsonData = BusETA.json()
    #    print(jsonData)
    jsonPredictions = jsonData['Predictions']
    #    print(jsonPredictions)
    jsonMainList = jsonPredictions[0]
    #    print(jsonMainList)
    tempRouteID = jsonMainList['RouteID']
    tempNarr = jsonMainList['DirectionText']
    tempMins = jsonMainList['Minutes']
    print(tempRouteID, tempNarr, tempMins)
    '''

'\nfor each_entry in BusStopDict:\n    ID = each_entry["StopID"]\n    BusPredictorRequest = BusPredictorURL + "?StopID=" + ID\n    BusETA = req.get(BusPredictorRequest, headers = papersPlease)\n\n    # Two different ways of accomplishing this\n    # One uses the json library, the other\n    # Depends on the .json() method in Requests library\n    # jsonData = json.loads(BusETA.content.decode(\'utf-8\'))\n\n    \n    jsonData = BusETA.json()\n    #    print(jsonData)\n    jsonPredictions = jsonData[\'Predictions\']\n    #    print(jsonPredictions)\n    jsonMainList = jsonPredictions[0]\n    #    print(jsonMainList)\n    tempRouteID = jsonMainList[\'RouteID\']\n    tempNarr = jsonMainList[\'DirectionText\']\n    tempMins = jsonMainList[\'Minutes\']\n    print(tempRouteID, tempNarr, tempMins)\n    '

Now that we've got the data reliably divvied up, let's store that information in a few arrays -- just like we did in the last few projects -- in order to organize it for our dashboard.

In [12]:
# As long as it isn't too high, we still 
# want to use the BusStop list as a convenient
# set of starting points.

# But let's start by zeroing out the lists
# with each bus' data.  Note that I'm using
# square brackets [] -- this is a list,
# not a tuple, not a dictionary.

# Also:  We zero it out only once -- we want
# these lists to accumulate values.

busStopID = []
busStopName = []

busRouteID = []
busRouteName = []
busRouteDir = []
busRouteMins = []
busID = []

for each_Stop in BusStopDict:
    
    stopID = each_Stop["StopID"]
    stopName = each_Stop["Name"]
    
    BusPredictorRequest = BusPredictorURL + "?StopID=" + stopID
    BusETA = req.get(BusPredictorRequest, headers = papersPlease)
    
    jsonData = BusETA.json()
    jsonPredictions = jsonData['Predictions']
    
    # NOW we have to iterate over every
    # dictionary hidden in that one list.
    # If there's only one dictionary, this
    # still works just fine.
    
    tempBusStopID = stopID
    tempBusStopName = stopName
    
    for each_Bus in jsonPredictions:
        tempRouteID = each_Bus['RouteID']
        tempDir = each_Bus['DirectionText']
        tempMins = each_Bus['Minutes']
       # tempID = each_Bus['']
        
        # add in data (often repeated)
        # that reminds us which stop we were
        # asking about...
        busStopID.append(tempBusStopID)
        busStopName.append(tempBusStopName)
        
        # now save this data at PEAK FRESHNESS
        
        busRouteID.append(tempRouteID)
        # busRouteName.append(tempRouteName)
        busRouteDir.append(tempDir)
        busRouteMins.append(str(tempMins))
       # busID.append(str(tempID))

In [13]:
n = len(busRouteID)
for i in range(0,n):
    print(
    busRouteMins[i], "mins:",
    busRouteID[i],
    busStopName[i],
    busRouteDir[i]
    )
print("-------------")

9 mins: G2 PROSPECT ST NW + 36TH ST NW East to Ledroit Park - Howard University
38 mins: G2 PROSPECT ST NW + 36TH ST NW East to Ledroit Park - Howard University
9 mins: G2 35TH ST NW + N ST NW East to Ledroit Park - Howard University
39 mins: G2 35TH ST NW + N ST NW East to Ledroit Park - Howard University
13 mins: 38B M ST NW + 34TH ST NW East to Farragut Square
7 mins: G2 37TH ST NW + O ST NW East to Ledroit Park - Howard University
37 mins: G2 37TH ST NW + O ST NW East to Ledroit Park - Howard University
10 mins: G2 O ST NW + 34TH ST NW East to Ledroit Park - Howard University
39 mins: G2 O ST NW + 34TH ST NW East to Ledroit Park - Howard University
3 mins: G2 P ST NW + 35TH ST NW West to Georgetown University
23 mins: G2 P ST NW + 35TH ST NW West to Georgetown University
0 mins: 38B M ST NW + 33RD ST NW West to Ballston
13 mins: 38B M ST NW + 33RD ST NW West to Ballston
31 mins: 38B M ST NW + 33RD ST NW West to Ballston
0 mins: 38B M ST NW + POTOMAC ST NW East to Farragut Square
14