This notebook exercises some new thinking in how to register GAP species for processing in the Taxa Information Registry. It works toward the notion of putting messages on a message queue, something we're working on but not quite ready with yet. This process examines GAP species records and sets up one or more API end point records in the TIR caches for processing. Those end points are then processed through a separate notebook.

In [1]:
from IPython.display import display
from bis2 import mlab
from bis import itis
from bis import worms
from bis import tess
from bis import natureserve

Here we set up a connection to the MLab instance of a MongoDB database we are experimenting with as our sandbox. We will eventually move this to a production instance. The data for the TIR is arranged into different collections of documents in the "bis" database.

In [2]:
bisDB = mlab.getDB("bis")
gapspecies = bisDB["gapspecies"]
itiscache = bisDB["itiscache"]
wormscache = bisDB["wormscache"]
tesscache = bisDB["tesscache"]
natureservecache = bisDB["natureservecache"]

Here we loop through the GAP species documents that do not currently have a tessCacheID (identifier pointing to a document in the tesscache collection where FWS listing information is cached). This basically sets up the process that will go out and retrieve any available FWS listing information associated with a GAP species and cache it for our use. It calls a function in the bis.tess module that sets up the appropriate query URL for later processing.

In [3]:
for gapDoc in gapspecies.find({"tessCacheID":{"$exists":False}}):
    tessDoc = {}
    tessDoc["originCollection"] = "gapspecies"
    tessDoc["originID"] = doc["_id"]
    if "ITIS_TSN" in gapDoc:
        tessDoc["searchURL"] = tess.getTESSSearchURL("TSN",gapDoc["ITIS_TSN"])
    else:
        tessDoc["searchURL"] = tess.getTESSSearchURL("SCINAME",gapDoc["scientificname"])
    display (tessDoc)
    print (gapspecies.update_one({"_id":gapDoc["_id"]},{"$set":{"tessCacheID":tesscache.insert_one(tessDoc).inserted_id}},upsert=False))
    

Note: This is kind of a rough process at this point that will need some more work. I already ran a similar process at different times to create the start to processors for ITIS, WoRMS, and NatureServe (our three other current TIR caches). I'm working toward a solution for this that will run periodically over time to refresh itself, but I haven't yet designed exactly how that will work.