The original records put together by Ecosystems and provided in a spreadsheet from Steve Hilburger had hyperlinks to the FWS Ecological Conservation Online System containing the "SPCODE" identifier from that system for most of the species. This identifier is different from the SPCODE or ENTITY_ID that is available in other parts of ECOS, and there does not appear to be a public API available to key on that identifier. The web links lead to public landing pages for the species that have a collection of useful information that we may look to parse out for analysis later. For now, we use the species scientific name to find the species in ECOS TESS and bring back any of its information for later use.

In [18]:
from bis2 import dd
from IPython.display import display
from datetime import datetime
from bis import tess

In [10]:
bisDB = dd.getDB("bis")
esaWPSpecies = bisDB["FWS ESA Work Plan Species"]

The presence of an ITIS TSN identifier assigned to an ECOS species record is a pretty solid identifier to use in retrieving data from the TESS system. This code block uses a TSN type query to retrieve as many records as possible back for the data collection.

In [51]:
for record in esaWPSpecies.find({"$and":[{"TESS":{"$exists":False}},{"ECOS Scrape.TSN":{"$exists":True}}]}):
    tessData = tess.queryTESS("TSN",record["ECOS Scrape"]["TSN"])
    if tessData["result"] is not False:
        esaWPSpecies.update_one({"_id":record["_id"]},{"$set":{"TESS":tessData}})
    else:
        display(record)

For a few cases, we did not have an ITIS TSN in the data from the ECOS scrape, but we do have an SPCODE identifier in the URL from the link. We can use those to go after TESS data in those cases. This code block is meant to run in sequence after trying for TESS data via TSN.

In [53]:
for record in esaWPSpecies.find({"$and":[{"TESS":{"$exists":False}},{"ECOS Scrape.SPCODE":{"$exists":True}}]}):
    tessData = tess.queryTESS("SPCODE",record["ECOS Scrape"]["SPCODE"])
    if tessData["result"] is not False:
        esaWPSpecies.update_one({"_id":record["_id"]},{"$set":{"TESS":tessData}})
    else:
        display(record)

In cases where we still don't have any TESS data after trying ITIS TSN and SPCODE identifiers, we can still try to use the scientific name to see if there is anything in the system. If not, then there must be some reason that FWS has not entered information for a particular petition into their core system.

At this point, I also check to see if the scientific name we scraped from a linked ECOS web page matches the scientific name from the FWS pre-listing plan spreadsheet. If it doesn't match, I put a note in the processing metadata indicating that there is an issue we may want to investigate further. Depending on who established a link to ECOS in the spreadsheet, it may just be that we resolved some taxonomic issue with what was originally submitted by a petitioner.

In [56]:
for record in esaWPSpecies.find({"$and":[{"TESS":{"$exists":False}},{"ECOS Scrape.SPCODE":{"$exists":False}},{"ECOS Scrape.Scientific Name":{"$exists":True}}]}):
    if record["Submitted Data"]["Scientific Name"] != record["ECOS Scrape"]["Scientific Name"]:
        processingMetadata = record["Processing Metadata"]
        processingMetadata["ECOS Match Annotation"] = "Scientific name from spreadsheet didn't match with referenced ECOS record"
        esaWPSpecies.update_one({"_id":record["_id"]},{"$set":{"Processing Metadata":processingMetadata}})

    tessData = tess.queryTESS("SCINAME",record["ECOS Scrape"]["Scientific Name"])
    if tessData["result"] is not False:
        esaWPSpecies.update_one({"_id":record["_id"]},{"$set":{"TESS":tessData}})
    else:
        print("No TESS record found on scientific name search")
        display(record)

At this point, there are a number of records that did not include any ECOS link to follow and scrape and for which we've not been able to retrieve any information from TESS. I go ahead and try to use the original scientific name supplied to run a search with the TESS API to see if we find any results. At this point, we've exhausted all our possibilities of finding a link to TESS without also running ITIS processes to potentially turn up a TSN to search with, so we go ahead and insert a TESS result for every remaining record, indicating that no result was found if that's the case.

In [58]:
for record in esaWPSpecies.find({"TESS":{"$exists":False}}):
    esaWPSpecies.update_one({"_id":record["_id"]},{"$set":{"TESS":tess.queryTESS("SCINAME",record["Submitted Data"]["Scientific Name"])}})
