We periodically cache information about taxa of interest from the USFWS Threatened and Endangered Species System (TESS) in the Taxonomic Information Registry. This information is displayed in places like the State Wildlife Action Plan (SWAP) online app and the National Biogeographic Map.

TESS data are stored as JSON "documents" in a jsonb column of the "tir" data store in the experimental GC2 platform we are working with. This notebook shows a couple of methods for retrieving TESS information via the GC2 API for use in web applications.

There is a TESS record for every taxon that is registered in the TIR. When no information from TESS was available with the query (based on ITIS TSN), there will be a result=>none key/value and a cacheDate, incidating the date/time the query was run.

The following queries, based on the SQL API, are some options for working with the json structures. The data have also been piped to ElasticSearch in the GC2 instance, but they will come back as strings that need to be parsed. Note that there should only be a single result coming back for any given TSN or any originally submitted scientific name from the SGCN. However, there is not an absolute requirement that only a single record exist in the database for any of these key identifying features. You might harden your code by checking to make sure that there is only a single record and dealing with any duplicates.

In [9]:
import requests,json
from IPython.display import display

In [6]:
# Query based on ITIS tsn
tsn = 201922

q = "SELECT tess FROM tir.tir \
    WHERE tess->>'TSN' = '"+str(tsn)+"'"

r = requests.get("https://gc2.datadistillery.org/api/v1/sql/bcb?q="+q).json()

display (r["features"][0]["properties"]["tess"])


'{"DPS": "0", "TSN": "201922", "FAMILY": "Cyprinidae", "SPCODE": "E00R", "STATUS": "E", "result": true, "COMNAME": "Pahranagat roundtail chub", "COUNTRY": "1", "INVNAME": "Chub, Pahranagat roundtail", "SCINAME": "Gila robusta jordani", "VIPCODE": "V01", "criteria": "Gila robusta jordani", "ENTITY_ID": "226", "queryType": "SCINAME", "dateCached": "2017-07-19T19:49:53.748069", "LEAD_AGENCY": "1", "LEAD_REGION": "8", "listingStatus": [{"STATUS": "Endangered", "POP_DESC": "Wherever found", "POP_ABBREV": "Wherever found", "LISTING_DATE": "1970-10-13"}], "REFUGE_OCCURRENCE": "Pahranagat National Wildlife Refuge"}'

The data that we've cached in the TIR from TESS and other sources is going into JSONB structures now that are returned by the GC2 API as strings (from either SQL or ElasticSearch). Those might need to be converted to an actual data structure so that they can be worked with more handily. For instance, in Python, we'd use a json.loads to load the string up as a dictionary for processing.

In [10]:
tessDict = json.loads(r["features"][0]["properties"]["tess"])

print (type(tessDict))

display (tessDict)

<class 'dict'>


{'COMNAME': 'Popolo',
 'COUNTRY': '1',
 'DPS': '0',
 'ENTITY_ID': '6870',
 'FAMILY': 'Solanaceae',
 'INVNAME': 'Popolo',
 'LEAD_AGENCY': '1',
 'LEAD_REGION': '1',
 'REFUGE_OCCURRENCE': 'Papahanaumokuakea Marine National Monument, Hawaiian Islands National Wildlife Refuge',
 'SCINAME': 'Solanum nelsonii',
 'SPCODE': 'Q21R',
 'STATUS': 'E',
 'TSN': '30483',
 'VIPCODE': 'P01',
 'criteria': '30483',
 'dateCached': '2017-07-07T10:44:41.421099',
 'listingStatus': [{'LISTING_DATE': '2016-10-31',
   'POP_ABBREV': 'Wherever found',
   'POP_DESC': 'Wherever found',
   'STATUS': 'Endangered'}],
 'queryType': 'TSN',
 'result': True}

In [8]:
# Query based on a submitted SGCN scientific name
scientificname = "Solanum nelsonii"

q = "SELECT tess FROM tir.tir \
    WHERE registration->>'scientificname' = '"+scientificname+"'"

r = requests.get("https://gc2.datadistillery.org/api/v1/sql/bcb?q="+q).json()

display (r["features"][0]["properties"]["tess"])

'{"DPS": "0", "TSN": "30483", "FAMILY": "Solanaceae", "SPCODE": "Q21R", "STATUS": "E", "result": true, "COMNAME": "Popolo", "COUNTRY": "1", "INVNAME": "Popolo", "SCINAME": "Solanum nelsonii", "VIPCODE": "P01", "criteria": "30483", "ENTITY_ID": "6870", "queryType": "TSN", "dateCached": "2017-07-07T10:44:41.421099", "LEAD_AGENCY": "1", "LEAD_REGION": "1", "listingStatus": [{"STATUS": "Endangered", "POP_DESC": "Wherever found", "POP_ABBREV": "Wherever found", "LISTING_DATE": "2016-10-31"}], "REFUGE_OCCURRENCE": "Papahanaumokuakea Marine National Monument, Hawaiian Islands National Wildlife Refuge"}'

You can also pull TESS and other buckets of information as JSON from the ElasticSearch index on tir. The following block runs a query on the tess property in the tir ES index for the term "endangered" (dangerously nonspecific), returns some results, and then loops through those to display listing status for the returned records out of the tess cache. Note that you should probably always show the date the information was cached from TESS so that users know the currency on the information they are viewing in our systems. This is a Python-specific way to work this information source, but you should be able to use similar methods in Javascript or whatever language.

In [20]:
esURL = "https://gc2.datadistillery.org/api/v1/elasticsearch/search/bcb/tir/tir?size=25&q={%22query%22:%20{%22match%22:%20{%22properties.source%22:%20%22SGCN%22}},%22query%22:%20{%22match%22:%20{%22properties.tess%22:%20%22Endangered%22}}}"

esData = requests.get(esURL).json()

for hit in esData["hits"]["hits"]:
    print (hit["_source"]["properties"]["scientificname"])
    _tessData =  json.loads(hit["_source"]["properties"]["tess"])
    print ("Date Cached from TESS: ", _tessData["dateCached"])
    for listingStatus in _tessData["listingStatus"]:
        print (listingStatus["STATUS"], listingStatus["LISTING_DATE"], listingStatus["POP_DESC"])
    print ("--------")
    
    

Lepidochelys olivacea
Date Cached from TESS:  2017-07-07T14:02:02.580433
Endangered 1978-07-28 Breeding colony populations on Pacific coast of Mexico
Threatened 1978-07-28 Wherever found, except when listed as endangered under 50 CFR 224.101
--------
Acipenser oxyrinchus oxyrinchus
Date Cached from TESS:  2017-07-19T14:17:30.621301
Endangered 2012-02-06 New York Bight DPS - See 50 CFR 224.101
Threatened 2012-02-06 Gulf of Maine DPS - See 50 CFR 223.102
Endangered 2012-02-06 South Atlantic DPS - See 50 CFR 224.101
Endangered 2012-02-06 Carolina DPS - See 50 CFR 224.101
Endangered 2012-02-06 Chesapeake Bay DPS - See 50 CFR 224.101
--------
Lepidochelys olivacea
Date Cached from TESS:  2017-07-26T16:10:48.109678
Endangered 1978-07-28 Breeding colony populations on Pacific coast of Mexico
Threatened 1978-07-28 Wherever found, except when listed as endangered under 50 CFR 224.101
--------
Rana muscosa
Date Cached from TESS:  2017-07-19T13:20:01.745286
Endangered 2014-06-30 U.S.A., northern 