The "ESA WP Species TIR Processing" notebook ran through the basic processing of the set of work plan species using Taxa Information Registry functions to populate a bunch of data for further exploration and analysis. These data were placed into a MongoDB collection as part of an experimental platform we work with on the ESIP Testbed. The data are accessible right now through a low level API with authentication through Python, R, or other code. Once we determine what aspects of these data are interesting for further use, we will build out a higher level API and features for more simplified access for other users, including making data public or providing simple authentication methods.

This notebook runs through a series of sections that attempt to tease out potentially interesting dynamics from the data for further use in conversations with USGS Ecosystems Mission Area folks and FWS personnel involved in this work.

In [9]:
from pybis import db
from IPython.display import display
import pandas as pd

bisDB = db.Db.connect_mongodb("bis")
esaWPSpecies = bisDB["FWS_Work_Plan_Species"]

# Names and Taxonomy
We started with just the original names from a supplied spreadsheet that we believe started with the common names in the [National Listing Work Plan](https://www.fws.gov/endangered/esa-library/pdf/Listing%207-Year%20Workplan%20Sept%202016.pdf) but added in scientific names. Most of these records also had links to the USFWS Ecological Conservation Online System (ECOS) where we were able to further reduce ambiguity in the species being identified. We were able to pick up ITIS Taxonomic Serial Numbers from the referenced ECOS pages, and from these we tapped into species data from the USFWS Threatened and Endangered Species System and consulted ITIS itself. Through all of these steps we continued to build more and more data into the system that can then be evaluated for various uses.

This next codeblock uses the generated array of unique scientific names to output a summary of names and their sources for further examination. A few of these are anomalies based on slight string mismatches that are no big deal and could have been dealt with previously. Notable things for future examination include:

* The majority of these are cases where there appears to be a disagreement in ITIS for the supplied scientific name. These mostly show up here where you'll see more than one ITIS name, followed by an "invalid" or "unaccepted" flag and a TSN that matches the TSN picked up from ECOS. The FWS system has a TSN that ITIS considers to be invalid for some reason. There may be good case for argument on these, but they could be noted for FWS biologists to consider if they haven't already done so. Some of the more notable cases are where two different identified species names are the same taxonomy according to ITIS. This may already be well known to the relevant FWS biologists and is likely a part of the petition review process.
* There are a few cases where only a genus name was supplied, and we didn't perhaps deal with these quite as elegantly as we could have in this synthesis exercise.
* There are a few other cases here where an explicit name match between different sources simply turned up a slight but insignificant difference and have no meaningful impact.

We likely need to add some additional annotation to the data based on a human review of these dynamics to help determine which name we want to use in connecting to other data systems. For now, we will try first to use the originally supplied scientific name in pulling other data in and then look at other names when that fails. The intent will be to remain true to what the original data supplied.

In [None]:
for record in esaWPSpecies.find({"Synthesis.Unique Scientific Names.1":{"$exists":True}}):
    print ("Submitted Name:", record["Submitted Data"]["Scientific Name"])
    try:
        print ("ECOS Name:", record["ECOS Scrape"]["Scientific Name"])
        print ("ECOS TSN:", record["ECOS Scrape"]["TSN"])
    except:
        pass
    try:
        print ("TESS Name:", record["TESS"]["SCINAME"])
    except:
        pass
    for itisRecord in record["ITIS"]:
        print ("ITIS:", itisRecord["nameWInd"], itisRecord["usage"], itisRecord["tsn"])
    print ("----------")

# State Species of Greatest Conservation Need
One of the data sources we compared with is the synthesis of SGCN species that our program pulls together. This report shows a quick summary of the species that were found in the state SGCN synthesis. What might be interesting here is to think about cases where states that are part of the species range have not determined that the species should be on their list of conservation needs. If that kind of comparative analysis is interesting, we can work that into the process.

In [None]:
for record in esaWPSpecies.find({"SGCN":{"$exists":True}},{"Submitted Data":1,"SGCN.State List Summary":1}):
    print ("Scientific Name:", record["Submitted Data"]["Scientific Name"])
    print ("Common Name:", record["Submitted Data"]["Common Name"])
    print ("Lead FWS Region:", record["Submitted Data"]["Lead FWS Region"])
    print ("Species Range (FWS):", record["Submitted Data"]["Species Range"])
    print ("ECOS Link:", record["Submitted Data"]["Species Record Reference"])
    for sgcnSummary in record["SGCN"]["State List Summary"]:
        print (next(iter(sgcnSummary.keys())), "SGCN Summary Data")
        display (next(iter(sgcnSummary.values())))
    print ("----------")
    

# Species Range
The data from the FWS 7-year work plan has a listing of states considered part of the range of the species. It's not completely clear at this point how those are designated, but we have a number of other sources to consult about what the range of the species might include.

## BISON Occurrence Data
The following code block examines the lists of states in the FWS range with a list of states that returned BISON records for the species. Three potentially interesting lists that you'll see in this section include the following:

* States included in the FWS range that do have occurrence records in BISON
* States from the FWS range that do not have any occurrence records in BISON
* Additional states where the species has occurrence records in BISON but not included in the FWS range

These dynamics would require quite a bit of additional exploration to do examine things like the type of occurrence record (fossils, museum specimens, etc. are all included here), date of last occurrence, and other attributes in the BISON data.

In [None]:
for record in esaWPSpecies.find({},{"Submitted Data":1,"Synthesis":1}):
    print ("Scientific Name:", record["Submitted Data"]["Scientific Name"])
    print ("Common Name:", record["Submitted Data"]["Common Name"])
    print ("Lead FWS Region:", record["Submitted Data"]["Lead FWS Region"])
    print ("ECOS Link:", record["Submitted Data"]["Species Record Reference"])

    statesAligningWithBISON = [s for s in record["Synthesis"]["FWS Range List"] if s in record["Synthesis"]["States with BISON Occurrence Data"]]
    if len(statesAligningWithBISON) > 0:
        print ("States where FWS range of states align with BISON occurrence data")
        display (statesAligningWithBISON)

    fwsStatesWithNoBISONRecords = [s for s in record["Synthesis"]["FWS Range List"] if s not in record["Synthesis"]["States with BISON Occurrence Data"]]
    if len(fwsStatesWithNoBISONRecords) > 0:
        print ("States where FWS range of states had no BISON occurrence data")
        display (fwsStatesWithNoBISONRecords)

    additionalStatesFromBISON = list(set(record["Synthesis"]["States with BISON Occurrence Data"]) - set(record["Synthesis"]["FWS Range List"]))
    if len(additionalStatesFromBISON) > 0:
        additionalStatesFromBISON.sort()
        print ("Additional states where BISON records occurrence data")
        display (additionalStatesFromBISON)
    print ("----------")


## GAP Species Habitat Maps
The Gap Analysis Project in Core Science Analytics, Synthesis and Library has produced species distribution maps for the terrestrial vertebrate species with range in CONUS. These are 30m raster products based on the GAP Land Cover data from 2001. Many of these distribution maps have been online for many years, particularly those that were started as regional products. The full set of maps is currently in USGS data review and will be released soon.

The following code block lists the FWS work plan species where GAP models exist.

In [10]:
for record in esaWPSpecies.find({"GAP":{"$exists":True}},{"Submitted Data":1,"GAP":1}):
    print (record["Submitted Data"]["Scientific Name"], "--", record["Submitted Data"]["Common Name"], "-- GAP Species Code:", record["GAP"]["gap_speciescode"])

Ambystoma barbouri -- streamside salamander -- GAP Species Code: aSTRSx
Batrachoseps campi -- Inyo Mountains slender salamander -- GAP Species Code: aINSAx
Batrachoseps minor -- lesser slender salamander -- GAP Species Code: aLSSAx
Batrachoseps relictus -- relictual slender salamander -- GAP Species Code: aRSSAx
Batrachoseps robustus -- Kern Plateau salamander -- GAP Species Code: aKPSAx
Batrachoseps simatus -- Kern Canyon slender salamander -- GAP Species Code: aKCSSx
Bufo microscaphus microscaphus -- Arizona toad -- GAP Species Code: aAZTOx
Cryptobranchus alleganiensis -- hellbender -- GAP Species Code: aHELLx
Eurycea latitans -- Cascade Caverns salamander -- GAP Species Code: aCCSAx
Eurycea nana -- Texas salamander -- GAP Species Code: aSMSAx
Eurycea robusta -- Blanco blind salamander -- GAP Species Code: aBLSAx
Eurycea sp. -- Comal Springs salamander -- GAP Species Code: aBLSAx
Eurycea tridentifera -- Comal blind salamander -- GAP Species Code: aCBSAx
Eurycea tynerensis -- Oklahoma

# Next Steps
I've just begun working through the various things we can do with the data we've already assembled for the FWS ESA work plan species in the Taxa Information Registry. And there are other sources to tap into as well such as IUCN Red List. I'm checking in this code for now but will come back to it as time allows. We'll also have to see what kind of appetite there is for digging into this further with Ecosystems folks.

For now, here is a quick example of a full species document in its raw form from the database.

In [None]:
display(esaWPSpecies.find_one())