One interesting thing to know about the work plan species is their connection to State Species of Greatest Conservation Need. The USGS builds and maintains a synthesis of state species, linking species names to taxonomic authorities (ITIS and WoRMS) to produce a synthesized National list for each decadal reporting period (with some periodic updates during the intervening years). This notebook leverages an sgcn module in the bispy package to search the National List SGCN API. It returns and caches the summarized National List records, which include the list of states that have the species in their conservation planning process.

In [1]:
import requests
import json
import bispy
from IPython.display import display
from joblib import Parallel, delayed

sgcn = bispy.sgcn.Search()
bis_utils = bispy.bis.Utils()

In [2]:
# Open up the cached workplan species
with open("cache/workplan_species.json", "r") as f:
    workplan_species = json.loads(f.read())

In [3]:
# Prepare a list of names to use for lookup that includes the name and its source in the prepared workplan species data
lookup_name_list = [{"name source": "Lookup Name", "name": r["Lookup Name"]} for r in workplan_species]
lookup_name_list.extend([{"name source": "Valid ITIS Scientific Name", "name": r["Valid ITIS Scientific Name"]} for r in workplan_species if "Valid ITIS Scientific Name" in r.keys()])

In [4]:
# Use joblib to run multiple requests for SGCN records in parallel via scientific names
sgcn_cache = Parallel(n_jobs=8)(delayed(sgcn.search)(name) for name in [r["name"] for r in lookup_name_list])

In [5]:
# Filter to just those cases where we found an SGCN species and inject a variable for where the name match came from
sgcn_cache_filtered = list()
for record in [s for s in sgcn_cache if s["sgcn_species"] is not None]:
    record["Workplan Species Name Source"] = next((r["name source"] for r in lookup_name_list if r["name"] == record["search_api"].split("=")[-1]), None)
    sgcn_cache_filtered.append(record)

In [6]:
# Cache the array of retrieved documents and return/display a random sample for verification
display(bis_utils.doc_cache("cache/sgcn.json", sgcn_cache_filtered))

{'Doc Cache File': 'cache/sgcn.json',
 'Document Number 161': {'Workplan Species Name Source': 'Lookup Name',
  'search_api': 'https://sciencebase.usgs.gov/staging/bis/api/v1/swap/nationallist?scientificname=Eumeces egregius insularis',
  'search_date': '2019-07-07T15:13:06.913082',
  'sgcn_species': {'acceptedauthorityurl': 'https://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=208886',
   'commonname': 'Cedar Key Mole Skink',
   'gid': 9580,
   'matchmethod': 'Exact Match',
   'scientificname': 'Eumeces egregius insularis',
   'sgcn2005': 1,
   'sgcn2015': 0,
   'statelist_2005': 'Florida',
   'statelist_2015': '',
   'taxonomicgroup': 'Reptiles',
   'taxonomicrank': 'Subspecies'}},
 'Number of Documents in Cache': 308}

# Note
As a point of reference, we were able to find a number of additional matches to the SGCN species based on having established a linkage to a valid ITIS record. The following code block provides a quick reference on the number of SGCN matches found (high number linked to state species of conservation need as we would expect) and the number found by having the ITIS information included in the name lookup.

In [8]:
print("Number of SGCN Species Matches:", len(sgcn_cache_filtered))
print("Number found with ITIS valid name:", len([r for r in sgcn_cache_filtered if r["Workplan Species Name Source"] != "Lookup Name"]))

Number of SGCN Species Matches: 308
Number found with ITIS valid name: 15
