One interesting thing to know about the work plan species is their connection to State Species of Greatest Conservation Need. The USGS builds and maintains a synthesis of state species, linking species names to taxonomic authorities (ITIS and WoRMS) to produce a synthesized National list for each decadal reporting period (with some periodic updates during the intervening years). This notebook leverages an sgcn module in the bispy package to search the National List SGCN API. It returns and caches the summarized National List records, which include the list of states that have the species in their conservation planning process.

In [1]:
import requests
import bispy
from IPython.display import display
from joblib import Parallel, delayed
from collections import Counter
import jsonschema

sgcn = bispy.sgcn.Search()
bis_utils = bispy.bis.Utils()

import helperfunctions

ModuleNotFoundError: No module named 'yaml'

In [2]:
name_list = helperfunctions.workplan_species()

In [3]:
%%time
# Use joblib to run multiple requests for SGCN records in parallel via scientific names
sgcn_results = Parallel(n_jobs=8)(delayed(sgcn.search)(name, name_source) for name, name_source in name_list)


CPU times: user 1.01 s, sys: 106 ms, total: 1.12 s
Wall time: 14.9 s


In [4]:
# Cache the array of retrieved documents and return/display a random sample for verification
display(bis_utils.doc_cache("cache/sgcn.json", sgcn_results))

{'Doc Cache File': 'cache/sgcn.json',
 'Number of Documents in Cache': 387,
 'Document Number 370': {'processing_metadata': {'status': 'success',
   'date_processed': '2019-07-24T21:45:12.311662',
   'status_message': 'Name Match',
   'api': 'https://api.sciencebase.gov/bis-api/api/v1/swap/nationallist?scientificname=Clemmys guttata'},
  'parameters': {'Scientific Name': 'Clemmys guttata',
   'Name Source': 'Lookup Name'},
  'sgcn_species': {'statelist_2005': 'Connecticut,Delaware,Florida,Georgia,Illinois,Indiana,Maine,Maryland,Massachusetts,Michigan,New Hampshire,New Jersey,New York,North Carolina,Ohio,Pennsylvania,Rhode Island,South Carolina,Vermont,Virginia,West Virginia',
   'statelist_2015': 'Connecticut,Delaware,District of Columbia,Florida,Georgia,Illinois,Indiana,Maine,Maryland,Massachusetts,Michigan,New Hampshire,New Jersey,New York,North Carolina,Ohio,Pennsylvania,Rhode Island,South Carolina,Vermont,Virginia,West Virginia',
   'scientificname': 'Clemmys guttata',
   'commonna

# Schema Validation

In [6]:
sgcn_schema = helperfunctions.load_schema('sgcn')
display(sgcn_schema)

jsonschema.validate(sgcn_results, sgcn_schema)

{'definitions': {'items': {'$id': '#/items',
   'type': ['object', 'array'],
   'title': 'Generic container for items in a dataset',
   'description': 'A JSON array or object property containing one or more items in a dataset or data structure within a dataset.'},
  'doi': {'$id': '#doi',
   'type': ['string', 'null'],
   'title': 'Digital Object Identifier',
   'description': 'A digital object identifier for or associated with a record. May be in the form of an HTTP url or a standalone identifier.',
   'examples': ['http://dx.doi.org/10.2305/IUCN.UK.2004.RLTS.T59435A11941314.en',
    '10.2305/IUCN.UK.2004.RLTS.T59435A11941314.en']},
  'resolvable_identifier': {'$id': '#resolvable_identifier',
   'type': 'string',
   'title': 'Resolvable Identifier',
   'description': 'Some form of resolvable identifier for a record that returns a response when accessed over an included protocol such as HTTP. May or may not provide for content negotiation.',
   'examples': ['https://www.iucnredlist.org

# Interesting Facets
As a point of reference, we were able to find a number of additional matches to the SGCN species based on having established a linkage to a valid ITIS record. The following code block provides a quick reference on the number of SGCN matches found (high number linked to state species of conservation need as we would expect) and the number found by having the ITIS information included in the name lookup.

Also, in looking at the overall results, we see that there are 72 names in the workplan species list that are not found on any current state list of species of greatest conservation need.

In [10]:
Counter(spp["parameters"]["Name Source"] for spp in [r for r in sgcn_results if r["processing_metadata"]["status"] == "success"])

Counter({'Lookup Name': 304, 'Valid ITIS Scientific Name': 9})

In [9]:
Counter(spp["processing_metadata"]["status"] for spp in sgcn_results)

Counter({'success': 313, 'failure': 74})