One interesting thing to know about the work plan species is their connection to State Species of Greatest Conservation Need. The USGS builds and maintains a synthesis of state species, linking species names to taxonomic authorities (ITIS and WoRMS) to produce a synthesized National list for each decadal reporting period (with some periodic updates during the intervening years). This notebook leverages an sgcn module in the bispy package to search the National List SGCN API. It returns and caches the summarized National List records, which include the list of states that have the species in their conservation planning process.

In [1]:
import requests
import bispy
from IPython.display import display
from joblib import Parallel, delayed
from collections import Counter
import jsonschema

sgcn = bispy.sgcn.Search()
bis_utils = bispy.bis.Utils()

import helperfunctions

In [2]:
name_list = helperfunctions.workplan_species()

In [3]:
%%time
# Use joblib to run multiple requests for SGCN records in parallel via scientific names
sgcn_results = Parallel(n_jobs=8)(delayed(sgcn.search)(name, name_source) for name, name_source in name_list)


CPU times: user 1.03 s, sys: 115 ms, total: 1.14 s
Wall time: 39.2 s


In [4]:
# Cache the array of retrieved documents and return/display a random sample for verification
display(bis_utils.doc_cache("../cache/sgcn.json", sgcn_results))

{'Doc Cache File': '../cache/sgcn.json',
 'Number of Documents in Cache': 386,
 'Document Number 244': {'processing_metadata': {'status': 'success',
   'date_processed': '2019-09-16T18:08:56.798238',
   'status_message': 'Name Match',
   'api': 'https://api.sciencebase.gov/bis-api/api/v1/swap/nationallist?scientificname=Cicindela marginipennis'},
  'parameters': {'Scientific Name': 'Cicindela marginipennis',
   'Name Source': 'Lookup Name'},
  'data': {'statelist_2005': 'Indiana,Massachusetts,New Hampshire,New York,Pennsylvania,Vermont,West Virginia',
   'statelist_2015': 'Maine,Massachusetts,New Hampshire,New Jersey,New York,Pennsylvania,Vermont,West Virginia',
   'scientificname': 'Cicindela marginipennis',
   'commonname': 'Cobblestone Tiger Beetle',
   'taxonomicgroup': 'Insects',
   'taxonomicrank': 'Species',
   'matchmethod': 'Exact Match',
   'acceptedauthorityurl': 'https://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=697708'}}}

# Schema Validation

In [10]:
sgcn_schema = helperfunctions.load_schema('sgcn')
display(sgcn_schema)

jsonschema.validate(sgcn_results, sgcn_schema)

{'definitions': {'items': {'$id': '#/items',
   'type': ['object', 'array'],
   'title': 'Generic container for items in a dataset',
   'description': 'A JSON array or object property containing one or more items in a dataset or data structure within a dataset.'}},
 '$schema': 'http://json-schema.org/draft-07/schema#',
 '$id': 'http://data.usgs.gov/property_registry/',
 'type': 'array',
 'title': 'SGCN Species on the FWS Listing Workplan',
 'description': 'This dataset contains the results of a search for Workplan Species that are on a US State Species of Greatest Conservation Need list. It uses a synthesis of SGCN species maintained by the USGS and provided via an API.',
 'items': {'$ref': '#/definitions/items',
  'required': ['processing_metadata'],
  'properties': {'processing_metadata': {'$ref': 'common_properties.json#/definitions/processing_metadata'},
   'parameters': {'$ref': 'common_properties.json#/definitions/parameters'},
   'data': {'$ref': 'common_properties.json#/definit

# Interesting Facets
As a point of reference, we were able to find a number of additional matches to the SGCN species based on having established a linkage to a valid ITIS record. The following code block provides a quick reference on the number of SGCN matches found (high number linked to state species of conservation need as we would expect) and the number found by having the ITIS information included in the name lookup.

Also, in looking at the overall results, we see that there are 72 names in the workplan species list that are not found on any current state list of species of greatest conservation need.

In [11]:
Counter(spp["parameters"]["Name Source"] for spp in [r for r in sgcn_results if r["processing_metadata"]["status"] == "success"])

Counter({'Lookup Name': 301, 'Valid ITIS Scientific Name': 15})

In [12]:
Counter(spp["processing_metadata"]["status"] for spp in sgcn_results)

Counter({'success': 316, 'failure': 70})