The NatureServe Explorer system contains a number of useful fields of information associated with sensitive species. The bispy package contains a module and set of functions for interacting with the NatureServe API. This notebook searches NatureServe and retrieves species documents for a cache to be used in further evaluation. The NatureServe API is a little bit old, and its XML response is challenging to work with. The Python function in bispy provides a little bit of reformatting into a dictionary object (JSON) for ease of processing.


# Data Management Considerations
Currently, we are simply working with the one public, open API for NatureServe explorer as we found it was seemingly more reliable than working with the services requiring an access key and contains enough information to get a start at working with what NatureServe has to offer. Private API routes also exist that require an API key for access. In either case, we comply with the posted [usage requirements](https://services.natureserve.org/idd/developer/license.jsp). Our team has had conversations with NatureServe staff and management about our particular usage of NatureServe data, verifying that we are using the information appropriately, including storing relevant "documents" associated with species in our lists, sharing selected information via our APIs, and displaying on web apps. The main concern that was expressed in discussions was that we always include the date that a NatureServe species record was last reviewed so that users can be aware of cases where they may be looking at outdated information.

In [1]:
import requests
import bispy
from IPython.display import display
from joblib import Parallel, delayed
from collections import Counter
import jsonschema

natureserve = bispy.natureserve.Natureserve()
bis_utils = bispy.bis.Utils()

import helperfunctions

In [2]:
name_list = helperfunctions.workplan_species()

In [4]:
%%time
# Use joblib to run multiple requests for records in parallel via scientific names
natureserve_results = Parallel(n_jobs=8)(delayed(natureserve.search)(name, name_source) for name, name_source in name_list)


CPU times: user 1.34 s, sys: 136 ms, total: 1.48 s
Wall time: 5min 49s


In [6]:
# Cache the array of retrieved documents and return/display a random sample for verification
display(bis_utils.doc_cache("../cache/natureserve.json", natureserve_results))


{'Doc Cache File': 'cache/natureserve.json',
 'Number of Documents in Cache': 387,
 'Document Number 174': {'processing_metadata': {'status': 'success',
   'date_processed': '2019-07-26T17:40:04.793861',
   'status_message': 'Single Match',
   'api': 'https://services.natureserve.org/idd/rest/v1/nationalSpecies/summary/nameSearch?nationCode=US&name=Oncidium undulatum'},
  'parameters': {'Scientific Name': 'Oncidium undulatum',
   'Name Source': 'Lookup Name'},
  'NatureServe Species': {'@uid': 'ELEMENT_NATIONAL.2.243144',
   '@type': 'Plant',
   'jurisdictionNationName': {'@code': 'US', '#text': 'UNITED STATES'},
   'nationalScientificName': {'unformattedName': 'Oncidium undulatum',
    'formattedName': '<i>Oncidium undulatum</i>',
    'nomenclaturalAuthor': '(Sw.) Salisb.'},
   'nationalConservationStatus': {'@lastChangedDate': '1997-03-03',
    '#text': 'N1'},
   'roundedNationalConservationStatus': 'N1',
   'natureServeGlobalConcept': {'@uid': 'ELEMENT_GLOBAL.2.147659',
    'classif

In [8]:
natureserve_schema = helperfunctions.load_schema('natureserve')
display(natureserve_schema)

jsonschema.validate(natureserve_results, natureserve_schema)

{'definitions': {'items': {'$id': '#items',
   'type': ['object', 'array'],
   'title': 'Generic container for items in a dataset',
   'description': 'A JSON array or object property containing one or more items in a dataset or data structure within a dataset.'}},
 '$schema': 'http://json-schema.org/draft-07/schema#',
 '$id': 'http://data.usgs.gov/property_registry/',
 'type': 'array',
 'title': 'NatureServe Species Summary Collection',
 'description': 'A dataset containing the results of a search for NatureServe Species against their public API. At this point, the schema is essentially the raw results from the search with a little bit of processing metadata and transformation from XML to JSON. Additional details for the schema will be provided in later versions of the process.',
 'items': {'$ref': '#/definitions/items',
  'properties': {'processing_metadata': {'$ref': 'common_properties.json#/definitions/processing_metadata'},
   'parameters': {'$ref': 'common_properties.json#/definit

# Interesting Facets
Theoretically, the work conducted by TNC and NatureServe with the State Heritage Societies should result in a good number of the species in the workplan list being matched to the NatureServe species database. We do find here that a total of 356 of the 363 species are matched with 7 of those coming from using the valid ITIS scientific name. We also show the distribution of current national conservation status values across those species.

In [9]:
Counter(spp["NatureServe Species"]["roundedNationalConservationStatus"] for spp in natureserve_results)

Counter({'N3': 99,
         'N5': 7,
         'N2': 104,
         'N1': 153,
         'N4B': 2,
         'N4': 10,
         'N3B,N3N': 1,
         'NNRN': 1,
         'NNR': 8,
         'N4B,N4N': 1,
         'NU': 1})

In [12]:
Counter(spp["parameters"]["Name Source"] for spp in [r for r in natureserve_results if r["processing_metadata"]["status"] == "success"])

Counter({'Lookup Name': 349, 'Valid ITIS Scientific Name': 7})