To provide access to the new form of the USNVC, we are experimenting with crafting a new high level REST API that provides all the functionality needed to query and interact with the classification in various ways. This notebook provides some query patterns that are currently in pretty raw form to begin testing and refining toward what we want in the API.

Currently, the very basic API is built as a Flask app running on a somewhat unstable virtual cloud machine via C9.io, so use at your own risk (to it not being available; but I'll try to kick the tires every once in a while :-). I have only exposed very basic, raw query and field filtering capability via "q" and "fields" parameters at this point (see examples below). We will refine this to limit how the system can be queried and make it simpler for non-expert users. That's the feedback I am looking for at this point - fields in the document structure to query, how to structure queries, etc.

The Python codes below simply use the requests package to issue HTTP requests to the REST API and a display helper to make the output more legible. These same URLs can be executed in any browser or other code platform. The results are only available in the raw JSON document format coming back from the database at this point. We will use the API to drive a replacement for the web app at usnvc.org, but we should also examine whether or not the API itself needs to return any other format of the data.

In [1]:
import requests
from IPython.display import display

# Instructions

I'm fiddling with how to have the API be essentially self-documenting and "followable" with a set of access points and instructions for access.

In [2]:
display (requests.get("https://bis-skybristol.c9users.io/bis/api/v0.1/usnvc").json())

{'units': {'Description': 'Retrieve USNVC unit data',
  'Identifiers': 'Include either the element_global_id or the UUID (USGS-assigned ID) after units/ to retrieve a single unit',
  'Search Parameters': [{'q': {'Description': 'Must be supplied as a valid JSON string that executes a find operation against the database in raw form.',
     'Link': 'https://docs.mongodb.com/manual/tutorial/query-documents/'}},
   {'fields': {'Description': 'Must be supplied as a valid JSON string that sets the fields to either return or suppress in the output.',
     'Link': 'https://docs.mongodb.com/manual/tutorial/project-fields-from-query-results/'}}]}}

# All Units

A simple request to the /units end point results in returning a set of unfiltered unit records with a limit set at 10 by default. The limit parameter can be set to a higher number if desired, but be prepared for a little bit of load time if you ask for too many.

In [3]:
display (requests.get("https://bis-skybristol.c9users.io/bis/api/v0.1/usnvc/units").json())

{'nextlink': {'rel': 'next',
  'url': 'https://bis-skybristol.c9users.io/bis/api/v0.1/usnvc/units?q=%7B%7D&skip=10&limit=10'},
 'total': 8246,
 'units': [{'Authorship': {'Concept Author': 'Hierarchy Revisions Working Group, Federal Geographic Data Committee (Faber-Langendoen et al. 2014)',
    'Description Author': 'Hierarchy Revisions Working Group',
    'Version Date': '8/2/2016'},
   'Concept History': {},
   'Confidence Level': {'Confidence Level': 'Moderate'},
   'Conservation Status': {'Global Rank': 'GNR',
    'Global Rank Review Date': '3/3/2011'},
   'Distribution': {'Geographic Range': 'Climate zones? Bailey (1989) Domains: Dry, Humid Tropical, Humid Temperate Domain, Subarctic Division of Polar Domain, and Mountain Divisions of Dry Domain. Less common in other divisions of Polar or Dry domains.'},
   'Environment': {'Environmental Description': '<i>Climate:</i> Climates range from humid tropical to boreal and subalpine, with fairly moderate moisture and temperature condition

# Single unit by URI

You'll see in the unit data that there is a field for "uri." This is dynamically generated using the REST API path and a new type of universal ID we built into the load process. This ID helps us maintain absolute integrity of the data system on our end. We will likely use it internally for reference within our applications, but I also built the API to respond to a request for a single unit by element_global_id, the unique identifier provided by the NatureServe Biotics data dump. Either the internal ID or the element_global_id can be added to the end of the /units/ end point to retrieve a single record.

In [4]:
display (requests.get("https://bis-skybristol.c9users.io/bis/api/v0.1/usnvc/units/860211").json())

{'unit': {'Authorship': {'Concept Author': 'Hierarchy Revisions Working Group, Federal Geographic Data Committee (Faber-Langendoen et al. 2012)',
   'Description Author': 'Hierarchy Revisions Working Group',
   'Version Date': '10/17/2014'},
  'Concept History': {},
  'Confidence Level': {'Confidence Level': 'Moderate'},
  'Conservation Status': {'Global Rank': 'GNR',
   'Global Rank Review Date': '3/3/2011'},
  'Distribution': {'Geographic Range': 'In non-tropical regions, this type is most common in the Bailey (1989) steppe divisions of the Dry Domain, the subarctic divisions of the Polar Domain, and is less common in other divisions of Polar or Dry domains. In the tropics, this type is uncommon in Humid Tropical and Humid Temperate domains, but common in Semi-humid Tropical domains (savannas).'},
  'Environment': {'Environmental Description': '<i>Climate:</i> Shrublands and grasslands occur in the following Trewartha Climatic zones: Aw = Tropical wet-dry; Am = Tropical wet-dry ("mon

# Queries

Queries are currently very raw but totally wide open if you know how to put together the syntax. The data sit in a MongoDB database and are accessed via pymongo, a low level API for working with the data. The q parameter requires a valid JSON construct that tells the API where to query and what to look for using the find method from pymongo. The following query looks for the Formation hierarchy level with an explicit query to that part of the document which looks something like the following:
```
{
    "Hierarchy":
    {
        "hierarchylevel":"Formation"
    }
}
```

In [5]:
display (requests.get("https://bis-skybristol.c9users.io/bis/api/v0.1/usnvc/units?q={%22Hierarchy.hierarchylevel%22:%22Formation%22}&limit=2").json())

{'nextlink': {'rel': 'next',
  'url': 'https://bis-skybristol.c9users.io/bis/api/v0.1/usnvc/units?q=%7Bu%27Hierarchy.hierarchylevel%27%3A+u%27Formation%27%7D&skip=2&limit=2'},
 'total': 36,
 'units': [{'Authorship': {'Concept Author': 'Hierarchy Revisions Working Group, Federal Geographic Data Committee (Faber-Langendoen et al. 2014)',
    'Description Author': 'D. Faber-Langendoen',
    'Version Date': '10/17/2014'},
   'Concept History': {},
   'Confidence Level': {'Confidence Level': 'Moderate'},
   'Conservation Status': {'Global Rank': 'GNR',
    'Global Rank Review Date': '3/3/2011'},
   'Distribution': {'Geographic Range': 'Warm Temperate Forest &amp; Woodland is found in the Mediterranean Basin and Mediterranean and warm-temperate regions in North America (California, Southeastern Coastal Plain), Chile, South Africa, Australia, India and Southeast Asia.',
    'Nations': {'Nation Info': [{'Abbreviation': 'CA',
       'Info API': 'https://restcountries.eu/rest/v2/alpha/CA',
     

More complex queries within the data are a little clunky at this point, because we have not introduced full text indexing methods, which we will probably do by adding ElasticSearch to the architecture and driving most or all of the API from that form of the data. The following example looks for the term "conifer forests" within the Physiognomy and Structure text using a simple regular expression type of query, which is the only way to look for something other than a full value of a field.

This query also introduces the "fields" parameter to limit the returned parts of the documents for faster processing. Fields also requires a JSON construct where you can specify either the fields you want with a "1" value or specific fields you want to exclude with a "0" value. Here, I ask for both the Overview and Vegetation parts of the documents.

In [6]:
display (requests.get("https://bis-skybristol.c9users.io/bis/api/v0.1/usnvc/units?q={%22Vegetation.Physiognomy%20and%20Structure%22:{%22$regex%22:%22conifer%20forests%22}}&fields={%22Overview%22:1,%22Vegetation%22:1}").json())

{'total': 4,
 'units': [{'Overview': {'Classification Comments': 'There are a number of classification issues pertaining to groups within this macrogroup, which may eventually result in conceptual changes to either this macrogroup, or to other related macrogroups. Below are some of the relevant comments. In addition, this description certainly needs review and additions for interior British Columbia or southern Alberta characteristics.<br /><br />How to treat <i>Pinus flexilis</i> in the Rocky Mountains is still somewhat uncertain. For now, there are three groups which have limber pine as a component. The limber pine group included in this macrogroup is composed predominantly of limber pine or juniper that is elevationally below the zone of continuous lower montane forests found in the main Rocky Mountain cordillera. The associations placed in this group are restricted to foothill settings on rock outcrops, or to escarpments in the Great Plains, and in Montana, these are limestone outc

That's pretty much the extent of the API at this point. From here, we need to work on the following:

* Define the actual query patterns we want to support through the high level API and build those in as direct query parameters. We can do things like abstract something like a "name" search to work across multiple fields containing names. We can add multiple fields to a more open text search or group fields together into higher level search terms.
* Optimize searches on distribution information such that we accept a variety of input parameters such as state names, state abbreviations, FIPS codes, and other ways of designating a geographic range of interest.
* Work out an optimal method of returning the USNVC hierarchy for browsing and treeview visualization. This will be its own type of API call. We may also revisit the indivudal unit hierarchy to put that into an actual tree structure within the documents if that makes better sense.