As an additional exploration with the species identified as having GAP habitat maps available, this notebook explores adding a view of GAP habitat metrics to the cache of information. This may be useful in determining where else species may have range within states. The process uses the 2018 US State boundaries from Census, the GAP range bounding box (generated previously in the basic caching of GAP species data from the GAP range services), and a set of pre-calculated GAP metrics that include acres within specific protection status designations in each state. It first finds a basic intersection of the range bounding box with a set of states, retrieves the total state metrics for all GAP species, and then filters to an individual species (no current query criteria for this at the API). This information is then cached for further reference.

In [9]:
import geopandas as gpd
import json
import requests
from IPython.display import display
import bispy
from joblib import Parallel, delayed
import jsonschema

gap = bispy.gap.Gap()
bis_utils = bispy.bis.Utils()

import helperfunctions

In [2]:
us_states = gpd.read_file("https://www2.census.gov/geo/tiger/TIGER2018/STATE/tl_2018_us_state.zip")

In [3]:
# Open the file back up and verify
with open("../cache/gap.json", "r") as f:
    gap_cache = json.loads(f.read())
    f.close()

In [4]:
# Tease out the unique GAP species that we were able to find in the entire set of searched names
gap_species = list(map(json.loads,set(map(json.dumps, [r for r in gap_cache if "GAP Species" in r.keys()]))))

In [5]:
%%time
# Use joblib to run multiple requests for SGCN records in parallel via scientific names
gap_metrics = Parallel(n_jobs=8)(delayed(gap.gap_metrics_species)(us_states, spp["GAP Species"]["GAP_SpeciesCode"], spp["GAP Species"]["Range Bounding Box"]) for spp in gap_species)


CPU times: user 3.87 s, sys: 1.65 s, total: 5.52 s
Wall time: 2min 31s


In [11]:
json.dumps(gap_metrics[:2])

'[{"GAP_SpeciesCode": "rRRCSx", "State Metrics": [{"state_fipscode": "12", "state_name": "Florida", "taxa": "R", "sppcode": "rRRCSx", "spp_comname": "Rim Rock Crowned Snake", "spp_sciname": "Tantilla oolitica", "gapstat1ac": 22639.3502175342, "gapstat2ac": 16029.5530917033, "gapstat3ac": 86915.4629228064, "gapstat4ac": 123776.9633437956, "gapstat12ac": 38668.9033092375, "gapstat123ac": 125584.3662320439, "totalac": 249361.3295758395, "gapstat1perc": 9.07893387320458, "gapstat2perc": 6.42824335231504, "gapstat3perc": 34.8552291851541, "gapstat4perc": 49.6375935893262, "gapstat12perc": 15.5071772255196, "gapstat123perc": 50.3624064106738, "gapstat12group": "10-17", "gapstat123group": ">50", "id": 6180}]}, {"GAP_SpeciesCode": "aNRWAx", "State Metrics": [{"state_fipscode": "37", "state_name": "North Carolina", "taxa": "A", "sppcode": "aNRWAx", "spp_comname": "Neuse River Waterdog", "spp_sciname": "Necturus lewisi", "gapstat1ac": 17.1244029033, "gapstat2ac": 1105.302369213, "gapstat3ac": 71

In [7]:
# Cache the array of retrieved documents and return/display a random sample for verification
display(bis_utils.doc_cache("../cache/gap_metrics.json", gap_metrics))

{'Doc Cache File': 'cache/gap_metrics.json',
 'Number of Documents in Cache': 63,
 'Document Number 47': {'GAP_SpeciesCode': 'aSMSAx',
  'State Metrics': [{'state_fipscode': '48',
    'state_name': 'Texas',
    'taxa': 'A',
    'sppcode': 'aSMSAx',
    'spp_comname': 'San Marcos Salamander',
    'spp_sciname': 'Eurycea nana',
    'gapstat1ac': 0.0,
    'gapstat2ac': 142.1103046131,
    'gapstat3ac': 344.712006495,
    'gapstat4ac': 14308.2170076573,
    'gapstat12ac': 142.1103046131,
    'gapstat123ac': 486.8223111081,
    'totalac': 14795.0393187654,
    'gapstat1perc': 0.0,
    'gapstat2perc': 0.960526711360972,
    'gapstat3perc': 2.32991612301957,
    'gapstat4perc': 96.7095571656195,
    'gapstat12perc': 0.960526711360972,
    'gapstat123perc': 3.29044283438054,
    'gapstat12group': '<1',
    'gapstat123group': '1-10',
    'id': 20161}]}}

# Schema Validation

In [12]:
gap_metrics_schema = helperfunctions.load_schema('gap_metrics')
display(gap_metrics_schema)

jsonschema.validate(gap_metrics, gap_metrics_schema)

{'definitions': {'items': {'$id': '#items',
   'type': ['object', 'array'],
   'title': 'Generic container for items in a dataset',
   'description': 'A JSON array or object property containing one or more items in a dataset or data structure within a dataset.'}},
 '$schema': 'http://json-schema.org/draft-07/schema#',
 '$id': 'http://data.usgs.gov/property_registry/',
 'type': 'array',
 'title': 'Cache of GAP Metrics for US States',
 'description': 'This dataset contains a set of pre-calculated protection status metrics pulled from an API for a collection of GAP species. Metrics are for US states, showing the calculated acres and percentage land cover with from the 3-meter GAP species habitat maps.',
 'required': ['GAP_SpeciesCode', 'State Metrics'],
 'properties': {'GAP_SpeciesCode': {'$id': '#/properties/GAP_SpeciesCode',
   'type': 'string',
   'title': 'GAP Species Code',
   'description': 'The unique, persistent identifier for a GAP species.',
   'examples': ['aNRWAx']},
  'State 