## Using the BISON API
The USGS provides an API for accessing species observation data. https://bison.usgs.gov/doc/api.jsp

This API is much better documented than the NWIS API, and we'll use it to dig a bit deeper into how the `requests` package can faciliate data access via APIs. 

* We'll begin by replicating the example API call they show on their web page:<br> 
[https://bison.usgs.gov/api/search.json?species=Bison bison&type=scientific_name&start=0&count=1](
https://bison.usgs.gov/api/search.json?species=Bison%20bison&type=scientific_name&start=0&count=1)

In [1]:
#First, import the wonderful requests module
import requests

* Now, we'll deconstruct the example URL into the service URL and parameters, saving the paramters as a dictionary. Note we are just providing a few of the parameters available through the [API](https://bison.usgs.gov/doc/api.jsp#opensearch). We could add more search criteria if we wanted, but for now we just want to grab the first 500 Bison records. 

In [2]:
# Construct the service URL as two components: the service URL and the request parameters
url = 'http://bison.usgs.gov/api/search.json'
params = {'species':'Bison bison',
          'type':'scientific_name',
          'start':'0',
          'count':'500'
         }

* With the components set as variables, we use the `requests.get()` function to send our request off to the server at the address provided, storing the servers response as a variable called `response`. 

In [3]:
#Send the request to the server and store the response as a variable
response = requests.get(url,params)

* This response object contains a number of properties and methods. Let's have a look at the reponse in raw text format. 

In [4]:
#View the reponse in text format
print(response.text)

{"occurrences":{"legend":{"fossil":1354,"observation":2247,"centroid":1,"specimen":1017}},"total":4618,"searchTime":184,"offset":0,"data":[{"geo":"Yes","provider":"iNaturalist.org","name":"Bison bison","decimalLongitude":"-112.97224426269531","decimalLatitude":"53.55277633666992","occurrenceID":"1453380369","common_name":"bison, American bison, American Bison, Bisonte americano","basis":"Observation"},{"geo":"Yes","provider":"iNaturalist.org","name":"Bison bison","decimalLongitude":"-110.76473236083984","decimalLatitude":"43.64556884765625","occurrenceID":"1024331790","common_name":"bison, American bison, American Bison, Bisonte americano","basis":"Observation"},{"geo":"Yes","provider":"Museum of Comparative Zoology, Harvard University","name":"Bison bison","decimalLongitude":"-84.72786712646484","decimalLatitude":"38.96986389160156","occurrenceID":"2461345968","common_name":"bison, American bison, American Bison, Bisonte americano","basis":"Fossil"},{"geo":"Yes","provider":"iNaturalis

**Yikes**, that's much less readable than the NWIS output!

Well, that's because the response from the BISON server is in **JSON** format. JSON, short for *JavaScript Object Notation*, is a text document that stores information in `key`:`value` pairs, *much like a Python dictionary*. Still, it's a raw text object, but one that we convert into a Python dictionary using `requests`'s `json()` function to convert the servers response into a Python dictionary.

In [5]:
#Convert the response 
data = response.json()
type(data)

dict

* Ok, if it's a dictionary, what are it's keys? 

In [6]:
#List the keys in the returned JSON object
data.keys()

dict_keys(['occurrences', 'total', 'searchTime', 'offset', 'data', 'species', 'eezs', 'itemsPerPage', 'counties', 'type', 'georeferenced', 'states'])

* What are the values linked with the 'data' key?

In [7]:
#Show the value associated with the `data` key
data['data']

[{'geo': 'Yes',
  'provider': 'iNaturalist.org',
  'name': 'Bison bison',
  'decimalLongitude': '-112.97224426269531',
  'decimalLatitude': '53.55277633666992',
  'occurrenceID': '1453380369',
  'common_name': 'bison, American bison, American Bison, Bisonte americano',
  'basis': 'Observation'},
 {'geo': 'Yes',
  'provider': 'iNaturalist.org',
  'name': 'Bison bison',
  'decimalLongitude': '-110.76473236083984',
  'decimalLatitude': '43.64556884765625',
  'occurrenceID': '1024331790',
  'common_name': 'bison, American bison, American Bison, Bisonte americano',
  'basis': 'Observation'},
 {'geo': 'Yes',
  'provider': 'Museum of Comparative Zoology, Harvard University',
  'name': 'Bison bison',
  'decimalLongitude': '-84.72786712646484',
  'decimalLatitude': '38.96986389160156',
  'occurrenceID': '2461345968',
  'common_name': 'bison, American bison, American Bison, Bisonte americano',
  'basis': 'Fossil'},
 {'geo': 'Yes',
  'provider': 'iNaturalist.org',
  'name': 'Bison bison',
  'deci

* Oh, it's a list of occurrences! Let's examine the first one...

In [8]:
#Display the first "data" value
data['data'][0]

{'geo': 'Yes',
 'provider': 'iNaturalist.org',
 'name': 'Bison bison',
 'decimalLongitude': '-112.97224426269531',
 'decimalLatitude': '53.55277633666992',
 'occurrenceID': '1453380369',
 'common_name': 'bison, American bison, American Bison, Bisonte americano',
 'basis': 'Observation'}

* We see it's a dictionary too! Let's list the `decimalLatitude` item value...

In [9]:
#We can get the latitude of the record from it's `decimalLatitude` key
data['data'][0]['decimalLatitude']

'53.55277633666992'

► **So** we see the Bison observations are stored as list of dictionaries which are accessed within the `data` key in the results dictionary generated from the JSON response to our API request. (Phew!)

* With a bit more code we can loop through all the data records and print out the lat and long coordinates...

In [10]:
#Loop thorough each observation and print the lat and long values
for observation in data['data']:
    if observation['geo'] == 'Yes':
        print (observation['decimalLatitude'],observation['decimalLongitude'])

53.55277633666992 -112.97224426269531
43.64556884765625 -110.76473236083984
38.96986389160156 -84.72786712646484
43.714542388916016 -103.40538024902344
38.33570098876953 -111.41018676757812
44.643890380859375 -110.45471954345703
64.08333587646484 -145.6666717529297
46.11265182495117 -101.04034423828125
44.91789245605469 -110.6800537109375
34.731868743896484 -98.71973419189453
38.911399841308594 -101.59110260009766
38.911399841308594 -101.59110260009766
33.68899917602539 -101.99800109863281
33.68899917602539 -101.99800109863281
44.4246711730957 -110.9105453491211
44.64444351196289 -110.92009735107422
40.859981536865234 -112.13763427734375
44.433589935302734 -110.9223403930664
38.96986389160156 -84.72786712646484
44.905643463134766 -110.14917755126953
50.8294792175293 -100.19963073730469
53.59470748901367 -112.8346939086914
50.83022689819336 -100.2008056640625
59.795589447021484 -127.46961975097656
50.767276763916016 -100.2241439819336
59.346961975097656 -125.9528579711914
50.80156326293

<details>
    <summary>
► If the above throws an error, can you debug it? HINT: the `geo` tag indicates whether coordinate info exist for the record...
    </summary>
    <pre><code>
#Loop thorough each observation and print the lat and long values
for observation in data['data']:
    if(observation['geo'] == 'Yes'):
        print (observation['decimalLatitude'],observation['decimalLongitude'])
    </code></pre>
</details>

In [11]:
#Loop thorough each observation and print the lat and long values
for observation in data['data']:
    if(observation['geo'] == 'Yes'):
        print (observation['decimalLatitude'],observation['decimalLongitude'])

53.55277633666992 -112.97224426269531
43.64556884765625 -110.76473236083984
38.96986389160156 -84.72786712646484
43.714542388916016 -103.40538024902344
38.33570098876953 -111.41018676757812
44.643890380859375 -110.45471954345703
64.08333587646484 -145.6666717529297
46.11265182495117 -101.04034423828125
44.91789245605469 -110.6800537109375
34.731868743896484 -98.71973419189453
38.911399841308594 -101.59110260009766
38.911399841308594 -101.59110260009766
33.68899917602539 -101.99800109863281
33.68899917602539 -101.99800109863281
44.4246711730957 -110.9105453491211
44.64444351196289 -110.92009735107422
40.859981536865234 -112.13763427734375
44.433589935302734 -110.9223403930664
38.96986389160156 -84.72786712646484
44.905643463134766 -110.14917755126953
50.8294792175293 -100.19963073730469
53.59470748901367 -112.8346939086914
50.83022689819336 -100.2008056640625
59.795589447021484 -127.46961975097656
50.767276763916016 -100.2241439819336
59.346961975097656 -125.9528579711914
50.80156326293

### Using Pandas to streamline the process...
Pandas can create a dataframe directly from dictionary values. 

In [12]:
import pandas as pd
df = pd.DataFrame(data['data'])
df.head()

Unnamed: 0,geo,provider,name,decimalLongitude,decimalLatitude,occurrenceID,common_name,basis
0,Yes,iNaturalist.org,Bison bison,-112.97224426269533,53.55277633666992,1453380369,"bison, American bison, American Bison, Bisonte...",Observation
1,Yes,iNaturalist.org,Bison bison,-110.76473236083984,43.64556884765625,1024331790,"bison, American bison, American Bison, Bisonte...",Observation
2,Yes,"Museum of Comparative Zoology, Harvard University",Bison bison,-84.72786712646484,38.96986389160156,2461345968,"bison, American bison, American Bison, Bisonte...",Fossil
3,Yes,iNaturalist.org,Bison bison,-103.40538024902344,43.714542388916016,891750841,"bison, American bison, American Bison, Bisonte...",Observation
4,Yes,iNaturalist.org,Bison bison,-111.41018676757812,38.33570098876953,2574088021,"bison, American bison, American Bison, Bisonte...",Observation


So now we can use our Panda's know-how to do some nifty analyses, including subsetting records for a specific provider.
* First we'll get a list of unique providers found in the data

In [13]:
#Generate a list of providers
df.provider.unique()

array(['iNaturalist.org',
       'Museum of Comparative Zoology, Harvard University',
       'University of Wyoming Museum of Vertebrates',
       'University of Alaska Museum of the North',
       'California State University, Chico Vertebrate Museum',
       'University of Arkansas Collections Facility, UAFMC',
       'National Museum of Natural History, Smithsonian Institution',
       'Cornell Lab of Ornithology', 'Canadian Museum of Nature',
       'Utah Museum of Natural History',
       'James R. Slater Museum of Natural History', 'Field Museum',
       'Yale University Peabody Museum',
       'Museum of Texas Tech University (TTU)',
       'University of Alberta Museums', 'Royal Ontario Museum',
       'naturgucker.de', 'European Molecular Biology Laboratory (EMBL)',
       'University of British Columbia',
       "Muséum d'Histoire Naturelle de Bourges", 'BISON',
       'Texas A&M University Biodiversity Research and Teaching Collections',
       'Natural History Museum, Unive

* Now, we'll subset the rows that include that provider...

In [14]:
df.query("provider == 'iNaturalist.org'")

Unnamed: 0,geo,provider,name,decimalLongitude,decimalLatitude,occurrenceID,common_name,basis
0,Yes,iNaturalist.org,Bison bison,-112.97224426269531,53.55277633666992,1453380369,"bison, American bison, American Bison, Bisonte...",Observation
1,Yes,iNaturalist.org,Bison bison,-110.76473236083984,43.64556884765625,1024331790,"bison, American bison, American Bison, Bisonte...",Observation
3,Yes,iNaturalist.org,Bison bison,-103.40538024902344,43.714542388916016,891750841,"bison, American bison, American Bison, Bisonte...",Observation
4,Yes,iNaturalist.org,Bison bison,-111.41018676757812,38.33570098876953,2574088021,"bison, American bison, American Bison, Bisonte...",Observation
10,Yes,iNaturalist.org,Bison bison,-110.6800537109375,44.91789245605469,2251905047,"bison, American bison, American Bison, Bisonte...",Observation
...,...,...,...,...,...,...,...,...
495,Yes,iNaturalist.org,Bison bison,-110.65933227539062,43.740753173828125,2244225920,"bison, American bison, American Bison, Bisonte...",Observation
496,Yes,iNaturalist.org,Bison bison,-112.39334869384766,41.077693939208984,1306572700,"bison, American bison, American Bison, Bisonte...",Observation
497,Yes,iNaturalist.org,Bison bison,-100.03150177001953,42.73929214477539,2252065444,"bison, American bison, American Bison, Bisonte...",Observation
498,Yes,iNaturalist.org,Bison bison,-110.60688018798828,44.82806396484375,2611072032,"bison, American bison, American Bison, Bisonte...",Observation


In [15]:
df.dtypes

geo                 object
provider            object
name                object
decimalLongitude    object
decimalLatitude     object
occurrenceID        object
common_name         object
basis               object
dtype: object

## Exercise:
* Extract the first 500 red wolf (*"Canis rufus"*) records from the BISON API. 
* Can you create a table listing the records collected by the `University of Kansas Biodiversity Institute`?
* *Challenge*: Can you create a table listing all the records collected in North Carolina?

In [16]:
#Grab first 500 red wolf records
urlCr = 'http://bison.usgs.gov/api/search.json'
paramsCr = {'species':'Canis rufus',
          'type':'scientific_name',
          'start':'0',
          'count':'500'
         }
#Send the request to the server and store the response as a variable
responseCr = requests.get(urlCr,paramsCr)
#Convert the response 
dataCr = responseCr.json()

In [17]:
# convert to a df
dfCr = pd.DataFrame(dataCr['data'])
# grab records of interest
dfCr.query("provider == 'University of Kansas Biodiversity Institute'")

Unnamed: 0,geo,provider,name,occurrenceID,common_name,basis,decimalLongitude,decimalLatitude
30,Yes,University of Kansas Biodiversity Institute,Canis rufus,686333137,Red Wolf,Specimen,-96.5749969482422,29.5310001373291
54,Yes,University of Kansas Biodiversity Institute,Canis rufus,686389327,Red Wolf,Specimen,-95.1777572631836,29.249670028686523
67,Yes,University of Kansas Biodiversity Institute,Canis rufus,686356790,Red Wolf,Specimen,-97.074951171875,27.816740036010746
428,Yes,University of Kansas Biodiversity Institute,Canis rufus,686333147,Red Wolf,Specimen,-96.5749969482422,29.5310001373291
430,Yes,University of Kansas Biodiversity Institute,Canis rufus,686389328,Red Wolf,Specimen,-95.1777572631836,29.249670028686523
472,Yes,University of Kansas Biodiversity Institute,Canis rufus,686354334,Red Wolf,Specimen,-96.6449966430664,29.59499931335449


In [25]:
# check for a state key
dataCr.keys()
dataCr['states']
#dfCrSt = pd.DataFrame(dataCr['states'])
# not there yet!

{'extent': {'miny': 24.396308000324215,
  'minx': -124.84897399961686,
  'maxy': 49.38435799996468,
  'maxx': -74.98628199981302},
 'total': 17,
 'data': {'North Carolina': {'total': '56', 'fips': '37'},
  'Indiana': {'total': '1', 'fips': '18'},
  'Oklahoma': {'total': '1', 'fips': '40'},
  'Tennessee': {'total': '1', 'fips': '47'},
  'Minnesota': {'total': '1', 'fips': '27'},
  'California': {'total': '3', 'fips': '06'},
  'Florida': {'total': '17', 'fips': '12'},
  'Alabama': {'total': '1', 'fips': '01'},
  'Arkansas': {'total': '5', 'fips': '05'},
  'Washington': {'total': '2', 'fips': '53'},
  'Mississippi': {'total': '3', 'fips': '28'},
  'New Mexico': {'total': '1', 'fips': '35'},
  'Texas': {'total': '86', 'fips': '48'},
  'Illinois': {'total': '2', 'fips': '17'},
  'Missouri': {'total': '4', 'fips': '29'},
  'Louisiana': {'total': '46', 'fips': '22'},
  'Maryland': {'total': '1', 'fips': '24'}},
 'legend': {'0': '0 - 5',
  '3': '15 - 20',
  '15': '75 - 86',
  '8': '40 - 45',
 