### NEON API Example

The National Ecological Observatory Network (NEON) continuously collects data across Long Term Ecological Research (LTER) sites and distributes these data via its web site. It's not a very friendly web site to use to grab data, requiring a lot of clicks and scrolls. However, they've provided an API so that we can write a script to collect data. 

This API, however, is also not as simple as many in that it has lots of nested attributes. For example, if you want to grab all the active ground beetle data, you first need to query the API for a list all LTER sites that have beetle data; this returns (among other things) a list of the URLs connecting to the actual data. So then you have to invoke another call to retrieve the actual data you want.

But this serves as a good example showing that, despite its complexity, scripts can be written to navigate even convoluted APIs. 

Source: http://data.neonscience.org/data-api

In [1]:
import requests

In [2]:
#Build the URL
url = 'http://data.neonscience.org/api/v0/products'
params = {'data':{'productStatus':'ACTIVE',
                  'productName':'Ground beetles sampled from pitfall traps'
                 }
         }

In [None]:
#Send the request, get the response
response = requests.get(url,params)

In [None]:
#Interpret the response as a JSON object & reveal it's top-level keys
jsonObj = response.json()
jsonObj.keys()

In [None]:
#Retreive this 'data' collection; how many objects are returned? 
data = jsonObj['data']
len(data)

In [None]:
#Inspect one of the data objects; what are its properties (i.e. its keys)?
data[0].keys()

In [None]:
#Examine the 'siteCodes' object (for the first data object)
# How many site codes in the firs data item?
allSiteCodes = data[0]['siteCodes']
len(allSiteCodes)

In [None]:
#What properties (i.e. keys) does the siteCodes object have?
siteCodes = allSiteCodes[0]
siteCodes.keys()

In [None]:
#What is 'siteCode' (of the first <siteCode> object in the first <data> object)?
siteCode = siteCodes['siteCode']
siteCode

In [None]:
#How about we loop through ALL 184 <data> objects and for each
# loop through all the <siteCode> objects and print its siteCode
for item in data:
    for siteCode in item['siteCodes']:
        print(siteCode['siteCode'])
        

In [None]:
#Or, with a minor tweak, we can reveal all the data urls...
# for each <siteCodes> collection included in each <data> object
for item in data:
    for siteCode in item['siteCodes']:
        for url in (siteCode['availableDataUrls']):
            #Here we print the URL, 
            #but we could write code to download the content at each site
            print url
        