# Following Tutorial [link](https://www.neonscience.org/resources/learning-hub/tutorials/neon-api-intro-requests-py)

We will be hitting the following endpoints:  
* `sites/`, 
* `products/`, 
* `data/`; 

other endpoints can be seen through the [REST API Explorer](https://data.neonscience.org/data-api/explorer/)

The target is a value or series of values that indicate the specific data product, site, location, or data files we are looking up.

In [None]:
# imports
import os
import requests
import json
import itertools
import rasterio as rio
import matplotlib.pyplot as plt
from rasterio.plot import show
import rasterio
import rasterio.features
import rasterio.warp

## Defining server url:

In [None]:
# Server URL
# more info on server url and how it influences the base url: https://swagger.io/docs/specification/api-host-and-base-path/
server_url = "http://data.neonscience.org/api/v0/"

# Querying by a site level:
> NEON manages 81 different sites across the United States and Puerto Rico. These sites are separated into two main groups, terrestrial and aquatic, and the aquatic sites are further subdivided into lakes, rivers, and wadable streams. Each of these different site types have different instrumentation and observation configurations, so not every data product is available at every site. We can start by asking what kinds of data products are available for a given site. This is done by using the sites/ endpoint in the API; this endpoint is used for getting information about specific NEON field sites. In this example we will query which data products are available at the Lower Teakettle (TEAK) site.

In [None]:

# Site Code for Lower Teakettle 
sitecode = "TEAK"
# Define the url, using the sites/ endpoint
url = server_url + "sites/" + sitecode
print(url)

## Making a call to the url and investigating the response:

In [None]:
# Request the url
site_request = requests.get(url)

In [None]:
#looking at what's under the hood for the site request:
site_request.__dict__

In [None]:
# Converting the request to Python JSON object
site_json = site_request.json()
print(site_json)

In [None]:
# Use the 'keys' method to view the component of the uppermost json dictionary
print(site_json.keys())
"""
    This output shows that the entire API response is contained within a single dict called 'data'. 
    In order to access any of the information contained within this highest-level 'data' dict, 
    we will need to reference that dictionary directly. Let's view the different keys that are available within 
    'data':
"""
#accessing the data component:
data_keys = site_json['data'].keys()
print(data_keys)

> At the highest level, the JSON object is a dictionary containing a single element with the label 'data'. This 'data' element in turn contains a dictionary with elements containing various pieces of information about the site. When we want to know what elements a dict contians, we can use the .keys() method to list the keys to each element in that dict.

In [None]:
#looking at the first 12 components of the 'data' component in more detail: 
dict(itertools.islice(site_json['data'].items(),12))

> This last piece of information in the 'data' dictionary is stored within the 'dataProducts' key. The 'dataProducts' element is a list of dictionaries, one for each type of NEON data product available at the site; each of these dictionaries has the same keys, but different values. Let's look at the JSON for the third to last entry ("[-3]") in the list of data products:

In [None]:
#View a data product dictionary
site_json['data']['dataProducts'][-3]

### looking at all the product codes in this site query:

In [None]:
#View product code and name for every available data product
for product in site_json['data']['dataProducts']:
    print(product['dataProductCode'],product['dataProductTitle'])

looking for the availability of Ecosystem structure (DP3.30015.001) - this is the Canopy Height Model, one of the data products generated by NEON's Airborne Observation Platform (AOP).

In [None]:
#Set the Ecosystem structure (CHM) data product
product_code = 'DP3.30015.001'

For each data product, there will be a list of the months for which data of that type was collected and it available at the site, and a corresponding list with the URLs that we would put into the API to get data on that month of data products.

In [None]:
#Get available months of Ecosystem structure data products for TEAK site
#Loop through the 'dataProducts' list items (each one is a dictionary) at the site
for product in site_json['data']['dataProducts']: 
    #if a list item's 'dataProductCode' dict element equals the product code string
    if(product['dataProductCode'] == product_code): 
        #print the available months
        print('Available Months: ',product['availableMonths'])
        print('URLs for each Month:')
        #print the available URLs
        for url in product['availableDataUrls']:
            print(url)

# Querying at Product level: 

In [None]:
#Make request:
product_request = requests.get(server_url+'products/'+product_code)
product_json = product_request.json()
print(product_json)

Similar structure, data main key with nested dict

In [None]:
#Print keys for product data dictionary
print(product_json['data'].keys())

In [None]:
# looking at the type of information within: 
#Print code, name, and abstract of data product
print("Product Code queried:", product_json['data']['productCode'])
print("Product Name:", product_json['data']['productName'],'\n')
print("Product Abstract", product_json['data']['productAbstract'])

Use case: 
> To look up the availability of the data product, we want the `siteCodes` element. This is a list with an entry for each site where the data product is available. Each site entry is a dict whose elements includes site code, a list of months for which data is available, and a list of the API request URLs to request data for that site for a given month.

In [None]:
#View keys of one site dictionary
print(product_json['data']['siteCodes'][0].keys())

> We can look up the availability of data at a particular site and get a URL to request data for a specific month. We saw that Lower Teakettle (TEAK) has the data product we want for June 2018; we can get the URL needed to request that data by creating a nested for loop to go through the site and month lists.
note we have already defined the sitecode for the Lower Teak that we're interested in:

In [None]:
#View available months and corresponding API urls, then save desired URL
for site in product_json['data']['siteCodes']:
    if(site['siteCode'] == sitecode):
        for month in zip(site['availableMonths'],site['availableDataUrls']): #Loop through the list of months and URLs
            print(month[0],month[1]) 
            if(month[0] == '2018-06'): #If data is available for the desired month, save the URL
                data_url = month[1]

# Data File Querying:
> We now know that CHM data product is available for 2018-06 at the Lower Teakettle site. Using the server url, site code, product code, and a year-month argument, we can make a request to the data/ endpoint of the NEON API. This will allow us to see what CHM data files can be obtained for 2018-06 at the Lower Teakettle site, and to learn the locations of these files as URLs.


In [None]:
#Make Request
data_request = requests.get(server_url+'data/'+product_code+'/'+sitecode+'/'+'2018-06')
data_json = data_request.json()
print(data_json)

Alternatively we could use one of the "Available Data URLs" from a sites/ or products/ request, like the data_url we saved earlier.

In [None]:
#Make request with saved url
data_request = requests.get(data_url)
data_json = data_request.json()

In [None]:
#Print dict key for 'data' element of data JSON
print(data_json['data'].keys())

As with the sites JSON content, the uppermost level of a data request JSON object is a dictionary whose only member has the `data` key; this member in turn is a dictionary with the product code, the sitecode, the month, and a list of the available data files.

The `files` list is a list of Python dictionaries, one for each file available based on our query. The dictionary for each file includes the `name` of the file, `size` of the file in bytes, a `crc32c` checksum code, and the `url` of the file - clicking on this url will download the file.

In [None]:
#View keys and values in first file dict
for key in data_json['data']['files'][0].keys(): #Loop through keys of the data file dict
    print(key,':\t', data_json['data']['files'][0][key])

In [None]:
#Display the names of the first 10 files
for file in data_json['data']['files'][:10]:
    print(file['name'])

In [None]:
#pull out information on only the CHM raster (tif) files:
for file in data_json['data']['files'][:20]:
    if 'CHM.tif' in file['name']:
        print(file['name'])
        print(file['url'])

click the link for NEON_D17_TEAK_DP3_313000_4098000_CHM.tif, download the file, and place into this directory

NOTE: the filename is hard coded, so if the response changes this code will be depreciated

In [None]:
#using rasterio library:
chm_tif = os.getcwd() + '/NEON_D17_TEAK_DP3_313000_4098000_CHM.tif'
#chm_tif = 'NEON_D17_TEAK_DP3_313000_4098000_CHM.tif'
print(chm_tif)
chm = rio.open(chm_tif)

In [None]:
#Configure the plot
fig, ax = plt.subplots(1,1, figsize=(5,5));

#Don't use scientific notation for the y axis label
ax.get_yaxis().get_major_formatter().set_scientific(False)

#Display the CHM
show((chm, 1), ax=ax, cmap='Greens', title='NEON_D17_TEAK_DP3_313000_4098000_CHM.tif');

In [None]:
with rasterio.open('NEON_D17_TEAK_DP3_313000_4098000_CHM.tif') as dataset:

    # Read the dataset's valid data mask as a ndarray.
    mask = dataset.dataset_mask()

    # Extract feature shapes and values from the array.
    for geom, val in rasterio.features.shapes(
            mask, transform=dataset.transform):

        # Transform shapes from the dataset's own coordinate
        # reference system to CRS84 (EPSG:4326).
        geom = rasterio.warp.transform_geom(
            dataset.crs, 'EPSG:4326', geom, precision=6)

        # Print GeoJSON shapes to stdout.
        print(geom)