  ### Retrieving Data from the API

We need a way to retrieve the data from the API. To achieve this, we first import Python's requests library, which allows us to make HTTP requests. Additionally, we import the pprint function from the pprint library to make the output more readable. Finally, we define the API endpoint we'll be interacting with, setting API_ENDPOINT to https://testing-api.water.noaa.gov/hefs/. Here's how we do it:

In [None]:
# start by importing the requests library
import requests
# import the pprint library to make the output more readable
from pprint import pprint
# define the api-endpoint
API_ENDPOINT = "https://testing-api.water.noaa.gov/hefs"

### Most Recent Ensemble Forecast for a Location and Plotting

In other cases, you might want to retrieve and plot recent ensemble data. The API allows ordering by a certain value of a parameter. This is achieved by adding `ordering=` to a parameter name.

For example, to retrieve the latest QINE ensemble forecast for location id PGRC2, you'll specify `location_id=PGRC2`, `parameter_id=QINE`, `ordering=-start_date_date`, and `limit=1`.  Note that the negative sign in front of start_date_date tells the API to sort the results by start_date_date descending

`/v1/headers/?location_id=PGRC2&parameter_id=QINE&ordering=-start_date_date&limit=1`


In [None]:
from datetime import datetime, timedelta
# create a series request with location_id, parameter_id, and ordering filters
uri = API_ENDPOINT + f"/v1/headers/?location_id=PGRC2&parameter_id=QINE&limit=1&ordering=-start_date_date"
# get the response
response = requests.request("GET", uri)
# get most recent start date
startDate = response.json()['results'][0]['start_date_date']
# print the response
pprint(response.json())

Now, with the most recent forecast startDate, we will retrieve all of the headers of that date. Then, get the total number of forecasts to be used in the graph process later.

For example, to retrieve the newest data series headers, you can format the URI like this:

`/v1/headers/?location_id=PGRC2&parameter_id=QINE&forecast_date_date={startDate}`

Where the startDate is the value of the variable found in the previous section

In [None]:
from datetime import datetime, timedelta
# create a series request with location_id, parameter_id, and start_date_date filters
uri = API_ENDPOINT + f"/v1/headers/?location_id=PGRC2&parameter_id=QINE&forecast_date_date={startDate}"
# get the response
response = requests.request("GET", uri)

count = response.json()["count"]
# print the response
pprint(response.json())

Next, with the most recent forecast startDate and total number of forecasts available, we will retrieve all of the ensemble data for that start date. This is the data that we will plot next.

We will use the term `limit` to set the total series output to the count set above. To retrieve the newest data series ensembles, you can format the URI like this:

`/v1/ensembles/?location_id=PGRC2&parameter_id=QINE&forecast_date_date={startDate}&limit={count}`

Where the count is the total number of forecasts for a given ensemble

In [None]:
# create a series request with location_id, parameter_id, start_date_date, and limit filters
uri = API_ENDPOINT + f"/v1/ensembles/?location_id=PGRC2&parameter_id=QINE&start_date_date={startDate}&limit={count}"
# get the response
response = requests.request("GET", uri)
# print the response
pprint(response.json())

Finally, we take the response data to plot the ensemble. To do this, we must parse through the response date and plot each of the forecast's time and corresponding values onto a single graph. The Python library matplotlib will enable is plotting process.

In [None]:
import matplotlib.pyplot as plt
import datetime

def graph_ensemble_data(response_data):
  """Graphs ensemble data from a API response.

  Args:
    response_data: A dictionary containing the API response data.
  """
  # check if results is present the response data
  if 'results' in response_data and response_data['results']:
    # iterate through result in results
    for result in response_data['results']:
      # check if event is present in response data
      if 'events' in result and result['events']:
        # iterate through event forcasts in the ensemble api response
        for event in result['events']:
          # initialize times list
          times = []
          # initialize values list
          values = []
          # iterate through values in the sepecific forcast event
          for value in result['events']:
              # prepare datetime for reformatting
              datetime_str = f"{value['date']} {value['time']}"
              # format datetime to Year-Month-Day Hour:Minutes:Seconds
              datetime_obj = datetime.datetime.strptime(datetime_str, "%Y-%m-%d %H:%M:%S")
              # add values from the forecast to their respective lists
              times.append(datetime_obj)
              values.append(value['value'])
          # plot the date for the event forecast
          #print("now")
          plt.plot(times, values)
    # set x graph label
    plt.xlabel('Time')
    # set y graph label
    plt.ylabel('Value')
    # set graph title
    plt.title('Ensemble Data')
    # rotate x text vertically for better visibility
    plt.xticks(rotation='vertical')
    # display the actaul plot
    plt.show()

# call graphing function
graph_ensemble_data(response.json())


## Summary

In this notebook, we learned how to use the HEFS API to retrieve data from the HEFS data store. We also learned how to use the API to filter results based on specific fields like `location_id`, `parameter_id`, and many more. The API filters on all available fields by appending the field name and value to the URI as query parameters.

These techniques can be used to retrieve, filter, and paginate data from all resources the HEFS data store.  With one exception, the GraphQL interface, we cover that next.