# GraphQL (Preview, Limited Functionality)

NOTE: GraphQL will be part of the 1.1 release anticipated in Q1 2025

The HEFS API provides a GraphQL endpoint for querying data.

GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools.

This notebook demonstrates how to use the HEFS GraphQL endpoint to query data. We will follow the same path as the prior notebook.

We will use the same code as in our REST_DEMO notebook to get the data we want to query.  So run the following cell to get set up.

** Remember to set the API_ENDPOINT to the correct value for your environment. **

In [9]:
# start by importing the requests library
import requests
# import the pprint library to make the output more readable
from pprint import pprint
# define the api-endpoint
# Note: API Endpoint will change to https://api.water.noaa.gov/hefs/ after testing is complete
API_ENDPOINT = "https://testing-api.water.noaa.gov/hefs/"

## Simple Query

## Understanding GraphQL and Making Queries

GraphQL is a query language for your API that enables clients to request exactly the data they need. Unlike REST APIs, where you have to hit multiple endpoints to fetch different pieces of data, GraphQL allows you to get all your data in a single request. This makes data retrieval more efficient and flexible.

Notice that we are using the terms edges and nodes, which should be familiar to those who have worked with network structures.  But you can interchange those with your own terms, as GraphQL allows both the queries and definition.

### Retrieving Series Data

Here we are getting the series list from the api, this the is GraphQL equivalent of the ```/series/```REST endpoint.

In [10]:
# Define the GraphQL query as a Python string
graphql_query = '''
{
  seriesList {
    edges {
      node {
        id
        type
        locationId
        parameterId
        ensembleId
        ensembleMemberIndex
        timeStepUnit
        timeStepMultiplier
        startDateDate
        startDateTime
        endDateDate
        endDateTime
        forecastDateDate
        forecastDateTime
        missVal
        stationName
        lat
        lon
        x
        y
        z
        units
        creationDate
        creationTime
      }
    }
  }
}
'''
# Create a dictionary to send the query
payload = {"query": graphql_query}

# Create the series request
uri = API_ENDPOINT + "/v1/graphql/"

# Perform a POST request to get the response
response = requests.post(uri, json=payload, timeout=10, header=header)

# Print the response
pprint(response.json())

{'seriesList': {'edges': []}}


## Limiting Number of Results

You can limit the number of results returned by using the limit parameter. For example, to limit the number of results to 10, using the `first: 2` which will begin with the first item of the list, and return 2.

This is equivalent to the REST query `/v1/series/?limit=2&offset=0` 

In [7]:
# Define the GraphQL query as a Python string
graphql_query = '''
{
  seriesList(first: 2) {
    edges {
      node {
        id
        type
        locationId
        parameterId
        ensembleId
        ensembleMemberIndex
        timeStepUnit
        timeStepMultiplier
        startDateDate
        startDateTime
        endDateDate
        endDateTime
        forecastDateDate
        forecastDateTime
        missVal
        stationName
        lat
        lon
        x
        y
        z
        units
        creationDate
        creationTime
      }
    }
  }
}
'''
# Create a dictionary to send the query
payload = {"query": graphql_query}

# Create the series request
uri = API_ENDPOINT + "/v1/graphql/"

# Perform a POST request to get the response
response = requests.post(uri, json=payload, header=header)

# Print the response
pprint(response.json())

{'seriesList': {'edges': []}}


Here is an example of how you would extract the last id from the JSON above (we will use it soon).

In [8]:
# Extract the last ID
last_id = response.json().get('seriesList', {}).get('edges', [])[-1].get('node', {}).get('id', None)

# Print the last ID
print("Last ID:", last_id)

IndexError: list index out of range

## Pagination

To achieve pagination, you user the `first: 2, after: <<uid>>`, which yields 10 results that follow the first 2.

This is the equivalent to the REST query: `v1/series/?limit=2&offset=2`.

There is nothing magical about the 2 here, we are limiting the results for notebook readability.

Notice there is a difference between GraphQL and REST here.  GraphQL uses the id of the last record for the offset.  This is because the after is really a cursor, so it will always be a string (to allow string values) and it will "point" to the record.  Hence we use the id of the last record.

This, along with Camel Case of underbars, hearkens back to the purpose of GraphQL, get just the JSON you need, and no more.ursors.


In [10]:
import requests
from pprint import pprint

# we are using the last_id from the last cell, so run it first

# Use the extracted last ID as the "after" parameter in the second query
graphql_query = f'''
{{
  seriesList(first: 2, after: "{last_id}") {{
    edges {{
      node {{
        id
        type
        locationId
        parameterId
        ensembleId
        ensembleMemberIndex
        timeStepUnit
        timeStepMultiplier
        startDateDate
        startDateTime
        endDateDate
        endDateTime
        forecastDateDate
        forecastDateTime
        missVal
        stationName
        lat
        lon
        x
        y
        z
        units
        creationDate
        creationTime
      }}
    }}
  }}
}}
'''

# Create a dictionary to send the second query
second_query_payload = {"query": graphql_query}

# Perform a POST request to get the response for the second query
response = requests.post(uri, json=second_query_payload)

# Print the response
pprint(response.json())


NameError: name 'last_id' is not defined

## Filtering Data with GraphQL

In GraphQL, you can retrieve a subset of data series and apply filtering using query parameters directly in the query itself.
Here, we are filtering on location ids and parameter ids.

The REST equivalent would be: `
/v1/series/?limit=10&offset=10&parameter_id=QUINE&location_id=MILN` but with way more unwanted data for the client to filter and manage.4

In [7]:
# Define the GraphQL query as a Python string
graphql_query = '''
{
  seriesList(locationIds: ["PPPN4"], parameterIds: ["QINE"]) {
    edges {
      node {
        locationId
        parameterId
        stationName
      }
    }
  }
}
'''
# Create a dictionary to send the query
payload = {"query": graphql_query}

# Create the series request
uri = API_ENDPOINT + "/v1/graphql/"

# Perform a POST request to get the response
response = requests.post(uri, json=payload, header=header)

# Print the response
pprint(response.json())

{'seriesList': {'edges': []}}


## Filtering Data Based on Multiple Parameters

In some use cases, you might want to retrieve data that matches more than one parameter value.

`/v1/series/?location_id[]=MILN4&location_id[]=MILN5`

In [8]:
# Define the GraphQL query as a Python string
graphql_query = '''
{
  seriesList(locationIds: ["MILN4", "MILN5"]) { 
    edges {
      node {
        type
        locationId
        parameterId
        ensembleId
        ensembleMemberIndex
        timeStepUnit
        timeStepMultiplier
        startDateDate
        startDateTime
        endDateDate
        endDateTime
        forecastDateDate
        forecastDateTime
        missVal
        stationName
        lat
        lon
        x
        y
        z
        units
        creationDate
        creationTime
      }
    }
  }
}
'''
# Create a dictionary to send the query
payload = {"query": graphql_query}

# Create the series request
uri = API_ENDPOINT + "/v1/graphql/"

# Perform a POST request to get the response
response = requests.post(uri, json=payload, header=header)

# Print the response
pprint(response.json())

{'seriesList': {'edges': []}}


# End Notes

This concludes this notebook, you will notice that we covered simple queries, but this should be enough to get a start on the capabilities of the GraphQL interface.

The next notebook will pick up where we are leaving off, and cover GraphQL in more detail, and present relay queries, 