# Querying Metadata with GraphQL

In this notebook we explore searching for metadata from the GraphQL API. The GraphQL API provides a method to programmatically extract a JSON representation of the meta data from the API. It has the advantage over REST that it can prevent under and over fetching data.

First we load some python dependancies that we will use as part of this notebook and set the variable `ENDPOINT_URL` to the location of the GraphQL API.

In [1]:
import yaml
import requests
from string import Template

def load_config() -> dict:
    """Helper function to load the configuration file"""
    with open('config.yml') as file_handle:
        config = yaml.load(file_handle, yaml.FullLoader)
    return config

config = load_config()
ENDPOINT_URL = config['graphql_api']

## Querying Shots with GraphQL

With GraphQL you can query exactly what you want, rather than having to recieve the whole table from the database This is useful in cases where the whole table has many columns, but you are interested in just a subset of them.

The GraphQL endpoint is located at `/graphql`. You can find the documentation and an interactive query explorer at the URL below:

In [2]:
print(f"GraphQL API endpoint: {ENDPOINT_URL}")

GraphQL API endpoint: https://ada-sam-app.oxfordfun.com/graphql


Unlike the REST API which uses HTTP `GET` requests to return data, with GraphQL we use HTTP `POST` to post our query to the API.

Here is a simple example of getting some shot data from the GraphQL API. We need to explicity state what information we want to return from the API. Here we are asking for:
 - the shot ID
 - the timestamp that the shot was taken
 - the preshot description
 - the divertor configuration

In [3]:
# Write our GraphQL query.
query = """
query {
    all_shots  {
        shots {
            shot_id
            timestamp
            preshot_description
            divertor_config
        }
    }
}
"""
# Query the API and get a JSON response
response = requests.post(f'{ENDPOINT_URL}', json={'query': query})
result = response.json()
result['data']['all_shots']['shots'][:3]

[{'shot_id': 11695,
  'timestamp': '2004-12-13T11:54:00',
  'preshot_description': '\n0.1T TF SHOT\n',
  'divertor_config': 'conventional'},
 {'shot_id': 11696,
  'timestamp': '2004-12-13T12:07:00',
  'preshot_description': '\nSTANDARD 0.3T TF SHOT\n',
  'divertor_config': 'conventional'},
 {'shot_id': 11697,
  'timestamp': '2004-12-13T12:19:00',
  'preshot_description': '\nRAISE TO 0.5T\n',
  'divertor_config': 'conventional'}]

### Searching & Filtering Data
We can also supply query parameters to GraphQL, such as limiting the number of returned values or filtering by value. Here we are limiting the first 3 values and we are selcting only shots from the M9 campaign.

In [4]:
# Write our GraphQL query.
query = """
query {
    all_shots (limit: 3, where: {campaign: {eq: "M9"}}) {
        shots {
            shot_id
            timestamp
            preshot_description
            divertor_config
        }
    }
}
"""
# Query the API and get a JSON response
response = requests.post(f'{ENDPOINT_URL}', json={'query': query})
response.json()

{'data': {'all_shots': {'shots': [{'shot_id': 28390,
     'timestamp': '2012-03-06T14:47:00',
     'preshot_description': '\nBC5, 300 ms, 3 V. D2 plenum 1536 mbar. For reference.\n',
     'divertor_config': 'conventional'},
    {'shot_id': 28391,
     'timestamp': '2012-03-06T14:52:00',
     'preshot_description': '\nBC5, 300 ms, 5 V. D2 plenum 1536 mbar. For reference.\n',
     'divertor_config': 'conventional'},
    {'shot_id': 28392,
     'timestamp': '2012-03-06T15:03:00',
     'preshot_description': '\nHL11, 300 ms, 2 V. He plenum 1047.\n',
     'divertor_config': 'conventional'}]}}}

### Nested Queries

One feature which makes GraphQL much more powerful that REST is that you may perform nested queries to gather different subsets of the data. For example, here we are going to query for all datasets with "AMC" in the name, and query for information about shots associated with this dataset.

In [5]:
# Write our GraphQL query.
query = """
query {
    all_signal_datasets (limit: 3, where: {name: {contains: "AMC"}}) {
        signal_datasets {
          name
          url
          shots (limit:3) {
            shot_id
            timestamp
            divertor_config
          }
        }
    }
}
"""
# Query the API and get a JSON response
response = requests.post(f'{ENDPOINT_URL}', json={'query': query})
response.json()

{'data': {'all_signal_datasets': {'signal_datasets': [{'name': 'AMC_EFPS_CURRENT',
     'url': 's3://mast/AMC_EFPS_CURRENT.zarr',
     'shots': [{'shot_id': 28412,
       'timestamp': '2013-03-06T15:13:00',
       'divertor_config': 'conventional'},
      {'shot_id': 28415,
       'timestamp': '2013-03-07T15:11:00',
       'divertor_config': 'conventional'},
      {'shot_id': 28416,
       'timestamp': '2013-03-07T15:27:00',
       'divertor_config': 'conventional'}]},
    {'name': 'AMC_ERROR_FIELD_02',
     'url': 's3://mast/AMC_ERROR_FIELD_02.zarr',
     'shots': [{'shot_id': 28785,
       'timestamp': '2013-05-21T16:00:00',
       'divertor_config': 'conventional'},
      {'shot_id': 28786,
       'timestamp': '2013-05-21T16:16:00',
       'divertor_config': 'conventional'},
      {'shot_id': 28787,
       'timestamp': '2013-05-21T16:32:00',
       'divertor_config': 'conventional'}]},
    {'name': 'AMC_ERROR_FIELD_05',
     'url': 's3://mast/AMC_ERROR_FIELD_05.zarr',
     'shots': 

### Pagination in GraphQL

GraphQL queries are paginated. You may access other entries by including the page metadata and it's associated elements. Here's an example of getting paginated entries:

In [6]:
def do_query(cursor: str = None):
    query = """
    query {
        all_shots (limit: 3, where: {campaign: {contains: "M9"}}, ${cursor}) {
            shots {
                shot_id
                timestamp
                preshot_description
                divertor_config
            }
            page_meta {
              next_cursor
              total_items
              total_pages
            }
        }
    }
    """
    template = Template(query)
    query = template.substitute(cursor=f'cursor: "{cursor}"' if cursor is not None else "")
    return requests.post(f'{ENDPOINT_URL}', json={'query': query})


def iterate_responses():
    cursor = None
    while True:
        response = do_query(cursor)
        payload = response.json() 
        yield payload
        cursor = payload['data']['all_shots']['page_meta']['next_cursor']
        if cursor is None:
            return

responses = iterate_responses()
print(next(responses))
print(next(responses))
print(next(responses))

{'data': {'all_shots': {'shots': [{'shot_id': 28390, 'timestamp': '2012-03-06T14:47:00', 'preshot_description': '\nBC5, 300 ms, 3 V. D2 plenum 1536 mbar. For reference.\n', 'divertor_config': 'conventional'}, {'shot_id': 28391, 'timestamp': '2012-03-06T14:52:00', 'preshot_description': '\nBC5, 300 ms, 5 V. D2 plenum 1536 mbar. For reference.\n', 'divertor_config': 'conventional'}, {'shot_id': 28392, 'timestamp': '2012-03-06T15:03:00', 'preshot_description': '\nHL11, 300 ms, 2 V. He plenum 1047.\n', 'divertor_config': 'conventional'}], 'page_meta': {'next_cursor': 'MjgzOTI=', 'total_items': 2084, 'total_pages': 695}}}}
{'data': {'all_shots': {'shots': [{'shot_id': 28393, 'timestamp': '2012-03-06T15:09:00', 'preshot_description': '\nHL11, 300 ms, 3 V. He plenum 1047.\n', 'divertor_config': 'conventional'}, {'shot_id': 28394, 'timestamp': '2012-03-06T15:13:00', 'preshot_description': '\nHL11, 300 ms, 5 V. He plenum 1047.\n', 'divertor_config': 'conventional'}, {'shot_id': 28395, 'timestam