# Querying Metadata with the REST API

In this notebook we explore searching for metadata from the REST API. The REST API provides a method to programmatically extract a JSON representation of the meta data from the API.

First we load some python dependancies that we will use as part of this notebook and set the variable `ENDPOINT_URL` to the location of the REST API.

In [62]:
import pandas as pd
import requests
import dask
import dask.dataframe as dd
import yaml

with open('config.yml') as file_handle:
    config = yaml.load(file_handle, yaml.FullLoader)

ENDPOINT_URL = config['rest_api']

Below we print the location of the endpoint we will query in this notebook:

In [63]:
print(f"REST API Endpoint: {ENDPOINT_URL}")

REST API Endpoint: http://209.97.185.27:8081/json


## Querying Shots with the REST API

We're going to use the python `requests` library to query metadata from get metadata from the database. All we need to do to get a result is to query the database with a HTTP GET at the appropriate endpoint. For example, to get information about different experimental shots we can query the `/json/shots/` endpoint.

In [64]:
response = requests.get(f'{ENDPOINT_URL}/shots')
result = response.json()
print(f"Query returned status code: {response.status_code}")

Query returned status code: 200


The shots endpoint returns a JSON payload with a list of shots. Let's look at the first element from the payload:

In [65]:
result[0]

{'shot_id': 30471,
 'timestamp': '2013-09-27T15:20:00',
 'preshot_description': '\nThe last plasma:\nConvert to i/b Helios 1724 mbar fill and repeat\nStop beam at 140ms, beam at 70kV. \n',
 'postshot_description': '\nGood shot.\n',
 'campaign': 'M9',
 'reference_shot': 30470,
 'scenario': 2,
 'heating': 'SS Beam',
 'pellets': False,
 'rmp_coil': False,
 'current_range': '700 kA',
 'divertor_config': 'Conventional',
 'plasma_shape': 'Lower Single Null',
 'facility': 'MAST',
 'cpf_abort': 0.0,
 'cpf_amin_ipmax': 0.6082594266596257,
 'cpf_amin_max': 0.6102043,
 'cpf_amin_truby': 0.0,
 'cpf_area_ipmax': 1.9033160214221327,
 'cpf_area_max': 1.965805,
 'cpf_area_truby': 0.0,
 'cpf_bepmhd_ipmax': 0.34328417278133033,
 'cpf_bepmhd_max': 0.39305434,
 'cpf_bepmhd_truby': 0.0,
 'cpf_betmhd_ipmax': 7.1242606250376665,
 'cpf_betmhd_max': 7.4911194,
 'cpf_betmhd_truby': 0.0,
 'cpf_bt_ipmax': -0.2851174433375451,
 'cpf_bt_max': -0.28102276,
 'cpf_bt_truby': 0.0,
 'cpf_c2ratio': 0.0,
 'cpf_creation': 

Each item in the list is a json object. Tis contains a the meta-data items that corresponded to our query. In this case, each item contains information about a different MAST shot. Each item has lots of information about different shots, for example the shot ID, the campaign the shot was part of, the pre- and post-shot description by investigators.

For more information on the what's returned by the API you can look at the endpoint documentation:

https://mast-app.site/redoc


Of course, we can read all this JSON data directly into common python data analysis packages, for example, we can create a `pandas` dataframe directly from the endpoint

In [66]:
df = pd.read_json(f'{ENDPOINT_URL}/shots')
df.head()

Unnamed: 0,shot_id,timestamp,preshot_description,postshot_description,campaign,reference_shot,scenario,heating,pellets,rmp_coil,...,cpf_tvol_max,cpf_twmhd_max,cpf_useful,cpf_vol_ipmax,cpf_vol_max,cpf_vol_truby,cpf_wmhd_ipmax,cpf_wmhd_max,cpf_wmhd_truby,cpf_zmag_efit
0,30471,2013-09-27 15:20:00,\nThe last plasma:\nConvert to i/b Helios 1724...,\nGood shot.\n,M9,30470,2.0,SS Beam,False,0.0,...,0.125,0.135,1,8.817559,9.283702,0.0,38063.58238,40906.09,0.0,0.01434
1,30470,2013-09-27 15:03:00,\nRepeat last using hydrogen in outboard and c...,\nNo HF gas.\n,M9,30467,2.0,SS Beam,False,,...,0.135,0.105,0,9.687049,10.055509,0.0,17290.432865,22310.516,0.0,0.015164
2,30469,2013-09-27 14:39:00,\nRepeat with increased beam power (74 kV)\ncH...,\nGood shot. Modes present.\n,M9,30467,3.0,SS Beam,False,0.0,...,0.14,0.145,1,8.98873,9.047923,0.0,47466.249616,49115.805,0.0,0.015299
3,30468,2013-09-27 14:21:00,\nRepeat with new neutron camera position.\ncH...,\nGood beam.\nGood repeat.\n,M9,30467,2.0,SS Beam,False,0.0,...,0.14,0.13,1,9.102411,9.107017,0.0,48516.962675,49382.133,0.0,0.012445
4,30467,2013-09-27 14:03:00,\nRepeat with new neutron camera position.\ncH...,\nTwo times lower DD neutron rate than referen...,M9,30459,3.0,SS Beam,False,0.0,...,0.125,0.18,1,9.029202,9.046394,0.0,49469.122469,52653.445,0.0,0.013202


### Searching & Filtering Data

All REST API endpoints can take query parameters to filter the data returned. For example, we can return all shots for the `M9` campaign by using the approrpiate query string.

For example, we can query for everything from the `M9` campaign by adding `?campaign=M9` to our query string.

In [67]:
df = pd.read_json(f"{ENDPOINT_URL}/shots?campaign=M9")
df.head()

Unnamed: 0,shot_id,timestamp,preshot_description,postshot_description,campaign,reference_shot,scenario,heating,pellets,rmp_coil,...,cpf_tvol_max,cpf_twmhd_max,cpf_useful,cpf_vol_ipmax,cpf_vol_max,cpf_vol_truby,cpf_wmhd_ipmax,cpf_wmhd_max,cpf_wmhd_truby,cpf_zmag_efit
0,30471,2013-09-27 15:20:00,\nThe last plasma:\nConvert to i/b Helios 1724...,\nGood shot.\n,M9,30470,2.0,SS Beam,False,0.0,...,0.125,0.135,1,8.817559,9.283702,0.0,38063.58238,40906.09,0.0,0.01434
1,30470,2013-09-27 15:03:00,\nRepeat last using hydrogen in outboard and c...,\nNo HF gas.\n,M9,30467,2.0,SS Beam,False,,...,0.135,0.105,0,9.687049,10.055509,0.0,17290.432865,22310.516,0.0,0.015164
2,30469,2013-09-27 14:39:00,\nRepeat with increased beam power (74 kV)\ncH...,\nGood shot. Modes present.\n,M9,30467,3.0,SS Beam,False,0.0,...,0.14,0.145,1,8.98873,9.047923,0.0,47466.249616,49115.805,0.0,0.015299
3,30468,2013-09-27 14:21:00,\nRepeat with new neutron camera position.\ncH...,\nGood beam.\nGood repeat.\n,M9,30467,2.0,SS Beam,False,0.0,...,0.14,0.13,1,9.102411,9.107017,0.0,48516.962675,49382.133,0.0,0.012445
4,30467,2013-09-27 14:03:00,\nRepeat with new neutron camera position.\ncH...,\nTwo times lower DD neutron rate than referen...,M9,30459,3.0,SS Beam,False,0.0,...,0.125,0.18,1,9.029202,9.046394,0.0,49469.122469,52653.445,0.0,0.013202


### Pagination 
The REST API responses are _paginated_, meaning that only a subset of the full items are returned with each query. Pagination is used to limit the total number of requests made by each user to prevent any single user overloading the server with huge data requests.

Pagination information is included in the header of the response and corresponds to [RFC 8288](https://datatracker.ietf.org/doc/html/rfc8288). The header contains the following properties:

 - `x-total-count` - this is the total number of items in the database which matched your query.
 - `x-total-pages` - this is the total number of pages with the given page size.
 - `link` - contains links to the next and previous pages for this query.

You can control the pagination using the following options as query arguments:
 - `page=xx` to set the page of items to view
 - `per_page=xx` to set the number of results returned by each page.

In [68]:
response = requests.get(f'{ENDPOINT_URL}/shots?page=10&per_page=2')
result = response.json()
headers = response.headers

print("Total number of pages", headers['x-total-pages'])
print("Total number of items in the database", headers['x-total-count'])
print("Links to the next and previous pages", headers['link'])

df = pd.DataFrame(result)
df

Total number of pages 9185
Total number of items in the database 18370
Links to the next and previous pages <http://209.97.185.27:8081/json/shots?page=11&per_page=2>; rel="next"<http://209.97.185.27:8081/json/shots?page=9&per_page=2>; rel="previous"


Unnamed: 0,shot_id,timestamp,preshot_description,postshot_description,campaign,reference_shot,scenario,heating,pellets,rmp_coil,...,cpf_tvol_max,cpf_twmhd_max,cpf_useful,cpf_vol_ipmax,cpf_vol_max,cpf_vol_truby,cpf_wmhd_ipmax,cpf_wmhd_max,cpf_wmhd_truby,cpf_zmag_efit
0,30451,2013-09-26T16:58:00,\nRepeat with TF settings as in 29341. Very lo...,"\nOk. Dies very early, but late enough for the...",M9,30377,2,SS Beam,False,False,...,0.13,0.12,1.0,7.022583,7.216327,0.0,25568.796591,25959.92,0.0,-0.000501
1,30450,2013-09-26T16:37:00,\nReload 30377 and TF settings as in 29301.\n,"\nOk, although shorter than reference, probabl...",M9,30377,2,SS Beam,False,False,...,0.165,0.185,1.0,7.812098,9.001694,0.0,47538.293348,63824.93,0.0,0.005763
