# Interacting with the ESCALATE REST API

This is a tutorail for using the ESCALATE REST API. 

It presumes only knowledge of basic python. 

It also presumes you have a local instance of ESCALATE running at http://localhost:8000

In [None]:
import json
from pprint import pprint

import requests  # requests library will send and receive data from the escalate server
from requests.api import post

import pandas as pd

## Quick intro to REST APIs

All you need to know about REST is that:
* It is a protocol for exchanging data between a client (e.g. me) and a server (e.g. ESCALATE)
* Data formats are human *and* machine readable (XML or JSON)
* There are two main HTTP 'verbs' or functions: 
  1. GET data from the server
  2. POST data to the server  
     (there are others, e.g. PUT and PATCH, but we won't use these in this tutorial)
* The python `requests` library implements the HTTP verbs for interacting with servers, including with REST APIs

### Example: GET from the PubChem API

* [Pubchem](https://pubchem.ncbi.nlm.nih.gov/) is a great way to get chemical informatics data from a vast repository
   - e.g. just search for any compound above
   - Has a graphical web interface and REST API
* (There are other computational chemistry REST APIs, including [Open Chemistry](https://doi.org/10.1186/s13321-017-0241-z), and [AFLOW-ML](https://doi.org/10.1016/j.commatsci.2018.03.075))


In [None]:
# dict mapping compound name to PubChemID
PubChemIDs = {
    'Methane': '297',
    'Benzene': '241',
    'Ethylammonium Iodide': '11116533'
}

The URL below is the compound 'API endpoint' which I can send a request to for properties of compounds given their PubChem CIDs

In [None]:
response = requests.get(('https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound'          # What type of entity? Compound
                         f'/cid/{",".join(PubChemIDs.values())}/'                      # Which compounds? These
                         'property/MolecularFormula,MolecularWeight,CanonicalSMILES/'  # Which properties? 
                         'JSON'))                                                      # Which format?

In [None]:
response

In [None]:
response.json()

## The ESCALATE REST API 

* Send data back and forth between ML, chemists, and laboratories

### Interactive List of all available ESCALATE endpoints

Simply navigating to http://localhost:8000/api will show you a browsable list of all available API endpoints. 

We can also view that list programmatically: 

In [None]:
base_url = 'http://localhost:8000/api'  # local dev server
response = requests.get(base_url)
response.json()

### Logging into ESCALATE


To be able to view and post to all endpoints, I will create a log in session that is managed by a token. 

In [None]:
# demo login credentials
login_data = {
    'username': 'mtynes',
    'password': 'hello1world2'
}

r_login = requests.post(f'{base_url}/login', data=login_data)
token = r_login.json()['token']
token

In [None]:
token_header = {'Authorization': f'Token {token}'}

content_type_header = {'content-type': 'application/json'} # for most requests we'll want this header

This token will allow me to validate my identity for this session

### Simple helper functions for GET/POST

These functions will do some minimal URL generation and JSON response parsing on top of what is done by the `requests` library

In [None]:
def post_data(endpoint, data={}, headers={**token_header, **content_type_header}):
    """POST `data` to `endpoint`in ESCALATE API using `headers`
    
    return: (dict|requests.Response), bool
    """
    r = requests.post(f'{base_url}/{endpoint}/', 
                      data=json.dumps(data), 
                      headers=headers)
    print(r)
    if r.ok: 
        print('POST: OK, returning new resource dict')
        return r.json()
    print('POST: FAILED, returning response object')
    return r


def get_data(endpoint, data={}, headers={**token_header}):
    """Make GET request with `data` to `endpoint` in ESCALATE API using `headers`
    
    return: (dict|list|requests.Response), bool
    """
    r = requests.get(f'{base_url}/{endpoint}/', params=data, headers=headers)
    print(r)
    if r.ok: 
        print('GET: OK')
        
        resp_json = r.json()        
        
        # handle cases: one vs many results
        if resp_json.get('count') is None: # edge case: template edit
            return r.json()
        elif resp_json.get('count') == 1: 
            print('Found one resource, returning dict')
            return resp_json['results'][0]
        elif resp_json.get('count') >= 1: 
            print(f"Found {resp_json['count']} resources, returning list of dicts)")
            return r.json()['results']
        else:
            print('GET: FAILED, returning response object')
    return r

### GET all of materials defined in ESCALATE

In [None]:
r = get_data(endpoint='material')

All of these materials would probably take up too much space in this notebook, lets look at the first one.

All lead iodide shares in this chemical formula, etc. Models are jsut 'what is is about these thigns that are all the same, as opposed to particular vials of gblthat has volumes, contanimanants, dates, masses. these all come from the object, but share in model. models dont contain provenance, are only abotu intenseive properties 

In [None]:
r[0]

These are the fields available for materials. Notably we can associate a material with arbitrary properties and material types.

### POST a material property

#### Current property definitions in ESCALATE

ESCALATE supports user defined property definitions, these are the one's we're using

In [None]:
r = get_data('propertydef', 
             {'fields': ['description']} # we can select 'columns' with the fields parameter    
            )
r

#### Molecular Weight

In [None]:
mw_property_def = get_data('propertydef', 
                           {'description': 'molecular-weight', # find an entity that has a particular description
                            'fields': ['url', 'description', 'val_unit']})
mw_property_def

Now we can associate an instance of this property with a material, say EthNH3I

In [None]:
ethylammonium_iodide = get_data('material', 
                                {'description': 'Ethylammonium Iodide',
                                 'fields':['url', 'description', 'property']})
ethylammonium_iodide

Note the empty property list above

Lets fill that in with the molecular weight from PubChem

In [None]:
response = requests.get('https://pubchem.ncbi.nlm.nih.gov/rest/pug/'+\
                        f'compound/cid/{PubChemIDs.get("Ethylammonium Iodide")}/'+\
                        'property/MolecularFormula,MolecularWeight,CanonicalSMILES/JSON')
pubchem_json = response.json()
eth_mw = pubchem_json['PropertyTable']['Properties'][0]['MolecularWeight']
eth_mw

In [None]:
r = post_data('materialproperty',
                  {'material': ethylammonium_iodide['url'],
                   'property_def': mw_property_def['url'],
                   'value': f"{eth_mw}"
                  }
                 )

And we've stored it!

In [None]:
get_data('material', 
         {'description':'Ethylammonium Iodide'})


* In practice we can use this functionality to store properties from any experiment or calculation. 

* We can also store metadata about where these values came from   
  (example to come in a further tutorial on tags, notes, edocs, calculations).

## Action definitions

* Just as we are free to define properties, we are free to define actions
* Current definition are what we've needed to specify human/robot instructions for current workflows

In [None]:
r = get_data('actiondef', {'fields': ['description', 'uuid', 'url']})
r

#### Zooming in on the dispense action definition

In [None]:
get_data('actiondef',               
         {'description': 'dispense',  # which action def
          'expand': 'parameter_def',  # sub dictionary to expand
          })

### Actions + Materials = Experiment Template 

* Experiment template = the form of an experiment that I wish to re-use, varying material choices and process parameters

In [None]:
experiment_templates = get_data('experimenttemplate', 
                               {'fields':['description', 'url']})

In [None]:
experiment_templates

Click on perovskite demo link and note 2 main nested fields: 
* Bill of materials = the initial materials for the perovskite workflow
* Workflow = the set of actions that combine these materials into perovskite crystal trials

In [None]:
perovskite_template = get_data('experimenttemplate',
                              {'description': 'test_wf_1', 
                              'expand': 'workflow' # expand the workflow subdictionary
                              })

In [None]:
perovskite_template

## Bill of Materials

The initial materials: think list of materials in methods section of paper

These are the Bill of Materials' Material entries for this experiment: 

In [None]:
get_data('bommaterial', {'bom':perovskite_template['bill_of_materials'][0]['uuid'], 
                        'fields':['description']})

In [None]:
get_data('compositematerial', {'composite_description__startswith':'Stock', 
                               'fields': ['composite_description',
                                          'component_description']})

#### We could drill further to get the concentrations, properties, etc

### Workflows: Logical groups of actions

Each of these contains a set of parameters that we can edit

In [None]:
perovskite_demo = get_data('experimenttemplate',
         {'description': 'perovskite_demo',
          'expand': 'workflow'})
[wf['description'] for wf in perovskite_demo['workflow']]

In [None]:
dispense_solvent_wf = perovskite_demo['workflow'][3] # pull out a workflow
example_steps = dispense_solvent_wf['step'][4]     # pull out some steps
example_steps

## Creating a new workflow from a template

If I want create an instance experiment from a template I: 

In [None]:
editable_template = get_data(
    # template endpoint  /        template ID      / create an instance of this template 
    f'experimenttemplate/{perovskite_demo["uuid"]}/create'
)

In [None]:
editable_template

Can also give arrays of values over 96 well plates

In [None]:
editable_template['experiment_name'] = 'test_perovskite_instance'

Suppose I edit this json then I can post the new template to the server

In [None]:
resp = post_data(
    f'experimenttemplate/{perovskite_demo["uuid"]}/create',
    editable_template
)

In [None]:
resp

This experiment then appears in the experiment queue.   

http://localhost:8000/experiment_list

The experimentalist is notified, can download relevant robot input and upload observed values through forms

## Parsing experiment results for ML

In [None]:
%%time
new_experiment_json = get_data('experiment/' + resp['new_experiment_created'].split('/')[-2], 
                                {'expand': 'workflow.step.workflow_object.action.parameter'}) # expanding deeply nested fields is somewhat slow

In [None]:
experiment_json = [new_experiment_json]

In [None]:
experiment_json

In [None]:
def experiment_json_to_df(experiment_json):
    result = []
    for e in experiment_json: 
        for workflow in e['workflow']:
            if 'Dispense' not in workflow['description']:
                    continue
            for step in workflow['step']: 
                action = step['workflow_object']['action']
                for parameter in action['parameter']:
                    result.append(
                        dict(
                            experiment_url         = e['url'],
                            experiment_id          = e['url'].split('/')[-2],
                            action_source          = action['source_material_description'],
                            action_dest            = action['destination_material_description'],
                            action_parameter       = parameter['parameter_def_description'],
                            action_parameter_value = float(parameter['parameter_val_nominal']['value']),
                            action_parameter_unit  = parameter['parameter_val_nominal']['unit']
                            )
                        )
    return pd.DataFrame(result)

In [None]:
results = experiment_json_to_df(experiment_json)

In [None]:
results

In [None]:
results = results.pivot_table(index=['experiment_id', 'action_dest'], 
                             columns=['action_source'], 
                             values='action_parameter_value')
results

In [None]:
crystal_scores = get_data('measure', 
                          {'measuredef':
                                (get_data('measuredef', 
                                          {'description': 'crystal_score'})['url']
                                )
                          })                        
results['crystal_score'] = crystal_scores['measure_value']['value']

In [None]:
results

### Current Limitations

* Some parts of API still are 'high entropy' (e.g. measure)
* Ditto for some portions of UI
* REST is slow for large transfers