# Example of retrieving data from an API of a Web App
<h3> Imports</h3>
Let's import first the needed libraries

In [None]:
import requests
import csv
import json
import pandas as pd

<h3> Setting up the request </h3>
Now, in order to securely connect with the API, wee need an access token. I have generated one here for demo purposes. The access token must be passed to the server in the header of the http request.

In [9]:
token = '796da693d7c232875342491039e21ad6e60b6715'

head = {'Authorization': 'token {}'.format(token)}

Then we will store all the useful API endpoints as variables.

In [29]:
sample_url = 'https://demo.surface-analytics.com/api/sample/'
results_url = 'https://demo.surface-analytics.com/api/result/'
study_url = 'https://demo.surface-analytics.com/api/study/'

<h3> Making a request </h3>
Now we aill make a request to one of the endpoints and see what the server returns to us.
In this case, we will get a list of all samples in the database.

In [4]:
endpoint = sample_url
response = requests.get(endpoint, headers=head)
samples = response.json()

[{'id': 1, 'label': 'MG-001-01', 'name': None, 'parent': None, 'date_created': '2021-09-21 13:40:18', 'state': 1, 'sample_form': None, 'mass': None, 'mass_unit': "('milligrams', 'Milligrams')", 'created_by': 1}, {'id': 2, 'label': 'MG-001-02', 'name': 'Annealed Pd', 'parent': 1, 'date_created': '2021-09-21 13:40:43', 'state': 4, 'sample_form': None, 'mass': None, 'mass_unit': "('milligrams', 'Milligrams')", 'created_by': 1}, {'id': 3, 'label': 'MG-001-03', 'name': 'Activated Pd', 'parent': 2, 'date_created': '2021-09-21 13:41:00', 'state': 3, 'sample_form': None, 'mass': None, 'mass_unit': "('milligrams', 'Milligrams')", 'created_by': 1}, {'id': 4, 'label': 'MG-001-04', 'name': 'Used Pd', 'parent': None, 'date_created': '2021-09-21 13:41:18', 'state': 3, 'sample_form': None, 'mass': None, 'mass_unit': "('milligrams', 'Milligrams')", 'created_by': 1}]


Great! It worked. You can see it is not so easy to read, but what we got is a list of dictionaries, containis metadata about the samples. 
Let's make it look a little prettier.

In [11]:
print(json.dumps(samples, indent=2))

[
  {
    "id": 1,
    "label": "MG-001-01",
    "name": null,
    "parent": null,
    "date_created": "2021-09-21 13:40:18",
    "state": 1,
    "sample_form": null,
    "mass": null,
    "mass_unit": "('milligrams', 'Milligrams')",
    "created_by": 1
  },
  {
    "id": 2,
    "label": "MG-001-02",
    "name": "Annealed Pd",
    "parent": 1,
    "date_created": "2021-09-21 13:40:43",
    "state": 4,
    "sample_form": null,
    "mass": null,
    "mass_unit": "('milligrams', 'Milligrams')",
    "created_by": 1
  },
  {
    "id": 3,
    "label": "MG-001-03",
    "name": "Activated Pd",
    "parent": 2,
    "date_created": "2021-09-21 13:41:00",
    "state": 3,
    "sample_form": null,
    "mass": null,
    "mass_unit": "('milligrams', 'Milligrams')",
    "created_by": 1
  },
  {
    "id": 4,
    "label": "MG-001-04",
    "name": "Used Pd",
    "parent": null,
    "date_created": "2021-09-21 13:41:18",
    "state": 3,
    "sample_form": null,
    "mass": null,
    "mass_unit": "('millig

That's better. 
Alternatively, we could structure it into a dataframe.

<h3> Reformatting into a Dataframe </h3>

In [27]:
samples_df = pd.DataFrame(samples)

In [28]:
display(samples_df)

Unnamed: 0,id,label,name,parent,date_created,state,sample_form,mass,mass_unit,created_by
0,1,MG-001-01,,,2021-09-21 13:40:18,1,,,"('milligrams', 'Milligrams')",1
1,2,MG-001-02,Annealed Pd,1.0,2021-09-21 13:40:43,4,,,"('milligrams', 'Milligrams')",1
2,3,MG-001-03,Activated Pd,2.0,2021-09-21 13:41:00,3,,,"('milligrams', 'Milligrams')",1
3,4,MG-001-04,Used Pd,,2021-09-21 13:41:18,3,,,"('milligrams', 'Milligrams')",1


Cool.
Now let's try getting some other data.
<h3> Nested results </h3>

In [30]:
endpoint = results_url
response = requests.get(endpoint, headers=head)
results = response.json()
print(json.dumps(results, indent=2))

[
  {
    "id": "2bfd5001-6ca6-4dbf-a6eb-a3e320f18f74",
    "file": "http://demo.surface-analytics.com/media/uploads/Untitled_11_4.jpg",
    "measurement_session": {
      "id": 3,
      "sample": {
        "id": 1,
        "label": "MG-001-01",
        "name": null,
        "parent": null,
        "date_created": "2021-09-21 13:40:18",
        "state": 1,
        "sample_form": null,
        "mass": null,
        "mass_unit": "('milligrams', 'Milligrams')",
        "created_by": 1
      },
      "study": {
        "id": 1,
        "name": "study",
        "date_created": "2021-09-21 13:40:08",
        "samples": [
          {
            "id": 1,
            "label": "MG-001-01",
            "name": null,
            "measurement_sessions": [
              {
                "id": 1,
                "method": {
                  "id": 1,
                  "name": "XPS"
                },
                "instrument": null,
                "status": "not started"
              },
      

Looks good. But in this case, the results are nested.
How will this look in our dataframe?

In [32]:
results_df = pd.DataFrame(results)
display(results_df)

Unnamed: 0,id,file,measurement_session,filename,date_created,method
0,2bfd5001-6ca6-4dbf-a6eb-a3e320f18f74,http://demo.surface-analytics.com/media/upload...,"{'id': 3, 'sample': {'id': 1, 'label': 'MG-001...",Untitled_11_4.jpg,2021-09-21 13:41:40,"{'id': 3, 'name': 'GC'}"
1,d0136a6c-21c7-453e-956c-b13c19d8aeba,http://demo.surface-analytics.com/media/upload...,"{'id': 5, 'sample': {'id': 1, 'label': 'MG-001...",Untitled_13.jpg,2021-09-21 13:41:58,"{'id': 5, 'name': 'IR'}"
2,47c8e7db-a9e1-4977-b260-dde2df2a18bd,http://demo.surface-analytics.com/media/upload...,"{'id': 4, 'sample': {'id': 1, 'label': 'MG-001...",Untitled_11_1.jpg,2021-09-21 13:42:16,"{'id': 4, 'name': 'HPLC'}"


Hmmmm...Well, the dataframe is not such a good representation for nested data.

We could us some built-in Pandas functionality to 'flatten' the nested items.

In [34]:
flat_results_df = pd.json_normalize(results, sep='-')
display(flat_results_df)

Unnamed: 0,id,file,filename,date_created,measurement_session-id,measurement_session-sample-id,measurement_session-sample-label,measurement_session-sample-name,measurement_session-sample-parent,measurement_session-sample-date_created,...,measurement_session-method-id,measurement_session-method-name,measurement_session-instrument-name,measurement_session-instrument-id,measurement_session-instrument-method,measurement_session-instrument-laboratory,measurement_session-instrument-description,measurement_session-status,method-id,method-name
0,2bfd5001-6ca6-4dbf-a6eb-a3e320f18f74,http://demo.surface-analytics.com/media/upload...,Untitled_11_4.jpg,2021-09-21 13:41:40,3,1,MG-001-01,,,2021-09-21 13:40:18,...,3,GC,Agilent 7000D,cdec2073-dddf-487f-a88e-9911abf516ed,[3],,,in progress,3,GC
1,d0136a6c-21c7-453e-956c-b13c19d8aeba,http://demo.surface-analytics.com/media/upload...,Untitled_13.jpg,2021-09-21 13:41:58,5,1,MG-001-01,,,2021-09-21 13:40:18,...,5,IR,Alpha II,7b0f415e-5436-4dd2-8e37-556bfab1aac1,[5],,,complete,5,IR
2,47c8e7db-a9e1-4977-b260-dde2df2a18bd,http://demo.surface-analytics.com/media/upload...,Untitled_11_1.jpg,2021-09-21 13:42:16,4,1,MG-001-01,,,2021-09-21 13:40:18,...,4,HPLC,Vanquish Core,b0e210ed-495e-4e0f-84bd-78eb270fac70,[4],,,complete,4,HPLC


<h3> Studies </h3>
Let's look at some more data.

In [35]:
endpoint = study_url
response = requests.get(endpoint, headers=head)
studies = response.json()
print(json.dumps(studies, indent=2))

[
  {
    "id": 1,
    "name": "study",
    "date_created": "2021-09-21 13:40:08",
    "samples": [
      {
        "id": 1,
        "label": "MG-001-01",
        "name": null,
        "measurement_sessions": [
          {
            "id": 1,
            "method": {
              "id": 1,
              "name": "XPS"
            },
            "instrument": null,
            "status": "not started"
          },
          {
            "id": 2,
            "method": {
              "id": 2,
              "name": "XRD"
            },
            "instrument": null,
            "status": "not started"
          },
          {
            "id": 6,
            "method": {
              "id": 6,
              "name": "Raman"
            },
            "instrument": null,
            "status": "not started"
          },
          {
            "id": 7,
            "method": {
              "id": 7,
              "name": "ICP"
            },
            "instrument": null,
            "status"

Again, let's put the result in a Dataframe.

In [36]:
studies_df = pd.DataFrame(studies)
display(studies_df)

Unnamed: 0,id,name,date_created,samples,configuration
0,1,study,2021-09-21 13:40:08,"[{'id': 1, 'label': 'MG-001-01', 'name': None,...",1


Here we see very few results because the 'study' is the highest-level object. All the other objects are nested.
Let's flatten it.

In [41]:
flat_studies_df = pd.json_normalize(studies[0]['samples'], sep='-')
display(flat_studies_df)

Unnamed: 0,id,label,name,measurement_sessions
0,1,MG-001-01,,"[{'id': 1, 'method': {'id': 1, 'name': 'XPS'},..."
1,2,MG-001-02,Annealed Pd,"[{'id': 8, 'method': {'id': 1, 'name': 'XPS'},..."
2,3,MG-001-03,Activated Pd,"[{'id': 15, 'method': {'id': 1, 'name': 'XPS'}..."
3,4,MG-001-04,Used Pd,"[{'id': 22, 'method': {'id': 1, 'name': 'XPS'}..."
