# PKDB-REST API
This notebook provides simple examples querying data from PK-DB.
The API documentation is available from https://pk-db.com/api/v1/swagger/.
For running the examples set the `base_url` to the rese

In [13]:
import requests
from requests import Response
from pprint import pprint
import pandas as pd

base_url = "http://0.0.0.0:8000/api/v1"  # https://pk-db.com/api/v1

def json_print(r: Response):
    json = r.json()
    pprint(json, sort_dicts=False)

## Statistics
To get a basic overview of the content of PK-DB and the version use the `/statistics/` endpoint.

In [14]:
r = requests.get(base_url + '/statistics/')
json_print(r)

{'version': '0.9.2a3',
 'study_count': 507,
 'reference_count': 507,
 'group_count': 1447,
 'individual_count': 6212,
 'intervention_count': 1391,
 'output_count': 72068,
 'output_calculated_count': 11651,
 'timecourse_count': 3036,
 'scatter_count': 37}


## Info nodes
Information in PK-DB is organized as info nodes. Meta-information is encoded in the form of the info nodes which for a given field encodes meta-data such as description, synonyms, annotations and database cross-references.

### Retrieve single info node
Information on info nodes can be retrieved using the `sid` with the `info_nodes` endpoint. An overview of the available info nodes is available from the info nodes tab https://pk-db/curation. For instance to query the information on the substance `caffeine` with `sid=caf` use

In [15]:
r = requests.get(base_url + '/info_nodes/caf/')
json_print(r)

{'sid': 'caf',
 'name': 'caffeine',
 'label': 'caffeine',
 'deprecated': False,
 'ntype': 'substance',
 'dtype': 'undefined',
 'description': 'A methylxanthine alkaloid found in the seeds, nuts, or leaves '
                'of a number of plants native to South America and East Asia '
                'that is structurally related to adenosine and acts primarily '
                'as an adenosine receptor antagonist with psychotropic and '
                'anti-inflammatory activities.',
 'synonyms': ['1,3,7-TMX',
              '1,3,7-Trimethylxanthine',
              '1,3,7-trimethyl-3,7-dihydro-1H-purine-2,6-dione',
              '1,3,7-trimethylpurine-2,6-dione',
              '1,3,7-trimethylxanthine',
              '1-methyltheobromine',
              '137MX',
              '3,7-Dihydro-1,3,7-trimethyl-1H-purin-2,6-dion',
              '3,7-Dihydro-1,3,7-trimethyl-1H-purine-2,6-dione',
              '7-methyltheophylline',
              'CAF',
              'CAFFEINE',
            

### Search info nodes
Info nodes can be search via the search argument. In the following we are searching for `caffeine` and return the paginated results with a page size of 100. We parse the JSON response in a pandas DataFrame and display `sid`, `name`, `label` and `description` for the top 10 results.

In [27]:
r = requests.get(base_url + '/info_nodes/?search=caffeine&page_size=100')
data = r.json()["data"]["data"]
df = pd.DataFrame.from_dict(data)
df[["sid", "name", "label", "description"]].head(10)

Unnamed: 0,sid,name,label,description
0,caf,caffeine,caffeine,"A methylxanthine alkaloid found in the seeds, ..."
1,caffeine-citrate,caffeine citrate,caffeine citrate,"Commercial citrate of caffeine, though not a d..."
2,caffeine-monohydrate,caffeine monohydrate,caffeine monohydrate,Caffeine monohydrate.
3,17u,17U,17U,Metabolite of caffeine.
4,px,paraxanthine,paraxanthine,A dimethylxanthine having the two methyl group...
5,tp,theophylline,theophylline,A natural alkaloid derivative of xanthine isol...
6,137mu,137MU,137MU,Metabolite of caffeine.
7,137tmu,137TMU,137TMU,Metabolite of caffeine.
8,13dmu,13DMU,13DMU,Metabolite of caffeine.
9,13mu,13MU,13MU,Metabolite of caffeine.


### Retrieve all available info nodes
To retrieve all available info nodes use the `/info_nodes/` endpoint. The results are paginated so to retrieve all entries one must iterate over the available pages.

In [34]:
r = requests.get(base_url + '/info_nodes/')
json = r.json()
print(f"Number of info nodes: {json['data']['count']}")

Number of info nodes: 1030


## Filter and search data

### Accessing the `groups`, `individuals`, `interventions`, `outputs`, `timecourses` and `scatters`


### Download data

In [None]:
r = requests.get(base_r)

In [13]:
r = requests.get(base_r)

TypeError: __init__() got an unexpected keyword argument 'sort_dicts'

## Statistics
To get a basic overview of the content of PK-DB and the version use the `/statistics/` endpoint.

In [10]:
r = requests.get(base_url + '/statistics/')
print(r.json())

{'version': '0.9.2a3', 'study_count': 507, 'reference_count': 507, 'group_count': 1447, 'individual_count': 6212, 'intervention_count': 1391, 'output_count': 72068, 'output_calculated_count': 11651, 'timecourse_count': 3036, 'scatter_count': 37}


## Info nodes
Information in PK-DB is organized as info nodes. Meta-information is encoded in the form of the info nodes which for a given field encodes meta-data such as description, synonyms, annotations and database cross-references.

### Query single info node
To retrieve the available info nodes use the `/info_nodes/` endpoint.

In [11]:
r = requests.get(base_url + '/info_nodes/')
print(r.json())

{'current_page': 1, 'last_page': 52, 'next_page_url': 'http://0.0.0.0:8000/api/v1/info_nodes/?page=2', 'prev_page_url': None, 'data': {'count': 1030, 'data': [{'sid': '137mu', 'name': '137MU', 'label': '137MU', 'deprecated': False, 'ntype': 'substance', 'dtype': 'undefined', 'description': 'Metabolite of caffeine.', 'synonyms': ['1,3,7-MU'], 'parents': [], 'annotations': [], 'xrefs': [], 'measurement_type': None, 'substance': {'mass': None, 'charge': None, 'formula': None}}, {'sid': '137tmu', 'name': '137TMU', 'label': '137TMU', 'deprecated': False, 'ntype': 'substance', 'dtype': 'undefined', 'description': 'Metabolite of caffeine.', 'synonyms': [], 'parents': [], 'annotations': [], 'xrefs': [], 'measurement_type': None, 'substance': {'mass': None, 'charge': None, 'formula': None}}, {'sid': '13cco2', 'name': '13C-co2', 'label': '13C-carbon dioxide', 'deprecated': False, 'ntype': 'substance', 'dtype': 'undefined', 'description': '13C carbon dioxide is a (13)C-modified compound that is c

## Query data for single study

In [None]:
r = requests.get(base_r)