# Getting started with EpiGraphDB in Python

This notebook is provided as a brief introductory guide to working with the EpiGraphDB platform through Python. Here we will demonstrate a few basic operations that can be carried out using the platform, but for more advanced methods please refer to the [API endpoint documentation](http://docs.epigraphdb.org/api/api-endpoints/).

A Python wrapper for EpiGraphDB's API is currently in the works, but for now we will be querying it directly using the `requests` library- knowledge of this package will be helpful but is by no means essential.


In [31]:
import requests
from pprint import pprint

First, we will ping the API to check our connection:

In [15]:
# Store our API URL as a string for future use
API_URL = "https://api.epigraphdb.org"

# Here we use the .get() method to send a GET request to the /ping endpoint of the API
ping_response = requests.get(f"{API_URL}/ping")  

# Check that the ping was sucessful
if ping_response.json():
    print("Connection sucessfully made.")
else:
    print("Connection couldn't be made.")

Connection sucessfully made.


### Obtaining Mendelian Randomisation causal estimates 

In this example, we will query EpiGraphDB to obtain a list of traits traits for which there is strong evidence of an effect from the exposure- as an example, body mass index.

In [47]:
# Create a dictionary for the parameters to be passed
BMI_params = {'exposure_trait': 'Body mass index', 
          'pval_threshold': 1e-10}

# Send the request
BMI_response = requests.get(f"{API_URL}/mr", params=BMI_params)

# Check for a successful response status, raise an error if unsuccessful
BMI_response.raise_for_status()

# Store the results of the query, which can be obtained by calling the .json() method on the response object
BMI_result = BMI_response.json()['results']

# Convert our results from a nested dictionary to a pandas dataframe and display it below
import pandas as pd
BMI_df = pd.json_normalize(BMI_result) 
BMI_df

Unnamed: 0,exposure.id,exposure.trait,outcome.id,outcome.trait,mr.b,mr.se,mr.pval,mr.method,mr.selection,mr.moescore
0,ieu-a-2,Body mass index,ukb-a-74,Non-cancer illness code self-reported: diabetes,0.034559,0.002418,0.000000e+00,FE IVW,DF,0.93
1,ieu-a-2,Body mass index,ukb-a-388,Hip circumference,0.724105,0.026588,0.000000e+00,Simple median,Tophits,0.95
2,ieu-a-2,Body mass index,ukb-a-382,Waist circumference,0.656440,0.024496,0.000000e+00,Simple median,Tophits,0.94
3,ieu-a-2,Body mass index,ukb-a-35,Comparative height size at age 10,0.136684,0.007909,0.000000e+00,FE IVW,Tophits,0.94
4,ieu-a-2,Body mass index,ukb-a-34,Comparative body size at age 10,0.365580,0.023556,0.000000e+00,Simple median,HF,0.87
...,...,...,...,...,...,...,...,...,...,...
517,ieu-a-974,Body mass index,ukb-a-476,Pain type(s) experienced in last month: Knee pain,0.052106,0.005613,7.582779e-11,Simple mean,HF,0.90
518,ieu-a-785,Body mass index,ieu-a-1037,Difference in height between childhood and adu...,-0.520875,0.080135,8.034037e-11,FE IVW,Tophits,0.71
519,ieu-a-2,Body mass index,ieu-a-1034,Height,0.356558,0.055004,9.023410e-11,FE IVW,DF + HF,0.78
520,ieu-a-2,Body mass index,ukb-a-294,Wheeze or whistling in the chest in last year,0.052605,0.008118,9.166369e-11,Simple median,DF,0.89


In the dataframe displayed above, we can see the results of our query. We requested all traits for which an MR analysis using body mass index as the exposure variable returned a causal estimate with a p-value lower than 1e-10. 522 such traits were found, and information regarding the exposure variable, outcome variable, and MR test is recorded in the columns with names starting `exposure.`, `outcome.`, and `mr.`, respectively.