# How to use EBI Metagenomics API

The EMG REST API https://www.ebi.ac.uk/metagenomics/api/ provides an easy-to-use set of top level resources, such as studies, samples, runs, experiment-types, biomes and annotations, that let user access metagenomics data in simple JSON format (JSON is a syntax for storing and exchanging data). Retrieving the data is as simple as sending a HTTP request.

We have utilised an interactive documentation framework (Swagger UI) to visualise and simplify interaction with the API’s resources via an HTML interface. Detailed explanations of the purpose of all resources, along with many examples, are provided to guide end-users. Documentation on how to use the endpoints is available at https://www.ebi.ac.uk/metagenomics/api/docs/.

### Import python modules

In [1]:
from jsonapi_client import Session, Filter

API_BASE = 'https://www.ebi.ac.uk/metagenomics/api/v0.2/'

### Get study and list of the samples with biome

In [2]:
with Session(API_BASE) as s:
    study = s.get('studies', 'ERP005831').resource
    for sample in study.samples:
        print(sample.accession, sample.biome.lineage)

ERS456668 root:Environmental:Aquatic:Freshwater:Lentic:Sediment
ERS456669 root:Environmental:Terrestrial:Soil:Loam:Agricultural


### List runs

In [3]:
with Session(API_BASE) as s:
    sample = s.get('samples', 'ERS667565').resource
    for run in sample.runs:
        print(run.accession, [p.id for p in run.pipelines])

ERR867951 ['4.0']
ERR867950 ['4.0']
ERR771104 ['2.0', '4.0']
ERR771104 ['2.0', '4.0']


### List sample metadata

In [4]:
with Session(API_BASE) as s:
    sample = s.get('samples', 'ERS488919').resource
    for m in sample.metadata:
        print(m.var_name, m.var_value, m.unit)

temperature 26.812225 &deg;C
project name Tara Oceans expedition (2009-2013) None
geographic location (depth) 25 m
environmental package water None
instrument model Illumina HiSeq 2000 None
ENA checklist ENA TARA (ERC000030) None
latitude end 18.5679 DD
longitude end 66.4581 DD
marine region n/a None
protocol label BACT_NUC-DNA(100L)_W1.6-20 None
sampling campaign TARA_20100309Z None
sampling platform SV Tara None
chlorophyll sensor 0.17638 mg Chl/m3
citation tbd None
event date/time end 2010-03-18T12:32 None
event date/time start 2010-03-18T11:33 None
event label TARA_20100318T1133Z_039_EVENT_PUMP None
further details tbd None
last update date 2014-05-01Z None
nitrate sensor -0.611888 &micro;mol/L
oxygen sensor 192.95875 &micro;mol/Kg
salinity sensor 36.332317 psu
sample collection device PUMP (High Volume Peristaltic Pump) with ECOTriplet None
sample status This version can be used to provide data discovery services None
sampling station TARA_039 None
size fraction lower threshold 1.

### List functional annotations

In [5]:
try:
    from urllib import urlencode
except ImportError:
    from urllib.parse import urlencode
    
with Session(API_BASE) as s:
    sample = s.get('samples', 'ERS238574').resource
    for run in sample.runs:
        for analysis in run.analysis:
            for ann in analysis.go_slim:
                print(ann.accession, ann.lineage, ann.description)

GO:0000156 molecular_function two-component response regulator activity
GO:0000160 biological_process phosphorelay signal transduction system
GO:0000166 molecular_function nucleotide binding
GO:0000746 biological_process conjugation
GO:0000902 biological_process cell morphogenesis
GO:0000988 molecular_function protein binding transcription factor activity
GO:0001071 molecular_function nucleic acid binding transcription factor activity
GO:0002376 biological_process immune system process
GO:0003676 molecular_function nucleic acid binding
GO:0003774 molecular_function motor activity
GO:0003924 molecular_function GTPase activity
GO:0004386 molecular_function helicase activity
GO:0004518 molecular_function nuclease activity
GO:0004672 molecular_function protein kinase activity
GO:0004812 molecular_function aminoacyl-tRNA ligase activity
GO:0004872 molecular_function receptor activity
GO:0005102 molecular_function receptor binding
GO:0005215 molecular_function transporter activity
GO:0005618

### Get oceanic metagenomic samples collected in a temperature between 0°C and 10°C

In [6]:
with Session(API_BASE) as s:
    params = {
        'experiment_type': 'metagenomic',
        'metadata_key': 'temperature',
        'metadata_value_gte': 0,
        'metadata_value_lte': 10,
    }
    f = Filter(urlencode(params))
    for sample in s.iterate('biomes/root:Environmental:Aquatic:Marine:Oceanic/samples', f):
        print(sample.accession, sample.biome.lineage)

ERS1568974 root:Environmental:Aquatic:Marine:Oceanic
ERS1569005 root:Environmental:Aquatic:Marine:Oceanic
ERS1569000 root:Environmental:Aquatic:Marine:Oceanic
ERS1568998 root:Environmental:Aquatic:Marine:Oceanic
ERS1568997 root:Environmental:Aquatic:Marine:Oceanic
ERS1568968 root:Environmental:Aquatic:Marine:Oceanic
ERS1568996 root:Environmental:Aquatic:Marine:Oceanic
ERS1568992 root:Environmental:Aquatic:Marine:Oceanic
ERS1569006 root:Environmental:Aquatic:Marine:Oceanic
ERS1568999 root:Environmental:Aquatic:Marine:Oceanic
ERS1568973 root:Environmental:Aquatic:Marine:Oceanic
ERS1569002 root:Environmental:Aquatic:Marine:Oceanic
ERS1568966 root:Environmental:Aquatic:Marine:Oceanic
ERS1569001 root:Environmental:Aquatic:Marine:Oceanic
ERS1568969 root:Environmental:Aquatic:Marine:Oceanic
ERS689136 root:Environmental:Aquatic:Marine:Oceanic:Photic zone
ERS689141 root:Environmental:Aquatic:Marine:Oceanic:Photic zone
ERS689142 root:Environmental:Aquatic:Marine:Oceanic:Photic zone
ERS689356 roo

### Export to CSV

In [7]:
import csv

with open("test_export.csv", "w") as csvfile:
    with Session(API_BASE) as s:
        fieldnames = ['study', 'sample', 'biome', 'longitude', 'latitude']
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        writer.writeheader()
        study = s.get('studies', 'ERP005831').resource
        for sample in study.samples:
            row = {
                'study': study.accession,
                'sample': sample.accession,
                'biome': sample.biome.lineage,
                'longitude': sample.longitude,
                'latitude': sample.latitude
            }
            writer.writerow(row)