# REST-0 - Exercises

**API REST usage**

In [1]:
import requests, sys

server = "https://rest.ensembl.org"
ext = "/sequence/id/ENSG00000157764?"

r = requests.get(server+ext, headers={ "Content-Type" : "text/plain"})

if not r.ok:
  r.raise_for_status()
  sys.exit()


print(r.text)

CTTCCCCCAATCCCCTCAGGCTCGGCTGCGCCCGGGGCCGCGGGCCGGTACCTGAGGTGGCCCAGGCGCCCTCCGCCCGCGGCGCCGCCCGGGCCGCTCCTCCCCGCGCCCCCCGCGCCCCCCGCTCCTCCGCCTCCGCCTCCGCCTCCGCCTCCCCCAGCTCTCCGCCTCCCTTCCCCCTCCCCGCCCGACAGCGGCCGCTCGGGCCCCGGCTCTCGGTTATAAGATGGCGGCGCTGAGCGGTGGCGGTGGTGGCGGCGCGGAGCCGGGCCAGGCTCTGTTCAACGGGGACATGGAGCCCGAGGCCGGCGCCGGCGCCGGCGCCGCGGCCTCTTCGGCTGCGGACCCTGCCATTCCGGAGGAGGTGAGTGCTGGCGCCACCCTGCCGCCCTCCCGACTCCGGGCTCGGCGGCTGGCTGGTGTTTATTTTGGAAAGAGGCGGCGGTGGGGGCTTGATGCCCTCAGCCACCTTCTCGGGCCAGCTCCGCGGGCTGGGAGGTGGGCATCGCCCCCGTGTCCCTCTCCGTCATGCAGCGCCTTCCTACGTAAACACACACAATGGCCCGGGGGGTTTCCCTGGCCCCCACCCCAGATGTGGGGATTGGGGCAGCGGTGGTTGAGCGGGAGGCTATCAATAGGGGGCGAAACTCAGGGTTGGTCCGAGAAGGTCACGATTGGCTGAAGTATCCAGCTCTGCATCTCTGTGGGGTGGGGGCGGCGGCGGCCTCGACGTGGAGGATATAGGTTAGTTGCTGGGGCTGAGACAACAGCCCGAGTTACTGTCGCGTGTAATTCTTACATGGTCGTGGGGATGATGGGGCTCATCATTTCCTCTCTCCTCTCCCGGACTGCCCCCCTTCTCAGTCCGCTGCCCTTTTTCACTTTTCTATTTGGGGATTTCTCTTCACCTGTTTTACCCAGCAAATTATTTTGATTTAGTCTTTACTTTTTCAATCCTAAATCGCAGTTTCCGATGCCTTTTCTGGTCTCTGGTCCTCTG

Notebook with exercises for practicing the use of API REST using the *requests* library

# 1. First steps

1.1. Study the following API REST, and answer:
* Which differences do you observe between GET and POST requests?

http://rest.ensembl.org/

Ejemplo 1:

- GET archive/id/:id
  > Uses the given identifier to return its latest version


- POST archive/id
  > Retrieve the latest version for a set of identifiers

Ejemplo 2:

- GET sequence/id/:id
  > Request multiple types of sequence by stable identifier.
  > Supports feature masking and expand options

- POST sequence/id
  > Request multiple types of sequence by a stable identifier list.


Un metodo GET codifica las variables en la URL (con GET podriamos pedir
solo 1 id):      `GET/used:deya&passord`

Post codifica las variables en el cuerpo del mensaje (no se ve la 
URL)            `POST/`(podriamos mandar una lista de id's)
                    `[parámetros]`

> POST no lo utilizan para el fin previsto (crear información en el 
  servidor), sino sólo para responder con la representación del recurso 
  (al igual que GET). URLs como las siguientes utilizan el método GET: 
  https://rest.ensembl.org/sequence/id/ENSG00000157764
  https://www.ebi.ac.uk/Tools/dbfetch/dbfetch?db=uniprot&id=WAP_RAT
  
> Un método POST equivalente utiliza una URL sin parámetros 
  (http://rest.ensembl.org/sequence/id/)
  que se deben enviar junto a los parámetros en el cuerpo de la petición 
  ("ids":["ENSG00000157764"]).

1.2. Write a program to get the software version of the API. First, find the appropiate endpoint, and study how to call it

Use the following code skeleton:


In [10]:
import requests, sys

server = "http://rest.ensembl.org"
endpoint = "/info/software?"

r = requests.get(f"{server}{endpoint}", headers={ "Content-Type" : "application/json"})

if not r.ok:
  r.raise_for_status()
  sys.exit()

print(r.json()) 


{'release': 113}


1.3. TIP: JSON structures are difficult to read if not are properly formatted. Use *pprint* module for that. Repeat the previous exercise and include:

In [11]:
import pprint
import requests, sys

server = "http://rest.ensembl.org"
endpoint = "/info/software?"

r = requests.get(f"{server}{endpoint}", headers={ "Content-Type" : "application/json"})

if not r.ok:
  r.raise_for_status()
  sys.exit()

print(r.json()) 

pprint.pprint(r.json())

{'release': 113}
{'release': 113}


# 2. Processing responses and errors
----

2.1. The response objects has a status_code attribute that can be used to check for any errors the API might have reported

In [7]:
import requests, pprint

response = requests.get("http://api.open-notify.org/astros.json")
if (response.status_code == 200):
    print("The request was a success!")
    # Code here will only run if the request is successful
    pprint.pprint(response.json())
elif (response.status_code == 404):
    print("Result not found!")
    # Code here will react to failed requests

The request was a success!
{'message': 'success',
 'number': 12,
 'people': [{'craft': 'ISS', 'name': 'Oleg Kononenko'},
            {'craft': 'ISS', 'name': 'Nikolai Chub'},
            {'craft': 'ISS', 'name': 'Tracy Caldwell Dyson'},
            {'craft': 'ISS', 'name': 'Matthew Dominick'},
            {'craft': 'ISS', 'name': 'Michael Barratt'},
            {'craft': 'ISS', 'name': 'Jeanette Epps'},
            {'craft': 'ISS', 'name': 'Alexander Grebenkin'},
            {'craft': 'ISS', 'name': 'Butch Wilmore'},
            {'craft': 'ISS', 'name': 'Sunita Williams'},
            {'craft': 'Tiangong', 'name': 'Li Guangsu'},
            {'craft': 'Tiangong', 'name': 'Li Cong'},
            {'craft': 'Tiangong', 'name': 'Ye Guangfu'}]}


2.2. To see this in action, try removing the last letter from the URL endpoint, the API should return a 404 status code.



In [8]:
import requests, pprint

response = requests.get("http://api.open-notify.org/astros.jso")
if (response.status_code == 200):
    print("The request was a success!")
    # Code here will only run if the request is successful
    pprint.pprint(response.json())
elif (response.status_code == 404):
    print("Result not found!")
    # Code here will react to failed requests

Result not found!


# 3. Using parameters
----

3.1. Write a program to get the lastest version info about gene ENSG00000157764

In [None]:
import requests, sys

server = "http://rest.ensembl.org"
endpoint = "/sequence/id/"
gene = "ENSG00000157764"

r = requests.get(f"{server}{endpoint}{gene}", headers={ "Content-Type" : "text/plain"})

print(r.text) 


CTTCCCCCAATCCCCTCAGGCTCGGCTGCGCCCGGGGCCGCGGGCCGGTACCTGAGGTGGCCCAGGCGCCCTCCGCCCGCGGCGCCGCCCGGGCCGCTCCTCCCCGCGCCCCCCGCGCCCCCCGCTCCTCCGCCTCCGCCTCCGCCTCCGCCTCCCCCAGCTCTCCGCCTCCCTTCCCCCTCCCCGCCCGACAGCGGCCGCTCGGGCCCCGGCTCTCGGTTATAAGATGGCGGCGCTGAGCGGTGGCGGTGGTGGCGGCGCGGAGCCGGGCCAGGCTCTGTTCAACGGGGACATGGAGCCCGAGGCCGGCGCCGGCGCCGGCGCCGCGGCCTCTTCGGCTGCGGACCCTGCCATTCCGGAGGAGGTGAGTGCTGGCGCCACCCTGCCGCCCTCCCGACTCCGGGCTCGGCGGCTGGCTGGTGTTTATTTTGGAAAGAGGCGGCGGTGGGGGCTTGATGCCCTCAGCCACCTTCTCGGGCCAGCTCCGCGGGCTGGGAGGTGGGCATCGCCCCCGTGTCCCTCTCCGTCATGCAGCGCCTTCCTACGTAAACACACACAATGGCCCGGGGGGTTTCCCTGGCCCCCACCCCAGATGTGGGGATTGGGGCAGCGGTGGTTGAGCGGGAGGCTATCAATAGGGGGCGAAACTCAGGGTTGGTCCGAGAAGGTCACGATTGGCTGAAGTATCCAGCTCTGCATCTCTGTGGGGTGGGGGCGGCGGCGGCCTCGACGTGGAGGATATAGGTTAGTTGCTGGGGCTGAGACAACAGCCCGAGTTACTGTCGCGTGTAATTCTTACATGGTCGTGGGGATGATGGGGCTCATCATTTCCTCTCTCCTCTCCCGGACTGCCCCCCTTCTCAGTCCGCTGCCCTTTTTCACTTTTCTATTTGGGGATTTCTCTTCACCTGTTTTACCCAGCAAATTATTTTGATTTAGTCTTTACTTTTTCAATCCTAAATCGCAGTTTCCGATGCCTTTTCTGGTCTCTGGTCCTCTG

In [36]:
import requests, sys
import pprint

server = "http://rest.ensembl.org"
gene = "ENSG00000157764"
get_gene_info_endpoint = f"/archive/id/{gene}"

response = requests.get(f"{server}{get_gene_info_endpoint}", 
            headers={ "Content-Type" : "application/json"})

res_dic = response.json()
result = json.dumps(res_dic, indent = 2)
print(result)

{
  "id": "ENSG00000157764",
  "version": 14,
  "peptide": null,
  "release": "113",
  "possible_replacement": [],
  "type": "Gene",
  "is_current": "1",
  "assembly": "GRCh38",
  "latest": "ENSG00000157764.14"
}


3.2. Right. Now, what about getting info for several genes? For example, the following ones:
* ENSG00000157764
* ENSG00000248378

Use ONLY one call.

In [38]:
import requests, sys
import pprint

server = "http://rest.ensembl.org"
ext = "/archive/id/"
genes = '{"id":["ENSG00000157764","ENSG00000248378"]}'
headers={ "Content-Type" : "application/json"}

response = requests.post(f"{server}{ext}", headers=headers, data=genes)

if not r.ok:
  r.raise_for_status()
  sys.exit()

# res_dic = response.json()
# result = json.dumps(res_dic, indent = 2)
# print(result) 
pprint.pprint(response.json())


[{'assembly': 'GRCh38',
  'id': 'ENSG00000157764',
  'is_current': '1',
  'latest': 'ENSG00000157764.14',
  'peptide': None,
  'possible_replacement': [],
  'release': '113',
  'type': 'Gene',
  'version': 14},
 {'assembly': 'GRCh38',
  'id': 'ENSG00000248378',
  'is_current': '1',
  'latest': 'ENSG00000248378.1',
  'peptide': None,
  'possible_replacement': [],
  'release': '113',
  'type': 'Gene',
  'version': 1}]


3.3. OK, time to raise the bar. Now, write a program to get the gene tree for the gene ENSGT00390000003602 of a cow, including its DNA sequence

In [50]:
# GET genetree/id/:id

import requests, sys
import pprint

server = "http://rest.ensembl.org"
gene = "ENSGT00390000003602"
endpoint_gene = f"/genetree/id/{gene}"

headers={ "Content-Type" : "application/json"}
parameters = {'prune_species': 'cow', 'sequence': 'cdna'}

response = requests.get(f"{server}{endpoint_gene}", 
           params=parameters, headers=headers)

if not r.ok:
  r.raise_for_status()
  sys.exit()

pprint.pprint(response.json())


{'id': 'ENSGT00390000003602',
 'rooted': 1,
 'tree': {'branch_length': 0.000223,
          'confidence': {},
          'id': {'accession': 'ENSBTAG00000000988', 'source': 'EnsEMBL'},
          'sequence': {'id': [{'accession': 'ENSBTAP00000001311',
                               'source': 'EnsEMBL'}],
                       'location': '12:28615746-28668655',
                       'mol_seq': {'is_aligned': 0,
                                   'seq': 'ATGCCGATTGGATGCAAAGAGAGGCCAACTTTTTTTGACATTTTTAAGGCGCGATGCAACAAAGCAGATTTAGGACCAATAAGCCTTAATTGGTTTGAAGAACTTTCTTCAGAAGCTCCACTCTGTAATTCTGAACCTTTAGAAGAATCAGAATATAAAATCAGCAGTAATGAAACAAACCCATTTAAAACACCACAAAGGAAACCTTATCATCAGTTGGCTTCAACTCCTGTAATATTCAAAGAGCAAAGTCTAACTCTGCCACTGTACCAATCTCCTTTAAAGGAATTACATAAATTCAGATTGGATTCAGGAAAGGATATTGCCAACAGTAAACATAAAAGTTGTTGCAGGGTGAAGGCTAAAATCAATCAAGCAAATGATGTTATCAGCCCACCTCCAAATTCCTCTCTTAGTGAAAGTCCTGTTGTTCTGCGATGTACACATGTAACACCACAAAGAGAAAAGTCAGTGGTATGTGGAAGTTTATTTCATACACCAAAGCTCATAAAGGGTCAGACACCGAAACGTATTTCTGAAAGT

# Solutions (do NOT open!. Yet)

**Exercise 3.1**

In [24]:
# Write a program to get the lastest version info about gene ENSG00000157764

import requests, sys, json
import pprint

server = "http://rest.ensembl.org"
gene = "ENSG00000157764"
get_gene_info_endpoint = f"/archive/id/{gene}"

r = requests.get(f"{server}{get_gene_info_endpoint}", headers={ "Content-Type" : "application/json"})

if not r.ok:
  r.raise_for_status()
  sys.exit()

pprint.pprint(r.json())

{'assembly': 'GRCh38',
 'id': 'ENSG00000157764',
 'is_current': '1',
 'latest': 'ENSG00000157764.14',
 'peptide': None,
 'possible_replacement': [],
 'release': '113',
 'type': 'Gene',
 'version': 14}


**Exercise 3.2**

In [39]:
# EXERCISE 3.2: Right. Now, what about getting info for several genes? For example, the following ones:

import requests, sys, json
import pprint

server = "http://rest.ensembl.org"
get_genes_info_endpoint = server + "/archive/id"

headers={ "Content-Type" : "application/json", "Accept" : "application/json"}
r = requests.post(get_genes_info_endpoint, headers=headers, data='{ "id" : ["ENSG00000157764", "ENSG00000248378"] }')

if not r.ok:
  r.raise_for_status()
  sys.exit()

pprint.pprint(r.json())

[{'assembly': 'GRCh38',
  'id': 'ENSG00000157764',
  'is_current': '1',
  'latest': 'ENSG00000157764.14',
  'peptide': None,
  'possible_replacement': [],
  'release': '113',
  'type': 'Gene',
  'version': 14},
 {'assembly': 'GRCh38',
  'id': 'ENSG00000248378',
  'is_current': '1',
  'latest': 'ENSG00000248378.1',
  'peptide': None,
  'possible_replacement': [],
  'release': '113',
  'type': 'Gene',
  'version': 1}]


**Exercise 3.3**.

In [49]:
import requests, sys, json
import pprint

server = "http://rest.ensembl.org"
gene = "ENSGT00390000003602"
get_gene_tree_endpoint = f"/genetree/id/{gene}"

parameters = {'prune_species': 'cow', 'sequence': 'cdna'}

r = requests.get(f"{server}{get_gene_tree_endpoint}",
                 params = parameters, headers={ "Content-Type" : "application/json"})

if not r.ok:
  r.raise_for_status()
  sys.exit()

pprint.pprint(r.json())

{'id': 'ENSGT00390000003602',
 'rooted': 1,
 'tree': {'branch_length': 0.000223,
          'confidence': {},
          'id': {'accession': 'ENSBTAG00000000988', 'source': 'EnsEMBL'},
          'sequence': {'id': [{'accession': 'ENSBTAP00000001311',
                               'source': 'EnsEMBL'}],
                       'location': '12:28615746-28668655',
                       'mol_seq': {'is_aligned': 0,
                                   'seq': 'ATGCCGATTGGATGCAAAGAGAGGCCAACTTTTTTTGACATTTTTAAGGCGCGATGCAACAAAGCAGATTTAGGACCAATAAGCCTTAATTGGTTTGAAGAACTTTCTTCAGAAGCTCCACTCTGTAATTCTGAACCTTTAGAAGAATCAGAATATAAAATCAGCAGTAATGAAACAAACCCATTTAAAACACCACAAAGGAAACCTTATCATCAGTTGGCTTCAACTCCTGTAATATTCAAAGAGCAAAGTCTAACTCTGCCACTGTACCAATCTCCTTTAAAGGAATTACATAAATTCAGATTGGATTCAGGAAAGGATATTGCCAACAGTAAACATAAAAGTTGTTGCAGGGTGAAGGCTAAAATCAATCAAGCAAATGATGTTATCAGCCCACCTCCAAATTCCTCTCTTAGTGAAAGTCCTGTTGTTCTGCGATGTACACATGTAACACCACAAAGAGAAAAGTCAGTGGTATGTGGAAGTTTATTTCATACACCAAAGCTCATAAAGGGTCAGACACCGAAACGTATTTCTGAAAGT