# Mouse Phenotype API

Example queries and simple data processing for the [mousephenotype.org](mousephenotype.org) API.

## Endpoints

- `/genes`: Gene search endpoint
- `/geneBundles`: Gene bundled data endpoint

## Example queries

### Setting things up

In [1]:
import requests
from IPython.display import HTML, display
import tabulate
impc_api_url = "https://www.gentar.org/impc-dev-api/"
impc_api_search_url = f"{impc_api_url}/genes"
impc_api_gene_bundle_url = f"{impc_api_url}/geneBundles"

### 1. Extract all measured phenotypes related to this gene
Using [Cib2 - MGI:1929293](https://www.mousephenotype.org/data/genes/MGI:1929293)
> Hint 💡: For any query that relies directly on an MGI Accession ID you can use the `/geneBundles` endpoint directly.

In [2]:
mgi_accession_id = "MGI:1929293"

# https://www.gentar.org/impc-dev-api/geneBundles/MGI:1929293
gene_bundle_url = f"{impc_api_gene_bundle_url}/{mgi_accession_id}"
gene_bundle =  requests.get(gene_bundle_url).json()

### 3. Extract all genes having a particular phenotype or a set of phenotypes (e.g. relevant to a disease)
Using increased basophil cell number ([MP:0002606](https://www.mousephenotype.org/data/phenotypes/MP:0002606)) and increased circulating cholesterol level
 [MP:0005178](https://www.mousephenotype.org/data/phenotypes/MP:0005178))
> Hint 💡: For any query that needs to perform a search you'll need to hit the `/genes` first and get the bundles URLs from the response to get the actual data.

In [3]:
target_mp_terms = ['MP:0002606', 'MP:0005178']

## All the data is paginated using the page and size parameters, by default the endpoint returns the first 20 hits
gene_by_phenotypes_query = f"{impc_api_search_url}/search/findAllBySignificantMpTermIdsContains?mpTermIds={','.join(target_mp_terms)}&page=0&size=20"
genes_with_clinical_chemistry_phenotypes = requests.get(gene_by_phenotypes_query).json()
print(f"Genes with {target_mp_terms}: {genes_with_clinical_chemistry_phenotypes['page']['totalElements']}")
list_of_genes = []

for gene in genes_with_clinical_chemistry_phenotypes['_embedded']['genes']:
    gene_dict = {"gene_accession_id": gene['mgiAccessionId'], "gene_name": gene['markerName'], "gene_bundle_url": gene["_links"]["geneBundle"]['href']}
    list_of_genes.append(gene_dict)

display(HTML(tabulate.tabulate([i.values() for i in list_of_genes], headers=list_of_genes[0].keys(), tablefmt='html')))

Genes with ['MP:0002606', 'MP:0005178']: 335


gene_accession_id,gene_name,gene_bundle_url
MGI:108404,"amyloid beta (A4) precursor protein-binding, family B, member 3",http://localhost:8080/geneBundles/MGI:108404
MGI:1915282,ER membrane protein complex subunit 4,http://localhost:8080/geneBundles/MGI:1915282
MGI:3607791,RFT1 homolog,http://localhost:8080/geneBundles/MGI:3607791
MGI:2685007,F-box protein 27,http://localhost:8080/geneBundles/MGI:2685007
MGI:1890396,suppressor of variegation 3-9 2,http://localhost:8080/geneBundles/MGI:1890396
MGI:108007,"potassium inwardly-rectifying channel, subfamily J, member 9",http://localhost:8080/geneBundles/MGI:108007
MGI:109321,"laminin, alpha 4",http://localhost:8080/geneBundles/MGI:109321
MGI:1918673,ectopic P-granules autophagy protein 5 homolog (C. elegans),http://localhost:8080/geneBundles/MGI:1918673
MGI:1344412,LIM domain binding 3,http://localhost:8080/geneBundles/MGI:1344412
MGI:2447622,RNA binding motif protein 11,http://localhost:8080/geneBundles/MGI:2447622


### 4. Extract all phenotypes which are present in a particular gene set (e.g. genes together in a pathway)
Using [MGI:2444773](https://www.mousephenotype.org/data/genes/MGI:2444773), [MGI:1351500](https://www.mousephenotype.org/data/genes/MGI:1351500), [MGI:2157522](https://www.mousephenotype.org/data/genes/MGI:2157522), [MGI:2141861](https://www.mousephenotype.org/data/genes/MGI:2141861), [MGI:3588194](https://www.mousephenotype.org/data/genes/MGI:3588194), [MGI:1918313](https://www.mousephenotype.org/data/genes/MGI:1918313), [MGI:2444431](https://www.mousephenotype.org/data/genes/MGI:2444431), [MGI:1913658](https://www.mousephenotype.org/data/genes/MGI:1913658), [MGI:1922354](https://www.mousephenotype.org/data/genes/MGI:1922354), [MGI:1917336](https://www.mousephenotype.org/data/genes/MGI:1917336).
> Hint 💡: The light-weight `/genes` endpoint contains all the searchable fields, if you don't need any extra data there is no need to use the heavy-weight `/geneBundles` endpoint.

In [4]:
target_genes = ['MGI:2444773', 'MGI:2444773', 'MGI:2157522', 'MGI:2141861', 'MGI:3588194', 'MGI:1918313', 'MGI:2444431', 'MGI:1913658', 'MGI:1922354', 'MGI:1917336']

genes_in_gene_list_query = f"{impc_api_search_url}/search/findAllByMgiAccessionIdIn?mgiAccessionIds={','.join(target_genes)}"

genes_in_gene_list = requests.get(genes_in_gene_list_query).json()
list_of_mp_terms_vs_gene_index = {}

for gene in genes_in_gene_list['_embedded']['genes']:
    mp_terms = gene['significantMpTermNames']
    gene_acc_id = gene["mgiAccessionId"]
    if mp_terms is None:
        continue
    for mp_term_name in mp_terms:
        if mp_term_name not in list_of_mp_terms_vs_gene_index:
            list_of_mp_terms_vs_gene_index[mp_term_name] = {"mp_term": mp_term_name, "genes": []}
        list_of_mp_terms_vs_gene_index[mp_term_name]["genes"].append(gene_acc_id)
genes_by_mp_term = list(list_of_mp_terms_vs_gene_index.values())
display(HTML(tabulate.tabulate([i.values() for i in genes_by_mp_term], headers=genes_by_mp_term[0].keys(), tablefmt='html')))

mp_term,genes
persistence of hyaloid vascular system,['MGI:1913658']
decreased bone mineral content,"['MGI:1913658', 'MGI:1913658', 'MGI:2444773', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194']"
decreased bone mineral density,"['MGI:1913658', 'MGI:2444773', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194']"
cataract,['MGI:1913658']
abnormal posterior eye segment morphology,['MGI:1913658']
abnormal eye morphology,"['MGI:1913658', 'MGI:1913658', 'MGI:1918313']"
abnormal vitreous body morphology,['MGI:1913658']
abnormal eye development,['MGI:1913658']
abnormal bone structure,"['MGI:1913658', 'MGI:1913658', 'MGI:1913658', 'MGI:2157522', 'MGI:2444773', 'MGI:2444773', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194']"
abnormal bone mineral content,"['MGI:1913658', 'MGI:1913658', 'MGI:2444773', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194', 'MGI:3588194']"


## 7. Extract images with a particular phenotype or a set of phenotypes
> Warning ⚠️: The IMPC data has not direct relationship between images and phenotypes, but it is possible to get all the images related to all the genes that have a significant hit for a given phenotype.
> Hint 💡: The images live inside each individual gene-bundle under the field `geneImages`. The easiest way to query images by phenotype is first hitting the `genes` endpoint to
Using *abnormal femur morphology* ([MP:0000559](https://www.mousephenotype.org/data/phenotypes/MP:0000559)) and abnormal digit morphology
 [MP:0002110](https://www.mousephenotype.org/data/phenotypes/MP:0002110))

### First let's get the genes from the light-weight `/genes` endpoint:

In [5]:
target_mp_terms = ['MP:0002110', 'MP:0000559']

## All the data is paginated using the page and size parameters, by default the endpoint returns the first 20 hits
gene_by_phenotypes_query = f"{impc_api_search_url}/search/findAllBySignificantMpTermIdsContains?mpTermIds={','.join(target_mp_terms)}&page=0&size=20"
genes_with_morphology_mps = requests.get(gene_by_phenotypes_query).json()
print(f"Genes with {target_mp_terms}: {genes_with_morphology_mps['page']['totalElements']}")
list_of_gene_bundle_urls = [gene["_links"]["geneBundle"]['href'] for gene in genes_with_morphology_mps['_embedded']['genes']]

Genes with ['MP:0002110', 'MP:0000559']: 113


### Now let's get the bundles from the heavy-weight `/geneBundles` endpoint (this may take a while):

In [6]:
gene_bundles = []
for gene_bundle_url in list_of_gene_bundle_urls:
    gene_bundle = requests.get(gene_bundle_url).json()
    gene_bundles.append(gene_bundle)

images_with_morphology_mps = []

## Doing just the first 20 and filtering out fields on the images
display_fields = ['geneSymbol', 'parameterName', 'biologicalSampleGroup', 'colonyId', 'zygosity', 'sex', 'downloadUrl', 'externalSampleId', 'thumbnailUrl']
for gene_bundle in gene_bundles[:20]:
    if gene_bundle["geneImages"] is not None:
        images = gene_bundle["geneImages"]
        for image in images:
            display_image = {k:v for k,v in image.items() if k in display_fields}
            images_with_morphology_mps.append(display_image)

images_table = []

## Displaying just the first 20 images
for i in images_with_morphology_mps[:20]:
    row = [f"<img src='{i['thumbnailUrl']}' />"] + list(i.values())
    images_table.append(row)

display(HTML(tabulate.tabulate(images_table, headers=["thumbnail"] + list(images_with_morphology_mps[0].keys()) , tablefmt='unsafehtml')))

thumbnail,externalSampleId,geneSymbol,biologicalSampleGroup,sex,colonyId,zygosity,parameterName,downloadUrl,thumbnailUrl
,30418302,Chpf,experimental,female,H5694-HEPD0515_3_B08-1,homozygote,XRay Images Whole Body Dorso Ventral,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/470489,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/470489
,30410884,Chpf,experimental,male,H5694-HEPD0515_3_B08-1,homozygote,XRay Images Whole Body Dorso Ventral,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/470498,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/470498
,30410861,Chpf,experimental,female,H5694-HEPD0515_3_B08-1,homozygote,XRay Images Whole Body Lateral Orientation,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/445280,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/445280
,30410882,Chpf,experimental,male,H5694-HEPD0515_3_B08-1,homozygote,XRay Images Whole Body Dorso Ventral,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/470499,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/470499
,30418302,Chpf,experimental,female,H5694-HEPD0515_3_B08-1,homozygote,XRay Images Whole Body Lateral Orientation,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/445276,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/445276
,30410879,Chpf,experimental,female,H5694-HEPD0515_3_B08-1,homozygote,XRay Images Whole Body Lateral Orientation,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/445284,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/445284
,30410862,Chpf,experimental,male,H5694-HEPD0515_3_B08-1,homozygote,XRay Images Whole Body Lateral Orientation,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/444957,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/444957
,30410860,Chpf,experimental,female,H5694-HEPD0515_3_B08-1,homozygote,XRay Images Whole Body Dorso Ventral,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/470490,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/470490
,30410879,Chpf,experimental,female,H5694-HEPD0515_3_B08-1,homozygote,XRay Images Whole Body Dorso Ventral,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/470500,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/470500
,30418293,Chpf,experimental,female,H5694-HEPD0515_3_B08-1,homozygote,XRay Images Whole Body Lateral Orientation,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/445285,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/445285
