# Mouse Phenotype API

Example queries and simple data processing for the [mousephenotype.org](mousephenotype.org) API.

## Endpoints

- `/genes`: Gene search endpoint
- `/geneBundles`: Gene bundled data endpoint
- `/statisticalResults`: Statistical results data
- `/observations`: Raw experimental data

## Example queries

### Setting things up

In [1]:
import requests
from IPython.display import HTML, display
import tabulate
from tqdm.notebook import trange

impc_api_url = "https://www.gentar.org/impc-dev-api/"
impc_api_search_url = f"{impc_api_url}/genes"
impc_api_gene_bundle_url = f"{impc_api_url}/geneBundles"
impc_api_statistical_results_url = f"{impc_api_url}/statisticalResults"
impc_api_observations_url = f"{impc_api_url}/observations"

### 1. Extract all measured phenotypes related to this gene
Using [Cib2 - MGI:1929293](https://www.mousephenotype.org/data/genes/MGI:1929293)
> Hint 💡: For any query that relies directly on an MGI Accession ID you can use the `/geneBundles` endpoint directly.

In [2]:
mgi_accession_id = "MGI:1929293"

# https://www.gentar.org/impc-dev-api/geneBundles/MGI:1929293
gene_bundle_url = f"{impc_api_gene_bundle_url}/{mgi_accession_id}"
gene_bundle =  requests.get(gene_bundle_url).json()

### 2. Extract all genes related to a phenotype **and** 3. Extract all genes having a particular phenotype or a set of phenotypes (e.g. relevant to a disease)

Using increased basophil cell number ([MP:0002606](https://www.mousephenotype.org/data/phenotypes/MP:0002606)) and increased circulating cholesterol level
 [MP:0005178](https://www.mousephenotype.org/data/phenotypes/MP:0005178))
> Hint 💡: For any query that needs to perform a search you'll need to hit the `/genes` first and get the bundles URLs from the response to get the actual data.

In [3]:
target_mp_terms = ['MP:0002606', 'MP:0005178']

## All the data is paginated using the page and size parameters, by default the endpoint returns the first 20 hits
gene_by_phenotypes_query = f"{impc_api_search_url}/search/findAllBySignificantMpTermIdsContains?mpTermIds={','.join(target_mp_terms)}&page=0&size=20"
genes_with_clinical_chemistry_phenotypes = requests.get(gene_by_phenotypes_query).json()
print(f"Genes with {target_mp_terms}: {genes_with_clinical_chemistry_phenotypes['page']['totalElements']}")
list_of_genes = []

for gene in genes_with_clinical_chemistry_phenotypes['_embedded']['genes']:
    gene_dict = {"gene_accession_id": gene['mgiAccessionId'], "gene_name": gene['markerName'], "gene_bundle_url": gene["_links"]["geneBundle"]['href']}
    list_of_genes.append(gene_dict)

display(HTML(tabulate.tabulate([i.values() for i in list_of_genes], headers=list_of_genes[0].keys(), tablefmt='html')))

Genes with ['MP:0002606', 'MP:0005178']: 278


gene_accession_id,gene_name,gene_bundle_url
MGI:107564,Ngfi-A binding protein 1,https://www.gentar.org/impc-dev-api/geneBundles/MGI:107564
MGI:1097158,calumenin,https://www.gentar.org/impc-dev-api/geneBundles/MGI:1097158
MGI:108077,neuroplastin,https://www.gentar.org/impc-dev-api/geneBundles/MGI:108077
MGI:1915889,SET and MYND domain containing 2,https://www.gentar.org/impc-dev-api/geneBundles/MGI:1915889
MGI:104993,leptin receptor,https://www.gentar.org/impc-dev-api/geneBundles/MGI:104993
MGI:107893,"DNA segment, Chr 6, Wayne State University 163, expressed",https://www.gentar.org/impc-dev-api/geneBundles/MGI:107893
MGI:101947,heterogeneous nuclear ribonucleoprotein D,https://www.gentar.org/impc-dev-api/geneBundles/MGI:101947
MGI:102784,transforming growth factor beta 1 induced transcript 1,https://www.gentar.org/impc-dev-api/geneBundles/MGI:102784
MGI:107628,"PR domain containing 2, with ZNF domain",https://www.gentar.org/impc-dev-api/geneBundles/MGI:107628
MGI:105985,a disintegrin and metallopeptidase domain 26A (testase 3),https://www.gentar.org/impc-dev-api/geneBundles/MGI:105985


### 4. Extract all phenotypes which are present in a particular gene set (e.g. genes together in a pathway)
Using [MGI:2444773](https://www.mousephenotype.org/data/genes/MGI:2444773), [MGI:1351500](https://www.mousephenotype.org/data/genes/MGI:1351500), [MGI:2157522](https://www.mousephenotype.org/data/genes/MGI:2157522), [MGI:2141861](https://www.mousephenotype.org/data/genes/MGI:2141861), [MGI:3588194](https://www.mousephenotype.org/data/genes/MGI:3588194), [MGI:1918313](https://www.mousephenotype.org/data/genes/MGI:1918313), [MGI:2444431](https://www.mousephenotype.org/data/genes/MGI:2444431), [MGI:1913658](https://www.mousephenotype.org/data/genes/MGI:1913658), [MGI:1922354](https://www.mousephenotype.org/data/genes/MGI:1922354), [MGI:1917336](https://www.mousephenotype.org/data/genes/MGI:1917336).
> Hint 💡: The light-weight `/genes` endpoint contains all the searchable fields, if you don't need any extra data there is no need to use the heavy-weight `/geneBundles` endpoint.

In [4]:
target_genes = ['MGI:2444773', 'MGI:2444773', 'MGI:2157522', 'MGI:2141861', 'MGI:3588194', 'MGI:1918313', 'MGI:2444431', 'MGI:1913658', 'MGI:1922354', 'MGI:1917336']

genes_in_gene_list_query = f"{impc_api_search_url}/search/findAllByMgiAccessionIdIn?mgiAccessionIds={','.join(target_genes)}"

genes_in_gene_list = requests.get(genes_in_gene_list_query).json()
list_of_mp_terms_vs_gene_index = {}

for gene in genes_in_gene_list['_embedded']['genes']:
    mp_terms = gene['significantMpTerms']
    gene_acc_id = gene["mgiAccessionId"]
    if mp_terms is None:
        continue
    for mp_term in mp_terms:
        mp_term_name = mp_term['mpTermName']
        if mp_term_name not in list_of_mp_terms_vs_gene_index:
            list_of_mp_terms_vs_gene_index[mp_term_name] = {"mp_term": mp_term_name, "genes": []}
        list_of_mp_terms_vs_gene_index[mp_term_name]["genes"].append(gene_acc_id)
genes_by_mp_term = list(list_of_mp_terms_vs_gene_index.values())
display(HTML(tabulate.tabulate([i.values() for i in genes_by_mp_term], headers=genes_by_mp_term[0].keys(), tablefmt='html')))

mp_term,genes
decreased bone mineral density,"['MGI:2444773', 'MGI:1913658', 'MGI:3588194']"
decreased bone mineral content,"['MGI:2444773', 'MGI:1913658', 'MGI:3588194']"
decreased startle reflex,['MGI:2444431']
increased vertical activity,['MGI:2444431']
abnormal auditory brainstem response,['MGI:2444431']
abnormal startle reflex,['MGI:2444431']
decreased thigmotaxis,['MGI:2444431']
abnormal behavior,['MGI:2444431']
cataract,['MGI:1913658']
persistence of hyaloid vascular system,['MGI:1913658']


## 7. Extract images with a particular phenotype or a set of phenotypes and 8. How many images are available with a particular phenotype (Way to access: Phenotype2Gene2Images)
> Warning ⚠️: The IMPC data has not direct relationship between images and phenotypes, but it is possible to get all the images related to all the genes that have a significant hit for a given phenotype.
> Hint 💡: The images live inside each individual gene-bundle under the field `geneImages`. The easiest way to query images by phenotype is first hitting the `genes` endpoint to
Using *abnormal femur morphology* ([MP:0000559](https://www.mousephenotype.org/data/phenotypes/MP:0000559)) and abnormal digit morphology
 [MP:0002110](https://www.mousephenotype.org/data/phenotypes/MP:0002110))

### First let's get the genes from the light-weight `/genes` endpoint:

In [5]:
target_mp_terms = ['MP:0002110', 'MP:0000559']

## All the data is paginated using the page and size parameters, by default the endpoint returns the first 20 hits
gene_by_phenotypes_query = f"{impc_api_search_url}/search/findAllBySignificantMpTermIdsContains?mpTermIds={','.join(target_mp_terms)}&page=0&size=20"
genes_with_morphology_mps = requests.get(gene_by_phenotypes_query).json()
print(f"Genes with {target_mp_terms}: {genes_with_morphology_mps['page']['totalElements']}")
list_of_gene_bundle_urls = [gene["_links"]["geneBundle"]['href'] for gene in genes_with_morphology_mps['_embedded']['genes']]

Genes with ['MP:0002110', 'MP:0000559']: 57


### Now let's get the bundles from the heavy-weight `/geneBundles` endpoint (this may take a while):

In [6]:
gene_bundles = []
for gene_bundle_url in list_of_gene_bundle_urls:
    gene_bundle = requests.get(gene_bundle_url).json()
    gene_bundles.append(gene_bundle)

images_with_morphology_mps = []

## Doing just the first 20 and filtering out fields on the images
display_fields = ['geneSymbol', 'parameterName', 'biologicalSampleGroup', 'colonyId', 'zygosity', 'sex', 'downloadUrl', 'externalSampleId', 'thumbnailUrl']
for gene_bundle in gene_bundles[:20]:
    if "geneImages" in gene_bundle and gene_bundle["geneImages"] is not None:
        images = gene_bundle["geneImages"]
        for image in images:
            display_image = {k:v for k,v in image.items() if k in display_fields}
            images_with_morphology_mps.append(display_image)

images_table = []

## Displaying just the first 20 images
for i in images_with_morphology_mps[:20]:
    row = [f"<img src='{i['thumbnailUrl']}' />"] + list(i.values())
    images_table.append(row)

display(HTML(tabulate.tabulate(images_table, headers=["thumbnail"] + list(images_with_morphology_mps[0].keys()) , tablefmt='unsafehtml')))

thumbnail,externalSampleId,geneSymbol,biologicalSampleGroup,sex,colonyId,zygosity,parameterName,downloadUrl,thumbnailUrl
,1710708,Myo10,experimental,female,MFQX,homozygote,XRay Images Whole Body Dorso Ventral,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/1060661,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/1060661
,1677319,Myo10,experimental,male,MFQX,homozygote,Images Slit Lamp,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/117662,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/117662
,1723876,Myo10,experimental,female,MFQX,homozygote,Images Ophthalmoscopy,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/117403,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/117403
,1677319,Myo10,experimental,male,MFQX,homozygote,XRay Images Forepaw,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/1139034,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/1139034
,1651302,Myo10,experimental,female,MFQX,homozygote,XRay Images Whole Body Lateral Orientation,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/1058525,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/1058525
,1710708,Myo10,experimental,female,MFQX,homozygote,XRay Images Skull Lateral Orientation,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/1059414,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/1059414
,1723876,Myo10,experimental,female,MFQX,homozygote,XRay Images Forepaw,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/1139027,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/1139027
,1710708,Myo10,experimental,female,MFQX,homozygote,XRay Images Forepaw,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/1139028,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/1139028
,1723876,Myo10,experimental,female,MFQX,homozygote,XRay Images Skull Lateral Orientation,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/1059413,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/1059413
,1710706,Myo10,experimental,female,MFQX,homozygote,XRay Images Skull Dorso Ventral Orientation,//www.ebi.ac.uk/mi/media/omero/webgateway/archived_files/download/1118437,//www.ebi.ac.uk/mi/media/omero/webgateway/render_birds_eye_view/1118437


### 9. Which parameters have been measured for a particular gene

In [7]:
# Using Cib2 MGI:1929293
cib2_gene_query = f"{impc_api_search_url}/search/getGeneByMgiAccessionId?mgiAccessionId=MGI:1929293"
cib2_gene_data = requests.get(cib2_gene_query).json()
cib2_gene_parameters = cib2_gene_data["testedParameters"]
headers = {'parameterName': 'Parameter name', 'parameterStableId': 'Parameter stable id', 'pipelineName': 'Pipeline name', 'pipelineStableId': 'Pipeline stable id', 'procedureName': 'Procedure name', 'procedureStableId': 'Procedure stable id'}
print('Tested parameters for Cib2')
display(HTML(tabulate.tabulate(cib2_gene_parameters, headers=headers, tablefmt='unsafehtml')))

Tested parameters for Cib2


Parameter name,Parameter stable id,Pipeline name,Pipeline stable id,Procedure name,Procedure stable id
Ovary,IMPC_ALZ_010_001,Harwell,HRWL_001,Adult LacZ,IMPC_ALZ_001
Skull shape,IMPC_XRY_001_001,Harwell,HRWL_001,X-ray,IMPC_XRY_001
Genitalia - morphology,IMPC_CSD_073_001,Harwell,HRWL_001,Combined SHIRPA and Dysmorphology,IMPC_CSD_003
Skin texture - whole body,IMPC_CSD_062_001,Harwell,HRWL_001,Combined SHIRPA and Dysmorphology,IMPC_CSD_003
Clavicle,IMPC_XRY_007_001,Harwell,HRWL_001,X-ray,IMPC_XRY_001
Fusion of vertebrae,IMPC_XRY_019_001,Harwell,HRWL_001,X-ray,IMPC_XRY_001
XRay Images Whole Body Lateral Orientation,IMPC_XRY_048_001,Harwell,HRWL_001,X-ray,IMPC_XRY_001
Snout size,IMPC_CSD_028_001,Harwell,HRWL_001,Combined SHIRPA and Dysmorphology,IMPC_CSD_003
18kHz-evoked ABR Threshold,IMPC_ABR_008_001,Harwell,HRWL_001,Auditory Brain Stem Response,IMPC_ABR_002
Tail - presence,IMPC_CSD_001_001,Harwell,HRWL_001,Combined SHIRPA and Dysmorphology,IMPC_CSD_003


### 10. Which parameters identified a significant finding for a particular knockout

In [8]:
# Using Cib2 MGI:1929293
page_size = 100
cib2_significant_stats_query = f"{impc_api_statistical_results_url}/search/findAllByMarkerAccessionIdIsAndSignificantTrue?mgiAccessionId=MGI:1929293&size={page_size}"
cib2_significant_stats_first_page= requests.get(cib2_significant_stats_query).json()
total_pages = cib2_significant_stats_first_page["page"]["totalPages"]
cib2_significant_parameters = []

for page_number in trange(0, total_pages):
    page_query = f"{impc_api_statistical_results_url}/search/findAllByMarkerAccessionIdIsAndSignificantTrue?mgiAccessionId=MGI:1929293&page={page_number}&size={page_size}"
    page_stats = requests.get(page_query).json()["_embedded"]["statisticalResults"]
    cib2_significant_parameters += [stats_result["parameterName"] for stats_result in page_stats]

print(cib2_significant_parameters)
# cib2_significant_stats_parameters = cib2_gene_data["testedParameters"]
# headers = {'parameterName': 'Parameter name', 'parameterStableId': 'Parameter stable id', 'pipelineName': 'Pipeline name', 'pipelineStableId': 'Pipeline stable id', 'procedureName': 'Procedure name', 'procedureStableId': 'Procedure stable id'}
# print('Tested parameters for Cib2')
# display(HTML(tabulate.tabulate(cib2_gene_parameters, headers=headers, tablefmt='unsafehtml')))

  0%|          | 0/1 [00:00<?, ?it/s]

['% Pre-pulse inhibition - Global', 'Total cholesterol', '12kHz-evoked ABR Threshold', '18kHz-evoked ABR Threshold', '% Pre-pulse inhibition - PPI2', 'Ears', 'Startle response', '24kHz-evoked ABR Threshold', 'Response amplitude - S', 'Limb grasp', 'Triglycerides', 'Fructosamine', 'Tremor', '6kHz-evoked ABR Threshold', '% Pre-pulse inhibition - PPI3', 'Basophil cell count', '% Pre-pulse inhibition - PPI4', 'HDL-cholesterol']


### 11. How many gene have been measured inside a particular pipeline

In [9]:
# Using IMPC_001 -> IMPC Standard Early Adult Pipeline
genes_by_tested_pipeline_query = f"{impc_api_search_url}/search/findAllByTestedPipelineId?pipelineId=IMPC_001&size=0"
impc_001_tested_genes_req = requests.get(genes_by_tested_pipeline_query, headers={"Accept": "application/json"})
print(f'Total measured genes for IMPC_001: {impc_001_tested_genes_req.json()["page"]["totalElements"]}')


Total measured genes for IMPC_001: 8075


### 12. Extract all genes and corresponding phenotypes related to a particular organ system (via MP terms from organic systems; similar search as significant phenotypes)

In [10]:
# using MP:0005391 (vision/eye phenotype)
target_mp_terms = ['MP:0005391']

## All the data is paginated using the page and size parameters, by default the endpoint returns the first 20 hits
gene_by_top_level_phenotypes_query = f"{impc_api_search_url}/search/findAllBySignificantTopLevelMpTermIdsContains?mpTermIds={','.join(target_mp_terms)}&page=0&size=20"

genes_with_vision_eye_phenotypes = requests.get(gene_by_top_level_phenotypes_query).json()
print(f"Genes with {target_mp_terms}: {genes_with_vision_eye_phenotypes['page']['totalElements']}")
list_of_genes = []

for gene in genes_with_vision_eye_phenotypes['_embedded']['genes']:
    gene_dict = {"gene_accession_id": gene['mgiAccessionId'], "gene_name": gene['markerName'], "gene_bundle_url": gene["_links"]["geneBundle"]['href']}
    list_of_genes.append(gene_dict)

display(HTML(tabulate.tabulate([i.values() for i in list_of_genes], headers=list_of_genes[0].keys(), tablefmt='html')))

Genes with ['MP:0005391']: 1459


gene_accession_id,gene_name,gene_bundle_url
MGI:1889810,glycoprotein 6 (platelet),https://www.gentar.org/impc-dev-api/geneBundles/MGI:1889810
MGI:1916947,"calmodulin regulated spectrin-associated protein family, member 3",https://www.gentar.org/impc-dev-api/geneBundles/MGI:1916947
MGI:2147870,leucine-rich repeats and calponin homology (CH) domain containing 2,https://www.gentar.org/impc-dev-api/geneBundles/MGI:2147870
MGI:106913,abl interactor 2,https://www.gentar.org/impc-dev-api/geneBundles/MGI:106913
MGI:103062,signal transducer and activator of transcription 4,https://www.gentar.org/impc-dev-api/geneBundles/MGI:103062
MGI:104663,leptin,https://www.gentar.org/impc-dev-api/geneBundles/MGI:104663
MGI:105959,cytochrome c oxidase subunit 8A,https://www.gentar.org/impc-dev-api/geneBundles/MGI:105959
MGI:109362,keratin 86,https://www.gentar.org/impc-dev-api/geneBundles/MGI:109362
MGI:104589,"actin, gamma 2, smooth muscle, enteric",https://www.gentar.org/impc-dev-api/geneBundles/MGI:104589
MGI:1329035,gamma-glutamyl hydrolase,https://www.gentar.org/impc-dev-api/geneBundles/MGI:1329035


### 14. Full table of genes and all identified phenotypes

In [11]:
page_size = 1000
complete_list_gene_page_info_query = f"{impc_api_search_url}/search/significantPhenotypesByGene/?size={page_size}&page=0&sort=_id&_id.dir=asc"

complete_list_gene_first_page_info = requests.get(complete_list_gene_page_info_query).json()

total_pages = complete_list_gene_first_page_info["page"]["totalPages"]
complete_gene_list = complete_list_gene_first_page_info["_embedded"]["genes"]

for page_number in trange(1, total_pages):
    page_query = f"{impc_api_search_url}?size={page_size}&page={page_number}"
    page_genes = requests.get(page_query).json()["_embedded"]["genes"]
    complete_gene_list += page_genes

print(len(complete_gene_list))

  0%|          | 0/26 [00:00<?, ?it/s]

26726


### 15. Extract all measurement (and statistical analysis) for a particular parameter or pipeline

### 16. Extract all genes and the measured values for a particular parameter

In [12]:
target_parameter = 'IMPC_IPG_012_001'
page_size = 1000
page_limit = 2 # set to -1 to get all the pages

observations_by_parameter_query = f"{impc_api_observations_url}/search/findAllByParameterStableId?parameterStableId={target_parameter}&size={page_size}"

first_page_info = requests.get(observations_by_parameter_query).json()
total_pages = first_page_info["page"]["totalPages"]
total_items = first_page_info["page"]["totalElements"]
observations_by_parameter = first_page_info["_embedded"]["observations"]

print(f"Getting {total_items} observations")

for page_number in trange(1, page_limit if page_limit > 0 else total_pages):
    page_query = f"{impc_api_observations_url}/search/findAllByParameterStableId?parameterStableId={target_parameter}&page={page_number}&size={page_size}"
    page_observations = requests.get(page_query).json()["_embedded"]["observations"]
    observations_by_parameter += page_observations

print(len(observations_by_parameter))

# Returns the observations with the schema described in https://github.com/mpi2/impc-etl/wiki/Observations-Output-Schema
# You can get the genes by getting all the values for the field gene_accession_id




Getting 83907 observations


  0%|          | 0/1 [00:00<?, ?it/s]

2000


### 17. Extract all X-Ray images, the related phenotypes and other metadata
> Warning ⚠️: The IMPC data has not direct relationship between images and phenotypes, but it is possible to get all the images related to all the genes that have a significant hit for a given phenotype.
> Hint 💡: The images live inside each individual gene-bundle under the field `geneImages`. The easiest way to query images by phenotype is first hitting the `genes` endpoint to search genes tested for a specific procedure (IMPC_XRY_001 - X-ray) and then get the gene bundles to figure out images and phenotypes for those X-ray tested genes. You could also filter the Significant Association to Phenotypes by Procedure.

In [13]:
target_procedure = 'IMPC_XRY_001'

## All the data is paginated using the page and size parameters, getting 200 genes
gene_by_tested_procedure_query = f"{impc_api_search_url}/search/findAllByTestedProcedureId?procedureId={target_procedure}"
gene_by_tested_procedure = requests.get(gene_by_tested_procedure_query).json()
print(f"Genes with {target_procedure} tested procedure: {gene_by_tested_procedure['page']['totalElements']}")
list_of_gene_bundle_urls = [gene["_links"]["geneBundle"]['href'] for gene in gene_by_tested_procedure['_embedded']['genes']]

gene_bundles = []
for gene_bundle_url in list_of_gene_bundle_urls:
    gene_bundle = requests.get(gene_bundle_url).json()
    gene_bundles.append(gene_bundle)

images_by_gene_and_procedure = {}
significant_phenotypes_by_gene_and_procedure = {}

for gene_bundle in gene_bundles:
    gene_acc = gene_bundle["mgiAccessionId"]
    
    if gene_acc not in images_by_gene_and_procedure:
        images_by_gene_and_procedure[gene_acc] = []
    images_by_gene_and_procedure[gene_acc] += [image_dict for image_dict in gene_bundle["geneImages"] if image_dict["procedureStableId"] == target_procedure]
    
    if gene_acc not in significant_phenotypes_by_gene_and_procedure:
        significant_phenotypes_by_gene_and_procedure[gene_acc] = []
    if gene_bundle["genePhenotypeAssociations"] is not None:
        significant_phenotypes_by_gene_and_procedure[gene_acc] += [gp_hit for gp_hit in gene_bundle["genePhenotypeAssociations"] if gp_hit["procedureStableId"] == [target_procedure]]

print(images_by_gene_and_procedure)
print(significant_phenotypes_by_gene_and_procedure)


Genes with IMPC_XRY_001 tested procedure: 4953
{'MGI:104562': [{'observationId': 'f619221140fdf59882e500482de68607', 'downloadFilePath': 'https://images.mousephenotype.org/src/3/14/11173/19/562/20031/1669664.dcm', 'phenotypingCenter': 'HMGU', 'pipelineStableId': 'HMGU_001', 'procedureStableId': 'IMPC_XRY_001', 'parameterStableId': 'IMPC_XRY_034_001', 'datasourceName': 'IMPC', 'experimentSourceFile': 'GMC/HMGU_001_GMC.2022-02-04.250.experiment.xml', 'specimenSourceFile': 'GMC/HMGU_001_GMC.2022-02-04.14.specimen.xml', 'experimentId': 'd79763f07dd23832cae3394b36628fb2', 'specimenId': '1b6a6e3faff557c9fd929a6f216d314b', 'alleleAccessionId': 'MGI:6120686', 'projectName': 'Helmholtz GMC', 'strainAccessionId': 'MGI:2683688', 'litterId': '67852', 'phenotypingCons': 'Helmholtz GMC', 'externalSampleId': '30419098', 'developmentalStageName': 'postnatal', 'developmentalStageAcc': 'EFO:0002948', 'ageInDays': 98, 'dateOfBirth': '2017-06-21T00:00:00Z', 'metadata': ['Experimenter ID (analysis) = 194',