## MedGen Mode of Inheritance (MOI) Concept Selection

Find MedGen concepts for inheritance based on HPO mappings.

### Group HPO MOI Concepts

- HPO MOI: https://hpo.jax.org/app/browse/term/HP:0000005 ("hierarchy" on the left).
- OLS: https://www.ebi.ac.uk/ols/ontologies/hp/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FHP_0000005&viewMode=All&siblings=true
- Children: http://www.ebi.ac.uk/ols/api/ontologies/hp/descendants?id=HP:0000005&size=500

**NOTE**: In the future, use this instead: https://colab.research.google.com/drive/1TWyVw164bkY4fjodgQ49rZdvTmy9wVyN?usp=sharing

In [20]:
import requests
import json
url = 'http://www.ebi.ac.uk/ols/api/ontologies/hp/descendants?id=HP:0000005&size=500'
r = requests.get(url).json()

In [21]:
moi_concepts = {}
for t in r['_embedded']['terms']:
    print(t['iri'], t['label'])
    typ = None
    if 'dominant' in t['label'].lower():
        typ = 'dominant'
    elif 'recessive' in t['label'].lower():
        typ = 'recessive'
    else:
        continue
    if typ not in moi_concepts:
        moi_concepts[typ] = []
    moi_concepts[typ].append(t['iri'].split('/')[-1].replace('_', ':'))

http://purl.obolibrary.org/obo/HP_0000006 Autosomal dominant inheritance
http://purl.obolibrary.org/obo/HP_0012275 Autosomal dominant inheritance with maternal imprinting
http://purl.obolibrary.org/obo/HP_0012274 Autosomal dominant inheritance with paternal imprinting
http://purl.obolibrary.org/obo/HP_0001470 Sex-limited autosomal dominant
http://purl.obolibrary.org/obo/HP_0001475 Male-limited autosomal dominant
http://purl.obolibrary.org/obo/HP_0001444 Autosomal dominant somatic cell mutation
http://purl.obolibrary.org/obo/HP_0001452 Autosomal dominant contiguous gene syndrome
http://purl.obolibrary.org/obo/HP_0025352 Autosomal dominant germline de novo mutation
http://purl.obolibrary.org/obo/HP_0000007 Autosomal recessive inheritance
http://purl.obolibrary.org/obo/HP_0031362 Sex-limited autosomal recessive inheritance
http://purl.obolibrary.org/obo/HP_0032113 Semidominant mode of inheritance
http://purl.obolibrary.org/obo/HP_0010985 Gonosomal inheritance
http://purl.obolibrary.org/ob

In [22]:
moi_concepts

{'dominant': ['HP:0000006',
  'HP:0012275',
  'HP:0012274',
  'HP:0001470',
  'HP:0001475',
  'HP:0001444',
  'HP:0001452',
  'HP:0025352',
  'HP:0032113',
  'HP:0001423'],
 'recessive': ['HP:0000007', 'HP:0031362', 'HP:0001419']}

### Extract MOI from MedGen

In [23]:
url = 'https://ftp.ncbi.nlm.nih.gov/pub/medgen/MedGen_HPO_Mapping.txt.gz'
!wget -P /tmp $url
!gzip -dc /tmp/MedGen_HPO_Mapping.txt.gz | head

--2020-06-19 11:56:21--  https://ftp.ncbi.nlm.nih.gov/pub/medgen/MedGen_HPO_Mapping.txt.gz
Resolving ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)... 130.14.250.10, 2607:f220:41e:250::10
Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|130.14.250.10|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 288969 (282K) [application/x-gzip]
Saving to: ‘/tmp/MedGen_HPO_Mapping.txt.gz.1’


2020-06-19 11:56:21 (1.43 MB/s) - ‘/tmp/MedGen_HPO_Mapping.txt.gz.1’ saved [288969/288969]

#CUI|SDUI|HpoStr|MedGenStr|MedGenStr_SAB|STY|
C0444868|HP:0000001|All|All|HPO|Quantitative Concept|
C4025901|HP:0000002|Abnormality of body height|Abnormality of body height|GTR|Finding|
C3714581|HP:0000003|Multicystic kidney dysplasia|Multicystic kidney dysplasia|GTR|Disease or Syndrome|
C1708511|HP:0000005|Mode of inheritance|Mode of inheritance|HPO|Genetic Function|
C0443147|HP:0000006|Autosomal dominant inheritance|Autosomal dominant inheritance|GTR|Intellectual Product|
C0443147|HP:00

In [24]:
import pandas as pd
df = pd.read_csv('/tmp/MedGen_HPO_Mapping.txt.gz', sep='|')
df.head()

Unnamed: 0,#CUI,SDUI,HpoStr,MedGenStr,MedGenStr_SAB,STY,Unnamed: 6
0,C0444868,HP:0000001,All,All,HPO,Quantitative Concept,
1,C4025901,HP:0000002,Abnormality of body height,Abnormality of body height,GTR,Finding,
2,C3714581,HP:0000003,Multicystic kidney dysplasia,Multicystic kidney dysplasia,GTR,Disease or Syndrome,
3,C1708511,HP:0000005,Mode of inheritance,Mode of inheritance,HPO,Genetic Function,
4,C0443147,HP:0000006,Autosomal dominant inheritance,Autosomal dominant inheritance,GTR,Intellectual Product,


In [25]:
cids = [v for vs in moi_concepts.values() for v in vs]
dfc = df[df['SDUI'].isin(cids)]
dfc

Unnamed: 0,#CUI,SDUI,HpoStr,MedGenStr,MedGenStr_SAB,STY,Unnamed: 6
4,C0443147,HP:0000006,Autosomal dominant inheritance,Autosomal dominant inheritance,GTR,Intellectual Product,
5,C0443147,HP:0000006,Autosomal dominant inheritance,Autosomal dominant inheritance,GTR,Genetic Function,
6,C0441748,HP:0000007,Autosomal recessive inheritance,Autosomal recessive inheritance,HPO,Intellectual Product,
7,C0441748,HP:0000007,Autosomal recessive inheritance,Autosomal recessive inheritance,HPO,Genetic Function,
1187,C1845977,HP:0001419,X-linked recessive inheritance,X-linked recessive inheritance,GTR,Finding,
1189,C1847879,HP:0001423,X-linked dominant inheritance,X-linked dominant inheritance,HPO,Finding,
1204,C4025781,HP:0001444,Autosomal dominant somatic cell mutation,Autosomal dominant somatic cell mutation,HPO,Genetic Function,
1209,C4025777,HP:0001452,Autosomal dominant contiguous gene syndrome,Autosomal dominant contiguous gene syndrome,HPO,Disease or Syndrome,
1221,C4025767,HP:0001470,Sex-limited autosomal dominant,Sex-limited autosomal dominant,HPO,Genetic Function,
1225,C4025764,HP:0001475,Male-limited autosomal dominant,Male-limited autosomal dominant,HPO,Genetic Function,


In [26]:
assert dfc['SDUI'].nunique() == len(cids)
len(dfc), dfc['SDUI'].nunique(), len(cids)

(15, 13, 13)