# HMGCR Yogev et al. (2023)

Data derived from [Yogev Y, et al. (2023) Limb girdle muscular disease caused by HMGCR mutation and statin myopathy treatable with mevalonolactone. Proc Natl Acad Sci U S A.;120(7):e2217831120. PMID:36745799](https://pubmed.ncbi.nlm.nih.gov/36745799/)

Six individuals of a single consanguineous Bedouin kindred (Fig. 1A) were affected with apparently autosomal recessive progressive limb girdle muscle disease. The disease initially manifested during the fourth decade of life with pain on exertion, followed by muscle fatigue and weakness, affecting mostly the proximal and axial muscles, and culminating with involvement of respiratory muscles. 

g.5:75359992G>A (GRCh38/hg38); NM_000859.3:c.2465G>A; p.(G822D) in HMGCR

In [1]:
import pandas as pd
pd.set_option('display.max_colwidth', None) # show entire column contents, important!
from IPython.display import HTML, display
from pyphetools.creation import *
from pyphetools.visualization import *
from pyphetools.validation import *
import importlib.metadata
__version__ = importlib.metadata.version("pyphetools")
print(f"Using pyphetools version {__version__}")

Using pyphetools version 0.9.78




In [2]:
PMID="PMID:36745799"
title = "Limb girdle muscular disease caused by HMGCR mutation and statin myopathy treatable with mevalonolactone"
cite = Citation(pmid=PMID, title=title)
parser = HpoParser(hpo_json_file="../hp.json")
hpo_cr = parser.get_hpo_concept_recognizer()
hpo_version = parser.get_version()
hpo_ontology = parser.get_ontology()
metadata = MetaData(created_by="ORCID:0000-0002-0736-9199", citation=cite)
metadata.default_versions_with_hpo(version=hpo_version)
print(f"HPO version {hpo_version}")

HPO version 2024-04-04


In [3]:
df = pd.read_excel("input/HMGCR_Yogev_HMGCR.xlsx")
dft = df.transpose()
dft.columns = dft.iloc[0]
dft.drop(dft.index[0], inplace=True)
dft['individual_id'] = dft.index  # Set the new column 'patient_id' to be identical to the contents of the index
dft.head() # check the transposed table

INDIVIDUAL,SEX,AGE AT EXAMINATION,AGE AT ONSET,PROXIMAL STRENGTH UPPER LIMB,PROXIMAL STRENGTH LOWER LIMB,ATROPHY UPPER LIMB,ATROPHY LOWER LIMB,DEEP TENDON REFLEXES,PAIN ON EXERTION,AMBULATORY,...,TRIGLYCERIDES (RECOMMENDED<150MG/DL),HDL (RECOMMENDED >60MG/DL),LDL (RECOMMENDED <100MG/DL),VLDL,FASTING BLOOD SUGAR,ABNORMAL BRAIN IMAGING,MYOPATHIC CHANGERS IN EMG,"MUSCLE BIOPSY-NORMAL DYSTROPHIN, NADH, SDH, COX, ATPASES, ELECTRON MICROSCOPY",COMORBIDITIES,individual_id
V:2,F,49,P31Y,0/5,0/5,Marked,Marked,Absent,+,-,...,87.0,49,80.0,17,390,-,+,Mild type 2 fiber deficiency,"Insulin-dependent diabetes mellitus, onset age 19",V:2
V:5,M,58,P39Y,3/5,2/5,Marked,Marked,Diminished,+,-,...,123.0,49,87.0,25,123,-,+,+,"COPD, diastolic dysfunction, ICRBBB, lymphocytosis",V:5
V:8,M,37,P24Y,5/5,5/5,-,-,+,+,+,...,95.5,38,77.0,19,127,,,,-,V:8
V:9,M,42,P33Y,5/5,4/5,-,-,+,+,+,...,108.0,45,67.0,22,111,-,,,ICRBBB,V:9
V:12,F,51,P31Y,2/5,2/5,Evident,Evident,Diminished,+,-,...,149.0,55,82.5,30,124,-,+,+,single kidney,V:12


In [4]:
generator = SimpleColumnMapperGenerator(df=dft, observed="+", excluded="-", hpo_cr=hpo_cr)
column_mapper_list = generator.try_mapping_columns()
display(HTML(generator.to_html()))

Result,Columns
Mapped,RESPIRATORY DIFFICULTIES; DYSPHAGIA; ABNORMAL BRAIN IMAGING
Unmapped,"SEX; AGE AT EXAMINATION; AGE AT ONSET; PROXIMAL STRENGTH UPPER LIMB; PROXIMAL STRENGTH LOWER LIMB; ATROPHY UPPER LIMB; ATROPHY LOWER LIMB; DEEP TENDON REFLEXES; PAIN ON EXERTION; AMBULATORY; ABULATORY MOBILITY RESTRICTION; ECHOCARDIOGRAPHY; CPK (reference 20-180 U/L); MAXIMAL TROPONIN T (0-14NG/L); CREATININE; AST (REFERENCE 0-35 U/L); ALT (REFERENCE 0-45 U/L); ALKALINE PHOSPHATASE (REFERENCE 30-120 U/L); TOTAL CHOLESTEROL (RECOMMENDED <200 MG/DL); TRIGLYCERIDES (RECOMMENDED<150MG/DL); HDL (RECOMMENDED >60MG/DL); LDL (RECOMMENDED <100MG/DL); VLDL; FASTING BLOOD SUGAR; MYOPATHIC CHANGERS IN EMG; MUSCLE BIOPSY-NORMAL DYSTROPHIN, NADH, SDH, COX, ATPASES, ELECTRON MICROSCOPY; COMORBIDITIES; individual_id"


In [5]:
#res = OptionColumnMapper.autoformat(df=dft, concept_recognizer=hpo_cr, omit_columns=generator.get_mapped_columns())
#print(res)

In [6]:
proximal_strength_upper_limb_d = {'0/5': 'Proximal muscle weakness in upper limbs',
 '3/5': 'Proximal muscle weakness in upper limbs',
 '2/5': 'Proximal muscle weakness in upper limbs'}
excluded = {'5/5': 'Proximal muscle weakness in upper limbs',}
proximalMapper = OptionColumnMapper(column_name="PROXIMAL STRENGTH UPPER LIMB", 
                                    concept_recognizer=hpo_cr, 
                                    option_d=proximal_strength_upper_limb_d,
                                   excluded_d=excluded)
column_mapper_list.append(proximalMapper)
proximalMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Proximal muscle weakness in upper limbs (HP:0008997) (observed),4
1,Proximal muscle weakness in upper limbs (HP:0008997) (excluded),2


In [7]:
proximal_strength_lower_limb_d = {'0/5': 'Proximal muscle weakness in lower limbs',
 '2/5': 'Proximal muscle weakness in lower limbs',
 '4/5': 'Proximal muscle weakness in lower limbs'}
excluded = { '5/5': 'Proximal muscle weakness in lower limbs',}
proximal_strength_lower_limbMapper = OptionColumnMapper(column_name="PROXIMAL STRENGTH LOWER LIMB", 
                                                        concept_recognizer=hpo_cr, 
                                                        option_d=proximal_strength_lower_limb_d,
                                                       excluded_d=excluded)
column_mapper_list.append(proximal_strength_lower_limbMapper)
proximal_strength_lower_limbMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Proximal muscle weakness in lower limbs (HP:0008994) (observed),5
1,Proximal muscle weakness in lower limbs (HP:0008994) (excluded),1


In [8]:
atrophy_upper_limb_d = {'Marked': 'Upper limb amyotrophy',
 'Evident': 'Upper limb amyotrophy'}
atrophy_upper_limbMapper = OptionColumnMapper(column_name="ATROPHY UPPER LIMB", concept_recognizer=hpo_cr, option_d=atrophy_upper_limb_d)
column_mapper_list.append(atrophy_upper_limbMapper)
atrophy_upper_limbMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Upper limb amyotrophy (HP:0009129) (observed),4


In [9]:
atrophy_lower_limb_d = {'Marked': 'Lower limb amyotrophy',
 'Evident': 'Lower limb amyotrophy'}
atrophy_lower_limbMapper = OptionColumnMapper(column_name="ATROPHY LOWER LIMB", concept_recognizer=hpo_cr, option_d=atrophy_lower_limb_d)
column_mapper_list.append(atrophy_lower_limbMapper)
atrophy_lower_limbMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Lower limb amyotrophy (HP:0007210) (observed),3


In [10]:
deep_tendon_reflexes_d = {'Absent': 'Areflexia',
 'Diminished': 'Hyporeflexia'}
deep_tendon_reflexesMapper = OptionColumnMapper(column_name="DEEP TENDON REFLEXES", concept_recognizer=hpo_cr, option_d=deep_tendon_reflexes_d)
column_mapper_list.append(deep_tendon_reflexesMapper)
deep_tendon_reflexesMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Areflexia (HP:0001284) (observed),1
1,Hyporeflexia (HP:0001265) (observed),3


In [11]:
pain_on_exertionMapper = SimpleColumnMapper(column_name="PAIN ON EXERTION", 
                                            hpo_id="HP:0003738",
                                            hpo_label="Exercise-induced myalgia",
                                            observed="+",
                                            excluded="-")
column_mapper_list.append(pain_on_exertionMapper)
pain_on_exertionMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""+"" -> HP: Exercise-induced myalgia (HP:0003738) (observed)",6


In [12]:
# dft["AMBULATORY"] "+" means can still ambulate, "-" is loss
# Loss of ambulation HP:0002505
ambulatoryMapper = SimpleColumnMapper(column_name="AMBULATORY", 
                                            hpo_id="HP:0002505",
                                            hpo_label="Loss of ambulation",
                                            observed="+",
                                            excluded="-")
column_mapper_list.append(ambulatoryMapper)
ambulatoryMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""-"" -> HP: Loss of ambulation (HP:0002505) (excluded)",3
1,"original value: ""+"" -> HP: Loss of ambulation (HP:0002505) (observed)",3


In [13]:
respiratory_d = {'ventilated through tracheostomy': 'Respiratory failure requiring assisted ventilation',
                             '+':"Respiratory insufficiency"}
excluded = {"-" :"Respiratory insufficiency"}
respiratoryMapper = OptionColumnMapper(column_name="RESPIRATORY DIFFICULTIES", concept_recognizer=hpo_cr, option_d=respiratory_d, excluded_d=excluded)
column_mapper_list.append(respiratoryMapper)
respiratoryMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Respiratory failure requiring assisted ventilation (HP:0004887) (observed),1
1,Respiratory insufficiency (HP:0002093) (observed),2
2,Respiratory insufficiency (HP:0002093) (excluded),3


In [14]:
dysphagiaMapper = SimpleColumnMapper(column_name="DYSPHAGIA", 
                                            hpo_id="HP:0002015",
                                            hpo_label="Dysphagia",
                                            observed="+",
                                            excluded="-")
column_mapper_list.append(dysphagiaMapper)
dysphagiaMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""-"" -> HP: Dysphagia (HP:0002015) (excluded)",6


In [15]:
echocardiography_d = {
 'Mild diastolic dysfunction': 'Left ventricular diastolic dysfunction',
}
excluded = {'Normal': 'Left ventricular diastolic dysfunction'}
echocardiographyMapper = OptionColumnMapper(column_name="ECHOCARDIOGRAPHY", 
                                            concept_recognizer=hpo_cr, 
                                            option_d=echocardiography_d, 
                                            excluded_d=excluded)
column_mapper_list.append(echocardiographyMapper)
echocardiographyMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Left ventricular diastolic dysfunction (HP:0025168) (excluded),3
1,Left ventricular diastolic dysfunction (HP:0025168) (observed),1


In [16]:
#Elevated circulating creatine kinase concentration HP:0003236
# Values were'1501' '9065' '477''542' '3797'

cpk = Thresholder.creatine_kinase_blood(unit="U/L", high_thresh=180, low_thresh=20)
cpk_Mapper = ThresholdedColumnMapper(column_name="CPK (reference 20-180 U/L)", 
                                    thresholder=cpk)
column_mapper_list.append(cpk_Mapper)
cpk_Mapper.preview_column(dft)

Unnamed: 0,mapping: 20.0-180.0 U/L,count
0,Abnormal circulating creatine kinase concentration (HP:0040081): excluded,1
1,Elevated circulating creatine kinase concentration (HP:0003236): observed,5


In [17]:
#Increased circulating troponin T concentration HP:0410174
#values were '32.06' '18.59' '64.82''23.39'

troponin_t = Thresholder.troponin_t_blood(unit="ng/L", low_thresh=0, high_thresh=14)
maximal_troponin_tMapper = ThresholdedColumnMapper(column_name="MAXIMAL TROPONIN T (0-14NG/L)",  thresholder=troponin_t)
column_mapper_list.append(maximal_troponin_tMapper)
maximal_troponin_tMapper.preview_column(dft)

Unnamed: 0,mapping: 0.0-14.0 ng/L,count
0,Increased circulating troponin T concentration (HP:0410174): observed,4
1,Increased circulating troponin T concentration (HP:0410174): not measured,2


In [18]:
ast = Thresholder.AST_blood(unit="U/L", low_thresh=0, high_thresh=35)
astMapper = ThresholdedColumnMapper(column_name="AST (REFERENCE 0-35 U/L)", thresholder=ast)
column_mapper_list.append(astMapper)
astMapper.preview_column(dft)

Unnamed: 0,mapping: 0.0-35.0 U/L,count
0,Elevated circulating aspartate aminotransferase concentration (HP:0031956): excluded,2
1,Elevated circulating aspartate aminotransferase concentration (HP:0031956): observed,4


In [19]:
alt = Thresholder.ALT_blood(unit="U/L", low_thresh=0, high_thresh=45)
altMapper = ThresholdedColumnMapper(column_name="ALT (REFERENCE 0-45 U/L)", thresholder=alt)
column_mapper_list.append(altMapper)
altMapper.preview_column(dft)

Unnamed: 0,mapping: 0.0-45.0 U/L,count
0,Elevated circulating alanine aminotransferase concentration (HP:0031964): excluded,3
1,Elevated circulating alanine aminotransferase concentration (HP:0031964): observed,3


In [20]:
#Elevated circulating alkaline phosphatase concentration
ap = Thresholder.alkaline_phophatase_blood(unit="U/L", low_thresh=30, high_thresh=120)
apMapper = ThresholdedColumnMapper(column_name="ALKALINE PHOSPHATASE (REFERENCE 30-120 U/L)", thresholder=ap)
column_mapper_list.append(apMapper)
apMapper.preview_column(dft)

Unnamed: 0,mapping: 30.0-120.0 U/L,count
0,Elevated circulating alkaline phosphatase concentration (HP:0003155): observed,1
1,Abnormality of alkaline phosphatase level (HP:0004379): excluded,5


In [21]:
cholesterol = Thresholder.total_cholesterol_blood(unit="mg/dl", high_thresh=200)
total_cholesterolMapper = ThresholdedColumnMapper(column_name="TOTAL CHOLESTEROL (RECOMMENDED <200 MG/DL)", thresholder=cholesterol)
column_mapper_list.append(total_cholesterolMapper)
total_cholesterolMapper.preview_column(dft)

Unnamed: 0,mapping: None-200.0 mg/dl,count
0,Hypercholesterolemia (HP:0003124): not measured,6


In [22]:
dft["TOTAL CHOLESTEROL (RECOMMENDED <200 MG/DL)"]

V:2     146
V:5     159
V:8     128
V:9     136
V:12    171
V:13    128
Name: TOTAL CHOLESTEROL (RECOMMENDED <200 MG/DL), dtype: object

# Variant
NM_000859.3:c.2465G>A; p.(G822D) in HMGCR

In [23]:
HMGCR_transcript = "NM_000859.3"
vvalidator = VariantValidator(genome_build="hg38", transcript=HMGCR_transcript)
var = vvalidator.encode_hgvs("c.2465G>A")
var.set_homozygous()

https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_000859.3%3Ac.2465G>A/NM_000859.3?content-type=application%2Fjson


In [24]:
ageMapper = AgeColumnMapper.iso8601(column_name="AGE AT ONSET")
#ageMapper.preview_column(dft)
ageExamMapper = AgeColumnMapper.by_year(column_name="AGE AT EXAMINATION")
#ageExamMapper.preview_column(dft)

In [25]:
sexMapper = SexColumnMapper(column_name='SEX', male_symbol="M", female_symbol="F")
#sexMapper.preview_column(dft)

In [26]:
encoder = CohortEncoder(df=dft, 
                        hpo_cr=hpo_cr,
                        column_mapper_list=column_mapper_list, 
                        individual_column_name="individual_id",
                        sexmapper=sexMapper,
                        age_of_onset_mapper=ageMapper,
                        age_at_last_encounter_mapper=ageExamMapper,
                        metadata=metadata)
LGMDR28 = Disease(disease_id='OMIM:620375', disease_label='Muscular dystrophy, limb-girdle, autosomal recessive 28')
encoder.set_disease(LGMDR28)

In [27]:
individuals = encoder.get_individuals()
for i in individuals:
    i.add_variant(var)

In [28]:
cvalidator = CohortValidator(cohort=individuals, ontology=hpo_ontology, min_hpo=1, allelic_requirement=AllelicRequirement.BI_ALLELIC)
qc = QcVisualizer(cohort_validator=cvalidator)
display(HTML(qc.to_summary_html()))

Level,Error category,Count
INFORMATION,NOT_MEASURED,10


In [29]:
individuals = cvalidator.get_error_free_individual_list()
table = PhenopacketTable(individual_list=individuals, metadata=metadata)
display(HTML(table.to_html()))

Individual,Disease,Genotype,Phenotypic features
V:2 (FEMALE; P49Y),"Muscular dystrophy, limb-girdle, autosomal recessive 28 (OMIM:620375)",NM_000859.3:c.2465G>A (homozygous),Proximal muscle weakness in upper limbs (HP:0008997); Proximal muscle weakness in lower limbs (HP:0008994); Upper limb amyotrophy (HP:0009129); Lower limb amyotrophy (HP:0007210); Areflexia (HP:0001284); Exercise-induced myalgia (HP:0003738); Respiratory failure requiring assisted ventilation (HP:0004887); Increased circulating troponin T concentration (HP:0410174); Elevated circulating alkaline phosphatase concentration (HP:0003155); excluded: Loss of ambulation (HP:0002505); excluded: Elevated circulating alanine aminotransferase concentration (HP:0031964); excluded: Abnormal circulating creatine kinase concentration (HP:0040081); excluded: Brain imaging abnormality (HP:0410263); excluded: Elevated circulating aspartate aminotransferase concentration (HP:0031956); excluded: Left ventricular diastolic dysfunction (HP:0025168); excluded: Dysphagia (HP:0002015)
V:5 (MALE; P58Y),"Muscular dystrophy, limb-girdle, autosomal recessive 28 (OMIM:620375)",NM_000859.3:c.2465G>A (homozygous),Respiratory distress (HP:0002098); Proximal muscle weakness in upper limbs (HP:0008997); Proximal muscle weakness in lower limbs (HP:0008994); Upper limb amyotrophy (HP:0009129); Lower limb amyotrophy (HP:0007210); Hyporeflexia (HP:0001265); Exercise-induced myalgia (HP:0003738); Respiratory insufficiency (HP:0002093); Left ventricular diastolic dysfunction (HP:0025168); Elevated circulating creatine kinase concentration (HP:0003236); Increased circulating troponin T concentration (HP:0410174); Elevated circulating aspartate aminotransferase concentration (HP:0031956); Elevated circulating alanine aminotransferase concentration (HP:0031964); excluded: Loss of ambulation (HP:0002505); excluded: Brain imaging abnormality (HP:0410263); excluded: Abnormality of alkaline phosphatase level (HP:0004379); excluded: Dysphagia (HP:0002015)
V:8 (MALE; P37Y),"Muscular dystrophy, limb-girdle, autosomal recessive 28 (OMIM:620375)",NM_000859.3:c.2465G>A (homozygous),Exercise-induced myalgia (HP:0003738); Loss of ambulation (HP:0002505); Elevated circulating creatine kinase concentration (HP:0003236); Elevated circulating aspartate aminotransferase concentration (HP:0031956); Elevated circulating alanine aminotransferase concentration (HP:0031964); excluded: Respiratory distress (HP:0002098); excluded: Proximal muscle weakness in upper limbs (HP:0008997); excluded: Left ventricular diastolic dysfunction (HP:0025168); excluded: Proximal muscle weakness in lower limbs (HP:0008994); excluded: Abnormality of alkaline phosphatase level (HP:0004379); excluded: Respiratory insufficiency (HP:0002093); excluded: Dysphagia (HP:0002015)
V:9 (MALE; P42Y),"Muscular dystrophy, limb-girdle, autosomal recessive 28 (OMIM:620375)",NM_000859.3:c.2465G>A (homozygous),Proximal muscle weakness in lower limbs (HP:0008994); Exercise-induced myalgia (HP:0003738); Loss of ambulation (HP:0002505); Elevated circulating creatine kinase concentration (HP:0003236); Increased circulating troponin T concentration (HP:0410174); excluded: Respiratory distress (HP:0002098); excluded: Elevated circulating alanine aminotransferase concentration (HP:0031964); excluded: Brain imaging abnormality (HP:0410263); excluded: Elevated circulating aspartate aminotransferase concentration (HP:0031956); excluded: Proximal muscle weakness in upper limbs (HP:0008997); excluded: Left ventricular diastolic dysfunction (HP:0025168); excluded: Abnormality of alkaline phosphatase level (HP:0004379); excluded: Respiratory insufficiency (HP:0002093); excluded: Dysphagia (HP:0002015)
V:12 (FEMALE; P51Y),"Muscular dystrophy, limb-girdle, autosomal recessive 28 (OMIM:620375)",NM_000859.3:c.2465G>A (homozygous),Respiratory distress (HP:0002098); Proximal muscle weakness in upper limbs (HP:0008997); Proximal muscle weakness in lower limbs (HP:0008994); Upper limb amyotrophy (HP:0009129); Lower limb amyotrophy (HP:0007210); Hyporeflexia (HP:0001265); Exercise-induced myalgia (HP:0003738); Respiratory insufficiency (HP:0002093); Elevated circulating creatine kinase concentration (HP:0003236); Increased circulating troponin T concentration (HP:0410174); Elevated circulating aspartate aminotransferase concentration (HP:0031956); excluded: Loss of ambulation (HP:0002505); excluded: Elevated circulating alanine aminotransferase concentration (HP:0031964); excluded: Brain imaging abnormality (HP:0410263); excluded: Abnormality of alkaline phosphatase level (HP:0004379); excluded: Dysphagia (HP:0002015)
V:13 (MALE; P41Y),"Muscular dystrophy, limb-girdle, autosomal recessive 28 (OMIM:620375)",NM_000859.3:c.2465G>A (homozygous),Proximal muscle weakness in upper limbs (HP:0008997); Proximal muscle weakness in lower limbs (HP:0008994); Upper limb amyotrophy (HP:0009129); Hyporeflexia (HP:0001265); Exercise-induced myalgia (HP:0003738); Loss of ambulation (HP:0002505); Elevated circulating creatine kinase concentration (HP:0003236); Elevated circulating aspartate aminotransferase concentration (HP:0031956); Elevated circulating alanine aminotransferase concentration (HP:0031964); excluded: Respiratory distress (HP:0002098); excluded: Brain imaging abnormality (HP:0410263); excluded: Abnormality of alkaline phosphatase level (HP:0004379); excluded: Respiratory insufficiency (HP:0002093); excluded: Dysphagia (HP:0002015)


In [30]:
Individual.output_individuals_as_phenopackets(individuals, metadata)

We output 6 GA4GH phenopackets to the directory phenopackets
