# LYN

Data from [de Jesus AA, et al. (2023) Constitutively active Lyn kinase causes a cutaneous small vessel vasculitis and liver fibrosis syndrome. Nat Commun;14:1502. PMID:36932076](https://pubmed.ncbi.nlm.nih.gov/36932076/).

In [1]:
import pandas as pd
from IPython.display import display, HTML
pd.set_option('display.max_colwidth', None) # show entire column contents, important!
from collections import defaultdict
from pyphetools.creation import *
from pyphetools.visualization import IndividualTable, QcVisualizer
from pyphetools.validation import *
import pyphetools
print(f"Using pyphetools version {pyphetools.__version__}")

Using pyphetools version 0.9.39


In [2]:
# de Jesus AA (2023)
PMID="PMID:36932076"
title = "Constitutively active Lyn kinase causes a cutaneous small vessel vasculitis and liver fibrosis syndrome"
cite = Citation(pmid=PMID, title=title)
parser = HpoParser(hpo_json_file="../hp.json")
hpo_cr = parser.get_hpo_concept_recognizer()
hpo_version = parser.get_version()
hpo_ontology = parser.get_ontology()
metadata = MetaData(created_by="ORCID:0000-0002-0736-9199", citation=cite)
metadata.default_versions_with_hpo(version=hpo_version)
print(f"HPO version {hpo_version}")

HPO version 2024-01-16


In [3]:
df = pd.read_excel("input/dejesus_2023_LYN.xlsx")
dft = df.transpose()
dft.columns = dft.iloc[0]
dft.drop(dft.index[0], inplace=True)
dft['individual_id'] = dft.index  # Set the new column 'patient_id' to be identical to the contents of the index
dft.head() # check the transposed table

Clinical and laboratory features,Age of disease onset,Sex,Age at diagnosis,Presenting Symptoms,Hydrops fetalis,Liver fibrosis,Other clinical manifestations,ESR (mm/1h),CRP (mg/L),SAA (mg/L),...,NK lymphocytes (abs #),IgG,IgA,IgM,Skin biopsies,Elastography (max),Infections,Treatment,NM_002350.4,individual_id
Patient 1,1st day of life,female,2 years 6 months old,"Purpuric rash, hepatosplenomegaly, fever, thrombocytopenia at birth","Yes, intra-utero platelet and PRBC transfusion at 29 weeks of GA",Yes,"Recurrent parotitis, abdominal pain, periorbital edema and erythema, conjunctivitis, epididymitis, headaches, arthralgias, oral ulcers, fatigue, GVHD-like colitis",64,14–86.5,ND,...,nl,nl,nl,low,Small vessel vasculitis with neutrophilic infiltrate and destruction of dermal vessel walls,6.7 kPa,"Enteropathogenic E. coli, Salmonella sp, Toxocara canisb","Poor response to IVIG, IVMP and oral prednisolone, partial response to dasatinib monotherapy, partial response to etanercept monotherapy, good response to dasatinib and etanercept combination therapy",c.1522T>C,Patient 1
Patient 2,1st day of life,male,15 years-old,"Mild purpuric rash at birth, fever, and generalized severe purpuric rash at the age of 3 months",No,No,"Recurrent abdominal pain, periorbital edema and erythema, conjunctivitis, epididymitis, headaches, arthralgias, oral ulcers, fatigue, GVHD-like colitis",ND,46–166,182–984,...,nl,nl,nl,low,Perivascular neutrophilic dermal infiltrate,"ND, normal LFTs",Post-streptococcal glomerulonephritis,"No response to anakinra and tocilizumab, partial response to colchicine and good response to etanercept and colchicine",c.1524C>G,Patient 2
Patient 3,1st day of life,male,4 months old,"Hepatosplenomegaly, thrombocytopenia, and discrete purpuric rash at birth","No, had congenital hydrocele",Yes,"Intrauterine growth restriction, failure to thrive, transient periorbital erythema, jaundice, direct hyperbilirubinemia",ND,6.5–107.6,ND,...,high,nl,low,nl,Small vessel vasculitis with neutrophilic infiltrate and destruction of dermal vessel walls,18.7 kPa,"Late neonatal sepsis, COVID-19 (asymptomatic)","Partial improvement of CRP and thrombocytopenia, resolution of direct hyperbilirubinemia, return of normal growth rate, and persistence of liver fibrosis on etanercept therapy",c.1523A>T,Patient 3


In [4]:
#res = OptionColumnMapper.autoformat(df=dft, concept_recognizer=hpo_cr, )
#print(res)
column_mapper_list = list()

In [5]:
presenting_symptoms_d = {'Purpuric rash': 'Purpura',
                    'hepatosplenomegaly': 'Hepatosplenomegaly',
                    'fever': 'Fever',
                    'thrombocytopenia at birth': 'Thrombocytopenia',
                    'Mild purpuric rash at birth': 'Purpura',
                    'and generalized severe purpuric rash at the age of 3 months': 'Purpura',
                    'Hepatosplenomegaly': 'Hepatosplenomegaly',
                    'thrombocytopenia': 'Thrombocytopenia',
                    'and discrete purpuric rash at birth': 'Purpura'}
presenting_symptomsMapper = OptionColumnMapper(column_name='Presenting Symptoms',
                                               concept_recognizer=hpo_cr, option_d=presenting_symptoms_d)
column_mapper_list.append(presenting_symptomsMapper)
presenting_symptomsMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Purpura (HP:0000979) (observed),4
1,Hepatosplenomegaly (HP:0001433) (observed),2
2,Fever (HP:0001945) (observed),2
3,Thrombocytopenia (HP:0001873) (observed),2


In [6]:
hydrops_d = {'Yes': 'Hydrops fetalis',
 #'intra-utero platelet and PRBC transfusion at 29 weeks of\xa0GA': 'PLACEHOLDER',
 'had congenital hydrocele': 'Congenital hydrocele'}
excluded = {'No': 'Hydrops fetalis',}
hydropsMapper = OptionColumnMapper(column_name='Hydrops fetalis',
                                    concept_recognizer=hpo_cr, option_d=hydrops_d, excluded_d=excluded)
column_mapper_list.append(hydropsMapper)
hydropsMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Hydrops fetalis (HP:0001789) (observed),1
1,Hydrops fetalis (HP:0001789) (excluded),1
2,Congenital hydrocele (HP:4000037) (observed),1


In [7]:
liver_d = {'Yes': 'Hepatic fibrosis',}
excluded = {'No': 'Hepatic fibrosis',}
liverMapper = OptionColumnMapper(column_name='Liver fibrosis',concept_recognizer=hpo_cr, option_d=liver_d, excluded_d=excluded)
column_mapper_list.append(liverMapper)
liverMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Hepatic fibrosis (HP:0001395) (observed),2
1,Hepatic fibrosis (HP:0001395) (excluded),1


In [8]:
other_d = {'Recurrent parotitis': 'Parotitis',
 'abdominal pain': 'Abdominal pain',
 'periorbital edema and erythema': 'Periorbital edema',
 'conjunctivitis': 'Conjunctivitis',
 'epididymitis': 'Epididymitis',
 'headaches': 'Headache',
 'arthralgias': 'Arthralgia',
 'oral ulcers': 'Oral ulcer',
 'fatigue': 'Fatigue',
 'GVHD-like colitis': 'Colitis',
 'Recurrent abdominal pain': 'Abdominal pain',
 'Intrauterine growth restriction': 'Intrauterine growth retardation',
 'failure to thrive': 'Failure to thrive',
 'transient periorbital erythema': 'Erythema',
 'jaundice': 'Jaundice',
 'direct hyperbilirubinemia': 'Conjugated hyperbilirubinemia'}
otherMapper = OptionColumnMapper(column_name='Other clinical manifestations', concept_recognizer=hpo_cr, option_d=other_d)
column_mapper_list.append(otherMapper)
otherMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Parotitis (HP:0011850) (observed),1
1,Abdominal pain (HP:0002027) (observed),2
2,Periorbital edema (HP:0100539) (observed),2
3,Conjunctivitis (HP:0000509) (observed),2
4,Epididymitis (HP:0000031) (observed),2
5,Headache (HP:0002315) (observed),2
6,Arthralgia (HP:0002829) (observed),2
7,Oral ulcer (HP:0000155) (observed),2
8,Fatigue (HP:0012378) (observed),2
9,Colitis (HP:0002583) (observed),2


In [9]:
# Male <50 years old: ≤15 mm/hr. Female <50 years old: ≤ 20 mm/hr. Male >50 years old: ≤20 mm/hr.
# therefore, 64 is increased
esr_d = {'64': 'Elevated erythrocyte sedimentation rate',}
esr_Mapper = OptionColumnMapper(column_name='ESR (mm/1h)',concept_recognizer=hpo_cr, option_d=esr_d)
column_mapper_list.append(esr_Mapper)
esr_Mapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Elevated erythrocyte sedimentation rate (HP:0003565) (observed),1


In [10]:
# Elevated circulating C-reactive protein concentration HP:0011227
crp_d = {'14–86.5': 'Elevated circulating C-reactive protein concentration',
 '46–166': 'Elevated circulating C-reactive protein concentration',
 '6.5–107.6': 'Elevated circulating C-reactive protein concentration'}
crpMapper = OptionColumnMapper(column_name='CRP (mg/L)',concept_recognizer=hpo_cr, option_d=crp_d)
column_mapper_list.append(crpMapper)
crpMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Elevated circulating C-reactive protein concentration (HP:0011227) (observed),3


In [11]:
# Serum amyloid A - need term
#saa_d = { '182–984': 'PLACEHOLDER'} - elevated
#saa_Mapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=saa_d)
#saa_Mapper.preview_column(df['SAA (mg/L)'])
#column_mapper_d['SAA (mg/L)'] = saa_(mg/l)Mapper

In [12]:
cbc_d = {'Mild anemia': 'Anemia',
 'mild leukocytosis': 'Leukocytosis',
 'moderate to severe thrombocytopenia': 'Thrombocytopenia',
 'Mild leukocytosis': 'Leukocytosis',
 'moderate leukocytosis': 'Leukocytosis'}
cbcMapper = OptionColumnMapper(column_name='CBC',concept_recognizer=hpo_cr, option_d=cbc_d)
column_mapper_list.append(cbcMapper)
cbcMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Anemia (HP:0001903) (observed),2
1,Leukocytosis (HP:0001974) (observed),3
2,Thrombocytopenia (HP:0001873) (observed),2


In [13]:
lfts_d = {'Increased ALT': 'Elevated circulating alanine aminotransferase concentration',
 'Increased AST': 'Elevated circulating aspartate aminotransferase concentration',
 'Increased GGT': 'Elevated gamma-glutamyltransferase level',
 }
excluded = {'Normal ALT': 'Elevated circulating aspartate aminotransferase concentration',
 'Normal AST': 'Elevated circulating aspartate aminotransferase concentration',
 'Normal GGT': 'Elevated gamma-glutamyltransferase level'}
lftsMapper = OptionColumnMapper(column_name='LFTs',concept_recognizer=hpo_cr, option_d=lfts_d, excluded_d=excluded)
column_mapper_list.append(lftsMapper)
lftsMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Elevated circulating alanine aminotransferase concentration (HP:0031964) (observed),2
1,Elevated circulating aspartate aminotransferase concentration (HP:0031956) (observed),2
2,Elevated gamma-glutamyltransferase level (HP:0030948) (observed),2


In [14]:
autoAB_d = {'Positive ANA': 'Antinuclear antibody positivity',
 'anti-Sm': 'Anti-Sm antibody positivity',
# 'anti-SSA': 'PLACEHOLDER',
# 'ACL IgG': 'PLACEHOLDER',
 'LAC': 'Lupus anticoagulant',
 'anti-mitochondrial': 'Antimitochondrial antibody positivity',
 'RF': 'Rheumatoid factor positive',
 'anti-TPO': 'Anti-thyroid peroxidase antibody positivity',
 #'transient positivity for ANA on research testing once': 'PLACEHOLDER',
 #'Borderline anti-PR3': 'PLACEHOLDER'
}
autoAbMapper = OptionColumnMapper(column_name='Autoantibodies',concept_recognizer=hpo_cr, option_d=autoAB_d)
column_mapper_list.append(autoAbMapper)
autoAbMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Antinuclear antibody positivity (HP:0003493) (observed),1
1,Anti-Sm antibody positivity (HP:0033040) (observed),1
2,Lupus anticoagulant (HP:0025343) (observed),1
3,Antimitochondrial antibody positivity (HP:0030167) (observed),1
4,Rheumatoid factor positive (HP:0002923) (observed),1
5,Anti-thyroid peroxidase antibody positivity (HP:0025379) (observed),1


In [15]:
#cd4_lymphocytes_(abs)_d = {'high': 'PLACEHOLDER',
# 'nl': 'PLACEHOLDER'}
#cd4_lymphocytes_(abs_#)Mapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=cd4_lymphocytes_(abs_#)_d)
#cd4_lymphocytes_(abs_#)Mapper.preview_column(df['CD4 lymphocytes (abs #)'])
#column_mapper_d['CD4 lymphocytes (abs #)'] = cd4_lymphocytes_(abs_#)Mapper

In [16]:
#cd8_lymphocytes_(abs)_d = {'high': 'PLACEHOLDER',
# 'nl': 'PLACEHOLDER'}
#cd8_lymphocytes_(abs_#)Mapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=cd8_lymphocytes_(abs_#)_d)
#cd8_lymphocytes_(abs_#)Mapper.preview_column(df['CD8 lymphocytes (abs #)'])
#column_mapper_d['CD8 lymphocytes (abs #)'] = cd8_lymphocytes_(abs_#)Mapper

In [17]:
b_lymphMapper = SimpleColumnMapper(column_name='B lymphocytes (abs #)',
                                    hpo_id="HP:0005404", hpo_label="Increased B cell count", 
                                   observed="high", excluded="nl")
column_mapper_list.append(b_lymphMapper)
b_lymphMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""nl"" -> HP: Increased B cell count (HP:0005404) (excluded)",2
1,"original value: ""high"" -> HP: Increased B cell count (HP:0005404) (observed)",1


In [18]:
#nk_lymphocytes_(abs_#)_d = {'nl': 'PLACEHOLDER',
# 'high': 'PLACEHOLDER'}
#nk_lymphocytes_(abs_#)Mapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=nk_lymphocytes_(abs_#)_d)
#nk_lymphocytes_(abs_#)Mapper.preview_column(df['NK lymphocytes (abs #)'])
#column_mapper_d['NK lymphocytes (abs #)'] = nk_lymphocytes_(abs_#)Mapper

In [19]:
iggMapper = SimpleColumnMapper(column_name='IgG',hpo_id="HP:0410242", hpo_label="Abnormal circulating IgG level", observed="high", excluded="nl")
column_mapper_list.append(iggMapper)
iggMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""nl"" -> HP: Abnormal circulating IgG level (HP:0410242) (excluded)",3


In [20]:
igaMapper = SimpleColumnMapper(column_name='IgA',
                               hpo_id="HP:0002720", hpo_label="Decreased circulating IgA level", observed="low", excluded="nl")
column_mapper_list.append(igaMapper)
igaMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""nl"" -> HP: Decreased circulating IgA level (HP:0002720) (excluded)",2
1,"original value: ""low"" -> HP: Decreased circulating IgA level (HP:0002720) (observed)",1


In [21]:
igmMapper = SimpleColumnMapper(column_name='IgM',hpo_id="HP:0002850", hpo_label="Decreased circulating total IgM", observed="low", excluded="nl")
column_mapper_list.append(igmMapper)
igmMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,"original value: ""low"" -> HP: Decreased circulating total IgM (HP:0002850) (observed)",2
1,"original value: ""nl"" -> HP: Decreased circulating total IgM (HP:0002850) (excluded)",1


In [22]:
skin_d = {'Small vessel vasculitis with neutrophilic infiltrate and destruction of dermal vessel walls': 'Small vessel vasculitis',
 #'Perivascular neutrophilic dermal infiltrate': 'PLACEHOLDER'
                  }
skinMapper = OptionColumnMapper(column_name='Skin biopsies',concept_recognizer=hpo_cr, option_d=skin_d)
column_mapper_list.append(skinMapper)
skinMapper.preview_column(dft)

Unnamed: 0,mapping,count
0,Small vessel vasculitis (HP:0011944) (observed),2


In [23]:
ageMapper = AgeColumnMapper.custom_dictionary(column_name="Age of disease onset", string_to_iso_d={'1st day of life': 'P1D'})
#ageMapper.preview_column(dft["Age of disease onset"])
sexMapper = SexColumnMapper(column_name="Sex", male_symbol="male", female_symbol="female")

In [24]:
vmanager = VariantManager(df=dft, 
                          allele_1_column_name="NM_002350.4",
                          gene_symbol="LYN", 
                          individual_column_name="individual_id",
                          transcript="NM_002350.4")
varMapper = VariantColumnMapper(variant_column_name="NM_002350.4",variant_d=vmanager.get_variant_d(), default_genotype="heterozygous")

In [25]:
disease = Disease(disease_id="OMIM:620376", disease_label="Autoinflammatory disease, systemic, with vasculitis")
encoder = CohortEncoder(df=dft,
                       hpo_cr=hpo_cr, 
                        agemapper=ageMapper,
                        sexmapper=sexMapper,
                        column_mapper_list=column_mapper_list,
                        individual_column_name="individual_id",
                        variant_mapper=varMapper,
                       metadata=metadata)
encoder.set_disease(disease)

In [26]:
individuals = encoder.get_individuals()

In [27]:
cvalidator = CohortValidator(cohort=individuals, ontology=hpo_ontology, min_hpo=1, allelic_requirement=AllelicRequirement.MONO_ALLELIC)
qc = QcVisualizer(cohort_validator=cvalidator)
display(HTML(qc.to_summary_html()))

Level,Error category,Count
WARNING,REDUNDANT,2


In [28]:
individuals = cvalidator.get_error_free_individual_list()
table = IndividualTable(individuals)
display(HTML(table.to_html()))

Individual,Disease,Genotype,Phenotypic features
Patient 1 (FEMALE; P1D),"Autoinflammatory disease, systemic, with vasculitis (OMIM:620376)",NM_002350.4:c.1522T>C (heterozygous),Small vessel vasculitis (HP:0011944); Hepatic fibrosis (HP:0001395); Abdominal pain (HP:0002027); Parotitis (HP:0011850); Elevated gamma-glutamyltransferase level (HP:0030948); Elevated erythrocyte sedimentation rate (HP:0003565); Arthralgia (HP:0002829); Elevated circulating C-reactive protein concentration (HP:0011227); Colitis (HP:0002583); Leukocytosis (HP:0001974); Fatigue (HP:0012378); Anti-Sm antibody positivity (HP:0033040); Hepatosplenomegaly (HP:0001433); Anemia (HP:0001903); Antimitochondrial antibody positivity (HP:0030167); Anti-thyroid peroxidase antibody positivity (HP:0025379); Decreased circulating total IgM (HP:0002850); Lupus anticoagulant (HP:0025343); Conjunctivitis (HP:0000509); Thrombocytopenia (HP:0001873); Periorbital edema (HP:0100539); Elevated circulating alanine aminotransferase concentration (HP:0031964); Oral ulcer (HP:0000155); Fever (HP:0001945); Hydrops fetalis (HP:0001789); Elevated circulating aspartate aminotransferase concentration (HP:0031956); Rheumatoid factor positive (HP:0002923); Purpura (HP:0000979); Epididymitis (HP:0000031); Headache (HP:0002315); excluded: Increased B cell count (HP:0005404); excluded: Abnormal circulating IgG level (HP:0410242); excluded: Decreased circulating IgA level (HP:0002720)
Patient 2 (MALE; P1D),"Autoinflammatory disease, systemic, with vasculitis (OMIM:620376)",NM_002350.4:c.1524C>G (heterozygous),Oral ulcer (HP:0000155); Arthralgia (HP:0002829); Elevated circulating C-reactive protein concentration (HP:0011227); Purpura (HP:0000979); Colitis (HP:0002583); Decreased circulating total IgM (HP:0002850); Conjunctivitis (HP:0000509); Abdominal pain (HP:0002027); Epididymitis (HP:0000031); Leukocytosis (HP:0001974); Fatigue (HP:0012378); Fever (HP:0001945); Headache (HP:0002315); Periorbital edema (HP:0100539); excluded: Hydrops fetalis (HP:0001789); excluded: Hepatic fibrosis (HP:0001395); excluded: Increased B cell count (HP:0005404); excluded: Abnormal circulating IgG level (HP:0410242); excluded: Decreased circulating IgA level (HP:0002720)
Patient 3 (MALE; P1D),"Autoinflammatory disease, systemic, with vasculitis (OMIM:620376)",NM_002350.4:c.1523A>T (heterozygous),Small vessel vasculitis (HP:0011944); Hepatic fibrosis (HP:0001395); Decreased circulating IgA level (HP:0002720); Elevated gamma-glutamyltransferase level (HP:0030948); Jaundice (HP:0000952); Elevated circulating C-reactive protein concentration (HP:0011227); Intrauterine growth retardation (HP:0001511); Hepatosplenomegaly (HP:0001433); Anemia (HP:0001903); Failure to thrive (HP:0001508); Increased B cell count (HP:0005404); Thrombocytopenia (HP:0001873); Elevated circulating alanine aminotransferase concentration (HP:0031964); Congenital hydrocele (HP:4000037); Elevated circulating aspartate aminotransferase concentration (HP:0031956); Purpura (HP:0000979); Conjugated hyperbilirubinemia (HP:0002908); Erythema (HP:0010783); excluded: Abnormal circulating IgG level (HP:0410242); excluded: Decreased circulating total IgM (HP:0002850)


In [29]:
Individual.output_individuals_as_phenopackets(individual_list=individuals, metadata=metadata)

We output 3 GA4GH phenopackets to the directory phenopackets
