# LYN

Data from [Louvrier C, et al. (2023) De Novo Gain-Of-Function Variations in LYN Associated With an Early-Onset Systemic Autoinflammatory Disorder. Arthritis Rheumatol;75(3):468-474. PMID:36122175](https://pubmed.ncbi.nlm.nih.gov/36122175/).

In [1]:
import pandas as pd
from IPython.display import display, HTML
pd.set_option('display.max_colwidth', None) # show entire column contents, important!
from collections import defaultdict
from pyphetools.creation import *
from pyphetools.visualization import IndividualTable, QcVisualizer
from pyphetools.validation import *
import pyphetools
print(f"Using pyphetools version {pyphetools.__version__}")

Using pyphetools version 0.9.22


In [2]:
# de Jesus AA (2023)
PMID="PMID:36932076"
title = "Constitutively active Lyn kinase causes a cutaneous small vessel vasculitis and liver fibrosis syndrome"
cite = Citation(pmid=PMID, title=title)
parser = HpoParser(hpo_json_file="../hp.json")
hpo_cr = parser.get_hpo_concept_recognizer()
hpo_version = parser.get_version()
hpo_ontology = parser.get_ontology()
metadata = MetaData(created_by="ORCID:0000-0002-0736-9199", citation=cite)
metadata.default_versions_with_hpo(version=hpo_version)
print(f"HPO version {hpo_version}")

HPO version 2023-10-09


In [3]:
df = pd.read_excel("input/louvrier_2023_LYN.xlsx")
dft = df.transpose()
dft.columns = dft.iloc[0]
dft.drop(dft.index[0], inplace=True)
dft['individual_id'] = dft.index  # Set the new column 'patient_id' to be identical to the contents of the index
dft.head() # check the transposed table

Clinical and laboratory features,Age of disease onset,Sex,Age at diagnosis,Presenting Symptoms,Hydrops fetalis,Liver fibrosis,Other clinical manifestations,ESR (mm/1h),CRP (mg/L),SAA (mg/L),...,NK lymphocytes (abs #),IgG,IgA,IgM,Skin biopsies,Elastography (max),Infections,Treatment,NM_002350.4,individual_id
Patient 1,1st day of life,female,2 years 6 months old,"Purpuric rash, hepatosplenomegaly, fever, thrombocytopenia at birth","Yes, intra-utero platelet and PRBC transfusion at 29 weeks of GA",Yes,"Recurrent parotitis, abdominal pain, periorbital edema and erythema, conjunctivitis, epididymitis, headaches, arthralgias, oral ulcers, fatigue, GVHD-like colitis",64,14–86.5,ND,...,nl,nl,nl,low,Small vessel vasculitis with neutrophilic infiltrate and destruction of dermal vessel walls,6.7 kPa,"Enteropathogenic E. coli, Salmonella sp, Toxocara canisb","Poor response to IVIG, IVMP and oral prednisolone, partial response to dasatinib monotherapy, partial response to etanercept monotherapy, good response to dasatinib and etanercept combination therapy",c.1522T>C,Patient 1
Patient 2,1st day of life,male,15 years-old,"Mild purpuric rash at birth, fever, and generalized severe purpuric rash at the age of 3 months",No,No,"Recurrent abdominal pain, periorbital edema and erythema, conjunctivitis, epididymitis, headaches, arthralgias, oral ulcers, fatigue, GVHD-like colitis",ND,46–166,182–984,...,nl,nl,nl,low,Perivascular neutrophilic dermal infiltrate,"ND, normal LFTs",Post-streptococcal glomerulonephritis,"No response to anakinra and tocilizumab, partial response to colchicine and good response to etanercept and colchicine",c.1524C>G,Patient 2
Patient 3,1st day of life,male,4 months old,"Hepatosplenomegaly, thrombocytopenia, and discrete purpuric rash at birth","No, had congenital hydrocele",Yes,"Intrauterine growth restriction, failure to thrive, transient periorbital erythema, jaundice, direct hyperbilirubinemia",ND,6.5–107.6,ND,...,high,nl,low,nl,Small vessel vasculitis with neutrophilic infiltrate and destruction of dermal vessel walls,18.7 kPa,"Late neonatal sepsis, COVID-19 (asymptomatic)","Partial improvement of CRP and thrombocytopenia, resolution of direct hyperbilirubinemia, return of normal growth rate, and persistence of liver fibrosis on etanercept therapy",c.1523A>T,Patient 3


In [4]:
#res = OptionColumnMapper.autoformat(df=dft, concept_recognizer=hpo_cr, )
#print(res)
column_mapper_d = {}

In [5]:
presenting_symptoms_d = {'Purpuric rash': 'Purpura',
                    'hepatosplenomegaly': 'Hepatosplenomegaly',
                    'fever': 'Fever',
                    'thrombocytopenia at birth': 'Thrombocytopenia',
                    'Mild purpuric rash at birth': 'Purpura',
                    'and generalized severe purpuric rash at the age of 3 months': 'Purpura',
                    'Hepatosplenomegaly': 'Hepatosplenomegaly',
                    'thrombocytopenia': 'Thrombocytopenia',
                    'and discrete purpuric rash at birth': 'Purpura'}
presenting_symptomsMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=presenting_symptoms_d)
presenting_symptomsMapper.preview_column(dft['Presenting Symptoms'])
column_mapper_d['Presenting Symptoms'] = presenting_symptomsMapper

In [6]:
hydrops_fetalis_d = {'Yes': 'Hydrops fetalis',
 #'intra-utero platelet and PRBC transfusion at 29 weeks of\xa0GA': 'PLACEHOLDER',
 'had congenital hydrocele': 'Congenital hydrocele'}
excluded = {'No': 'Hydrops fetalis',}
hydrops_fetalisMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=hydrops_fetalis_d, excluded_d=excluded)
hydrops_fetalisMapper.preview_column(dft['Hydrops fetalis'])
column_mapper_d['Hydrops fetalis'] = hydrops_fetalisMapper

In [7]:
liver_fibrosis_d = {'Yes': 'Hepatic fibrosis',}
excluded = {'No': 'Hepatic fibrosis',}
liver_fibrosisMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=liver_fibrosis_d, excluded_d=excluded)
liver_fibrosisMapper.preview_column(dft['Liver fibrosis'])
column_mapper_d['Liver fibrosis'] = liver_fibrosisMapper

In [8]:
other_clinical_manifestations_d = {'Recurrent parotitis': 'Parotitis',
 'abdominal pain': 'Abdominal pain',
 'periorbital edema and erythema': 'Periorbital edema',
 'conjunctivitis': 'Conjunctivitis',
 'epididymitis': 'Epididymitis',
 'headaches': 'Headache',
 'arthralgias': 'Arthralgia',
 'oral ulcers': 'Oral ulcer',
 'fatigue': 'Fatigue',
 'GVHD-like colitis': 'Colitis',
 'Recurrent abdominal pain': 'Abdominal pain',
 'Intrauterine growth restriction': 'Intrauterine growth retardation',
 'failure to thrive': 'Failure to thrive',
 'transient periorbital erythema': 'Erythema',
 'jaundice': 'Jaundice',
 'direct hyperbilirubinemia': 'Conjugated hyperbilirubinemia'}
other_clinical_manifestationsMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=other_clinical_manifestations_d)
other_clinical_manifestationsMapper.preview_column(dft['Other clinical manifestations'])
column_mapper_d['Other clinical manifestations'] = other_clinical_manifestationsMapper

In [9]:
# Male <50 years old: ≤15 mm/hr. Female <50 years old: ≤ 20 mm/hr. Male >50 years old: ≤20 mm/hr.
# therefore, 64 is increased
esr_d = {'64': 'Elevated erythrocyte sedimentation rate',}
esr_Mapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=esr_d)
esr_Mapper.preview_column(dft['ESR (mm/1h)'])
column_mapper_d['ESR (mm/1h)'] = esr_Mapper

In [10]:
# Elevated circulating C-reactive protein concentration HP:0011227
crp_d = {'14–86.5': 'Elevated circulating C-reactive protein concentration',
 '46–166': 'Elevated circulating C-reactive protein concentration',
 '6.5–107.6': 'Elevated circulating C-reactive protein concentration'}
crpMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=crp_d)
crpMapper.preview_column(dft['CRP (mg/L)'])
column_mapper_d['CRP (mg/L)'] = crpMapper

In [11]:
# Serum amyloid A - need term
#saa_d = { '182–984': 'PLACEHOLDER'} - elevated
#saa_Mapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=saa_d)
#saa_Mapper.preview_column(df['SAA (mg/L)'])
#column_mapper_d['SAA (mg/L)'] = saa_(mg/l)Mapper

In [12]:
cbc_d = {'Mild anemia': 'Anemia',
 'mild leukocytosis': 'Leukocytosis',
 'moderate to severe thrombocytopenia': 'Thrombocytopenia',
 'Mild leukocytosis': 'Leukocytosis',
 'moderate leukocytosis': 'Leukocytosis'}
cbcMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=cbc_d)
cbcMapper.preview_column(dft['CBC'])
column_mapper_d['CBC'] = cbcMapper

In [13]:
lfts_d = {'Increased ALT': 'Elevated circulating alanine aminotransferase concentration',
 'Increased AST': 'Elevated circulating aspartate aminotransferase concentration',
 'Increased GGT': 'Elevated gamma-glutamyltransferase level',
 }
excluded = {'Normal ALT': 'Elevated circulating aspartate aminotransferase concentration',
 'Normal AST': 'Elevated circulating aspartate aminotransferase concentration',
 'Normal GGT': 'Elevated gamma-glutamyltransferase level'}
lftsMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=lfts_d, excluded_d=excluded)
lftsMapper.preview_column(dft['LFTs'])
#column_mapper_d['LFTs'] = lftsMapper

Unnamed: 0,original text,terms
0,"Increased ALT, Increased AST, Increased GGT",HP:0031964 (Elevated circulating alanine aminotransferase concentration/observed); HP:0031956 (Elevated circulating aspartate aminotransferase concentration/observed); HP:0030948 (Elevated gamma-glutamyltransferase level/observed)
1,"Normal ALT, Normal AST, Normal GGT",
2,"Increased ALT, Increased AST, Increased GGT",HP:0031964 (Elevated circulating alanine aminotransferase concentration/observed); HP:0031956 (Elevated circulating aspartate aminotransferase concentration/observed); HP:0030948 (Elevated gamma-glutamyltransferase level/observed)


In [14]:
autoantibodies_d = {'Positive ANA': 'Antinuclear antibody positivity',
 'anti-Sm': 'Anti-Sm antibody positivity',
# 'anti-SSA': 'PLACEHOLDER',
# 'ACL IgG': 'PLACEHOLDER',
 'LAC': 'Lupus anticoagulant',
 'anti-mitochondrial': 'Antimitochondrial antibody positivity',
 'RF': 'Rheumatoid factor positive',
 'anti-TPO': 'Anti-thyroid peroxidase antibody positivity',
 #'transient positivity for ANA on research testing once': 'PLACEHOLDER',
 #'Borderline anti-PR3': 'PLACEHOLDER'
}
autoantibodiesMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=autoantibodies_d)
autoantibodiesMapper.preview_column(dft['Autoantibodies'])
column_mapper_d['Autoantibodies'] = autoantibodiesMapper

In [15]:
#cd4_lymphocytes_(abs)_d = {'high': 'PLACEHOLDER',
# 'nl': 'PLACEHOLDER'}
#cd4_lymphocytes_(abs_#)Mapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=cd4_lymphocytes_(abs_#)_d)
#cd4_lymphocytes_(abs_#)Mapper.preview_column(df['CD4 lymphocytes (abs #)'])
#column_mapper_d['CD4 lymphocytes (abs #)'] = cd4_lymphocytes_(abs_#)Mapper

In [16]:
#cd8_lymphocytes_(abs)_d = {'high': 'PLACEHOLDER',
# 'nl': 'PLACEHOLDER'}
#cd8_lymphocytes_(abs_#)Mapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=cd8_lymphocytes_(abs_#)_d)
#cd8_lymphocytes_(abs_#)Mapper.preview_column(df['CD8 lymphocytes (abs #)'])
#column_mapper_d['CD8 lymphocytes (abs #)'] = cd8_lymphocytes_(abs_#)Mapper

In [17]:
b_lymphocytes_absMapper = SimpleColumnMapper(hpo_id="HP:0005404", hpo_label="Increased B cell count", observed="high", excluded="nl")
b_lymphocytes_absMapper.preview_column(dft['B lymphocytes (abs #)'])
column_mapper_d['B lymphocytes (abs #)'] = b_lymphocytes_absMapper

In [18]:
#nk_lymphocytes_(abs_#)_d = {'nl': 'PLACEHOLDER',
# 'high': 'PLACEHOLDER'}
#nk_lymphocytes_(abs_#)Mapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=nk_lymphocytes_(abs_#)_d)
#nk_lymphocytes_(abs_#)Mapper.preview_column(df['NK lymphocytes (abs #)'])
#column_mapper_d['NK lymphocytes (abs #)'] = nk_lymphocytes_(abs_#)Mapper

In [19]:
iggMapper = SimpleColumnMapper(hpo_id="HP:0410242", hpo_label="Abnormal circulating IgG level", observed="high", excluded="nl")
iggMapper.preview_column(dft['IgG'])
column_mapper_d['IgG'] = iggMapper

In [20]:
igaMapper = SimpleColumnMapper(hpo_id="HP:0002720", hpo_label="Decreased circulating IgA level", observed="low", excluded="nl")
igaMapper.preview_column(dft['IgA'])
column_mapper_d['IgA'] = igaMapper

In [21]:
igmMapper = SimpleColumnMapper(hpo_id="HP:0002850", hpo_label="Decreased circulating total IgM", observed="low", excluded="nl")
igmMapper.preview_column(dft['IgM'])
column_mapper_d['IgM'] = igmMapper

In [22]:
skin_biopsies_d = {'Small vessel vasculitis with neutrophilic infiltrate and destruction of dermal vessel walls': 'Small vessel vasculitis',
 #'Perivascular neutrophilic dermal infiltrate': 'PLACEHOLDER'
                  }
skin_biopsiesMapper = OptionColumnMapper(concept_recognizer=hpo_cr, option_d=skin_biopsies_d)
skin_biopsiesMapper.preview_column(dft['Skin biopsies'])
column_mapper_d['Skin biopsies'] = skin_biopsiesMapper

In [23]:
ageMapper = AgeColumnMapper.custom_dictionary(column_name="Age of disease onset", string_to_iso_d={'1st day of life': 'P1D'})
#ageMapper.preview_column(dft["Age of disease onset"])
sexMapper = SexColumnMapper(column_name="Sex", male_symbol="male", female_symbol="female")

In [24]:
vmanager = VariantManager(df=dft, 
                          allele_1_column_name="NM_002350.4",
                          cohort_name="LYN", 
                          individual_column_name="individual_id",
                          transcript="NM_002350.4")

In [25]:
disease = Disease(disease_id="OMIM:620376", disease_label="Autoinflammatory disease, systemic, with vasculitis")
encoder = CohortEncoder(df=dft,
                       hpo_cr=hpo_cr, 
                        agemapper=ageMapper,
                        sexmapper=sexMapper,
                        column_mapper_d=column_mapper_d,
                        individual_column_name="individual_id",
                       metadata=metadata)
encoder.set_disease(disease)

In [26]:
individuals = encoder.get_individuals()
vmanager.add_variants_to_individuals(individual_list=individuals)

In [27]:
cvalidator = CohortValidator(cohort=individuals, ontology=hpo_ontology, min_hpo=1, allelic_requirement=AllelicRequirement.MONO_ALLELIC)
qc = QcVisualizer(cohort_validator=cvalidator)
display(HTML(qc.to_summary_html()))

Level,Error category,Count
WARNING,REDUNDANT,2


In [28]:
individuals = cvalidator.get_error_free_individual_list()
table = IndividualTable(individuals)
display(HTML(table.to_html()))

Individual,Disease,Genotype,Phenotypic features
Patient 1 (FEMALE; P1D),"Autoinflammatory disease, systemic, with vasculitis (OMIM:620376)",NM_002350.4:c.1522T>C (heterozygous),Epididymitis (HP:0000031); Anti-Sm antibody positivity (HP:0033040); Lupus anticoagulant (HP:0025343); Small vessel vasculitis (HP:0011944); Arthralgia (HP:0002829); Fever (HP:0001945); Parotitis (HP:0011850); Headache (HP:0002315); Oral ulcer (HP:0000155); Purpura (HP:0000979); Anemia (HP:0001903); Fatigue (HP:0012378); Hydrops fetalis (HP:0001789); Hepatic fibrosis (HP:0001395); Conjunctivitis (HP:0000509); Antimitochondrial antibody positivity (HP:0030167); Decreased circulating total IgM (HP:0002850); Anti-thyroid peroxidase antibody positivity (HP:0025379); Elevated circulating C-reactive protein concentration (HP:0011227); Periorbital edema (HP:0100539); Colitis (HP:0002583); Abdominal pain (HP:0002027); Thrombocytopenia (HP:0001873); Elevated erythrocyte sedimentation rate (HP:0003565); Rheumatoid factor positive (HP:0002923); Leukocytosis (HP:0001974); Hepatosplenomegaly (HP:0001433); excluded: Increased B cell count (HP:0005404); excluded: Abnormal circulating IgG level (HP:0410242); excluded: Decreased circulating IgA level (HP:0002720)
Patient 2 (MALE; P1D),"Autoinflammatory disease, systemic, with vasculitis (OMIM:620376)",NM_002350.4:c.1524C>G (heterozygous),Fatigue (HP:0012378); Headache (HP:0002315); Epididymitis (HP:0000031); Oral ulcer (HP:0000155); Purpura (HP:0000979); Conjunctivitis (HP:0000509); Arthralgia (HP:0002829); Leukocytosis (HP:0001974); Decreased circulating total IgM (HP:0002850); Fever (HP:0001945); Elevated circulating C-reactive protein concentration (HP:0011227); Periorbital edema (HP:0100539); Colitis (HP:0002583); Abdominal pain (HP:0002027); excluded: Hydrops fetalis (HP:0001789); excluded: Hepatic fibrosis (HP:0001395); excluded: Increased B cell count (HP:0005404); excluded: Abnormal circulating IgG level (HP:0410242); excluded: Decreased circulating IgA level (HP:0002720)
Patient 3 (MALE; P1D),"Autoinflammatory disease, systemic, with vasculitis (OMIM:620376)",NM_002350.4:c.1523A>T (heterozygous),Thrombocytopenia (HP:0001873); Congenital hydrocele (HP:4000037); Purpura (HP:0000979); Hepatic fibrosis (HP:0001395); Failure to thrive (HP:0001508); Conjugated hyperbilirubinemia (HP:0002908); Intrauterine growth retardation (HP:0001511); Decreased circulating IgA level (HP:0002720); Jaundice (HP:0000952); Increased B cell count (HP:0005404); Small vessel vasculitis (HP:0011944); Anemia (HP:0001903); Hepatosplenomegaly (HP:0001433); Elevated circulating C-reactive protein concentration (HP:0011227); Erythema (HP:0010783); excluded: Abnormal circulating IgG level (HP:0410242); excluded: Decreased circulating total IgM (HP:0002850)


In [29]:
Individual.output_individuals_as_phenopackets(individual_list=individuals, metadata=metadata)

We output 3 GA4GH phenopackets to the directory phenopackets
