<H1>ANKH...</H1>
<p>This notebook imports data from <a href="https://pubmed.ncbi.nlm.nih.gov/22647861/" target="__blank">Gruber BL, et al. Novel ANKH amino terminus mutation (Pro5Ser) associated with early-onset calcium pyrophosphate disease with associated phosphaturia. J Clin Rheumatol. 2012 Jun;18(4):192-5.</a></p>

In [3]:
import phenopackets as php
from google.protobuf.json_format import MessageToDict, MessageToJson
from google.protobuf.json_format import Parse, ParseDict
import pandas as pd
pd.set_option('display.max_colwidth', None) # show entire column contents, important!
from collections import defaultdict
import os
import sys
from pyphetools import *

In [4]:
vignette="""
A 41-year-old woman presented as an outpatient with a history of onset of 
bilateral ankle swelling and pain dating back to age 20. The inflammatory attacks were self-limited, lasting
days to weeks and recurring every few months.
"""

<h3>Setup pyphetools</h3>

In [5]:
parser = HpoParser()
hpo_cr = parser.get_hpo_concept_recognizer()
pmid = "PMID:22647861"
age = "P20Y"
encoder = CaseEncoder(concept_recognizer=hpo_cr, pmid=pmid, age_at_last_exam=age)

In [6]:
d1={'ankle swelling and pain': 'Ankle pain'}
results = encoder.add_vignette(vignette=vignette, custom_d=d1)
results

Unnamed: 0,id,label,observed,measured
0,HP:0030840,Ankle pain,True,True


In [7]:
vignette2="""
 Subsequently, both knees became involved,
and arthroscopy was performed at age 22, revealing extensive CC intraoperatively, with
CPP crystals documented. Symptoms progressed despite treatment with colchicine and
nonsteroidal anti-inflammatory drugs. Additional joint involvement included shoulders,
elbows, and wrists with multiple attacks every few weeks. She also began to experience
severe low back pain. Over the years, attacks became less self-limited and a chronic,
polyarticular arthropathy evolved."""

In [8]:
d2={'ankle swelling and pain': 'Ankle pain',
    'low back pain':'Low back pain',
  'extensive CC intraoperatively': 'Polyarticular chondrocalcinosis',
   'Polyarticular arthropathy':'Polyarticular arthropathy'}
results = encoder.add_vignette(vignette=vignette2, custom_d=d2)
results

Unnamed: 0,id,label,observed,measured
0,HP:0003419,Low back pain,True,True
1,HP:0005017,Polyarticular chondrocalcinosis,True,True
2,HP:0005195,Polyarticular arthropathy,True,True


In [9]:
vignette3="""
On presentation, examination revealed multiple swollen, warm, and markedly tender joints
—including bilateral wrists, right shoulder with minimal range of motion, and bilateral ankle
swelling. She was wheelchair bound, unable to ambulate or bear weight. An ankle aspiration
revealed CPP crystals via compensated polarized microscopic analysis.
Laboratory tests revealed normal complete blood cell count and metabolic panel including
serum alkaline phosphatase, calcium (8.8 mg/dL), magnesium (1.9 mg/dL), urate (1.6 mg/dL), 
copper (122 μg/dL), ceruloplasmin (36 mg/dL), ferritin (31 ng/mL), 25-hydroxy
vitamin D (28 ng/mL), and thyroid function tests. Serum phosphorus was decreased (1.9
mg/dL) and intact parathyroid hormonewas elevated (93 pg/mL; reference value, <65 pg/
mL), indicating mild secondary hyperparathyroidism. The level of fibroblast growth
factor-23 was not elevated (100 RU/mL; reference value, <180 RU/mL). The 24-hour urine
collection revealed elevated calcium (437 mg/24 hr; reference value, <275 mg/24 hr) and
phosphorus (1730 mg/24 hr; reference value, <1300 mg/24 hr) with normal creatinine (1860
mg/24 hr) and uric acid (600 mg/24 hr) excretion, with a calculated creatinine clearance of
110 mL/min.
"""

In [10]:
## Add labs one by one
age="P40Y"
# Hypophosphatemia HP:0002148
lab_results = []
r = encoder.add_term(label="Hypophosphatemia", custom_age="P1Y8M")
lab_results.append(r)
r =  encoder.add_term(label="Elevated circulating parathyroid hormone level", custom_age="P1Y8M")
lab_results.append(r)
r =  encoder.add_term(label="Hypercalciuria", custom_age="P1Y8M")
lab_results.append(r)
r =  encoder.add_term(label="Hyperphosphaturia", custom_age="P1Y8M")
lab_results.append(r)
# lab_results uncomment to display

In [11]:
vignette5="""Radiographs of multiple regions revealed extensive CC of small and large joints within the
articular cartilage and fibrocartilage, as well as along tendon insertions in the feet (Fig. 1). In
addition, calcifications were observed throughout the spine within intervertebral disks (Fig.
1).
Treatment was instituted with intra-articular steroids and oral prednisone for acute attacks
and a combination of hydroxychloroquine, magnesium oxide, probenecid, and colchicine for
long-term maintenance therapy. After several weeks, acute synovitis markedly improved,
and for the past 12 months, only 2 minor exacerbations occurred, which responded rapidly
to oral prednisolone treatment. Interestingly, we noted that abnormalities of serum
phosphate, 25-OH vitamin D, and parathyroid hormone levels resolved during treatment.
She has been able to continue her occupation as a teacher. Recently, radiographs (knees,
hands, wrists) of the patient’s father reveal marked and extensive CC with degenerative joint
disease consistent with CPPD"""

In [12]:
genome = 'hg38'
default_genotype = 'heterozygous'
transcript='NM_054027.4' # not mentioned in article, but should be correct since there are no alt starts
varValidator = VariantValidator(genome_build=genome, transcript=transcript)
var = varValidator.encode_hgvs(hgvs="c.13 C>T")
var.to_string()

https://rest.variantvalidator.org/VariantValidator/variantvalidator/hg38/NM_054027.4%3Ac.13 C>T/NM_054027.4


'chr5:14871435G>A'

In [13]:
individual_id = "41-year-old woman"
sex = "FEMALE"
age = "P41Y" 
disease_id = "OMIM:118600"
disease_label = "Chondrocalcinosis 2"

In [14]:
metadata = MetaData(created_by="ORCID:0000-0002-0736-9199")
metadata.geno()
metadata.hpo(version='v2022-12-15')
metadata.hgnc(version='06/01/23')
metadata.omim(version='January 4, 2023')

In [15]:
ppacket = encoder.get_phenopacket(individual_id=individual_id, sex=sex, age=age, disease_id=disease_id, 
                                  disease_label=disease_label, variants=var, metadata=metadata.to_ga4gh())

In [16]:
json_string = MessageToJson(ppacket)
print(json_string)

{
  "id": "PMID_22647861_41-year-old_woman",
  "subject": {
    "id": "41-year-old woman",
    "timeAtLastEncounter": {
      "age": {
        "iso8601duration": "P41Y"
      }
    },
    "sex": "FEMALE"
  },
  "phenotypicFeatures": [
    {
      "type": {
        "id": "HP:0030840",
        "label": "Ankle pain"
      },
      "onset": {
        "age": {
          "iso8601duration": "P20Y"
        }
      }
    },
    {
      "type": {
        "id": "HP:0003419",
        "label": "Low back pain"
      },
      "onset": {
        "age": {
          "iso8601duration": "P20Y"
        }
      }
    },
    {
      "type": {
        "id": "HP:0005017",
        "label": "Polyarticular chondrocalcinosis"
      },
      "onset": {
        "age": {
          "iso8601duration": "P20Y"
        }
      }
    },
    {
      "type": {
        "id": "HP:0005195",
        "label": "Polyarticular arthropathy"
      },
      "onset": {
        "age": {
          "iso8601duration": "P20Y"
        }
     

In [17]:
output_directory = "ANKH_phenopackets"
encoder.output_phenopacket(outdir=output_directory, phenopacket=ppacket)

Wrote phenopacket to ANKH_phenopackets/PMID_22647861_41-year-old_woman.json
