Skip to content

Scalability

Tiffany J. Callahan edited this page Jun 21, 2019 · 1 revision

Overview


Background

Problem: Computational phenotype definitions may lack scalability because the current process for creating definitions is a time-consuming, iterative process requiring both domain expertise and external validation.

Solution: PhenKnowVec implements differentially private embedding methods, which convert large complex heterogeneous data into scalable compressed vectors without semantic information loss.

Experiment: For each phenotype, patient-level embeddings were created for each of the cohorts that were derived using the clinical and ontology code sets from the translatability experiments. Two types of patient-level embeddings will be built:

  • Embeddings will only include the clinical codes explicitly outlined by the phenotype definition.
  • Embeddings will only include all available data.

For all comparisons, the best performing clinical and ontology code sets (without descendants) from the translatability experiments will be used as gold standards.

The results from these experiments are organized by phenotype and listed below.


Results

ADHD


Appendicitis


Crohn's Disease


Hypothyroidism


Peanut Allergy


Sickle Cell Disease


Sleep Apnea


Steroid-Induced Osteonecrosis


Systemic Lupus Erythematosus