Skip to content
[Under construction] An R package for evaluating phenotype algorithms.
R TSQL Perl Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R Adding in F1, removing LRPos Aug 13, 2019
extras First Version Dec 10, 2018
inst Adding in F1, removing LRPos Aug 13, 2019
man commit for class Jul 16, 2019
vignettes commit for class Jul 16, 2019
.Rbuildignore First Version Dec 10, 2018
.gitignore First Version Dec 10, 2018
.travis.yml Changing travis dependency order Dec 10, 2018
DESCRIPTION
INDEX First Version Dec 10, 2018
NAMESPACE update to code Jul 16, 2019
PheValuator.Rproj New version Mar 1, 2019
README.md
compare_versions Adding travis files. Updating readme to common format Dec 10, 2018
deploy.sh Adding travis files. Updating readme to common format Dec 10, 2018

README.md

title author date output
PheValuator
Joel Swerdel
August 13, 2019
html_document

PheValuator

An R package for evaluating phenotype algorithms,

Introduction

The goal of PheValuator is to produce a large cohort of subjects each with a predicted probability for a specified health outcome of interest (HOI). This is achieved by developing a diagnostic predictive model for the HOI using the PatientLevelPrediction (PLP) R package and applying the model to a large, randomly selected population. These subjects can be used to test one or more phenotype algorithms.

Process Steps

The first step in the process, developing the evaluation cohort, is shown below:

The model is created using a cohort of subjects with a very high likelihood of having the HOI. These "noisy" positives ("noisy" in that they are very likely positive for the HOI but not a true gold standard) are called the "xSpec" cohort - extremely specific. This cohort will be the Outcome (O) cohort in the PLP model. There are several methods to create this cohort but the simplest would be to develop a cohort of subjects who have multiple condition codes for the HOI in their patient record. A typical number to use might be 5 or more condition codes for acute HOI's, say myocardial infarction, or 10 or more condition codes for chronic HOI's, say diabetes mellitus. We also define a noisy negatives cohort. This cohort is created by taking a random sample of the subjects in the database who have no evidence of the HOI. These would be determined by creating a very sensitive cohort, in most cases 1 or more condition codes for the HOI and excluding these subjects for the noisy negative cohort. The xSpec cohort and the noisy negative cohort are combined to for the Target (T) cohort for the PLP model. We then create a diagnostic predictive model with LASSO regularized regression using all the data in the subjects record. The data to inform this model is created using the FeatureExtraction package. The data includes conditions, drug exposures, procedures, and measurements. The developed model has a set of features with beta coefficients that can be used to discriminate between those with the HOI and those without.

We next create and "evaluation" cohort - a large group of randomly selected subjects to be used to evaluate the phenotype algorithms (PA). The subjects are selected by pulling up to 1,000,000 subjects from the dataset. We extract the same covariates as we extracted form the T cohort in the model creation phase. We use the PLP function applyModel to apply the model to this large cohort producing probabilities for the HOI for each subject in the evaluation cohort. The subjects in this cohort with their associated probability of the HOI are used as a "gold" standard for the HOI. We save this output for use in the next step of the process

The second step in the process, evaluating the PAs, is shown below:

The next step in the process tests the PA(s). Phenotype algorithms are created based upon the needs of the research to be performed. Every subject in the evaluation cohort should be eligible to be included in the cohort developed from this algorithm. The figure describes how the predicted probabilities for subjects either included or excluded from the phenotype algorithm cohort are used to evaluate the PA. To fully evaluate a PA, you need to estimate the sensitivity, specificity, and positive and negative predictive values. These values are estimated through calculations involving subjects that are True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN). These statistics are generated using the predicted probabilities. Examples of the calculations are shown in the diagram. The formulas for the final calculations are also displayed.

The results from the evaluation for Opioid Abuse is shown below:

The diagram shows the complete performance evaluation for 5 PAs for the Expected Value as described above where the predicted value is used for summing the TP, FP, TN, and FN values. The full table created by the function also includes the performance characteristics based on the prediction thresholds specified when running the function.

Technology

PheValuator is an R package.

System Requirements

Requires R (version 3.3.0 or higher). Installation on Windows requires RTools.

Installation

  1. On Windows, make sure RTools is installed.
  2. The DatabaseConnector and SqlRender packages require Java. Java can be downloaded from http://www.java.com. Once Java is installed, ensure that Java is being pathed correctly. Under environment variables in the control panel, ensure that the jvm.dll file is added correctly to the path.
  3. In R, use the following commands to download and install CohortMethod:
install.packages("drat")
drat::addRepo("OHDSI")
install.packages("PheValuator")

User Documentation

Please read the main vignette for the package:

Getting Involved

We would like you to get involved in the development of this package through pull requests to our development branch.

=======

Support

License

PheValuator is licensed under Apache License 2.0

Development

PheValuator is being developed in R Studio.

Development status

Build Status codecov.io

Beta

Acknowledgements

  • The package is maintained by Joel Swerdel and has been developed with major contributions from Jenna Reps, Peter Rijnbeek, Martijn Schuemie, Patrick Ryan, and Marc Suchard.
You can’t perform that action at this time.