Skip to content
Applying Machine Learning Ras, NF1, and TP53 Classifiers to PDX model gene expression
Branch: master
Clone or download
Latest commit 9c75d61 Jul 10, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
data update md5sums Jul 9, 2019
figures rerun pipeline fresh Jul 9, 2019
html rerun pipeline fresh Jul 9, 2019
results rerun pipeline and add all new results Jul 9, 2019
scripts recode histology detailed Jul 9, 2019
1.apply-classifier.ipynb update classifier apply to deal with rda file instead of rds Jul 9, 2019
2.evaluate-classifier.ipynb update data in evaluate classifier Jul 9, 2019
3.explore-variants.ipynb recode histology detailed Jul 9, 2019 update contributors Oct 27, 2018 update zenodo badge for new release Jan 4, 2019 FINAL data update Jul 9, 2019
environment.yml change env name Jul 9, 2018 update bash command for file name update Jan 3, 2019 fix histology column name in utils Jul 9, 2019

Applying Machine Learning Classifiers to Pediatric Patient Derived Xenograft Expression Data

Gregory Way, Jo Lynne Harenza, John Maris, 2018


Here, we apply a Ras activation, an NF1 inactivation, and a TP53 inactivation classifier to Target Patient Derived Xenograft (PDX) RNAseq data. The classifiers were previously trained using data from The Cancer Genome Atlas (TCGA) PanCanAtlas Project (Way et al. 2018, Knijnenburg et al. 2018)

Computational Environment

We use conda as an environment manager. To reproduce the computational environment used in this pipeline, run:

# Using conda version >4.5
conda env create --force --file environment.yml

conda activate expression-classification


The following notebooks describe the analysis pipeline

Notebook Description
1.apply-classifier.ipynb Apply the classifiers trained previously on the input data
2.evaluate-classifier.ipynb Investigate and evaluate the prediction performance and score distribution for input data
3.explore-variants.ipynb Explore the classifier predictions across genes, variants, and outliers

To rerun all scripts, perform the following:

# First, download the gene expression and alterations data

# Make sure to activate the conda environment
conda activate expression-classification

# Run the pipeline to extract results, figures, and convert notebooks for easy viewing
You can’t perform that action at this time.