MPRA

Please see updated repository for this project at https://github.com/kundajelab/MPRA-DragoNN/.

This project applies convolutional neural networks to predict output from massively parallel reporter assays (MPRAs), with the aim of systematically decoding regulatory sequence patterns and identifying disease-causing noncodign variants.

Most of the code involved with training and interpreting the models can be found in Jupyter notebooks (compiled with Python 2.7) in deeplearn/scripts/. The final model was trained in Keras 1.2.2 and can be loaded with the load_model or model_from_json methods using the weights (deeplearn/model_files/sharpr_znormed_jul23/record_13_model_bgGhy_modelWeights.h5) and/or architecture files (deeplearn/model_files/sharpr_znormed_jul23/record_13_model_bgGhy_modelJson.json).

Dependencies:

numpy, scipy, pandas, seaborn, etc.
theano (0.9.0)
keras (1.2.2)
deeplift (0.5.2)

Here are descriptions of the relevant Jupyter notebooks in deeplearn/scripts/:

Sharpr Model Interpretation.ipynb: Scatter plots for replicates (Figure 1C), prediction performances (Figure 2A), performances for specific chromatin states (Figure 2B), exploration of regulatory motifs and grammars learned by the model.
GBM Performance Benchmarking.ipynb: Training and performance testing of gradient boosting tree models (Figure S2).
Sharpr DeepLIFT Scoring Validation: CENTIPEDE TFBS validation (Figure 2C-D), DeepLIFT motif scores, comparison to Sharpr (Fig. 3).
Regulatory Grammar Discovery with Sharpr Models.ipynb: Exploration of predictive TF motif PWMs by comparing to DeepLIFT score profiles (Fig. 4).
Variant Scoring.ipynb: Evaluation of SNPpet ISM scores for variant prioritization (Figures 5 and 6, Supp. Figures).
ggplot2 visualizations.ipynb: R notebook to generate several of the manuscript's figures.

Some code written to process the Sharpr-MPRA and SuRE-seq datasets is available in the folders "sharpr" and "sureseq" respectively. These scripts process the data and produce structured input/output matrices that are used to train the deep learning models.

Feel free to contact Rajiv Movva (rmovva at mit dot edu) with any questions.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
deeplearn		deeplearn
figures		figures
sharpr		sharpr
sureseq		sureseq
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MPRA

About

Releases

Packages

Languages

kundajelab/mpra

Folders and files

Latest commit

History

Repository files navigation

MPRA

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages