Prediction of variant Effect on Percent Spliced In
Authors: Robert Wang, Yaqiong Wang, Zhiqiang Hu
PEPSI is a tool created for predicting the impact of coding and noncoding variation on pre-mRNA splicing based on sequence conservation, RNA secondary structure, and regulatory sequence elements.
Please contact Robert Wang for feedback, questions, or bugs.
Table of Contents
PEPSI requires python2 (>= 2.7.9)
PEPSI requires R (>= 3.2.2) and it is available on the R website.
Additional R packages needed are:
GNU parallel is available on their website. The latest version of GNU parallel should be downloaded and installed. Make sure that the path to the 'parallel' executable is included in your 'PATH' environment variable.
tabix can be downloaded from their website. Follow the installation guidelines on their page. The latest version of tabix should be downloaded and installed. Have the path to the 'tabix' executable in your 'PATH' environment variable.
RNAplfold is part of the ViennaRNA suite, which can be downloaded from their website. Download the latest release of the ViennaRNA suite and follow the instructions for installation. Include path to RNAplfold executable in your 'PATH' environment variable.
PEPSI can be downloaded (cloned) using the git command.
git clone https://github.com/rwang916/PEPSI.git
Once downloaded, run the following commands to download necessary files for running PEPSI
cd PEPSI/src bash vs_build.sh
To run PEPSI, run:
bash vs_main.sh <path_to_training_set> <path_to_test_set> <number_of_threads> <path_to_training_set_sequences> <path_to_test_set_sequences>
bash vs_main.sh ../data/vexseq/HepG2_delta_PSI_CAGI_training.tsv ../data/vexseq/HepG2_delta_PSI_CAGI_test.tsv 20 ../data/vexseq/vs.sequences.tsv ../data/vexseq/vs.sequences.tsv