Overview

This repository tests the performance of rnabridge-align and rnabridge-denovo. Here we provide scripts to download datasets, run these tools and reproduce the results and figures in the manuscript.

The pipeline involves in the followint five steps:

Download necessary datasets (data directory).
Download and/or compile necessary programs (programs directory).
Run the methods and produce results regarding rnabridge-align (align directory).
Run the methods and produce results regarding rnabridge-denovo (denovo directory).
Summarize results and produce figures (plots directory).

Datasets

We evaluate them on two datasets, namely simulation80 and encode10. We also need the reference annotation files for evaluating reference-based transcript assembly. In directory data, we provide metadata for these datasets, and also provide scripts to download them.

simulation80

The data was simulated with Flux-Simulator. We tried two parameters, the average length of fragments (300 and 500) and the length of reads (75 and 100). For each combination, we simulated 20 samples. The reads, ground-truth transcripts, alignments (using STAR) can be downloaded through Penn State Data Commons (https://doi.org/10.26208/b01x-aq20).

encode10

This dataset contains 10 human RNA-seq samples downloaded from ENCODE. This dataset has also been used in scalloptest. All these samples are sequenced with strand-specific and paired-end protocols. For each of these 10 samples, we align it with two RNA-seq aligners, STAR and HISAT2. You may download all these reads alignments via Penn State Data Commons (https://doi.org/10.26208/8c06-w247).

annotations

Use the following script in data to download annotations:

./download.annotation.sh

The downloaded files will appear under data/ensembl.

Programs

Our experiments (used in the manuscript) involve the following four programs:

Program	Version	Description
rnabridge-align	v1.0.1	bridging RNA-seq alignments
Scallop	v0.10.5	transcript assembler
StringTie	v2.1.4	transcript assembler
gffcompare	v0.11.2	Evaluate assembled transcripts
gtfcuff		a set of utilities for processing RNA-seq data

You need to download and/or complile them, and then link them to programs directory. Make sure that the program names are in lower cases (i.e., stringtie, scallop, and gffcompare) in programs directory.

Generate Results for Evaluating rnabridge-align

Once the datasets and programs are available, use the following scripts in align to run:

./run.simulation80.sh
./run.encode10.sh

In each of these scripts, you can modify it to run different parameters. For each run, you need to specify a run-id, which will be used later on when collecting the results.

After experiments finish running, the following script can collect accuracies:

./collect.sh

This will report results to a directory results.RUN-ID, which can be directly use by the scripts to generate figreus (below).

Analysis Results and Reproduce Figures

Once the results have been generated, one can use the following scripts in plots to reproduce the figures:

./build.figures.sh

You may need to install R tikzDevice. You may also need to modify these scripts to match the run-id(s) you specified. #The results used in the manuscript (run-id = D400) has been update in this repo (including GTEx dataset), #so the directly running above script can generate all figures used in the manuscript.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Datasets

simulation80

encode10

annotations

Programs

Generate Results for Evaluating rnabridge-align

Analysis Results and Reproduce Figures

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
align		align
data		data
plots		plots
programs		programs
README.md		README.md

Shao-Group/rnabridge-test

Folders and files

Latest commit

History

Repository files navigation

Overview

Datasets

simulation80

encode10

annotations

Programs

Generate Results for Evaluating rnabridge-align

Analysis Results and Reproduce Figures

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages