Panagiotis Papastamoulis edited this page Jan 28, 2015 · 20 revisions

Welcome to the BitSeqVB_benchmarking wiki!

This set of script files can be used in order to replicate the simulation analysis presented in Section 3.1 (inference accuracy on synthetic data) of the BitSeqVB manuscript [1].

The following software is required:

The gcc compiler (4.8.2 release or higher) should also be available in your machine.


This analysis is based on the UCSC/hg19 reference annotation (download link ~ 21GB). After downloading the annotation, follow the instructions written in the simulationScripts/README file. The main jobscript is written in the commented file, consisting of the following steps:

  1. Choose dataset (4 simulation scenarios)
  2. Generate RPK values
  3. Simulate fastq files with spanki.
  4. Align reads with bowtie
  5. Align reads with tophat
  6. Run BitSeqMCMC
  7. Run BitSeqVB
  8. Run Casper
  9. Run Cufflinks
  10. Run RSEM
  11. Run Sailfish
  12. Run Tigar2
  13. Run eXpress
  14. Produce graphs

For a reasonable computing time the user should split the jobscript into parallel ones according to the instructions given in

Warning: big data files will be generated

This downstream analysis was processed using the linux operating system on the High Performance Computing cluster (CSF) at the University of Manchester. The user has to make sure that at least 2.5T of free disk space is available.


  1. J Hensman, P Papastamoulis, P Glaus, A Honkela, M Rattray (2014). Fast and accurate approximate inference of transcript expression from RNA-seq data. arXiv preprint arXiv:1412.5995
Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.