Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Clone this wiki locally
Ideas for a BioRuby Plugin to handle Next Generation Sequencing data, in particular RNA-seq data.
Git Repository is: https://github.com/helios/bioruby-ngs
The bio-ngs plugin will be used as a container for others NGS plugins that will provide specific wrappers or bindings to existing tools. Here is a first list:
- bio-bwa Burrows-Wheeler Aligner
- bio-picard Picard
- comprises Java-based command-line utilities that manipulate SAM files, and a Java API (SAM-JDK) for creating new programs that read and write SAM files.
- bio-samtools SAM (Sequence Alignment/Map)
- SAM (Sequence Alignment/Map) format is a generic format for storing large nucleotide sequence alignments.
- bio-qseq TODO convert qseq file in fastq format.
and will include graphics libraries like Rubyvis (http://rubyvis.rubyforge.org/) to generate reports on data quality, mapping results and other related statistics.
The main idea is to wrap NGS standard tools into Ruby and where possible to include direct binding for these tools.
This could be done for example for Picard via JRuby and for SAMtools using samtools-ruby (https://github.com/homonecloco/samtools-ruby). Every option needs to be tested to ensure a good performance in handling large datasets.
bio-samtools is a Ruby binding to the popular SAMtools library, and provides access to individual read alignments as well as BAM files, reference sequence and pileup information.
Source code is available on GitHub at https://github.com/helios/bioruby-samtools.
Tutorial is available here: Bio-samtools
- create a BWA shared library for Linux and Mac OS X: DONE
- create a BioRuby plugin with binding to BWA: DONE. The code is available at https://github.com/fstrozzi/bioruby-bwa
- perform a real test to check the Ruby binding: DONE. Details available at https://github.com/fstrozzi/bioruby-bwa/wiki
- run a test phase to check if pre-compiled shared libraries work fine everywhere: TODO
see also Workflows
Using Rake or Thor to run NGS analyses
The bio-ngs plugin will implement a flexible Rake task system similar to Rails, where custom tasks can be defined according to specific needs. As an alternative, Thor could be used instead of Rake (https://github.com/wycats/thor).
This will allow bio-ngs users to perform NGS analyses and pipelines directly using Rake and the functionalities provided by BioRuby and the others Bio* plugins.
Please add the people involved on this topic.
bio-samtools: Raoul Bonnal
bio-bwa: Francesco Strozzi