RandAL - Randomized Sequence Aligner

RandAL is a tool for aligning DNA sequences to reference genomes. The tool is developed based on a new randomized algorithm with the distinction of having high performance across a wide range of read lengths and base error rates.

RandAL is implemented in C++; FM-index codes are adapted from an external library (http://code.google.com/p/fmindex-plus-plus). The tool has been tested with Debian Squeeze 6.0.6 and Mac OS 10.7.5.

More detail about RandAL can be found in our paper "Nam S. Vo, Quang Tran, Nobal Nilaura, Vinhthuy Phan. RandAL: a randomized approach to aligning DNA sequences to reference genomes. BMC Genomics 2014, 15(Suppl 5):S2 doi:10.1186/1471-2164-15-S5-S2". (http://www.biomedcentral.com/1471-2164/15/S5/S2)

Directory organization:

./src: source code of RandAL.

See MANUAL for detailed information on how to use RandAL.
See also LICENSE, VERSION, and CHANGELOG for other information.

./data: several datasets for testing.

./scripts: scripts to support running and testing RandAL.

Sample datasets for testing RandAL:

./data/genomes: store two reference genomes.

./data/genomes/Staphylococcus.fasta (bacterium, http://www.ebi.ac.uk/ena/data/view/Taxon:663951)
./data/genomes/Drosophila melanogaster chromosome 3R.fasta (eukaryote, http://www.ebi.ac.uk/ena/data/view/AE014297)

Genomes are taken from EBI (http://www.ebi.ac.uk). Users also can find reference genomes at NCBI (http://www.ncbi.nlm.nih.gov).

./data/reads: store simulated reads for above genomes.

./data/genomes/Staphylococcus: simulated reads for Staphylococcus.
./data/genomes/Drosophila3R: simulated reads for Drosophila melanogaster chromosome 3R.

Reads are generated with a simulator named wgsim (https://github.com/lh3/wgsim). 100,000 reads with length from 35bps to 400bps are generated with default settings. See https://github.com/lh3/wgsim for detailed information on how to generate simulated reads and evaluate the alignment results.

Supporting scripts to run and evaluate RandAL:

./scripts/do_exps.py: Python script to test RandAL with multiple simulated datasets.

Usage: python do_exps.py -r ref -l read_len -e error_rate -o result_file_name
Example: python do_exps.py -r Stap -r Dros -l 35 -l 51 -e 0.02 -e 0.04 -o overall_results.txt

./scripts/wgsim_eval.pl: Perl script to evaluate a SAM output and then produce result to screen. Original code from https://github.com/lh3/wgsim.

./scripts/wgsim_eval_tofile.pl: Perl script to evaluate a SAM ouput and then write results (including error mapped reads) to files, used in the script do_exps.py. Modified from wgsim_eval.pl.

Contact:

nsvo1@memphis.edu

qmtran@memphis.edu

vphan@memphis.edu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RandAL - Randomized Sequence Aligner

Directory organization:

Sample datasets for testing RandAL:

Supporting scripts to run and evaluate RandAL:

Contact:

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
data		data
scripts		scripts
src		src
CHANGELOG		CHANGELOG
LICENSE		LICENSE
MANUAL		MANUAL
README.md		README.md

License

namsyvo/RandAL

Folders and files

Latest commit

History

Repository files navigation

RandAL - Randomized Sequence Aligner

Directory organization:

Sample datasets for testing RandAL:

Supporting scripts to run and evaluate RandAL:

Contact:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages