D Shell Python Ruby Makefile
Latest commit dee8979 Apr 17, 2017 @lomereiter committed on GitHub Update issue_template.md
Permalink
Failed to load latest commit information.
.github Update issue_template.md Apr 17, 2017
BioD @ b7f1db8 Introduce compareCoordinatesAndStrand function Feb 23, 2017
cram Preparing sambamba for LDC >1.1.0 support by changing std.stream to u… Feb 23, 2017
deb fix #79; bump version Jun 20, 2014
etc/bash_completion.d ditto Dec 1, 2013
htslib @ 2f3c3ea update to latest htslib Apr 2, 2015
lz4 @ d86dc91 updated lz4 library (a bug has been fixed there) #214 May 19, 2016
man fix formatting Oct 24, 2016
sambamba markdup: allocate i/o buf. according to docs (#169) Mar 20, 2017
test updated test (no tab at the end) Oct 3, 2016
thirdparty improved sort performance (needs testing) Nov 16, 2013
utils v0.6.6 Mar 5, 2017
.drone.yml.template added .drone.yml template Mar 5, 2017
.dropbox.sh fix script Jun 9, 2014
.gitignore Add support for cloning Dlang undeaD - this dependency will disappear… Feb 23, 2017
.gitmodules revive pileup command (#127) Apr 11, 2015
.run_tests.sh tar on travis doesn't support --overwrite option Dec 18, 2013
.test_suite.sh fix some tests Apr 17, 2017
.travis.yml updated dropbox link Mar 20, 2017
LICENSE license under GPL v2+ Aug 6, 2012
Makefile Makefile: fixed standard make file to support Dlang ./undeaD lib Feb 23, 2017
Makefile.guix Makefile.guix: fix comment Mar 14, 2017
README.md updated README Mar 20, 2017
bioconda_push.sh automation for bioconda updates Oct 26, 2016
bioconda_yaml_gen.py automation for bioconda updates Oct 26, 2016
gen_ldc_version_info.py ldc version script: python 2.6 support Mar 5, 2017
main.d Save version strings from build chain and display in usage Nov 17, 2016
randomize_bases.d script for replacing read sequences with garbage Jun 19, 2016
sambamba-ldmd-debug.rsp Save version strings from build chain and display in usage Nov 17, 2016
sambamba-ldmd-release.rsp Makefile: fixed standard make file to support Dlang ./undeaD lib Feb 23, 2017

README.md

Anaconda-Server Badge

DOI

Sambamba

Sambamba is a high performance modern robust and fast tool (and library), written in the D programming language, for working with SAM and BAM files. Current functionality is an important subset of samtools functionality, including view, index, sort, markdup, and depth. Most tools support piping: just specify /dev/stdin or /dev/stdout as filenames.

For almost 5 years the main advantage over samtools was parallelized BAM reading. Finally in March 2017 samtools 1.4 was released, reaching parity on this. That said, we still have quite a few interesting features to offer:

  • faster sort (no benchmarks yet, sorry)
  • automatic index creation when writing any coordinate-sorted file
  • view -L <bed file> utilizes BAM index to skip unrelated chunks
  • depth allows to measure base, sliding window, or region coverages
    • Chanjo builds upon this and gets you to exon/gene levels of abstraction
  • markdup, a fast implementation of Picard algorithm
  • slice quickly extracts a region into a new file, tweaking only first/last chunks

Sambamba is free and open source software, licensed under GPLv2+. See manual pages online to know more about what is available and how to use it.

For more information on Sambamba you can contact Artem Tarasov and Pjotr Prins.

Binaries

With Conda use the bioconda channel.

A GNU Guix package for sambamba is available.

Debian: coming soon.

Users of Homebrew can also use the formula from homebrew-science.

For those not in the mood to learn/install new package managers, there are of course Github releases.

Compiling Sambamba

The preferred method for compiling Sambamba is with the LDC compiler which targets LLVM.

Compiling for Linux

The LDC compiler's github repository also provides binary images. The current preferred release for sambamba is LDC - the LLVM D compiler (>= 1.1.0). After installing LDC:

    git clone --recursive https://github.com/lomereiter/sambamba.git
    cd sambamba
    git clone https://github.com/dlang/undeaD
    make sambamba-ldmd2-64

Installing LDC only means unpacking an archive and setting some environmental variables, e.g. unpacking into $HOME:

cd
wget https://github.com/ldc-developers/ldc/releases/download/$ver/ldc2-$ver-linux-x86_64.tar.xz
tar xJf ldc2-$ver-linux-x86_64.tar.xz
export PATH=~/ldc2-$ver-linux-x86_64/bin/:$PATH
export LIBRARY_PATH=~/ldc2-$ver-linux-x86_64/lib/

GNU Guix

To build sambamba the LDC compiler is also available in GNU Guix:

guix package -i ldc

Compiling for Mac OS X

    brew install ldc
    git clone --recursive https://github.com/lomereiter/sambamba.git
    cd sambamba
    git clone https://github.com/dlang/undeaD
    make sambamba-ldmd2-64

Troubleshooting

In case of crashes it's helpful to have GDB stacktraces (bt command). A full stacktrace for all threads:

thread apply all backtrace full

Note that GDB should be made aware of D garbage collector:

handle SIGUSR1 SIGUSR2 nostop noprint

A binary relocatable install of sambamba with debug information can be fetched from

wget http://biogems.info/contrib/genenetwork/s7l4l5jnrwvvyr3pva242yakvmbfpm06-sambamba-0.6.6-pre3-6ae174b-debug-x86_64.tar.bz2
md5sum ca64fd6f2fa2ba901937afc6b189e98d
mkdir tmp
tar xvjf ../*sambamba*.tar.bz2
cd tmp

unpack the tarball and run the contained install.sh script with TARGET

./install.sh ~/sambamba-test

Run sambamba in gdb with

gdb --args ~/sambamba-test/sambamba-*/bin/sambamba view --throw-error

Development

Sambamba development and issue tracker is on github. Developer documentation can be found in the source code and the development documentation.

Copyright

Sambamba is distributed under GNU Public License v2+.

Citation

If you are using Sambamba in your research, please cite the following article:

A. Tarasov, A. J. Vilella, E. Cuppen, I. J. Nijman, and P. Prins. Sambamba: fast processing of NGS alignment formats. Bioinformatics, 2015.