Sparsity Exploiting K-mer based estimation of bacterial community composition
Julia
Switch branches/tags
Nothing to show
Latest commit 4ac3f2a Apr 9, 2015 @dkoslicki fixed missing taxa
Permalink
Failed to load latest commit information.
data
src/Julia
README.md fixed typo Oct 12, 2014

README.md

#SEK

SEK is a sparsity exploiting k-mer-based estimation of bacterial community composition estimation tool.

#What does this repository contain?

This repository is a Julia implementation of the SEK algorithm. For a Matlab implementation, see this website.

Requirements

  • Mac OS X 10.6.8 or GNU/Linux
  • 4Gb of RAM minimum. Absolutely necessary.
  • gcc that supports OpenMP
  • dna_utils must be installed

Mac Requirements

  • Mac OS X 10.6.8 (what we have tested)
  • GCC 4.7 or newer. (gcc 4.2 did not work, and is the default installation)
  • OpenMP libraries (libgomp, usually comes with gcc)

Linux Requirements

  • GCC 4.7 or newer
  • OpenMP libraries (libgomp, usually comes with gcc)

Installation

After cloning and installing the dna_utils repository, just clone this repository. As the code contained herein are Julia scripts, no compilation is necessary.

Usage

The code only works on FASTA files (not FASTQ or any other format). Here's an example:

julia SEK.jl -i /path/to/FASTA.fa -o /path/to/Output.tsv

Other options are available, see julia SEK.jl -h.

The output format is consistent with the CAMI challenge and is similar to the output produced by MetaPhlAn.

Further Notes

If your installation of dna_utils results in the executable being located in a non-standard location, specify this location using the option -k /path/to/./kmer_counts_per_sequence

It is very important that your installation of BLAS matches the architecture of your hardware (if not, significant increases in computation time might be observed). We recommend using OpenBLAS.