libra
Compute the Similarity between Metagenomic Samples
Download a pre-built binary (compiled with Java 7, including dependencies):
For old releases, check out the release page:
Most users do not need to build a binary from source. Use pre-built binaries.
To build, use ANT build system.
Type following to build without dependencies:
ant
Type following to build with dependencies (recommended):
ant allinone
The jar
package built will be located at the /dist
directory.
Preprocessing FASTA/FASTQ files
hadoop jar libra-all.jar preprocess -k 20 -t 8 -o /index_dir /source_dir
Preprocessing Options
- k : k-mer size
- t : number of tasks (reducers). 1 by default.
- s : min size of group in bytes. 10GB by default. For each file group, a separate index file is created.
- g : max number of groups. 20 groups by default. If groups to be created by "-s" option exceeds this value, combine groups.
- f : kmer filter algorithm. NONE | STDDEV (standard deviation) | STDDEV2 (two's standard deviation) | NOTUNIQUE (default)
- o : output directory
Scoring
hadoop jar libra-all.jar core -w LOGARITHM -o /score_dir /index_dir
Scoring Options
- s : scoring algorithm. COSINESIMILARITY (default) | BRAYCURTIS | JENSENSHANNON
- m : run mode. MAP (default) | REDUCE
- w : weighting algorithm. LOGARITHM (default) | BOOLEAN | NATURAL
- o : output directory