Skip to content

k-mer Counter based on Multiple Burst Trees (multi-threaded)

Notifications You must be signed in to change notification settings

abdullah009/kcmbt_mt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

KCMBT: A very fast k-mer counter

Copyright 2015 Abdullah-Al Mamun

abdullah.am.cs (at) engr.uconn.edu

KCMBT is a free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
any later version.

KCMBT is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with KCMBT. If not, see <http://www.gnu.org/licenses/>.

What is KCMBT

KCMBT (k-mer Counter based on Multiple Burst Trees) is a very fast multi-threaded k-mer counting algorithm. It uses cache efficient burst tries to store k-mers. Experimental results show that it outperforms all well-known algorithms.

Compilation

To compile the source code, type

make

It will create three executable files in bin directory:

  • kcmbt generates binary files containing k-mers with their counts
  • kcmbt_dump produces human readable file having k-mers with their counts
  • kcmbt_query copies range specific k-mer list to user-defined output file

Usage

Please run ./bin/kcmbt at first, and then run ./bin/kcmbt_dump. ./bin/kcmbt_dump uses kcmbt generated files as input, and outputs list of human readable k-mers with their counts.

To run kcmbt, use

./bin/kcmbt -k <k-mer length> -i <@file_listing_fastq_files or fastq_file> -t <number_of_threads>
Parameters:
	-k:	k-mer length (10 <= k <= 32, default 28) 
	-i:	input file in fastq format (start with @ if the file contains a list of fastq files)
	-t:	number of threads (please use 2^x threads, x = 0, , 2, 3, ..)

Example: ./bin/kcmbt -k 28 -i srr.fastq -t 4

To run kcmbt_dump, use

./bin/kcmbt_dump number_of_threads_used_in_kcmbt

Example: ./bin/kcmbt_dump 4

kcmbt_dump creates kmer_list.txt which contains human readable k-mer list with their counts. This file may be huge. So we can query a specific range of k-mer lists using kcmbt_query.

To run kcmbt_query, use

./bin/kcmbt_query out_file_name begin_count end_count

Example: ./bin/kcmbt_query out 2 50

Contact

For questions, suggestions, bugs, and other issues, please contact:

Abdullah-Al Mamun
abdullah.am.cs (at) engr.uconn.edu

About

k-mer Counter based on Multiple Burst Trees (multi-threaded)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published