Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
A biogem for counting small kmers for fingerprinting nucleotide sequences
Ruby
tag: v0.0.1

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
bin
lib
test
.document
.gitignore
.travis.yml
Gemfile
LICENSE.txt
README.md
README.rdoc
Rakefile
VERSION

README.md

bio-kmer_counter

Build Status

bio-kmer_counter is a simple biogem for fingerprinting nucleotide sequences by counting the occurences of particular kmers in the sequence. The methodology is not new, for references see Teeling et. al. 2004. The default parameters are derived from the methods section of Dick et. al. 2009.

This methodology is quite different to that of other software that counts kmer content with longer kmers, e.g. khmer. Here only small kmers are intended (e.g. 1mer or 4mer).

Note: this software is under active development!

Installation

    gem install bio-kmer_counter

Usage

To analyse a fasta file (that contains one or more sequences in it) for 4-mer (tetranucleotide) content, reporting the fingerprint of 5kb windows in each sequence separately, plus the leftover part if it is longer than 2kb:

    kmer_counter.rb <fasta_file> >tetranucleotide_content.csv

The fingerprints are reported in percentages. Well, between 0 and 1, that is. From there it is up to you how to use the fingerprints, sorry.

Project home page

Information on the source tree, documentation, examples, issues and how to contribute, see

http://github.com/wwood/bioruby-kmer_counter

The BioRuby community is on IRC server: irc.freenode.org, channel: #bioruby.

Cite

This software is currently unpublished, so please just cite the homepage (thanks!).

Please also cite the tools upon which it is based, one of:

Copyright

Copyright (c) 2012 Ben J Woodcroft. See LICENSE.txt for further details.

Something went wrong with that request. Please try again.