Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Rolling Hash C++ Library

branch: master

first commit

latest commit 8d9d8e395f
Daniel Lemire authored March 11, 2011
Octocat-spinner-32 Makefile first commit March 11, 2011
Octocat-spinner-32 README first commit March 11, 2011
Octocat-spinner-32 characterhash.h first commit March 11, 2011
Octocat-spinner-32 cyclichash.h first commit March 11, 2011
Octocat-spinner-32 generalhash.h first commit March 11, 2011
Octocat-spinner-32 mersennetwister.h first commit March 11, 2011
Octocat-spinner-32 rabinkarphash.h first commit March 11, 2011
Octocat-spinner-32 speedtesting.cpp first commit March 11, 2011
Octocat-spinner-32 threewisehash.h first commit March 11, 2011
Octocat-spinner-32 unit.cpp first commit March 11, 2011
Octocat-spinner-32 ztimer.h first commit March 11, 2011
README
Testing n-gram hash functions
Daniel Lemire

License: Apache 2.0


== What is this? ==

This is a set of C++ classes implementing various recursive n-gram hashing techniques, also called rolling hashing (http://en.wikipedia.org/wiki/Rolling_hash), including:

*) Randomized Karp-Rabin (sometimes called Rabin-Karp)
*) Hashing by Cyclic Polynomials (also known as Buzhash)
*) Hashing by Irreducible Polynomials


== Code sample ==

const uint n(3);//hash all sequences of 3 characters
const uint L(7); // you need 7 bits
CyclicHash hf(n,L );
for(uint32 k = 0; k<n;++k) {
          chartype c = ... ; // grab some character
          hf.eat(c); // feed it to the hasher
}
while(...) { // go over your string
   chartype c = ... ;// point to the next character
   chartype out = ...; // character we want to forget
   hf.update(out,c); // update hash value
   hf.hashvalue; // at all times, this contains the hash value
}

== Requirements ==

GNU GCC.

== What should I do after I download it?==


type:

make

then

./unit

then

./speedtesting




== Reference ==

Daniel Lemire, Owen Kaser: Recursive n-gram hashing is pairwise independent, at best, Computer Speech & Language, Volume 24, Issue 4, October 2010, Pages 698-710 http://arxiv.org/abs/0705.4676
Something went wrong with that request. Please try again.