Skip to content

jeblad/hash-pearson

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hash-pearson

stability-wip GitHub issues

Scope

Pearson non-cryptographic hash function

From the paper Fast Hashing of Variable-Length Text Strings (archived) by Peter K. Pearson with additional edits by jeblad (Dec 13, 2025).

The core of the Pearson hash algorithm is fast execution on processors with integer registers. The implementation requires few instructions, and a lookup table containing permutated values from 0 to one less the size of the table. It produces a hash value that is strongly dependent on every byte of the input.

The original version uses a lookup table of 256 values to be particularly efficient on processors with 8 bit registers, by completely exhausting the available space. This version is adapted to use larger registers and variant size of the lookup table.

The Pearson hash is particularly well suited when a completely uniform distribution is necessary, together with repeatability.

The current implementation is meant for a specific use case, and might not be generally usable.

Usage

The hash-pearson library is in a single header file. Simply grab the file and put it wherever it is needed, or pull the repo as a submodule.

wget https://raw.githubusercontent.com/jeblad/hash-pearson/refs/heads/main/hash-pearson.h

or

git submodule add git@github.com:jeblad/hash-pearson.git path-to-submodule

The path-to-submodule would typically be something like include/hash-pearson if you're in the project folder.

If you're adding the hash-pearson as a submodule, then pull an updated version.

About

Pearson non-cryptographic hash function

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published