Stable hash functions for Postgres
C
Switch branches/tags
Nothing to show
Pull request Compare This branch is 2 commits ahead, 29 commits behind markokr:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
debian
expected
sql
.gitignore
COPYRIGHT
Makefile
README.asciidoc
crc32.c
hashlib.control
hashlib.sql.in
inthash.c
lookup2.c
lookup3.c
murmur3.c
pghashlib.c
pghashlib.h
pgsql84.c
uninstall_hashlib.sql

README.asciidoc

pghashlib

The goal of pghashlib is to provide stable hashes for PostgeSQL, "stable" in the sense that their implementation does not change, they are independent from PostgeSQL version.

Some string hashes may be dependent on CPU architecture, so that they return different return on big-endian architecture from little-endian architecture. It you want to be architectures-independent, then use algorithms that don’t depend on endianess.

Installation

You need PostgreSQL developent environment. Then simply:

 $ make
 $ make install
 $ psql -d ... -f hashlib.sql

Functions

hash_string

hash_string(data text,  algo text [, initval int4]) returns int4
hash_string(data bytea, algo text [, initval int4]) returns int4

CPU-independent algorigthms:

  • lookup2 - Jenkins lookup2, used as PostgreSQL hashtext() in versions up to 8.3

  • lookup3le - Jenkins lookup3/little-endian

  • lookup3be - Jenkins lookup3/big-endian

  • crc32 - CRC32 [cpu-independent]

CPU-dependent algorigthms:

  • lookup3 - Jenkins lookup3

  • pgsql84 - Hacked version of lookup3, used PostgreSQL 8.4+

  • murmur3 - MurmurHash v3

CRC32 supports piecemeal hashing:

   hash_string('abcd', 'crc32') == hash_string('cd', 'crc32',
					       hash_string('ab', 'crc32'))

This is not supported by other algorithms.

For others, initval just allows tuning of the startup parameters. Currently lookup2 default initval is not original Jenkins one, but from PostgreSQL 8.3.

hash_int4

hash_int4(val int4) returns int4

Reversible integer hash.

Supported aldorithms:

  • jenkins - Jenkins integer hash with 6 shifts

  • wang - Thomas Wang’s hash32shift()

  • wang2 - Thomas Wang’s hash32shiftmult()

hash_int8

hash_int8(val int8) returns int8

Reversible integer hash.

Supported algorithms:

  • wang - Thomas Wang’s hash64shift()

  • wang8to4 - Thomas Wang’s hash6432shift(), creates 32bit hash from 64bit integer. The result can be safely cast to int4.