The goal of pghashlib is to provide stable hashes for PostgeSQL, "stable" in the sense that their implementation does not change, they are independent from PostgeSQL version.
Some string hashes may be dependent on CPU architecture, so that they return different return on big-endian architecture from little-endian architecture. It you want to be architectures-independent, then use algorithms that don’t depend on endianess.
You need PostgreSQL developent environment. Then simply:
$ make $ make install $ psql -d ... -f hashlib.sql
hash_string(data text, algo text [, initval int4]) returns int4 hash_string(data bytea, algo text [, initval int4]) returns int4
CPU-independent algorigthms:
-
lookup2
- Jenkins lookup2, used as PostgreSQL hashtext() in versions up to 8.3 -
lookup3le
- Jenkins lookup3/little-endian -
lookup3be
- Jenkins lookup3/big-endian -
crc32
- CRC32 [cpu-independent]
CPU-dependent algorigthms:
-
lookup3
- Jenkins lookup3 -
pgsql84
- Hacked version of lookup3, used PostgreSQL 8.4+ -
murmur3
- MurmurHash v3
CRC32 supports piecemeal hashing:
hash_string('abcd', 'crc32') == hash_string('cd', 'crc32', hash_string('ab', 'crc32'))
This is not supported by other algorithms.
For others, initval
just allows tuning of the startup
parameters. Currently lookup2
default initval is not
original Jenkins one, but from PostgreSQL 8.3.
hash_int4(val int4) returns int4
Reversible integer hash.
Supported aldorithms:
-
jenkins
- Jenkins integer hash with 6 shifts -
wang
- Thomas Wang’s hash32shift() -
wang2
- Thomas Wang’s hash32shiftmult()