As discussed over at Skeeto's Hash Prospector, a CRC32c-MUL-CRC32c scheme seemingly forms a low-bias permutation on 32-bit values. Embedding it into a Pearson scheme as used with Pearson B. Hashing, we get a new hashing function. Following the naming series, let's call it Pearson C. Hashing.
Hope is that the CRC32c instruction as found at Intel comptaible CPU's starting with SSE 4.2 support will accelerate hashing a lot.
We use an all-same constant version as it has a comparable bias to an all-different constant version. This way, we might save some cycles loading constants from cache or memory.
uint32_t cmc (uint32_t x) {
x = _mm_crc32_u32(x, 0x941325ab);
x *= 0x941325ab;
x = _mm_crc32_u32(x, 0x941325ab);
return x;
}
This permutation passes SmallCrunch (15/15), masters Crunch (140/144) and does well on BigCrush (141/160).
========= Summary results of SmallCrush =========
Version: TestU01 1.2.3
Generator: CRC32-MUL-CRC32
Number of statistics: 15
Total CPU time: 00:00:04.42
All tests were passed
========= Summary results of Crush =========
Version: TestU01 1.2.3
Generator: CRC32-MUL-CRC32
Number of statistics: 144
Total CPU time: 00:19:25.30
The following tests gave p-values outside [0.001, 0.9990]:
(eps means a value < 1.0e-300):
(eps1 means a value < 1.0e-15):
Test p-value
----------------------------------------------
41 MaxOft, t = 5 1 - 3.2e-7
42 MaxOft, t = 10 1 - 1.7e-13
43 MaxOft, t = 20 1 - 8.4e-13
44 MaxOft, t = 30 1 - 5.1e-13
----------------------------------------------
All other tests were passed
========= Summary results of BigCrush =========
Version: TestU01 1.2.3
Generator: CRC32-MUL-CRC32
Number of statistics: 160
Total CPU time: 02:27:12.07
The following tests gave p-values outside [0.001, 0.9990]:
(eps means a value < 1.0e-300):
(eps1 means a value < 1.0e-15):
Test p-value
----------------------------------------------
30 CouponCollector, r = 0 4.0e-11
31 CouponCollector, r = 10 1.1e-6
32 CouponCollector, r = 20 7.8e-10
33 CouponCollector, r = 27 1.4e-13
34 Gap, r = 0 eps
35 Gap, r = 25 eps
36 Gap, r = 0 eps
37 Gap, r = 20 eps
41 Permutation, t = 5 eps
43 Permutation, t = 10 eps
46 MaxOft, t = 8 1 - eps1
47 MaxOft, t = 16 1 - eps1
48 MaxOft, t = 24 1 - eps1
49 MaxOft, t = 32 1 - eps1
58 AppearanceSpacings, r = 27 5.6e-7
65 SumCollector eps
89 PeriodsInStrings, r = 20 0.9995
101 Run of bits, r = 0 2.7e-6
102 Run of bits, r = 27 8.0e-4
----------------------------------------------
All other tests were passed
A few tests are still missing such as the comprehensive evaluation of the whole scheme (not just the permutation) by SMHasher – to follow soon. Same for the full C implementation.