small hash #102

Open
rurban opened this Issue Dec 30, 2015 · 5 comments

Projects

None yet

2 participants

@rurban
Member
rurban commented Dec 30, 2015 edited

optimize hashes with <= 3-5 keys to a simple array of keys and values with linear lookup.

HvSMALL(hv) / XHvSMALL(xhv) is either checking HvMAX < 7, or a flag. If a flag the very first HE* entry needs to be a non-ptr tag (& 0x1).
We'd need a flag with inlined HEs and overlong keys, to omit HvSMALL optims with such long keys.
We cannot the hv_aux based HvFLAGS with normal HvSMALL hashes, esp. when inlined.

The best would be a he-array alike inlined len/char*/flags/val array to be cache concious. (as in #24 feature/gh24-he-array). The len really should be run-length encoded, then the flags needed for hash cmp need to come first.
However at first we start with simple HE* arrays. (array of ptrs, not values)
The last array element needs to have an NULL sentinel, so we cannot use all 7 HE*, only 6.

But there are many more simple hash optims, which we do first.

  • extract uncommon magical code from hv_common
  • add __builtin_ctz support (count trailing zeros) and use it instead of division on DO_HSPLIT (done with the builtin-ctz branch)
  • pre-extend hashes as in aassign with av_extend, when the number of keys is on the stack. This speeds up all the big hash inits (e.g. warnings.pm), needing no costly series of splits during initialization.
@rurban rurban added the enhancement label Dec 30, 2015
@rurban rurban self-assigned this Dec 30, 2015
@wollmers
Contributor

I estimate you can use linear or serial (unsorted) lookup up to 100 keys or even more, depending on benchmarks.

In my port of LCS::BV from Perl to C I began with Bob Jenkins hash and ended the tuning using VLAs (variable length arrays) on the stack, the array serially filled (\0 terminated). See llcs_seq_a() and the used hash_setpos() and hash_getpos(). With Bob Jenkins I get 250 kHz (cases per second) on i5@1500, with serial VLAs 7.5 MHz, thus factor 30x. The calloc variant llcs_seq() comes at 4 MHz.

Of course in my example I can benefit from the known restrictions: maximum size, keys strings immutable, typed values (uint_64).

@rurban
Member
rurban commented Apr 27, 2016

So many? I thought I only want to fill one cache line, so just very few
keys. But I'll benchmark it soon, when I got more time. Other langs tested
3-5, if I remember.

On Wed, Apr 27, 2016, 20:59 Helmut Wollmersdorfer notifications@github.com
wrote:

I estimate you can use linear or serial (unsorted) lookup up to 100 keys
or even more, depending on benchmarks.

In my port of LCS::BV
https://github.com/wollmers/LCS-BV/tree/master/lib/LCS from Perl to C I
began with Bob Jenkins hash and ended the tuning using VLAs (variable
length arrays) on the stack, the array serially filled (\0 terminated). See
llcs_seq_a()
https://github.com/wollmers/c-lcs-bv/blob/master/lcstest.c#L105 and the
used hash_setpos() and hash_getpos(). With Bob Jenkins I get 250 kHz
(cases per second) on i5@1500, with serial VLAs 7.5 MHz, thus factor 30x.
The calloc variant llcs_seq()
https://github.com/wollmers/c-lcs-bv/blob/master/lcstest.c#L144 comes
at 4 MHz.

Of course in my example I can benefit from the known restrictions: maximum
size, keys strings immutable, typed values (uint_64).


You are receiving this because you were assigned.
Reply to this email directly or view it on GitHub
#102 (comment)

@wollmers
Contributor

You should trust only numbers you benchmarked yourself;-)

Hash is said to have complexity O(1). But as always it is O(1*k), where k is the implementation factor.

Serial has O((n/2)*k). A break even point of n=4 between hash and serial would need k_hash = 2 * k_serial. I.e. the hash algorithm executes only the double amount of instructions compared to one iteration of the loop of serial. My serial has 3 instructions (C operators) in the loop including conditions. So for a break even n=4 it would need a hash function (locating the entry in the array) to only use 6 instructions.

I didn't optimize for cache friendlyness directly. Serial just maps a nearly indefinite (sparse) alphabet to a minimal one (none sparse) and keeps nearly the order of filling, which is memory and cache friendly. Hash algorithms (if not perfect hashes) map sparse to not so sparse, but still sparse.

@rurban rurban added a commit that referenced this issue Jul 14, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys WIP
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

See #102
9fba8c2
@rurban rurban added a commit that referenced this issue Jul 14, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys WIP
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

See #102
25d7dd4
@rurban rurban added a commit that referenced this issue Jul 17, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys WIP
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

See #102
b5150a4
@rurban
Member
rurban commented Jul 17, 2016

I went with 7 because this is the initial calloced size. But it doesn't work yet, so I cannot benchmark it.

@rurban rurban added a commit that referenced this issue Jul 19, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys WIP
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

See #102
bb8d710
@rurban rurban added a commit that referenced this issue Jul 25, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys WIP
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

See #102
c68effb
@rurban rurban added a commit that referenced this issue Jul 26, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys WIP
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

See #102
33f998d
@rurban rurban added a commit that referenced this issue Jul 27, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys WIP
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

See #102
fc59b93
@rurban rurban added a commit that referenced this issue Jul 28, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys WIP
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

See #102
55550ea
@rurban rurban added a commit that referenced this issue Jul 28, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys WIP
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

See #102
4e4864a
@rurban rurban added a commit that referenced this issue Jul 31, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys WIP
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

See #102
eb54469
@rurban rurban added a commit that referenced this issue Aug 2, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys WIP
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

See #102
170ce7d
@rurban rurban added a commit that referenced this issue Aug 7, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys WIP
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

See #102
838347f
@rurban rurban added a commit that referenced this issue Aug 8, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys WIP
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

See #102
3543a9f
@rurban rurban added a commit that referenced this issue Aug 8, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys WIP
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

See #102
e41755a
@rurban rurban added a commit that referenced this issue Aug 10, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys WIP
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

See #102
baf94f4
@rurban rurban added a commit that referenced this issue Aug 10, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys WIP
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102
c584d28
@rurban rurban added a commit that referenced this issue Aug 10, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys WIP
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102
d64eb2b
@rurban rurban added a commit that referenced this issue Aug 11, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but write_buildcustomize.pl fails.
a5fa7db
@rurban rurban added a commit that referenced this issue Aug 12, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
a127894
@rurban rurban added a commit that referenced this issue Aug 12, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
2b0dd27
@rurban rurban added a commit that referenced this issue Aug 13, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
fb8da52
@rurban rurban added a commit that referenced this issue Aug 13, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
a3e564d
@rurban rurban added a commit that referenced this issue Aug 13, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
11715ce
@rurban rurban added a commit that referenced this issue Aug 14, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
b47195e
@rurban rurban added a commit that referenced this issue Aug 14, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
2269bea
@rurban rurban added a commit that referenced this issue Aug 14, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
99f4334
@rurban rurban added a commit that referenced this issue Aug 14, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
c23cadb
@rurban rurban added a commit that referenced this issue Aug 14, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
fbf9772
@rurban rurban added this to the v5.26.0 milestone Aug 15, 2016
@rurban rurban added a commit that referenced this issue Aug 15, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
3447b9f
@rurban
Member
rurban commented Aug 19, 2016 edited

Merged the 3 first parts (ctz, hv_common_magical, pre-extend).
In the end it was much faster than expected. On my linux gcc-6.1/i5-2300 it was 8-14% faster in perlbench-run, on my darwin gcc-6-lto/i7-4650U 6% faster.

Having the magical code seperated and abstracted away will also help in the future hash rewrites.

@rurban rurban closed this Aug 19, 2016
@rurban rurban reopened this Aug 19, 2016
@rurban rurban added a commit that referenced this issue Aug 19, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
a07ab71
@rurban rurban added a commit that referenced this issue Aug 19, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
766f869
@rurban rurban added a commit that referenced this issue Aug 19, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
9dc063f
@rurban rurban referenced this issue Aug 19, 2016
Open

new hash table #24

@rurban rurban added a commit that referenced this issue Aug 22, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
dc79321
@rurban rurban added a commit that referenced this issue Aug 22, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
180d416
@rurban rurban added a commit that referenced this issue Aug 24, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
f365064
@rurban rurban added a commit that referenced this issue Aug 24, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
1256edd
@rurban rurban added a commit that referenced this issue Aug 24, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
3d3b0e0
@rurban rurban added a commit that referenced this issue Aug 24, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
3612789
@rurban rurban added a commit that referenced this issue Aug 25, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
7b97b9e
@rurban rurban added a commit that referenced this issue Aug 25, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
70cdf01
@rurban rurban added a commit that referenced this issue Aug 26, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
f8a4e39
@rurban rurban added a commit that referenced this issue Aug 26, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
0038f47
@rurban rurban added a commit that referenced this issue Aug 27, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
e72d1ae
@rurban rurban added a commit that referenced this issue Aug 27, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
3e3e04c
@rurban rurban added a commit that referenced this issue Aug 27, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
6b68f2c
@rurban rurban added a commit that referenced this issue Aug 27, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
ae51ced
@rurban rurban added a commit that referenced this issue Aug 28, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
5195c1a
@rurban rurban added a commit that referenced this issue Aug 28, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
1002352
@rurban rurban added a commit that referenced this issue Aug 30, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
3146a3c
@rurban rurban added a commit that referenced this issue Aug 30, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
988c1a2
@rurban rurban added a commit that referenced this issue Sep 6, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
22c0514
@rurban rurban added a commit that referenced this issue Sep 6, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
95a30d8
@rurban rurban added a commit that referenced this issue Sep 8, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
c6caac7
@rurban rurban added a commit that referenced this issue Sep 8, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
c85baf1
@rurban rurban added a commit that referenced this issue Sep 13, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
f667273
@rurban rurban added a commit that referenced this issue Sep 13, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
004d8b6
@rurban rurban added a commit that referenced this issue Sep 15, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
70aa7c8
@rurban rurban added a commit that referenced this issue Sep 15, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
19888a0
@rurban rurban added a commit that referenced this issue Sep 16, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
ef69c07
@rurban rurban added a commit that referenced this issue Sep 16, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
1f3e498
@rurban rurban added a commit that referenced this issue Sep 20, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
963290b
@rurban rurban added a commit that referenced this issue Sep 20, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
7e906ee
@rurban rurban added a commit that referenced this issue Sep 20, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
c7bdd3c
@rurban rurban added a commit that referenced this issue Sep 21, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
8fb48b6
@rurban rurban added a commit that referenced this issue Sep 21, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
264edb0
@rurban rurban added a commit that referenced this issue Sep 22, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
cd527ab
@rurban rurban added a commit that referenced this issue Sep 23, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
ffcf9a5
@rurban rurban added a commit that referenced this issue Sep 23, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
92aa780
@rurban rurban added a commit that referenced this issue Sep 23, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
e2f320d
@rurban rurban added a commit that referenced this issue Sep 25, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
8bb7970
@rurban rurban added a commit that referenced this issue Sep 25, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
b558bc3
@rurban rurban added a commit that referenced this issue Sep 25, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
70ba5db
@rurban rurban added a commit that referenced this issue Sep 25, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
2373d41
@rurban rurban added a commit that referenced this issue Sep 25, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
95e831c
@rurban rurban added a commit that referenced this issue Sep 25, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
9f4ce5a
@rurban rurban added a commit that referenced this issue Sep 26, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
deff4ce
@rurban rurban added a commit that referenced this issue Sep 26, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
7286666
@rurban rurban added a commit that referenced this issue Sep 26, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
060cddc
@rurban rurban added a commit that referenced this issue Sep 26, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
205cc47
@rurban rurban added a commit that referenced this issue Sep 29, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
e86d844
@rurban rurban added a commit that referenced this issue Sep 29, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
eb824f3
@rurban rurban added a commit that referenced this issue Sep 30, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
7e347e7
@rurban rurban added a commit that referenced this issue Sep 30, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
39acbc4
@rurban rurban added a commit that referenced this issue Oct 1, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
8137082
@rurban rurban added a commit that referenced this issue Oct 1, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
e9fa676
@rurban rurban added a commit that referenced this issue Oct 2, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
d2399a4
@rurban rurban added a commit that referenced this issue Oct 2, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
053513d
@rurban rurban added a commit that referenced this issue Oct 2, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
62b709c
@rurban rurban added a commit that referenced this issue Nov 5, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
9dd440d
@rurban rurban added a commit that referenced this issue Nov 5, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
4c9d217
@rurban rurban added a commit that referenced this issue Nov 7, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
7bce180
@rurban rurban added a commit that referenced this issue Nov 7, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
e2f54c8
@rurban rurban added a commit that referenced this issue Nov 11, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
0f1d971
@rurban rurban added a commit that referenced this issue Nov 11, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
69c6c3a
@rurban rurban added a commit that referenced this issue Nov 15, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
4c5a4f6
@rurban rurban added a commit that referenced this issue Nov 15, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
29fcd60
@rurban rurban added a commit that referenced this issue Nov 15, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
4f14ed1
@rurban rurban added a commit that referenced this issue Nov 15, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
918c110
@rurban rurban added a commit that referenced this issue Nov 18, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
213291e
@rurban rurban added a commit that referenced this issue Nov 18, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
54deefa
@rurban rurban added a commit that referenced this issue Nov 20, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
defcbf5
@rurban rurban added a commit that referenced this issue Nov 20, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
317868a
@rurban rurban added a commit that referenced this issue Nov 20, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
4008f1f
@rurban rurban added a commit that referenced this issue Nov 20, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
9daf4d5
@rurban rurban added a commit that referenced this issue Nov 22, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
fbac0f3
@rurban rurban added a commit that referenced this issue Nov 22, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
1636c2f
@rurban rurban added a commit that referenced this issue Nov 22, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
a02e407
@rurban rurban added a commit that referenced this issue Nov 22, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
af9a1fc
@rurban rurban added a commit that referenced this issue Nov 23, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
a12098a
@rurban rurban added a commit that referenced this issue Nov 23, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
508508f
@rurban rurban added a commit that referenced this issue Nov 24, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
8348465
@rurban rurban added a commit that referenced this issue Nov 27, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
c149a40
@rurban rurban added a commit that referenced this issue Nov 27, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
a44fa29
@rurban rurban added a commit that referenced this issue Nov 27, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
f94fbe9
@rurban rurban added a commit that referenced this issue Nov 27, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
a725cf1
@rurban rurban added a commit that referenced this issue Nov 27, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
f49687b
@rurban rurban added a commit that referenced this issue Nov 27, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
0af99e1
@rurban rurban added a commit that referenced this issue Nov 30, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
39cc665
@rurban rurban added a commit that referenced this issue Nov 30, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
31defea
@rurban rurban added the in progress label Dec 1, 2016
@rurban rurban added a commit that referenced this issue Dec 1, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
b9ef437
@rurban rurban added a commit that referenced this issue Dec 1, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
a3b4c57
@rurban rurban added a commit that referenced this issue Dec 4, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
fd6d376
@rurban rurban added a commit that referenced this issue Dec 4, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
cd25966
@rurban rurban added a commit that referenced this issue Dec 4, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
87134f0
@rurban rurban added a commit that referenced this issue Dec 4, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
419b9d1
@rurban rurban added a commit that referenced this issue Dec 10, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
afe78b5
@rurban rurban added a commit that referenced this issue Dec 10, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
8a81fcd
@rurban rurban added a commit that referenced this issue Dec 11, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
2897e29
@rurban rurban added a commit that referenced this issue Dec 11, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
08d7cc6
@rurban rurban added a commit that referenced this issue Dec 31, 2016
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
7f764c7
@rurban rurban added a commit that referenced this issue Dec 31, 2016
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
8b3bab3
@rurban rurban added a commit that referenced this issue Jan 1, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
c62d919
@rurban rurban added a commit that referenced this issue Jan 1, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
00b2c31
@rurban rurban added a commit that referenced this issue Jan 1, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
9d6f3ee
@rurban rurban added a commit that referenced this issue Jan 1, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
ab7b411
@rurban rurban added a commit that referenced this issue Jan 2, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
a819b5b
@rurban rurban added a commit that referenced this issue Jan 2, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
9759238
@rurban rurban added a commit that referenced this issue Jan 2, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
4ded4cf
@rurban rurban added a commit that referenced this issue Jan 2, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
3af89a9
@rurban rurban added a commit that referenced this issue Jan 4, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
ddbbe14
@rurban rurban added a commit that referenced this issue Jan 4, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
15f1c7e
@rurban rurban added a commit that referenced this issue Jan 4, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
5e49e63
@rurban rurban added a commit that referenced this issue Jan 4, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
07ce7b9
@rurban rurban added a commit that referenced this issue Jan 4, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
f2f22de
@rurban rurban added a commit that referenced this issue Jan 4, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
7e9c64e
@rurban rurban added a commit that referenced this issue Jan 7, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
fadde7d
@rurban rurban added a commit that referenced this issue Jan 7, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
6c023d2
@rurban rurban added a commit that referenced this issue Jan 9, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
7b08554
@rurban rurban added a commit that referenced this issue Jan 9, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
44cd0cf
@rurban rurban added a commit that referenced this issue Jan 13, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
bd64bb7
@rurban rurban added a commit that referenced this issue Jan 13, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
26287a2
@rurban rurban added a commit that referenced this issue Jan 14, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
e3fa602
@rurban rurban added a commit that referenced this issue Jan 14, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
4cbd931
@rurban rurban added a commit that referenced this issue Jan 15, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
55fbf48
@rurban rurban added a commit that referenced this issue Jan 15, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
d63101f
@rurban rurban added a commit that referenced this issue Jan 15, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
8d2b2ce
@rurban rurban added a commit that referenced this issue Jan 15, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
713bbb8
@rurban rurban added a commit that referenced this issue Jan 17, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
fd6cd4a
@rurban rurban added a commit that referenced this issue Jan 17, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
0a2df9f
@rurban rurban added a commit that referenced this issue Jan 24, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
ee067b1
@rurban rurban added a commit that referenced this issue Jan 24, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
3e9cce1
@rurban rurban added a commit that referenced this issue Jan 25, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
da240e4
@rurban rurban added a commit that referenced this issue Jan 25, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
1498e2c
@rurban rurban added a commit that referenced this issue Jan 27, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
d15044f
@rurban rurban added a commit that referenced this issue Jan 27, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
2b8554e
@rurban rurban added a commit that referenced this issue Jan 30, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
c17345b
@rurban rurban added a commit that referenced this issue Jan 30, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
1b91d7f
@rurban rurban added a commit that referenced this issue Jan 31, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
3ed9519
@rurban rurban added a commit that referenced this issue Jan 31, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
d5006a7
@rurban rurban added a commit that referenced this issue Feb 4, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
c90f3f1
@rurban rurban added a commit that referenced this issue Feb 4, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
8a6ac96
@rurban rurban added a commit that referenced this issue Feb 8, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
87670f7
@rurban rurban added a commit that referenced this issue Feb 8, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
69de073
@rurban rurban added a commit that referenced this issue Feb 8, 2017
@rurban rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
b536ae0
@rurban rurban added a commit that referenced this issue Feb 8, 2017
@rurban rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
4a4d419
@rurban rurban added a commit that referenced this issue Feb 12, 2017
@rurban @rurban rurban + rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
acbcab9
@rurban rurban added a commit that referenced this issue Feb 12, 2017
@rurban @rurban rurban + rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
4801ae8
@rurban rurban added a commit that referenced this issue Feb 16, 2017
@rurban @rurban rurban + rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
bb2d92e
@rurban rurban added a commit that referenced this issue Feb 16, 2017
@rurban @rurban rurban + rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
a3ab2d7
@rurban rurban added a commit that referenced this issue Feb 16, 2017
@rurban @rurban rurban + rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
176cb01
@rurban rurban added a commit that referenced this issue Feb 16, 2017
@rurban @rurban rurban + rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
1b4d793
@rurban rurban added a commit that referenced this issue Feb 18, 2017
@rurban @rurban rurban + rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
f966bca
@rurban rurban added a commit that referenced this issue Feb 18, 2017
@rurban @rurban rurban + rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
47ee106
@rurban rurban added a commit that referenced this issue Feb 19, 2017
@rurban @rurban rurban + rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
a722e3c
@rurban rurban added a commit that referenced this issue Feb 19, 2017
@rurban @rurban rurban + rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
79f913e
@rurban rurban added a commit that referenced this issue Feb 19, 2017
@rurban @rurban rurban + rurban HeArray: remove hek_hash and refcounted_he_hash
Calculate hashes on demand, but not store it in a HEK
to make HEK shorter to fill more entries into a cache line.
HEK_HASH(hek) is now invalid and gone.
Use the new HeHASH_calc(he), HEK_HASH_calc(hek), SvSHARED_HASH_calc(sv)
instead.
See http://www.ilikebigbits.com/blog/2016/8/28/designing-a-fast-hash-table
for benchmarks (HashCache).

And using 4 tests in the hot hash loop also makes not much sense,
when checking the length and the string is enough to weed out
collisions.
This strategy, recomputing the hash wehen needed, is so far 1-7% slower,
but we hope to get to speed with the HeARRAY patch. See below.

The endgoal is to get rid of linked lists and store the collisions
inlined in consecutive memory, in a HekARRAY. (len,cmp-flags,char*,other-flags,val)
Measurements in "Cache-Conscious Collision Resolution in String Hash Tables"
by Nikolas Askitis and Justin Zobel, Melbourne 2005 show that this is the
fastest strategy for Open Hashing (chained) tables.
See GH #24 and GH #102

The next idea is to use MSB varint encoding of the str length in a HEK,
because our strings are usually short, len < 63, fits into one byte.
We can then merge it with the cmp-flags, the flags only needed for comparison.
See https://techoverflow.net/blog/2013/01/25/efficiently-encoding-variable-length-integers-in-cc/
or just <63 one byte, >63 MSB: I32 len.
Note that the 1st MSB bit is already taken for UTF8.
0044f45
@rurban rurban added a commit that referenced this issue Feb 19, 2017
@rurban @rurban rurban + rurban smallhash: 2nd optim, linear scan if < 7 keys (WIP)
avoid hash calculation for a short number of keys.
calloc the first 7 words of HvARRAY. if we add one to the 6th entry
we need to split it, as the 7th, the last, is needed as NULL sentinel.

on split a small hash, we need to allocate a fresh array to move
the hashed entries to. This can be optimized furtheron. (alloc 2x)

on insert a new entry at 7th, we can avoid a split when placeholders exist.
just replace it then.

See #102

WIP: the standard operations work, but use constant fails.
Currently 13% slower.
1ce5aff
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment