Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OMT: maybe optimize integer map keys if feasible #36

Closed
fxamacker opened this issue Jul 16, 2021 · 3 comments
Closed

OMT: maybe optimize integer map keys if feasible #36

fxamacker opened this issue Jul 16, 2021 · 3 comments
Assignees

Comments

@fxamacker
Copy link
Member

We can potentially skip SipHash for integer key types to gain speed.

Maybe this optimization can be evaluated after OMT is feature complete and we have more test coverage (and fuzzing) in place.

Thanks @ramtinms for mentioning this possible optimization!

@fxamacker
Copy link
Member Author

While looking for a fast hash designed for 64-bit integer data, I stumbled across Thomas Wang's hash.

Thomas Wang's hash for 64-bit integers

uint64_t hash(uint64_t key) {
  key = (~key) + (key << 21); // key = (key << 21) - key - 1;
  key = key ^ (key >> 24);
  key = (key + (key << 3)) + (key << 8); // key * 265
  key = key ^ (key >> 14);
  key = (key + (key << 2)) + (key << 4); // key * 21
  key = key ^ (key >> 28);
  key = key + (key << 31);
  return key;
}

@fxamacker
Copy link
Member Author

fxamacker commented Sep 24, 2021

Using a non-compatible 8-byte input hasher can increase hash collisions with other input sizes.

There's no need to use a non-compatible 8-byte input hasher if we use a fast non-crypto hash for level 1.

On Intel Haswell CPUs with Go 1.16, my CircleHash functions (which pass Strict Avalanche Criterion, Bit Independence Criterion, and etc.) are able to perform faster than one call to the assignment:

foo := uint64_a % uint64_b  // slower than fast non-crypto hash function that passes SMHasher

I just need to write the README explaining why it should be used instead of current non-crypto fast hashes, add more tests, etc. before making it public (maybe this weekend).

@fxamacker fxamacker self-assigned this Sep 24, 2021
@fxamacker
Copy link
Member Author

This issue is closed by PR #145. CircleHash64 is faster than foo = uint64_a % uint64_b on Haswell CPU and within about 1 ns per hash on newer Intel CPU models (around 3.5 ns/op for short inputs).

CircleHash64 is currently written to be an easy-to-audit reference, so there's room for optimization if it becomes a bottleneck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant