-
-
Notifications
You must be signed in to change notification settings - Fork 99
Update xxHash and hashing APIs #1905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a385e9a
to
1e81f7d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove xxhash.h from LICENSE.3rdparty as well.
8b9dccd
to
1a17e6d
Compare
It turns out that we need this function to be stateless, because a algorithm that is both oneshot and incremental will always intialize its state upon construction, which is a pessimization when performing one-shot hashing.
There is a clash with this overload: template <class HashAlgorithm, class T> requires portable_hash<T, HashAlgorithm> void hash_append(HashAlgorithm& h, const T& x) noexcept { h(std::addressof(x), sizeof(x)); }
Provide a find module for xxhash
e354e1f
to
c674ec7
Compare
c674ec7
to
2e7978d
Compare
We need the as_bytes() overload. This obviates the need for the member function data(). Removing that one created a long tail of follow-ups.
The name is not very clear and with constexpr there's now almost no more reason to not directly inline the actual conditional.
7c6b45c
to
57d1a4c
Compare
@dominiklohmann could you take a look at the current CI error? Seems to be a CMake thingy where I am a snail.
|
That should be easy to fix, you likely just need to install xxhash in another place in CI. Let's chat about this tomorrow. |
@dominiklohmann I tried fixing this in adfbd6d, but no luck by adding extra dependencies. |
Just providing the find module alongside installations of libvast does not suffice; we need to make use of it as well.
adfbd6d
to
9cd564e
Compare
496b7b8
to
8bed7a7
Compare
One deficiency of hash_append is that it doesn't support runtime seeding. This commit offers an interface for this use case, yielding the following two user-facing interfaces: hash<H>(x, y, z); seeded_hash<H>{a, b, c}(x, y, z); Here, H is the algorithm, x, y, z objects to hash, and a, b, c seeds given to the constructor of H. The implementation chooses the optimal code path to perform oneshot or incremental hashing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already had a few video calls over this, and I only have some minor comments.
We need to create a follow-up story to measure the speed-up from using xxh64_3
, so we can double-check its not an accidental regression, and use the numbers for the release blogpost.
8bac64b
to
2bbb7cd
Compare
📔 Description
This PR switches the default hash function from XXH64 to XXH3, making possible a 2-3x speed improvement.
📝 Checklist
std::hash<T>
specializations to usevast::hash
vast::uhash
withvast::hash
where possibleis_uniquely_represented
traits (address
,flow
,uuid
,integer
)Provide a default specialization forstd:hash<T>
whenT
is hashableaddress
hashing changed)address
bugfix in PCAP reader🎯 Review Instructions
Commit-by-commit.