-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Are there any other hashes we should be benchmarking against? #28
Comments
FarmHash - compile with SSE4.2 enabled. |
Highway Hash and Metro Hash come to mind. |
Metro hash is already in the benchmark set. Highway Hash was in the set, but it was so slow I had to pull it - it was an order of magnitude slower than all the other hashes in the set, and was dominating the amount of time it took to run the benchmark series :( I will look at adding FarmHash and CLHash. - Casey |
I just added Meow to the devel branch of t1ha's internal benchmark.
|
That's surprising, sounds like it might not be compiled with |
In the my benchmark
|
That is correct, it would not have been compiled with AVX2. At the moment we generally don't allow anything past AVX. Technically we should be benchmarking only up through SSE AES-NI, since that is all that Meow requires. And if Highway Hash only gets in the 4bytes/cycle even when extended to AVX2, I don't think it's appropriate to expand the qualifications for being in the benchmarking to AVX2 just for a hash that isn't particularly fast... if it were actually going to be a leading hash it might make sense, but requiring AVX2 for a slow hash doesn't seem interesting? - Casey |
I agree it's justified to omit HWH because it's in a different class (aiming for robustness vs. malicious inputs).
Surprisingly, AESNI is only guaranteed for Skylake and later (https://reviews.llvm.org/rL341862). For example, here's a Haswell without AESNI: https://ark.intel.com/products/76294/Intel-Core-i3-4102E-Processor-3M-Cache-1_60-GHz (FYI we also use AES in a 2048-bit cryptographic permutation serving as an RNG: http://github.com/google/randen) |
Well, if it's unlikely that AES-NI would exist without AVX-2, then I'd be OK raising the benchmark to AVX2... the goal is just to compare apples to apples :) - Casey |
That is not quite accurate. There are a lot of Intel CPU's with AES-NI, but without AVX/AVX2. This is usually for lower end all-in-one box PC's or mobile/tablet ones. Whether this is target audience of Meow hash... not for me to say. Example - I have small-box-pc with Pentium N4200 which is Goldmont microarchitecture (chronologically comes after Skylake). It has all SSE's up to 4.2, it also has AES-NI and SHA1/SHA256 instruction sets. But no AVX and no AVX2. |
I added Highway Hash back in, and it is still excruciatingly slow even with /arch:AVX2, so I think the problem is maybe I have only the C implementation (there are no intrinsics anywhere in it :( ) I will look for an optimized version. - Casey |
Grab the official C++ code from https://github.com/google/highwayhash |
@dgryski That is what I grabbed - I just took the "C" directory from there and compiled it. I'm guessing that is not what they intend, but I haven't looked through the giant other source directory to find out where the actual optimized hash lives. If someone wants to send me what the proper include is to actually just compile the hash function, I'll stick it in place of the C one :) - Casey |
The C directory is a reference implementation with no vector instructions. You need the |
That's not going to work, then. Every other hash we use is ~4 files, tops. If someone wants to make a simple h/cpp pair of it, I'll add it to the benchmark, otherwise it's a no-go. I'm not going to quintuple (or more??) the size of the Meow benchmark build just for one (not particularly fast) hash. - Casey |
Fair enough. |
Besides the aforementioned Highway Hash problem, v0.3 will come with the other hashes requested. I should note, however, that although I included CLHash, it really should be disqualified, because it cannot hash things with a seed at reasonable speeds. Each time you change the seed, you must go through a lengthy separate call that generates a table. So its performance is apples to oranges as opposed to all the other hashes in the suite, which are expected to take a different seed on every call. - Casey |
I used the ones I could find that were supposed to be speed-oriented. But I may have missed some.
- Casey
The text was updated successfully, but these errors were encountered: