Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are there any other hashes we should be benchmarking against? #28

Closed
cmuratori opened this issue Oct 28, 2018 · 18 comments
Closed

Are there any other hashes we should be benchmarking against? #28

cmuratori opened this issue Oct 28, 2018 · 18 comments

Comments

@cmuratori
Copy link
Owner

I used the ones I could find that were supposed to be speed-oriented. But I may have missed some.

- Casey

@mmozeiko
Copy link
Contributor

FarmHash - compile with SSE4.2 enabled.

@dgryski
Copy link

dgryski commented Oct 29, 2018

Highway Hash and Metro Hash come to mind.

@dgryski
Copy link

dgryski commented Oct 29, 2018

@cmuratori
Copy link
Owner Author

Metro hash is already in the benchmark set. Highway Hash was in the set, but it was so slow I had to pull it - it was an order of magnitude slower than all the other hashes in the set, and was dominating the amount of time it took to run the benchmark series :(

I will look at adding FarmHash and CLHash.

- Casey

@erthink
Copy link

erthink commented Oct 30, 2018

I just added Meow to the devel branch of t1ha's internal benchmark.
Yet only MeowHash1() with -maes GCC's flag.

Preparing to benchmarking...
 - running on CPU#3
 - use RDPMC_40000001 as clock source for benchmarking
 - assume it cheap and stable
 - measure granularity and overhead: 53 cycles, 0.0188679 iteration/cycle

Bench for tiny keys (7 bytes):
t1ha2_atonce            :     17.300 cycle/hash,  2.471 cycle/byte,  0.405 byte/cycle,  1.214 Gb/s @3GHz 
t1ha0                   :     16.125 cycle/hash,  2.304 cycle/byte,  0.434 byte/cycle,  1.302 Gb/s @3GHz 
meow1_aes               :     96.312 cycle/hash, 13.759 cycle/byte,  0.073 byte/cycle,  0.218 Gb/s @3GHz 
xxhash64                :     26.219 cycle/hash,  3.746 cycle/byte,  0.267 byte/cycle,  0.801 Gb/s @3GHz 
StadtX                  :     19.302 cycle/hash,  2.757 cycle/byte,  0.363 byte/cycle,  1.088 Gb/s @3GHz 
HighwayHash64_avx2      :     56.750 cycle/hash,  8.107 cycle/byte,  0.123 byte/cycle,  0.370 Gb/s @3GHz 

Bench for large keys (16384 bytes):
t1ha2_atonce            :   3547.000 cycle/hash,  0.216 cycle/byte,  4.619 byte/cycle, 13.857 Gb/s @3GHz 
t1ha0                   :   1308.000 cycle/hash,  0.080 cycle/byte, 12.526 byte/cycle, 37.578 Gb/s @3GHz 
meow1_aes               :   1139.000 cycle/hash,  0.070 cycle/byte, 14.385 byte/cycle, 43.154 Gb/s @3GHz 
xxhash64                :   4119.000 cycle/hash,  0.251 cycle/byte,  3.978 byte/cycle, 11.933 Gb/s @3GHz 
StadtX                  :   3670.000 cycle/hash,  0.224 cycle/byte,  4.464 byte/cycle, 13.393 Gb/s @3GHz 
HighwayHash64_avx2      :   4405.000 cycle/hash,  0.269 cycle/byte,  3.719 byte/cycle, 11.158 Gb/s @3GHz 

@jan-wassenberg
Copy link

Highway Hash was in the set, but it was so slow I had to pull it

That's surprising, sounds like it might not be compiled with -mavx2 - should be around 0.25 cpb (0.269 in the above results).

@erthink
Copy link

erthink commented Oct 30, 2018

Highway Hash was in the set, but it was so slow I had to pull it

That's surprising, sounds like it might not be compiled with -mavx2 - should be around 0.25 cpb (0.269 in the above results).

In the my benchmark -mavx2 for HWH is here.

Intel(R) Core(TM) i7-4600U CPU @ 2.10GHz
gcc version 7.3.0 (Ubuntu 7.3.0-27ubuntu1~18.04)

@cmuratori
Copy link
Owner Author

That is correct, it would not have been compiled with AVX2. At the moment we generally don't allow anything past AVX. Technically we should be benchmarking only up through SSE AES-NI, since that is all that Meow requires. And if Highway Hash only gets in the 4bytes/cycle even when extended to AVX2, I don't think it's appropriate to expand the qualifications for being in the benchmarking to AVX2 just for a hash that isn't particularly fast... if it were actually going to be a leading hash it might make sense, but requiring AVX2 for a slow hash doesn't seem interesting?

- Casey

@jan-wassenberg
Copy link

I agree it's justified to omit HWH because it's in a different class (aiming for robustness vs. malicious inputs).

we should be benchmarking only up through SSE AES-NI, since that is all that Meow requires.

Surprisingly, AESNI is only guaranteed for Skylake and later (https://reviews.llvm.org/rL341862). For example, here's a Haswell without AESNI: https://ark.intel.com/products/76294/Intel-Core-i3-4102E-Processor-3M-Cache-1_60-GHz

(FYI we also use AES in a 2048-bit cryptographic permutation serving as an RNG: http://github.com/google/randen)

@cmuratori
Copy link
Owner Author

Well, if it's unlikely that AES-NI would exist without AVX-2, then I'd be OK raising the benchmark to AVX2... the goal is just to compare apples to apples :)

- Casey

@mmozeiko
Copy link
Contributor

That is not quite accurate. There are a lot of Intel CPU's with AES-NI, but without AVX/AVX2. This is usually for lower end all-in-one box PC's or mobile/tablet ones. Whether this is target audience of Meow hash... not for me to say.

Example - I have small-box-pc with Pentium N4200 which is Goldmont microarchitecture (chronologically comes after Skylake). It has all SSE's up to 4.2, it also has AES-NI and SHA1/SHA256 instruction sets. But no AVX and no AVX2.

@cmuratori
Copy link
Owner Author

I added Highway Hash back in, and it is still excruciatingly slow even with /arch:AVX2, so I think the problem is maybe I have only the C implementation (there are no intrinsics anywhere in it :( ) I will look for an optimized version.

- Casey

@dgryski
Copy link

dgryski commented Oct 31, 2018

Grab the official C++ code from https://github.com/google/highwayhash

@cmuratori
Copy link
Owner Author

@dgryski That is what I grabbed - I just took the "C" directory from there and compiled it. I'm guessing that is not what they intend, but I haven't looked through the giant other source directory to find out where the actual optimized hash lives.

If someone wants to send me what the proper include is to actually just compile the hash function, I'll stick it in place of the C one :)

- Casey

@dgryski
Copy link

dgryski commented Nov 1, 2018

The C directory is a reference implementation with no vector instructions. You need the highwayhash directory with all the C++ code. The README lays out what you need to do. Basically, make, then use the header you want and link against libhighwayhash.

@cmuratori
Copy link
Owner Author

cmuratori commented Nov 1, 2018

That's not going to work, then. Every other hash we use is ~4 files, tops. If someone wants to make a simple h/cpp pair of it, I'll add it to the benchmark, otherwise it's a no-go. I'm not going to quintuple (or more??) the size of the Meow benchmark build just for one (not particularly fast) hash.

- Casey

@jan-wassenberg
Copy link

Fair enough.

@cmuratori
Copy link
Owner Author

Besides the aforementioned Highway Hash problem, v0.3 will come with the other hashes requested. I should note, however, that although I included CLHash, it really should be disqualified, because it cannot hash things with a seed at reasonable speeds. Each time you change the seed, you must go through a lengthy separate call that generates a table. So its performance is apples to oranges as opposed to all the other hashes in the suite, which are expected to take a different seed on every call.

- Casey

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants