Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize SipHash using sun.misc.Unsafe #1681

Merged
merged 3 commits into from Nov 2, 2014

Conversation

Projects
None yet
4 participants
@grddev
Copy link
Contributor

grddev commented May 5, 2014

As SipHash is designed to consume the inputs 8 bytes at a time, a key part of the algorithm is reading the eight bytes as once. Unfortunately, there is no way to directly read eight bytes at once from a byte array without resorting to Unsafe. This implements the unsafe operation with a fallback on the original slow implementation for cases where Unsafe cannot be loaded (or where the native byte ordering is not little endian, as needed by SipHash).

The following performance comparison that tries strings of increasing lengths shows that the unsafe implementation is roughly 25% faster for really long strings, and about the same speed for short strings.

require 'benchmark'

N = 30_000_000

puts RUBY_DESCRIPTION
Benchmark.bmbm do |x|
  [0,1,8,16,32,64,1024,65536].each do |len|
    string = 'x'*len
    n = N/(0.25*len+10) + N/20_000
    x.report(len.to_s) { n.to_i.times { string.hash } }
  end
end

jruby 1.7.12 (1.9.3p392) 2014-04-15 643e292 on Java HotSpot(TM) 64-Bit Server VM 1.8.0_05-b13 [darwin-x86_64]

            user     system      total        real
0       0.650000   0.000000   0.650000 (  0.653000)
1       0.630000   0.000000   0.630000 (  0.626000)
8       0.600000   0.010000   0.610000 (  0.599000)
16      0.590000   0.000000   0.590000 (  0.584000)
32      0.570000   0.000000   0.570000 (  0.573000)
64      0.550000   0.000000   0.550000 (  0.541000)
1024    0.470000   0.000000   0.470000 (  0.471000)
65536   0.780000   0.000000   0.780000 (  0.783000)

jruby 1.7.13-SNAPSHOT (1.9.3p392) 2014-05-01 c02fcd7 on Java HotSpot(TM) 64-Bit Server VM 1.8.0_05-b13 [darwin-x86_64]

            user     system      total        real
0       0.650000   0.000000   0.650000 (  0.643000)
1       0.660000   0.000000   0.660000 (  0.654000)
8       0.610000   0.010000   0.620000 (  0.608000)
16      0.570000   0.000000   0.570000 (  0.559000)
32      0.530000   0.000000   0.530000 (  0.527000)
64      0.440000   0.000000   0.440000 (  0.442000)
1024    0.350000   0.000000   0.350000 (  0.342000)
65536   0.580000   0.000000   0.580000 (  0.583000)

Running the same benchmark against PerlHash, we can see that the unsafe implementation of SipHash is about the same speed for strings around length 64, and for longer strings the unsafe implementation of SipHash is significantly faster than PerlHash:

            user     system      total        real
0       0.460000   0.000000   0.460000 (  0.452000)
1       0.420000   0.000000   0.420000 (  0.416000)
8       0.440000   0.000000   0.440000 (  0.444000)
16      0.430000   0.010000   0.440000 (  0.422000)
32      0.410000   0.000000   0.410000 (  0.414000)
64      0.470000   0.000000   0.470000 (  0.462000)
1024    0.420000   0.000000   0.420000 (  0.425000)
65536   0.750000   0.000000   0.750000 (  0.750000)

Even though this implementation of SipHash is faster than the previous, I don't think it is fast enough to replace PerlHash entirely, as the majority of Hash keys are most probably predominantly short strings. Keep in mind though, that the 20% difference in PerlHash's advantage represents less than 100ns per invokation, whereas the 20% difference in SipHash's advantage represents a much longer time.

grddev added some commits May 5, 2014

Avoid manual unroll of non-hot SipHash loops
As these loops will be executed only once for every #hash invokation,
it would make sense to defer the decision to unroll the loops to the
runtime.
Hoist SipHashInline range checks
In principle, this should allow the JIT compiler to remove all range
checks within the loop. I haven't had time to verify this though.
Use Unsafe to read a long at a time
While one could wish that JIT compilation optimised the eight sequential
byte reads into a single long read, it in fact does not.

This implementation should fallback to the slow implementation in a
context where Unsafe fails to load, but I haven't figured out how to
test that properly.
private static final class FallbackLongReader extends LongReader {
@Override
public long getLong(byte[] src, int offset) {
return (long) src[offset++] |

This comment has been minimized.

Copy link
@dkarpenko

dkarpenko Jun 5, 2014

You can just assign byte value to variable of type long:

        byte byteValue = 127;
        long longValue = byteValue;

This comment has been minimized.

Copy link
@grddev

grddev Jun 8, 2014

Author Contributor

@dkarpenko, while I did not write this code (you can see the same code in the original), I think the reason for the long type conversion is that the shifts should be applied to longs, as they would otherwise produce ints if I'm not mistaken.

This comment has been minimized.

Copy link
@headius

headius Nov 2, 2014

Member

@grddev I think you're right about that. Bit twiddling can be a little tweaky with Java's automatic type promotions.

@headius

This comment has been minimized.

Copy link
Member

headius commented Jun 8, 2014

Hey great find. If I remember right, I did a quick experiment with Unsafe when we put siphash into the codebase but never actually landed this. your patches look good; we'll review and get it installed. I'm not sure we'll want to make SipHash the default, since as you say PerlHash is still faster for small strings (and I wouldn't expect people to be hashing really big strings) but it will be excellent to get SipHash closer to raw native performance. Thank you!

@nahi

This comment has been minimized.

Copy link
Member

nahi commented Jun 8, 2014

Here's my try in 2012. I added some uncommitted sources (Murmur, Perl, etc) so that Benchmark.java should run.
https://github.com/nahi/siphash-java-inline/tree/master/perf

I don't remember the details but Unsafe version is slower. Mine calls Long.reverseBytes() so optimization could have not enough.
https://github.com/nahi/siphash-java-inline/blob/master/perf/SipHashInlineTry.java#L18-22

@grddev

This comment has been minimized.

Copy link
Contributor Author

grddev commented Jun 9, 2014

@nahi, I tried the benchmark with an added Long.reverseBytes() in the reader, and here are the results:

jruby 1.7.13-SNAPSHOT (1.9.3p392) 2014-05-01 c02fcd7 on Java HotSpot(TM) 64-Bit Server VM 1.8.0_05-b13 [darwin-x86_64]

            user     system      total        real
0       0.680000   0.000000   0.680000 (  0.673000)
1       0.690000   0.000000   0.690000 (  0.687000)
8       0.660000   0.000000   0.660000 (  0.651000)
16      0.620000   0.010000   0.630000 (  0.625000)
32      0.590000   0.000000   0.590000 (  0.582000)
1024    0.470000   0.000000   0.470000 (  0.478000)
65536   0.810000   0.000000   0.810000 (  0.804000)

This is indeed not faster than the 'safe' version, thus the unsafe version is only relevant with matching endianness.

@headius

This comment has been minimized.

Copy link
Member

headius commented Nov 2, 2014

Going for it. Thanks!

headius added a commit that referenced this pull request Nov 2, 2014

Merge pull request #1681 from grddev/unsafe-siphash-opt
Optimize SipHash using sun.misc.Unsafe

@headius headius merged commit cb4581c into jruby:jruby-1_7 Nov 2, 2014

1 check passed

continuous-integration/travis-ci The Travis CI build passed
Details

@headius headius added this to the JRuby 1.7.17 milestone Nov 2, 2014

@headius headius self-assigned this Nov 2, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.