Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize SipHash using sun.misc.Unsafe #1681

Merged
merged 3 commits into from Nov 2, 2014
Merged

Conversation

@grddev
Copy link
Contributor

@grddev grddev commented May 5, 2014

As SipHash is designed to consume the inputs 8 bytes at a time, a key part of the algorithm is reading the eight bytes as once. Unfortunately, there is no way to directly read eight bytes at once from a byte array without resorting to Unsafe. This implements the unsafe operation with a fallback on the original slow implementation for cases where Unsafe cannot be loaded (or where the native byte ordering is not little endian, as needed by SipHash).

The following performance comparison that tries strings of increasing lengths shows that the unsafe implementation is roughly 25% faster for really long strings, and about the same speed for short strings.

require 'benchmark'

N = 30_000_000

puts RUBY_DESCRIPTION
Benchmark.bmbm do |x|
  [0,1,8,16,32,64,1024,65536].each do |len|
    string = 'x'*len
    n = N/(0.25*len+10) + N/20_000
    x.report(len.to_s) { n.to_i.times { string.hash } }
  end
end

jruby 1.7.12 (1.9.3p392) 2014-04-15 643e292 on Java HotSpot(TM) 64-Bit Server VM 1.8.0_05-b13 [darwin-x86_64]

            user     system      total        real
0       0.650000   0.000000   0.650000 (  0.653000)
1       0.630000   0.000000   0.630000 (  0.626000)
8       0.600000   0.010000   0.610000 (  0.599000)
16      0.590000   0.000000   0.590000 (  0.584000)
32      0.570000   0.000000   0.570000 (  0.573000)
64      0.550000   0.000000   0.550000 (  0.541000)
1024    0.470000   0.000000   0.470000 (  0.471000)
65536   0.780000   0.000000   0.780000 (  0.783000)

jruby 1.7.13-SNAPSHOT (1.9.3p392) 2014-05-01 c02fcd7 on Java HotSpot(TM) 64-Bit Server VM 1.8.0_05-b13 [darwin-x86_64]

            user     system      total        real
0       0.650000   0.000000   0.650000 (  0.643000)
1       0.660000   0.000000   0.660000 (  0.654000)
8       0.610000   0.010000   0.620000 (  0.608000)
16      0.570000   0.000000   0.570000 (  0.559000)
32      0.530000   0.000000   0.530000 (  0.527000)
64      0.440000   0.000000   0.440000 (  0.442000)
1024    0.350000   0.000000   0.350000 (  0.342000)
65536   0.580000   0.000000   0.580000 (  0.583000)

Running the same benchmark against PerlHash, we can see that the unsafe implementation of SipHash is about the same speed for strings around length 64, and for longer strings the unsafe implementation of SipHash is significantly faster than PerlHash:

            user     system      total        real
0       0.460000   0.000000   0.460000 (  0.452000)
1       0.420000   0.000000   0.420000 (  0.416000)
8       0.440000   0.000000   0.440000 (  0.444000)
16      0.430000   0.010000   0.440000 (  0.422000)
32      0.410000   0.000000   0.410000 (  0.414000)
64      0.470000   0.000000   0.470000 (  0.462000)
1024    0.420000   0.000000   0.420000 (  0.425000)
65536   0.750000   0.000000   0.750000 (  0.750000)

Even though this implementation of SipHash is faster than the previous, I don't think it is fast enough to replace PerlHash entirely, as the majority of Hash keys are most probably predominantly short strings. Keep in mind though, that the 20% difference in PerlHash's advantage represents less than 100ns per invokation, whereas the 20% difference in SipHash's advantage represents a much longer time.

grddev added 3 commits May 5, 2014
As these loops will be executed only once for every #hash invokation,
it would make sense to defer the decision to unroll the loops to the
runtime.
In principle, this should allow the JIT compiler to remove all range
checks within the loop. I haven't had time to verify this though.
While one could wish that JIT compilation optimised the eight sequential
byte reads into a single long read, it in fact does not.

This implementation should fallback to the slow implementation in a
context where Unsafe fails to load, but I haven't figured out how to
test that properly.
private static final class FallbackLongReader extends LongReader {
@Override
public long getLong(byte[] src, int offset) {
return (long) src[offset++] |

This comment has been minimized.

@dkarpenko

dkarpenko Jun 5, 2014

You can just assign byte value to variable of type long:

        byte byteValue = 127;
        long longValue = byteValue;

This comment has been minimized.

@grddev

grddev Jun 8, 2014
Author Contributor

@dkarpenko, while I did not write this code (you can see the same code in the original), I think the reason for the long type conversion is that the shifts should be applied to longs, as they would otherwise produce ints if I'm not mistaken.

This comment has been minimized.

@headius

headius Nov 2, 2014
Member

@grddev I think you're right about that. Bit twiddling can be a little tweaky with Java's automatic type promotions.

@headius
Copy link
Member

@headius headius commented Jun 8, 2014

Hey great find. If I remember right, I did a quick experiment with Unsafe when we put siphash into the codebase but never actually landed this. your patches look good; we'll review and get it installed. I'm not sure we'll want to make SipHash the default, since as you say PerlHash is still faster for small strings (and I wouldn't expect people to be hashing really big strings) but it will be excellent to get SipHash closer to raw native performance. Thank you!

@nahi
Copy link
Member

@nahi nahi commented Jun 8, 2014

Here's my try in 2012. I added some uncommitted sources (Murmur, Perl, etc) so that Benchmark.java should run.
https://github.com/nahi/siphash-java-inline/tree/master/perf

I don't remember the details but Unsafe version is slower. Mine calls Long.reverseBytes() so optimization could have not enough.
https://github.com/nahi/siphash-java-inline/blob/master/perf/SipHashInlineTry.java#L18-22

@grddev
Copy link
Contributor Author

@grddev grddev commented Jun 9, 2014

@nahi, I tried the benchmark with an added Long.reverseBytes() in the reader, and here are the results:

jruby 1.7.13-SNAPSHOT (1.9.3p392) 2014-05-01 c02fcd7 on Java HotSpot(TM) 64-Bit Server VM 1.8.0_05-b13 [darwin-x86_64]

            user     system      total        real
0       0.680000   0.000000   0.680000 (  0.673000)
1       0.690000   0.000000   0.690000 (  0.687000)
8       0.660000   0.000000   0.660000 (  0.651000)
16      0.620000   0.010000   0.630000 (  0.625000)
32      0.590000   0.000000   0.590000 (  0.582000)
1024    0.470000   0.000000   0.470000 (  0.478000)
65536   0.810000   0.000000   0.810000 (  0.804000)

This is indeed not faster than the 'safe' version, thus the unsafe version is only relevant with matching endianness.

@headius
Copy link
Member

@headius headius commented Nov 2, 2014

Going for it. Thanks!

headius added a commit that referenced this pull request Nov 2, 2014
Optimize SipHash using sun.misc.Unsafe
@headius headius merged commit cb4581c into jruby:jruby-1_7 Nov 2, 2014
1 check passed
1 check passed
@enebo
continuous-integration/travis-ci The Travis CI build passed
Details
@headius headius added this to the JRuby 1.7.17 milestone Nov 2, 2014
@headius headius self-assigned this Nov 2, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants