Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize nextPowOfTwo #5347

Merged
merged 2 commits into from Oct 11, 2018

Conversation

Projects
None yet
3 participants
@ahorek
Copy link
Contributor

ahorek commented Oct 4, 2018

numberOfLeadingZeros is vectorized and more effective than the original version

It's also possible to use highestOneBit, but I didn't because it was optimized the same way only since java 11 see https://bugs.openjdk.java.net/browse/JDK-8199843

require 'benchmark/ips'
require 'set'
Benchmark.ips do |x|
  x.report('set') { 100.times { Set[:x, :a, :h, :h, :h, :g, :h, :h, :h, :i] } }
end

patch

                      8.632k (± 1.9%) i/s -     43.316k in   5.020055s

master

                      8.384k (±10.6%) i/s -     41.477k in   5.035840s

patch

         0: ldc           #3                  // int -2147483648
         2: iload_0
         3: invokestatic  #4                  // Method java/lang/Integer.numberOfLeadingZeros:(I)I
         6: iushr
         7: istore_1
         8: iload_0
         9: iload_1
        10: if_icmpne     17
        13: iload_0
        14: goto          20
        17: iload_1
        18: iconst_1
        19: ishl
        20: ireturn

master

         0: iload_0
         1: istore_1
         2: iinc          1, -1
         5: iload_1
         6: iload_1
         7: iconst_1
         8: ishr
         9: ior
        10: istore_1
        11: iload_1
        12: iload_1
        13: iconst_2
        14: ishr
        15: ior
        16: istore_1
        17: iload_1
        18: iload_1
        19: iconst_4
        20: ishr
        21: ior
        22: istore_1
        23: iload_1
        24: iload_1
        25: bipush        8
        27: ishr
        28: ior
        29: istore_1
        30: iload_1
        31: iload_1
        32: bipush        16
        34: ishr
        35: ior
        36: istore_1
        37: iinc          1, 1
        40: iload_1
        41: ireturn
pavel

@ahorek ahorek force-pushed the ahorek:pow2 branch from b095d86 to 36b857f Oct 4, 2018

@ahorek ahorek changed the title optimalize nextPowOfTwo optimize nextPowOfTwo Oct 5, 2018

@kares

This comment has been minimized.

Copy link
Member

kares commented Oct 11, 2018

heh, interesting ... maybe the branch (last line) could be eliminated as well?

@ahorek

This comment has been minimized.

Copy link
Contributor Author

ahorek commented Oct 11, 2018

I took a safe path, the branch could be eliminated only if we can ensure that the input will always be positive and not 0 or 1. Let me confirm that.

@ahorek

This comment has been minimized.

Copy link
Contributor Author

ahorek commented Oct 11, 2018

         0: ldc           #3                  // int -2147483648
         2: iload_0
         3: iconst_1
         4: isub
         5: invokestatic  #4                  // Method java/lang/Integer.numberOfLeadingZeros:(I)I
         8: iushr
         9: iconst_1
        10: ishl
        11: ireturn
        8.787k (±14.2%) i/s -     43.036k in   5.046647s

inspiration jemalloc/jemalloc#1303

@ahorek ahorek force-pushed the ahorek:pow2 branch from 6b5af3b to f95225a Oct 11, 2018

pavel

@ahorek ahorek force-pushed the ahorek:pow2 branch from f95225a to 21bd95e Oct 11, 2018

@headius headius merged commit 825026b into jruby:master Oct 11, 2018

1 check was pending

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details

@headius headius added this to the JRuby 9.2.1.0 milestone Oct 11, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.