Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize nextPowOfTwo #5347

Merged
merged 2 commits into from Oct 11, 2018
Merged

optimize nextPowOfTwo #5347

merged 2 commits into from Oct 11, 2018

Conversation

@ahorek
Copy link
Contributor

@ahorek ahorek commented Oct 4, 2018

numberOfLeadingZeros is vectorized and more effective than the original version

It's also possible to use highestOneBit, but I didn't because it was optimized the same way only since java 11 see https://bugs.openjdk.java.net/browse/JDK-8199843

require 'benchmark/ips'
require 'set'
Benchmark.ips do |x|
  x.report('set') { 100.times { Set[:x, :a, :h, :h, :h, :g, :h, :h, :h, :i] } }
end

patch

                      8.632k (± 1.9%) i/s -     43.316k in   5.020055s

master

                      8.384k (±10.6%) i/s -     41.477k in   5.035840s

patch

         0: ldc           #3                  // int -2147483648
         2: iload_0
         3: invokestatic  #4                  // Method java/lang/Integer.numberOfLeadingZeros:(I)I
         6: iushr
         7: istore_1
         8: iload_0
         9: iload_1
        10: if_icmpne     17
        13: iload_0
        14: goto          20
        17: iload_1
        18: iconst_1
        19: ishl
        20: ireturn

master

         0: iload_0
         1: istore_1
         2: iinc          1, -1
         5: iload_1
         6: iload_1
         7: iconst_1
         8: ishr
         9: ior
        10: istore_1
        11: iload_1
        12: iload_1
        13: iconst_2
        14: ishr
        15: ior
        16: istore_1
        17: iload_1
        18: iload_1
        19: iconst_4
        20: ishr
        21: ior
        22: istore_1
        23: iload_1
        24: iload_1
        25: bipush        8
        27: ishr
        28: ior
        29: istore_1
        30: iload_1
        31: iload_1
        32: bipush        16
        34: ishr
        35: ior
        36: istore_1
        37: iinc          1, 1
        40: iload_1
        41: ireturn
@ahorek ahorek changed the title optimalize nextPowOfTwo optimize nextPowOfTwo Oct 5, 2018
@kares
Copy link
Member

@kares kares commented Oct 11, 2018

heh, interesting ... maybe the branch (last line) could be eliminated as well?

@ahorek
Copy link
Contributor Author

@ahorek ahorek commented Oct 11, 2018

I took a safe path, the branch could be eliminated only if we can ensure that the input will always be positive and not 0 or 1. Let me confirm that.

@ahorek
Copy link
Contributor Author

@ahorek ahorek commented Oct 11, 2018

         0: ldc           #3                  // int -2147483648
         2: iload_0
         3: iconst_1
         4: isub
         5: invokestatic  #4                  // Method java/lang/Integer.numberOfLeadingZeros:(I)I
         8: iushr
         9: iconst_1
        10: ishl
        11: ireturn
        8.787k (±14.2%) i/s -     43.036k in   5.046647s

inspiration jemalloc/jemalloc#1303

@headius headius merged commit 825026b into jruby:master Oct 11, 2018
1 check was pending
@headius headius added this to the JRuby 9.2.1.0 milestone Oct 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants