Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MINOR: Dry up RubyArray and improve performance in a few spots #4751

Merged
merged 1 commit into from Sep 5, 2017

Conversation

Projects
None yet
2 participants
@original-brownbear
Copy link
Contributor

original-brownbear commented Aug 26, 2017

Some cleanups I came across debugging RubyArray performance recently :)

  • remove dead imports from RubyArray
  • Extract the USE_PACKED_ARRAYS == true paths to a single if branch to be more JIT friendly here
    • especially isPackedArray will be nicely dead code eliminated now for the Collection case and we save an expensive size call here and there (+ bytecode becomes smaller :))
  • Make 2 methods static that had no instance reference anywhere
  • Remove dead cast
  • Removed redundant array creation for varargs method with one arguement
  • Remove dead imports from packed array types
  • Replaced slower collection to array conversion list.toArray(new IRubyObject[list.size()])) with the faster version collection.toArray(IRubyObject.NULL_ARRAY)

Just a few suggestions, let me know what you think :)

return new RubyArray(runtime, values);
}

public static RubyArray newArray(Ruby runtime, List<? extends IRubyObject> list) {

This comment has been minimized.

Copy link
@kares

kares Aug 27, 2017

Member

still'd like to have the List version around for binary compatibility - it also does not need toArray for packed

This comment has been minimized.

Copy link
@original-brownbear

original-brownbear Aug 27, 2017

Author Contributor

@kares right bringing it back sec :)

This comment has been minimized.

Copy link
@original-brownbear

original-brownbear Aug 27, 2017

Author Contributor

brought it back here https://github.com/jruby/jruby/pull/4751/files#diff-008115c4ee37b136355736c9aa675fa3R264 with nicely separate packed and not packed paths :)

@@ -4100,7 +4077,9 @@ public IRubyObject sample(ThreadContext context, IRubyObject[] args) {
if (args.length > 0) {
IRubyObject hash = TypeConverter.checkHashType(context.runtime, args[args.length - 1]);
if (!hash.isNil()) {
IRubyObject[] rets = ArgsUtil.extractKeywordArgs(context, (RubyHash) hash, new String[] { "random" });
IRubyObject[] rets = ArgsUtil.extractKeywordArgs(context, (RubyHash) hash,

This comment has been minimized.

Copy link
@kares

kares Aug 27, 2017

Member

let's keep it consistently a one liner maybe :) ?

This comment has been minimized.

Copy link
@original-brownbear

original-brownbear Aug 27, 2017

Author Contributor

@kares sure :)

@@ -4792,7 +4771,7 @@ public IRubyObject min(ThreadContext context, IRubyObject num, Block block) {
}

private static final int optimizedCmp(ThreadContext context, IRubyObject a, IRubyObject b, int token, CachingCallSite op_cmp, CallSite op_gt, CallSite op_lt) {
if (token == ((RubyBasicObject) a).getMetaClass().generation) {

This comment has been minimized.

Copy link
@kares

kares Aug 27, 2017

Member

this cast is all over the place - it used to help inlining ... without the JIT wasn't just smart-enough to get it "fast"

This comment has been minimized.

Copy link
@original-brownbear

original-brownbear Aug 27, 2017

Author Contributor

@kares ah I see, phew I'm not sure about this point, but in Java 8 this should not help anything anymore right. I'd rather expect this to potentially be harmful because it increases the size of the method and the type profiler will figure out that this is always a RubyBasicObjecteventually right? (I could investigate this a little with JitWatch I guess :))

This comment has been minimized.

Copy link
@kares

kares Aug 27, 2017

Member

kk, but Java 7 still supported.
so it might be valuable to measure using that as well, although Java 8 first I guess
(9K is already considerably slower on 7)

This comment has been minimized.

Copy link
@original-brownbear

original-brownbear Aug 27, 2017

Author Contributor

@kares wow I stand 100% corrected here, check this out:
Tried to use a slightly deeper (in the sense of inheritance) case and benchmarked this:

    @Benchmark
    @OperationsPerInvocation(EVENTS_PER_INVOCATION)
    public final void castPerformanceImpactWithout(final Blackhole blackhole) throws Exception {
        for (int i = 0; i < EVENTS_PER_INVOCATION; ++i) {
            blackhole.consume(random().isTrue());
        }
    }

    @Benchmark
    @OperationsPerInvocation(EVENTS_PER_INVOCATION)
    public final void castPerformanceImpactWith(final Blackhole blackhole) throws Exception {
        for (int i = 0; i < EVENTS_PER_INVOCATION; ++i) {
            blackhole.consume(((RubyBoolean) random()).isTrue());
        }
    }

Results are:

# Run complete. Total time: 00:01:03

Benchmark                                              Mode  Cnt      Score      Error   Units
CastExperimentBenchmark.castPerformanceImpactWith     thrpt   20  92950.858 ± 1169.586  ops/ms
CastExperimentBenchmark.castPerformanceImpactWithout  thrpt   20  69318.454 ± 2354.502  ops/ms

didn't expect this at all (same effect for JDK 7 and 8 btw.) => will revert + thanks for the very educational moment :)

This comment has been minimized.

Copy link
@original-brownbear

original-brownbear Aug 27, 2017

Author Contributor
    private static IRubyObject random() {
        return RubyUtil.RUBY.newBoolean(RANDOM.nextBoolean());
    }
    private static final Random RANDOM = new Random();

was my object generator :)

This comment has been minimized.

Copy link
@original-brownbear

original-brownbear Aug 27, 2017

Author Contributor

The overhead in these cases seems to be the keeping of the type profile itself, inlining is not affected as such and method size is even a byte larger from the cast as I expected:

   5189 2084       4       org.logstash.benchmark.CastExperimentBenchmark::castPerformanceImpactWith (28 bytes)
                              @ 9   org.logstash.benchmark.CastExperimentBenchmark::random (13 bytes)   inline (hot)
                                @ 6   java.util.Random::nextBoolean (14 bytes)   inline (hot)
                                  @ 2   java.util.Random::next (47 bytes)   inline (hot)
                                    @ 8   java.util.concurrent.atomic.AtomicLong::get (5 bytes)   inline (hot)
                                    @ 32   java.util.concurrent.atomic.AtomicLong::compareAndSet (13 bytes)   inline (hot)
                                      @ 9   sun.misc.Unsafe::compareAndSwapLong (0 bytes)   (intrinsic)
                                @ 9   org.jruby.Ruby::newBoolean (16 bytes)   inline (hot)
                              @ 15   org.jruby.RubyBasicObject::isTrue (17 bytes)   inline (hot)
                              @ 18   org.openjdk.jmh.infra.Blackhole::consume (28 bytes)   disallowed by CompilerOracle

   2917 2053       4       org.logstash.benchmark.CastExperimentBenchmark::castPerformanceImpactWithout (27 bytes)
                              @ 9   org.logstash.benchmark.CastExperimentBenchmark::random (13 bytes)   inline (hot)
                                @ 6   java.util.Random::nextBoolean (14 bytes)   inline (hot)
                                  @ 2   java.util.Random::next (47 bytes)   inline (hot)
                                    @ 8   java.util.concurrent.atomic.AtomicLong::get (5 bytes)   inline (hot)
                                    @ 32   java.util.concurrent.atomic.AtomicLong::compareAndSet (13 bytes)   inline (hot)
                                      @ 9   sun.misc.Unsafe::compareAndSwapLong (0 bytes)   (intrinsic)
                                @ 9   org.jruby.Ruby::newBoolean (16 bytes)   inline (hot)
                              @ 12   org.jruby.RubyBasicObject::isTrue (17 bytes)   inline (hot)
                              @ 12   org.jruby.RubyBasicObject::isTrue (17 bytes)   inline (hot)
                               \-> TypeProfile (140240/280503 counts) = org/jruby/RubyBoolean$False
                               \-> TypeProfile (140263/280503 counts) = org/jruby/RubyBoolean$True
                              @ 17   org.openjdk.jmh.infra.Blackhole::consume (28 bytes)   disallowed by CompilerOracle

This comment has been minimized.

Copy link
@kares

kares Aug 27, 2017

Member

thanks, and yes it surprising (and IDE gives warnings all over) also did not know about that.
... but I recall @headius lecture on some of the issues along the way of making JRuby.
think he also raised the issue to the HotSpot team - maybe for 9 thay might do smt about it.
or maybe they won't and eventually just push GraalVM's optimizations for all as Java 10 :)

@original-brownbear original-brownbear force-pushed the original-brownbear:cleanup-ruby-array branch 2 times, most recently from 03cd174 to 51cd6a2 Aug 27, 2017

@original-brownbear

This comment has been minimized.

Copy link
Contributor Author

original-brownbear commented Aug 27, 2017

@kares thanks for the swift review, addressed all comments I think/hope :)
List special case handling is back and should inline very clean now too.

@original-brownbear original-brownbear force-pushed the original-brownbear:cleanup-ruby-array branch from 51cd6a2 to 5152772 Aug 27, 2017

@original-brownbear

This comment has been minimized.

Copy link
Contributor Author

original-brownbear commented Aug 27, 2017

@kares reverted the removed cast and added some benchmarks here #4751 (comment) (just in case GH hides it now that I pushed over the change, kind of interesting imo :))

@kares

kares approved these changes Aug 27, 2017

@kares

This comment has been minimized.

Copy link
Member

kares commented Sep 5, 2017

no one complained in a week or so, thus let's ship it ...

@kares kares merged commit 3cbf5ff into jruby:master Sep 5, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@kares kares added this to the JRuby 9.2.0.0 milestone Sep 5, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.