Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fannkuch-redux is less than half as fast in 9k as in 1.7 #2850

Closed
chrisseaton opened this Issue Apr 18, 2015 · 7 comments

Comments

Projects
None yet
5 participants
@chrisseaton
Copy link
Contributor

chrisseaton commented Apr 18, 2015

I've recently finished a new big run of benchmarks - available at http://jruby.github.io/bench9000/. As you'd expect, some benchmarks are a little faster under 9k (e.g. 1.2x, 1.8x), a few a lot faster (e.g. 8.5x), a few are a little slower, and some are worse than that.

A starting point to tackling those that are slower might be fannkuch-redux. It is less than half as fast in 9k as it is in 1.7.

The error bar represents standard deviation - so this benchmark is very stable.

screen shot 2015-04-18 at 16 16 14

$ ~/.rbenv/versions/jruby-1.7.19/bin/ruby -Xcompile.invokedynamic=false ../fannkuch.rb 
0.486
0.303
0.288
0.26
0.258
0.26
0.256
0.266
0.266
0.26
0.261
0.253
$ ~/.rbenv/versions/jruby-1.7.19/bin/ruby -Xcompile.invokedynamic=true ../fannkuch.rb 
1.318
0.164
0.16
0.179
0.328
0.142
0.146
0.15
0.156
0.155
0.153
0.146
0.145
0.16
$ bin/ruby -Xcompile.invokedynamic=false ../fannkuch.rb 
0.704
0.335
0.32
0.314
0.31
0.311
0.319
0.307
0.318
0.314
$ bin/ruby -Xcompile.invokedynamic=true ../fannkuch.rb 
1.367
1.303
0.782
0.323
0.327
0.32
0.315
0.742
0.345
0.344
0.345
$ java -version
java version "1.8.0_31"
Java(TM) SE Runtime Environment (build 1.8.0_31-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.31-b07, mixed mode)

It's not an indy thing - it applies with or without indy. It's also not a warmup thing - the report linked above goes to great pains to ensure a benchmark is warmed up and will run for a benchmark minutes.

Version of the benchmark used: https://gist.github.com/chrisseaton/8201ad900149ee821403.

It's true that fannkuch-redux is a simple, synthetic benchmark, perhaps unrepresentative of real Ruby code - but that doesn't make the fact it's any slower any better. If there are problems there there may be problems elsewhere.

@chrisseaton chrisseaton added this to the JRuby 9.0.0.0 milestone Apr 18, 2015

@headius

This comment has been minimized.

Copy link
Member

headius commented Apr 20, 2015

Thanks Chris...given that they're both slow I'm guessing they're both bottlenecked on the same thing, and that same thing is probably something consistent like excessive allocation. We'll look into it before final.

@headius

This comment has been minimized.

Copy link
Member

headius commented Apr 20, 2015

I notice the bench contains a lot of masgn, as in p[2], p[3] = p[3], p[2]. This was optimized in 1.7, when balanced, to just do direct assignments. In 9k we do not yet have that optimization and always construct and destructure a RubyArray. The improvement would go in IR building, most likely, and it is probably the cause of perf issues here.

@chrisseaton

This comment has been minimized.

Copy link
Contributor Author

chrisseaton commented Apr 20, 2015

Yeah that does make sense - we have a kind of fork in Truffle for multiple assignment - one side does the assignment - the other constructs the actual array needed - and if you don't use the result value we only execute one side of the fork.

@subbuss

This comment has been minimized.

Copy link
Contributor

subbuss commented May 5, 2015

2015-05-04T18:29:24.224-07:00: Ruby: failed to compile target script local_benches/shootout/fannkuch.jruby: failed to compile script local_benches/shootout/fannkuch.jruby
org.jruby.compiler.NotCompilableException: failed to compile script local_benches/shootout/fannkuch.jruby
        at org.jruby.ir.Compiler.execute(Compiler.java:62)
        at org.jruby.ir.Compiler.execute(Compiler.java:30)
        at org.jruby.ir.IRTranslator.execute(IRTranslator.java:42)
        at org.jruby.Ruby.tryCompile(Ruby.java:810)
        at org.jruby.Ruby.precompileCLI(Ruby.java:768)
        at org.jruby.Ruby.runNormally(Ruby.java:742)
        at org.jruby.Ruby.runFromMain(Ruby.java:575)
        at org.jruby.Main.doRunFromMain(Main.java:402)
        at org.jruby.Main.internalRun(Main.java:297)
        at org.jruby.Main.run(Main.java:226)
        at org.jruby.Main.main(Main.java:198)
Caused by: java.lang.ClassFormatError: Illegal exception table range in class file local_benches/shootout/fannkuch_dot_jruby
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
        at org.jruby.util.ClassDefiningJRubyClassLoader.defineClass(ClassDefiningJRubyClassLoader.java:56)
        at org.jruby.ir.targets.JVMVisitor.defineFromBytecode(JVMVisitor.java:87)
        at org.jruby.ir.Compiler.execute(Compiler.java:54)

In 1.7 mode, this JITs.

@headius headius closed this in d05ce1f May 8, 2015

@headius headius added the performance label May 8, 2015

@eregon

This comment has been minimized.

Copy link
Member

eregon commented May 11, 2015

Is this as fast as 1.7 now?

@enebo

This comment has been minimized.

Copy link
Member

enebo commented May 12, 2015

After @subbuss three commits yesterday for #2916 this bench is faster than 1.7 with indy on by a smidgen (5-10%) and nearly 50% faster than 1.7 when indy is disabled.

@chrisseaton

This comment has been minimized.

Copy link
Contributor Author

chrisseaton commented May 12, 2015

I'll run the full set of benchmarks again soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.