Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bootstrap always slower than oracle JVM? #1140

Closed
rmannibucau opened this issue Apr 8, 2019 · 24 comments

Comments

5 participants
@rmannibucau
Copy link

commented Apr 8, 2019

Hi guys,

I did a small main (comparable to just calling a CLI help command, no real logic) and the execution is always slower with graalvm than with oracle. Whereas with the rc14 it was slower but close, the rc15 is really noticeably slower. Here a capture of the execution time to illustrate that statement - using time command on the exact same command line (java -cp "li/*.jar" my.Main):

Oracle 1.8u191

real	0m0,127s
user	0m0,145s
sys	0m0,028s


Graal 14

real	0m0,119s
user	0m0,184s
sys	0m0,049s


Graal 15

real	0m0,477s
user	0m0,271s
sys	0m0,027s

Since Graal rc15 uses libjvm by default I expected it to be the opposite, did I miss anything or is this something to address?

Romain

@plokhotnyuk

This comment has been minimized.

Copy link

commented Apr 8, 2019

@rmannibucau please try with -XX:-UseJVMCINativeLibrary option for RC15

@rmannibucau

This comment has been minimized.

Copy link
Author

commented Apr 8, 2019

Hi @plokhotnyuk,

thanks for the fast answer, here is what I get - does not look as fast as expected to me:

Oracle 1.8u191

real	0m0,175s
user	0m0,176s
sys	0m0,042s


Graal 14

real	0m0,228s
user	0m0,391s
sys	0m0,025s


Graal 15

real	0m0,262s
user	0m0,410s
sys	0m0,048s


Oracle 1.8u191

real	0m0,126s
user	0m0,161s
sys	0m0,019s


Graal 14

real	0m0,234s
user	0m0,377s
sys	0m0,035s


Graal 15

real	0m0,130s
user	0m0,241s
sys	0m0,020s


Oracle 1.8u191

real	0m0,137s
user	0m0,168s
sys	0m0,024s


Graal 14

real	0m0,217s
user	0m0,332s
sys	0m0,048s


Graal 15

real	0m0,218s
user	0m0,364s
sys	0m0,017s


Oracle 1.8u191

real	0m0,127s
user	0m0,149s
sys	0m0,029s


Graal 14

real	0m0,256s
user	0m0,403s
sys	0m0,038s


Graal 15

real	0m0,146s
user	0m0,248s
sys	0m0,029s


Oracle 1.8u191

real	0m0,151s
user	0m0,135s
sys	0m0,048s


Graal 14

real	0m0,249s
user	0m0,394s
sys	0m0,034s


Graal 15

real	0m0,228s
user	0m0,354s
sys	0m0,042s


Oracle 1.8u191

real	0m0,130s
user	0m0,129s
sys	0m0,056s


Graal 14

real	0m0,220s
user	0m0,352s
sys	0m0,032s


Graal 15

real	0m0,208s
user	0m0,342s
sys	0m0,026s


Oracle 1.8u191

real	0m0,121s
user	0m0,150s
sys	0m0,021s


Graal 14

real	0m0,226s
user	0m0,348s
sys	0m0,048s


Graal 15

real	0m0,125s
user	0m0,219s
sys	0m0,026s


Oracle 1.8u191

real	0m0,130s
user	0m0,165s
sys	0m0,017s


Graal 14

real	0m0,238s
user	0m0,376s
sys	0m0,029s


Graal 15

real	0m0,277s
user	0m0,423s
sys	0m0,034s


Oracle 1.8u191

real	0m0,154s
user	0m0,185s
sys	0m0,025s


Graal 14

real	0m0,313s
user	0m0,474s
sys	0m0,047s


Graal 15

real	0m0,142s
user	0m0,234s
sys	0m0,023s


Oracle 1.8u191

real	0m0,139s
user	0m0,167s
sys	0m0,029s


Graal 14

real	0m0,280s
user	0m0,425s
sys	0m0,033s


Graal 15

real	0m0,214s
user	0m0,335s
sys	0m0,043s

Here is the script i'm using:

#! /bin/bash

for i in {1..10}; do

echo "Oracle 1.8u191"
time ~/softwares/jdk1.8.0_191/bin/java -cp "dependency/*" com.github.rmannibucau.docker.compose.cli.Launcher &> /dev/null
echo
echo

echo "Graal 14"
time ~/softwares/graalvm-ce-1.0.0-rc14/bin/java -cp "dependency/*" com.github.rmannibucau.docker.compose.cli.Launcher  &> /dev/null
echo
echo

echo "Graal 15"
time ~/softwares/graalvm-ce-1.0.0-rc15/bin/java -XX:-UseJVMCINativeLibrary -cp "dependency/*" com.github.rmannibucau.docker.compose.cli.Launcher &> /dev/null
echo
echo

done
@plokhotnyuk

This comment has been minimized.

Copy link

commented Apr 8, 2019

@rmannibucau have you tried OpenJDK and GraavVM EE for a full picture?

@rmannibucau

This comment has been minimized.

Copy link
Author

commented Apr 8, 2019

@plokhotnyuk here the same run with zulu - as openjdk representant (I had it locally):

Oracle 1.8u191

real	0m0,130s
user	0m0,156s
sys	0m0,022s


Zulu 1.8u202

real	0m0,126s
user	0m0,169s
sys	0m0,021s


Graal 14

real	0m0,113s
user	0m0,206s
sys	0m0,019s


Graal 15

real	0m0,251s
user	0m0,405s
sys	0m0,037s


Oracle 1.8u191

real	0m0,130s
user	0m0,163s
sys	0m0,026s


Zulu 1.8u202

real	0m0,135s
user	0m0,191s
sys	0m0,019s


Graal 14

real	0m0,127s
user	0m0,223s
sys	0m0,024s


Graal 15

real	0m0,114s
user	0m0,197s
sys	0m0,028s

I didn't try with graalvm EE - don't think it is publicly and freely available, is it?

@plokhotnyuk

This comment has been minimized.

Copy link

commented Apr 8, 2019

@rmannibucau AFAIK it is not free for usage in production only

@rmannibucau

This comment has been minimized.

Copy link
Author

commented Apr 8, 2019

@plokhotnyuk here are the results - very close to CE:

Oracle 1.8u191

real	0m0,202s
user	0m0,218s
sys	0m0,026s


Zulu 1.8u202

real	0m0,155s
user	0m0,188s
sys	0m0,046s


Graal 14

real	0m0,140s
user	0m0,261s
sys	0m0,016s


Graal CE 15

real	0m0,292s
user	0m0,458s
sys	0m0,042s


Graal EE 15

real	0m0,165s
user	0m0,271s
sys	0m0,036s


Oracle 1.8u191

real	0m0,152s
user	0m0,185s
sys	0m0,023s


Zulu 1.8u202

real	0m0,150s
user	0m0,207s
sys	0m0,017s


Graal 14

real	0m0,178s
user	0m0,321s
sys	0m0,033s


Graal CE 15

real	0m0,159s
user	0m0,285s
sys	0m0,028s


Graal EE 15

real	0m0,155s
user	0m0,259s
sys	0m0,035s


Oracle 1.8u191

real	0m0,162s
user	0m0,200s
sys	0m0,027s


Zulu 1.8u202

real	0m0,178s
user	0m0,212s
sys	0m0,028s


Graal 14

real	0m0,274s
user	0m0,334s
sys	0m0,034s


Graal CE 15

real	0m0,158s
user	0m0,288s
sys	0m0,012s


Graal EE 15

real	0m0,216s
user	0m0,320s
sys	0m0,041s


Oracle 1.8u191

real	0m0,140s
user	0m0,153s
sys	0m0,045s


Zulu 1.8u202

real	0m0,167s
user	0m0,204s
sys	0m0,030s


Graal 14

real	0m0,141s
user	0m0,227s
sys	0m0,036s


Graal CE 15

real	0m0,270s
user	0m0,423s
sys	0m0,036s


Graal EE 15

real	0m0,337s
user	0m0,542s
sys	0m0,037s


Oracle 1.8u191

real	0m0,133s
user	0m0,179s
sys	0m0,008s


Zulu 1.8u202

real	0m0,155s
user	0m0,205s
sys	0m0,024s


Graal 14

real	0m0,343s
user	0m0,537s
sys	0m0,055s


Graal CE 15

real	0m0,276s
user	0m0,430s
sys	0m0,035s


Graal EE 15

real	0m0,172s
user	0m0,261s
sys	0m0,043s


Oracle 1.8u191

real	0m0,148s
user	0m0,196s
sys	0m0,012s


Zulu 1.8u202

real	0m0,142s
user	0m0,178s
sys	0m0,032s


Graal 14

real	0m0,134s
user	0m0,252s
sys	0m0,016s


Graal CE 15

real	0m0,253s
user	0m0,424s
sys	0m0,026s


Graal EE 15

real	0m0,307s
user	0m0,489s
sys	0m0,037s


Oracle 1.8u191

real	0m0,150s
user	0m0,192s
sys	0m0,018s


Zulu 1.8u202

real	0m0,164s
user	0m0,210s
sys	0m0,025s


Graal 14

real	0m0,347s
user	0m0,526s
sys	0m0,056s


Graal CE 15

real	0m0,153s
user	0m0,262s
sys	0m0,023s


Graal EE 15

real	0m0,329s
user	0m0,535s
sys	0m0,025s


Oracle 1.8u191

real	0m0,139s
user	0m0,173s
sys	0m0,025s


Zulu 1.8u202

real	0m0,148s
user	0m0,194s
sys	0m0,022s


Graal 14

real	0m0,264s
user	0m0,429s
sys	0m0,028s


Graal CE 15

real	0m0,131s
user	0m0,221s
sys	0m0,038s


Graal EE 15

real	0m0,334s
user	0m0,507s
sys	0m0,063s


Oracle 1.8u191

real	0m0,134s
user	0m0,170s
sys	0m0,023s


Zulu 1.8u202

real	0m0,177s
user	0m0,223s
sys	0m0,027s


Graal 14

real	0m0,427s
user	0m0,594s
sys	0m0,056s


Graal CE 15

real	0m0,323s
user	0m0,478s
sys	0m0,041s


Graal EE 15

real	0m0,153s
user	0m0,269s
sys	0m0,017s


Oracle 1.8u191

real	0m0,171s
user	0m0,224s
sys	0m0,016s


Zulu 1.8u202

real	0m0,173s
user	0m0,250s
sys	0m0,005s


Graal 14

real	0m0,139s
user	0m0,245s
sys	0m0,028s


Graal CE 15

real	0m0,138s
user	0m0,246s
sys	0m0,031s


Graal EE 15

real	0m0,320s
user	0m0,511s
sys	0m0,060s

Used command:

time ~/softwares/graalvm-ee-1.0.0-rc15/bin/java -XX:-UseJVMCINativeLibrary -cp "dependency/*" com.github.rmannibucau.docker.compose.cli.Launcher &> /dev/null
@plokhotnyuk

This comment has been minimized.

Copy link

commented Apr 8, 2019

@rmannibucau do you have a possibility to create a native image for your tool?

That can be a game changer, especially with the profile guided optimization...

@neomatrix369

This comment has been minimized.

Copy link

commented Apr 8, 2019

@rmannibucau do you have a possibility to create a native image for your tool?

That can be a game changer, especially with the profile guided optimization...

Will try this and respond asap!

@neomatrix369

This comment has been minimized.

Copy link

commented Apr 8, 2019

@plokhotnyuk here are the results - very close to CE:

Used command:

time ~/softwares/graalvm-ee-1.0.0-rc15/bin/java -XX:-UseJVMCINativeLibrary -cp "dependency/*" com.github.rmannibucau.docker.compose.cli.Launcher &> /dev/null

Great work with comparing various JDKs with GraalVM - do you have an automated way to do it? I mean scripts and the likes? Would love to try them out for my own apps.

@rmannibucau

This comment has been minimized.

Copy link
Author

commented Apr 8, 2019

@neomatrix369 I used https://github.com/rmannibucau/docosh/blob/master/graalcomparison.sh (read the small warning at the beginning, tried to not put .jar in the git repo)

@plokhotnyuk i was generating a native image, it works but it is super slow and loose portability of java which is still a nice feature. My goal with graal rc15 was to see if libjvm was a game changer which could enable to not require to native-image applications.

@plokhotnyuk

This comment has been minimized.

Copy link

commented Apr 8, 2019

@rmosaner for fast native images use --pgo-instrument and --pgo options as described on the 18th slide here or in these steps

@thomaswue

This comment has been minimized.

Copy link
Member

commented Apr 8, 2019

What is the OS/machine you are running on? Generally, the new libgraal setup should be significantly faster in startup in the first few seconds. But it could be that you are dominated by loading the binary. Can you share the program so we can test on different machines?

Generally, for CLI apps, the native image configuration is indeed the recommended one. It gives you an order of magnitude better startup/footprint.

@rmannibucau

This comment has been minimized.

Copy link
Author

commented Apr 8, 2019

My OS is an ubuntu:

Linux RMANNIBUCAU 4.18.0-16-generic #17-Ubuntu SMP Fri Feb 8 00:06:57 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

The app is available in master of previous link.

@plokhotnyuk do you think pgo was not used in libjvm generation and could optimize it more? Goal is really to reduce the need to go native making the jvm itself pre-optimized. Can also imply to hack the default classloaders to use pre-indexed classes or so maybe?

@plokhotnyuk

This comment has been minimized.

Copy link

commented Apr 8, 2019

@rmannibucau pgo wasn't used for this RC

BTW, have you tried to profile your app with async-profiler? It can build flame graphs with Java/Native/Hotspot/GC stacktraces like here.

@thomaswue

This comment has been minimized.

Copy link
Member

commented Apr 9, 2019

We have identified the issue. It is related to "shutdown hooks" delaying an exit of the VM process when a compilation is currently running. We are working on a fix that should be ready for the next release candidate. Thanks for the report!

@plokhotnyuk

This comment has been minimized.

Copy link

commented Apr 9, 2019

@rmannibucau yet another option to speed up the native image is using CPU features option, like:
-H:CPUFeatures=SSE4_1,POPCNT,AVX,AVX2,AES,CLMUL,MMX,SSE3,SSE4_2,TSC,CMOV

AFAIK There are different lists for different platforms. For AMD64 they are listed here

@thomaswue BTW, which options were used for building libgraal for RC15?

@thomaswue

This comment has been minimized.

Copy link
Member

commented Apr 23, 2019

Fix shipped in RC16.

@thomaswue thomaswue closed this Apr 23, 2019

@rmannibucau

This comment has been minimized.

Copy link
Author

commented Apr 24, 2019

Hi @thomaswue ,

rc16 is significatively better but still slower than oraclejdk or openjdk:

Oracle 1.8u191

real	0m0,141s
user	0m0,154s
sys	0m0,038s


Zulu 1.8u202

real	0m0,116s
user	0m0,163s
sys	0m0,013s


Graal 14

real	0m0,198s
user	0m0,314s
sys	0m0,034s


Graal CE 15

real	0m0,200s
user	0m0,335s
sys	0m0,027s


Graal EE 15

real	0m0,117s
user	0m0,185s
sys	0m0,041s


Graal CE 16

real	0m0,130s
user	0m0,248s
sys	0m0,021s
@thomaswue

This comment has been minimized.

Copy link
Member

commented Apr 24, 2019

The measurements you show have a high level of noise - i.e., running them multiple times will give you very different results. Can you do these measurements 100 times for each config and create some average/medium metric (potentially removing outliers)?

@rmannibucau

This comment has been minimized.

Copy link
Author

commented Apr 24, 2019

@thomaswue I did and pasted an iteration representing well the executions. real and sys times are always close - excluding singular points - but the user time is also always ~+80ms.

@thomaswue

This comment has been minimized.

Copy link
Member

commented Apr 24, 2019

It seems like your metrics indicate that we are at least same level for "real" time:
Oracle 1.8u191 real 0m0,141s
Graal CE 16 real 0m0,130s

The "user" time could depend on how many compiler threads are started at which point in time. It is the less important metric. I will check if I can confirm higher user times on my machine.

@rmannibucau

This comment has been minimized.

Copy link
Author

commented Apr 24, 2019

@thomaswue yes, rc16 is definitively better than rc15. It is still a bit disappointing to not benefit from libjvm in terms of startup times but at least not being worth would be ok.

@thomaswue

This comment has been minimized.

Copy link
Member

commented Apr 24, 2019

For substantially improved start-up time for your user application, we recommend creating a native image. Libgraal is only creating such AOT-compiled image for the compiler itself. This can improve start-up if your application benefits from faster compilation times; and it substantially improves start-up versus vs our previous versions where the compiler had to dynamically warm up in parallel to the application.

@rmannibucau

This comment has been minimized.

Copy link
Author

commented Apr 24, 2019

@thomaswue as mentionned earlier going native is not an option for the app and has a lot of pitfalls currently (not being portable anymore and requiring to cross compile vs just relying on prebuilt JVM, not supporting classpath extensibility, not supporting several libraries which rely on classpath etc...). So my goal is really to get the best of both world having an optimized boot for the JVM classes and a "normal" mode for user code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.