Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node.js vs Graal.js Performance #74

Open
weixingsun opened this issue Nov 15, 2018 · 28 comments
Open

Node.js vs Graal.js Performance #74

weixingsun opened this issue Nov 15, 2018 · 28 comments
Assignees
Labels
Node.js Relevant for Graal.js' Node.js recast performance Performance of the engine (peak or warmup)

Comments

@weixingsun
Copy link

Dude,

I came across GraalVM and have a glance at the JVM options part, and thought it is promising,
But I found that the performance is much lower than latest Node.js, here is the result:
https://github.com/weixingsun/perf_tuning_results/blob/master/Node.js%20vs.%20GraalVM

Any idea about the difference?

@wirthi wirthi self-assigned this Nov 15, 2018
@wirthi
Copy link
Member

wirthi commented Nov 15, 2018

Hi @weixingsun

thanks for your question. I am trying to understand what your benchmark script (test_graal.sh) is doing. It obviously does something, and terminates after ~130 seconds on my machine, but CPU utilization is <1% most of the time, so that does not look like a reasonable benchmark to me.

I can execute the application itself (node application.js) and benchmark it with a tool like wrk, that really stresses the fib calculation. With that I get the following numbers:

  • GraalVM: 0.93 requests/sec
  • Node.js (10.9.0): 0.80 requests/sec

On that benchmark, GraalVM even outperforms Node. But note that your fib calculation blocks the event loop, so you can only do one calculation at a time, serve only one request at a time (you usually want to avoid exactly that when using Node.js) - all requests are serialized and calculated one after the other. So you are measuring hardly any Node.js/express code - this benchmark almost exclusively measures core JavaScript via the fibonacci calculation (for a 30 sec benchmark, only 28 iterations are run through Node.js/express; the time is spent in the fibonacci function - which is fine, if you want to measure pure Javascript core performance).

I am using wrk -t5 -c10 -d30s http://localhost:8080/fib to measure (that's my typical Node.js benchmark setting; using 5 threads and 10 connections is actually overkill on this serialized benchmark, as stated above).

Can you please help me understand what you try to measure with the test_graal.sh script? Maybe I am missing something.

Best,
Christian

@weixingsun
Copy link
Author

@wirthi thanks for your reply, I just want to saturate a certain core in my server.
the main workload of 2 simple get methods: fib/fast as an iteration in parallel.
By using this method, I can easily see how long time the 100 continuous iterations take.

which I can see they occupied 100% user cycles, which means I created a bottleneck on cpu3:
[root@dr1 cpu_bond]# mpstat -P 3 3 3
Linux 3.10.0-862.11.6.el7.x86_64 (dr1) 11/15/2018 x86_64 (112 CPU)

07:27:43 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
07:27:46 PM 3 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
07:27:49 PM 3 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
07:27:52 PM 3 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: 3 100.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

@weixingsun
Copy link
Author

oops, test_graal.sh is creating bottleneck on cpu 2, log above is for test_v8.sh

@woess
Copy link
Member

woess commented Nov 15, 2018

We have two execution modes, "native" (default) and "JVM" (see https://www.graalvm.org/docs/reference-manual/languages/js/ for more information). Setting jvm options switches to the JVM.
Currently, fibonacci is significantly faster in native mode, try running without jvm options.

@weixingsun
Copy link
Author

weixingsun commented Nov 15, 2018

@woess Thanks for explaining the modes, but I got 186.197s after removing all the jvm options. what vm is underneath? Nashorn or GraalVM?

perf record gave me following stacktraces:
Samples: 1K of event 'cycles:ppp', Event count (approx.): 53465728779
Overhead Command Shared Object Symbol
2.00% node perf-29614.map [.] 0x00007fd3e71580cb
1.55% node libpolyglot.so [.] com.oracle.truffle.js.nodes.function.FunctionBodyNode.execute(com.oracle.truffle.api.frame.VirtualFrame)java.lang.Object
1.42% node libpolyglot.so [.] com.oracle.svm.core.genscavenge.GCImpl.blackenBootImageRoots()void
1.21% node perf-29614.map [.] 0x00007fd3e7158242
1.20% node perf-29614.map [.] 0x00007fd3e71588ec
1.14% node perf-29614.map [.] 0x00007fd3e7158000
0.99% node perf-29614.map [.] 0x00007fd3e71583f0
0.93% node perf-29614.map [.] 0x00007fd3e715859d
0.92% node perf-29614.map [.] 0x00007fd3e7158007
0.82% pilerThread-156 libpolyglot.so [.] org.graalvm.collections.EconomicMapImpl.grow()void
......

@EdwardDrapkin
Copy link

I was curious as well, so I figured I could provide a real life benchmark, running a webpack build. This was entirely unscientific and the tests were only run once.

The results were surprising. Here are the relevant files: https://gist.github.com/EdwardDrapkin/d1b380787821462c5677323614f20146

The results wound up:

Node 11:

real	0m3.361s
user	0m4.747s
sys	0m0.396s

Graal native:

real	1m18.097s
user	2m51.988s
sys	0m13.533s

Graal JVM:

real	1m5.169s
user	5m21.155s
sys	0m4.549s

Graal JVM with --jvm.XX:+UseG1GC:

real	1m13.938s
user	6m37.463s
sys	0m4.333s

@wirthi
Copy link
Member

wirthi commented Feb 24, 2019

Hi @EdwardDrapkin

thanks for sharing your benchmark. I am no expert on webpack - I guess the modules/pp3/ is the actual thing you pack? You didn't provide that in your gist?

Note that, unlike the peak-performance benchmark weixingsun posted above, your's is heavy on startup - it's a one-time executed tool. If even original Node finishes that in 3 seconds, Graal-Node.js will have a hard time of keeping up with that. Graal-Node.js requires more time to JIT-compile the source code it gets. This makes it slower on workloads like npm, webpack or similar - anything that runs only for a short time, and only once. However, a factor of >20 as you experience it is more than we usually see.

If I could reproduce your run fully, I'd love to look into it and see if there is anything we can optimize for.

Best,
Christian

@EdwardDrapkin
Copy link

EdwardDrapkin commented Feb 25, 2019

I can't provide the actual source code we use at work, but it's a fairly straightforward React project. You'd get similar results if you copied any react project in there. I will note that I switched the TS language service in IntelliJ to use GraalVM instead of NodeJS, and while it's exceptionally painful for a good long while, after about an hour it feels faster but AFAIK there's no way to benchmark proprietary IntelliJ plugins.

@wirthi wirthi added Node.js Relevant for Graal.js' Node.js recast performance Performance of the engine (peak or warmup) labels Mar 8, 2019
@i-void
Copy link

i-void commented Mar 21, 2019

Create a simple Nuxt.js project with selecting yarn package manager as default. And run yarn run dev, you simply don't need any benchmark results. Graal is slower 3min or more for a simple build. For complex projects over 350 modules difference goes up to 10-15min just for build. This is not in an acceptable range to use this. Also it gives errors and cannot start.

@re-thc
Copy link

re-thc commented Jul 20, 2019

Is startup performance not going to be considered? Having to run both graaljs and nodejs in parallel is going to be confusing. I thought the point of graal was to have 1 tool that does it all and have the interop?

@wirthi
Copy link
Member

wirthi commented Jul 24, 2019

Hi @hc-codersatlas

we are currently working on significant startup improvements by AOT-compiling larger parts of the Node.js codebase. This is a significant engineering effort though, so it takes a while.

Best,
Christian

@4ntoine
Copy link

4ntoine commented Aug 15, 2019

Hey, i'm also interested in it.

I've just measured node.js vs graalvm's node performance and the latter is 10-20x slower.
Any possible reason or optimizations turned off? I think i will be able to provide the sources for benchmarking or do proper benchmarking (for now just replaced calls of node to graalvm's node without any additional arguments).

graalvm-ee-19.1.1
node.js v8.9.0

@thomaswue
Copy link
Member

Is this for startup or peak performance? Can you share the workload as suggested?

@4ntoine
Copy link

4ntoine commented Aug 16, 2019

Hi, Thomas. Thanks for reply.

It's rough time of execution in millis of exactly the same code on node and graalvm's node (startup time excluded from measurement). It includes processing of stdin, parsing (string + regexp operations mostly), objects instantiating and calling object methods with some business logics.

I think i will be able to provide the code, will doublecheck it.

@4ntoine
Copy link

4ntoine commented Aug 16, 2019

graalvm-bechmark.zip

Just run run.sh with node on PATH:

./run.sh

or graalvm node on PATH, eg:

PATH=/Users/asmirnov/Documents/dev/src/graalvm-ee-19.1.1/Contents/Home/bin:$PATH ./run.sh

It will clone required JS code, prepare data and run benchmarking, see actual execution time.
Let me know if you need any assistance or find the rootcause

@wirthi
Copy link
Member

wirthi commented Aug 23, 2019

Hi @4ntoine

thanks for your code. I confirm we can execute it and measure performance.

Your benchmark does not consider warm-up. To mitigate that, you can put a loop around the core of your benchmark (lines 153ff in benchmark.js) and measure each iteration independently - however, that might not give exact results due to caching in the code. We are working on some micro-benchmarks to better measure the performance. But it seems we are within 2.5X of origin Node if you account for the warmup.

Also, note that running in JVM mode (node --jvm benchmark.js) gives a better peak performance than in native mode.

We'll get back to you once we know more. Also, improving our warmup performance is high up our list, so that should get better over the next releases.

Best,
Christian

@4ntoine
Copy link

4ntoine commented Aug 27, 2019

Hey.

Thanks for the update.

caching in the code

Yup, there is some caching and i can modify it to avoid side effect of caching for better benchmarking.

But it seems we are within 2.5X of origin Node if you account for the warmup.

Does it mean you target to have 2.5x worse performance compared to Node?

@thomaswue
Copy link
Member

thomaswue commented Aug 27, 2019

Our target is to be at least comparable speed or better for any workload. This is a longterm target however and we aren't there yet for Node.js applications.

@In-line
Copy link

In-line commented Nov 11, 2019

Running graal/bin/node yarn start in React project is significantly slower, than with stock NodeJS.

It takes around 15minutes to start compiling and I didn't wait after that.

Stock node does that in around ~2 minutes.

@frank-dspeed
Copy link
Contributor

I think it should maybe get documented that node-graalvm is not as optimized at present as it could be and that at present the startup time is higher and the performance is slower for none long-running processes. so that not everyone is shocked.

@thomaswue
Copy link
Member

Agreed that we should put the information on startup into the documentation. On peak performance it is not so clear as there are also workloads where we are faster.

@Ivan-Kouznetsov
Copy link

Ivan-Kouznetsov commented Aug 18, 2020

I note that there are cases where GraalVM Node.js performs slower than Node.js after many iterations of the same task, which does not appear to be caused by start up time. I created a repo that illustrates that GraalVM performs slower than Node.js at:

  1. Regex (1000 iterations of regex-redux task)
  2. JSONPath queries (1000 iterations using 2 different libraries)
  3. HTTP GET requests (10,000 iterations using lightweight library)

I hope you will find it useful: https://github.com/Ivan-Kouznetsov/graalvm-perf

@frank-dspeed
Copy link
Contributor

@ivan you need to calculate that right NodeJS will most time be faster in that cases but when you replace Regex with the Regex from java and the JSONParser and Query Element with the Java one and the HTTPGet Method with that from Java you outperform NodeJS by Far.

@4ntoine
Copy link

4ntoine commented Oct 3, 2020

@thomaswue

Our target is to be at least comparable speed or better for any workload. This is a longterm target however and we aren't there yet for Node.js applications.

Are we there at the moment? Any benchmarks/comparisons available? Thanks

@frank-dspeed
Copy link
Contributor

@4ntoine the state is still the same everything that uses nodejs modules from node-graaljs is slower

if you use Only Java or Javascript it is faster.

@wirthi
Copy link
Member

wirthi commented Nov 19, 2020

Hi @Ivan-Kouznetsov

thanks for your benchmarks, they provide relevant insight! And they show the fundamental misconception, which is

after many iterations of the same task

1000 iterations of something is not "many" in the JIT world. As per your documentation (original) Node.js would need 0.120s for the full jsonpath-classic-benchmark.js. GraalVM is in the Java world - and there, it would take a few hundred milliseconds to even start the JVM, let alone execute the benchmark. Thanks to native-image, we can be faster on GraalVM, but the same basic principle still applies: we need to JIT-compile the code, and that won't fully happen within 120 milliseconds.

I've hacked some proper warmup into your benchmark, like this (e.g. for jsonpath-classic-benchmark.js):

const jsonPath = require('./lib/jsonPath');
const n = process.argv[2] || 10000;

function test() {
  const sampleObj = {name:"john",job:{title:"developer", payscale:3}};
  var len=0;
  for(let i=0;i<n;i++){
    len += jsonPath(sampleObj,"$..name").toString().length;
    len += jsonPath(sampleObj,"$..payscale").toString().length;
    len += jsonPath(sampleObj,"$..age").toString().length;
  }
  return len;
}

var i=0;
while (true) {
  var start = Date.now();
  console.log(test());
  console.log(++i+" = "+(Date.now()-start)+" ms");
}

Basically, I am executing your full benchmark repeatedly, and print out how long each iteration takes:

GraalVM EE 20.3.0

$ node jsonpath-classic-benchmark.js 
100000
1 = 2485 ms
100000
2 = 2267 ms
100000
3 = 427 ms
100000
4 = 209 ms
100000
5 = 185 ms
100000
6 = 148 ms
100000
7 = 177 ms
100000
8 = 143 ms
100000
9 = 146 ms

compared to Node.js 12.18.0

$ ~/software/node-v12.18.0-linux-x64/bin/node jsonpath-classic-benchmark.js 
100000
1 = 253 ms
100000
2 = 221 ms
100000
3 = 217 ms
100000
4 = 197 ms
100000
5 = 199 ms
100000
6 = 208 ms
100000
7 = 207 ms
100000
8 = 197 ms
100000
9 = 213 ms

Admitted, GraalVM's first 2 iterations are horrible. Iterations 3 and 4 are in the ballpark of V8. Starting with iteration 5, GraalVM is actually significantly (around 25%) faster than V8.

There's one more trick up our sleeve. In --jvm mode, the first iterations are even slower, and it takes longer to reach a good score. But after ~20 iterations, we are down to around 60ms per iterations, meaning GraalVM in JVM mode takes 0.3x the time of V8 per iteration.

On the jsonpath-new-benchmark.js, GraalVM and V8 are roughly on par.

On regexp-benchmark.js, our engine is around 3-4x behind. Will complain with our RegExp guy to optimize this pattern :-)

Best,
Christian

@frank-dspeed
Copy link
Contributor

@wirthi #360 (comment) maybe makes this obsolet as this performance degradations are now less a problem then before. Even npm is now not freezing anymore the string update is a hugh one combined with the new default boot mode

@Osiris-Team
Copy link

It would be cool if the jvm cached the generated binary from each class so that the warmup only would happen once and not on each program restart.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Node.js Relevant for Graal.js' Node.js recast performance Performance of the engine (peak or warmup)
Projects
None yet
Development

No branches or pull requests