GraalJS memory leak in Engine #121

mdsina · 2019-02-26T12:51:04Z

Hello.
I'm creating service like FaaS for JS scripts with limited API and found memory leak issues on prototyping different approaches while using Graal Polyglot.

Approaches that I've trying to implement:

Pool of Context with single Engine. Source evaluated once in Context. Any JS code wrapped to unique namespace, that can be accessible as member of Context and executed.
Any other invocations just execute existing member and nothing more
Single Engine, Sources are cached. On any new invocation we just getting existing Source and evaluate them in new Context that created for every invocation.
After that getting function member from Context and execute (as in previous approach)
New Engine and New Context for every invocation, Sources are built once.

Only third approach have no memory leaks, because GC will grab them out from heap.
In first and second approaches are memory leaks. Looks like Engine have issues on executing some code and generates much data for each invocation, that will be never removed from Engine.

I have a prototype that reproduces the first approach:
https://github.com/mdsina/graaljs-executor-web-service

Also I wrote a load tests that simulate real-world invocations of JS code:
https://github.com/mdsina/graaljs-executor-web-service-load-tests/tree/master
Service was executed with VM options:

-Xmx2g -XX:+UseG1GC

Looks like links to ContextJS never be removed.

The text was updated successfully, but these errors were encountered:

wirthi · 2019-02-26T13:46:46Z

Hi @mdsina,

thanks for your question.

I've not looked at your code yet, but a very generic answer of the best practice when you want code to be cached in principle is:

Have one Engine, that you share among all your contexts.
Create a fresh context for ever (independent) execution, AND .close it!
For (only) those sources you want to share, enable caching (public Source.Builder cached(boolean cached)) Note that you want to reuse the identical source object for sharing to work, otherwise this could lead to a memory leak.

http://www.graalvm.org/docs/graalvm-as-a-platform/embed/#enable-source-caching

As you don't see any memory leak in your third option, I don't think this is a memory leak in our core JS engine code. Before further investigating, please make sure you properly share the engine, close the contexts, and only put the caching flag on sources that you actually want to be cached.

Thanks,
Christian

mdsina · 2019-02-26T14:16:46Z

Thanks for your answer, @wirthi
Of corse I do what documentation say in second approach:

One Engine
Contexts are closed after invocation (through try-with-resource statement)
Every shared sources are built with cached option

The one thing that I have is that when I build Context and put to JS bindings some members, like require or another one that are present like a singleton Java object (my example have that)

The last one have no problems with memory leaks is obviously. Because Engine exactly keep links to some Context object or what in self. If Engine was destroyed after every invocation so there are no problems
For the second approach I take the example from documentation https://www.graalvm.org/docs/graalvm-as-a-platform/embed/#enable-source-caching
But I want to warm up the contexts and keep them, so that's because I implements the first one approach.

wirthi · 2019-03-28T12:46:57Z

Hi @mdsina

We have fixed a memory leak problem around Objects with lots of (different) properties. This fix has landed in RC14 and might fix the problem you reported.

Best,
Christian

wirthi · 2019-05-20T11:28:55Z

Hi @mdsina

can you confirm this problem is solved for you by a newer Release (RC14 or later, or GraalVM 19).

Thanks,
Christian

blazarus · 2019-05-20T17:59:45Z

I hit what I think was the same problem on rc12 and tried upgrading to rc16 and still saw the memory leak. I haven't tried on v19 yet.

mdsina · 2019-05-21T10:00:41Z

@wirthi
The problem still presents on warming of graal context with caching

mdsina · 2020-01-05T07:10:51Z

Checked this out on graal 19.3.0. The same.
Also I forgot to attach heapdump initially. So where it is: https://drive.google.com/file/d/1pGQBM8pEFar1bXK7-3SIbOVXiCMCKgV-/view?usp=sharing

dariol83 · 2020-08-31T19:15:55Z

I believe we are both hitting #268

I am wondering if there is a release plan for GraalVM and if this issue is going to be checked. I can provide some help, eventually.

wirthi · 2020-09-01T07:44:13Z

Hi,

what would be super-helpful for us would be an example to easily reproduce. Ideally just one Java file without any other dependency. From what I read from your descriptions (especially in #268) stems from DynamicObjects/Shapes, i.e. from Objects on the JavaScript side being created. So, in theory, the problem should be reproducible by something as simple as (pseudocode):

while (true) {
  Context ctx = new Context();
  ctx.eval("js", " //create your JavaScript object here, that you think should be GCed once the context is closed");
  ctx.close();
}

Best,
Christian

dariol83 · 2020-09-01T11:38:39Z

Hi Christian,
I will try to come up with a simple project with a single JUnit test and try to reproduce the problem there. I will put it on Github and share the link if I succeed :)

Thanks for the help,
Dario

dariol83 · 2020-09-06T08:44:20Z

Hi again,

I did my best to try to reproduce the leak I see in my application (#268), but without success. What you see in the attached test application is basically what I exactly do in my software but with this test application there seem to be no leak.

The only difference is that in my application the returned object is saved in memory for some time, while in the test application the returned object is not really used. But I can't believe that the problem is because of this, since typically these objects are primitive objects. Also, in my original application, if I do not cache the Engine object and I put it in a try-with-resources, I see no leak (but performance are bad) while if I cache it, I start seeing the leak... that I do not see in this test application!

In an attempt to work-around the leak, I even cached the Engine object as a SoftReference, so that in case of high memory consuption, the GC could still reclaim it. If I do that, I still see the leak, even though I see via VisualVM that the memory (and the number of related Shape objects) actually goes down a few times due to GC kicking in. I suspect that, since nobody is actually closing the Engine, something remains still laying around somewhere cached in the GraalVM JS implementation...

So I am actually clueless... maybe the problem is somewhere else, but I find it hard to believe, if changing literally two lines of code in a single class (caching the Engine and the Source, or not doing it) makes the problem to appear or disappear.

It is sad, because GraalVM JS is the script engine that provides so far the best performance I could find out there...

TestGraalVmJsLeak.zip

horschi · 2020-09-06T11:34:45Z

Hi,

I also seem to have GraalVM/JS related memory leak, but with different objects:

Here are the first entries of my jmap histogram:

----------------------------------------------
   1:       5380455      473480040  java.lang.reflect.Method
   2:       6100005      417736320  [C
   3:       5686429      232862248  [Ljava.lang.Object;
   4:       9428532      224668864  [Ljava.lang.Class;
   5:       4617418      184696720  java.util.LinkedHashMap$Entry
   6:       5370855      171867360  com.oracle.truffle.polyglot.HostMethodDesc$SingleMethod$MethodMHImpl
   7:       6007220      144173280  java.lang.String
   8:       5815221      139565304  sun.reflect.generics.tree.SimpleClassTypeSignature
   9:       5539169      132940056  java.util.ArrayList
  10:       5816264      106812120  [Lsun.reflect.generics.tree.TypeArgument;
  11:       3627752      101215200  [Ljava.lang.reflect.Type;
  12:       5495914       87934624  sun.reflect.generics.tree.ClassTypeSignature
  13:       1432177       57287080  sun.reflect.generics.repository.MethodRepository
  14:       1438703       46038496  sun.reflect.generics.tree.MethodTypeSignature
  15:       1438703       41626424  [Lsun.reflect.generics.tree.TypeSignature;
  16:       2153832       40182336  [Lsun.reflect.generics.tree.FieldTypeSignature;
  17:        114918       36246592  [Ljava.util.HashMap$Node;
  18:        877581       35103240  java.lang.invoke.BoundMethodHandle$Species_LL
  19:       1439608       34550592  sun.reflect.generics.factory.CoreReflectionFactory
  20:       1432184       34372416  sun.reflect.generics.scope.MethodScope
  21:       1439262       23060408  [Lsun.reflect.generics.tree.FormalTypeParameter;
  22:        377851       21159656  java.util.LinkedHashMap
  23:        717713       17225112  sun.reflect.generics.reflectiveObjects.ParameterizedTypeImpl
  24:        638514       16625136  [Lcom.oracle.truffle.polyglot.HostMethodDesc$SingleMethod;
  25:        282888       15841728  java.lang.invoke.MemberName
  26:         56240       12942152  [B
  27:        638514       10216224  com.oracle.truffle.polyglot.HostMethodDesc$OverloadedMethod
  28:        282182        9029824  java.lang.invoke.DirectMethodHandle
  29:        354505        8508120  sun.reflect.generics.tree.Wildcard
  30:        265330        8490560  sun.reflect.generics.reflectiveObjects.WildcardTypeImpl
  31:        221932        7101824  java.lang.invoke.BoundMethodHandle$Species_L
  32:        163952        6558080  java.lang.invoke.BoundMethodHandle$Species_L3
  33:         23420        5268936  [I
  34:        278113        4449808  sun.reflect.generics.tree.TypeVariableSignature
  35:        110947        4437880  java.util.WeakHashMap$Entry
  36:         55318        4425440  java.lang.reflect.Constructor
  37:        130141        4164512  java.lang.ClassValue$Entry
  38:         64822        3630032  com.oracle.js.parser.ir.IdentNode
  39:         89169        3566760  com.oracle.truffle.polyglot.HostClassDesc$Members
  40:        109761        3512352  com.oracle.truffle.polyglot.HostClassDesc
  41:         58849        2824752  java.lang.invoke.BoundMethodHandle$Species_L5
  42:         23138        2568224  java.lang.Class
  43:         25978        2493888  com.oracle.truffle.object.ShapeBasic
  44:         68815        2202080  java.util.HashMap$Node

I made sure to call close on my Contexts (I even added a finalizer which checks if it was called), but I still have the JVM going OOM after some time.

I will try to gather more information. My idea is to try to check for ThreadLocals using reflection. My assumption would be that something is still held in ThreadLocals. Does anyone perhaps have more/better ideas how to approach this?

regards,
Christian

dariol83 · 2020-09-06T12:00:02Z

Hi Christian,

The only thing that I can say with some reasonable confidence is that, if you close the Engine object, the memory is properly deallocated. Using a SoftReference to an ad-hoc class holding the reference to Engine, and adding a finalizer to close the Engine object when the holder class is marked for GC helps in keeping the memory under control.

As I wrote, any attempt to reproduce the behaviour in a unit test failed on my side... :(

Best regards,
Dario

horschi · 2020-09-06T14:47:16Z

Hi,

@dariol83 : Perhaps its some kind of code cache in the engine, that is growing? I tried to have changing code in my test, but also failed reproducing the behaviour. Might I ask what are the objects on your heap? (e.g. like shown in jmap -histo)

In my application, even though I do

		context.close(true);
		context.getEngine().close(true);

, I still get a growing heap in my application and a memory histogram like posted above.

regards,
Christian

dariol83 · 2020-09-06T15:02:07Z

Hi Christian,

You can have a look here: #268 (comment)

Best regards,
Dario

horschi · 2020-09-06T15:32:26Z

One thing I figured out:

For every transaction in my app (for which a create a Context/engine and close it again) I get exactly one entry of com.oracle.truffle.polyglot.HostClassCache and org.graalvm.polyglot.HostAccess, which are never freed.

Edit: Turns out, this is what happens when you create a new HostAccess instance for every Engine. Using a static final HostAccess fixes the leak. Strangely, this was not an issue with my test-program. I was doing the same thing there, but had no leak.

dariol83 · 2020-09-10T11:12:17Z

Hi Christian,

Could you please post a snippet of code documenting exactly how you would use a cached HostAccess? So that I can test it here as well and confirm or not the finding.

Thanks a lot in advance!

horschi · 2020-09-10T11:20:05Z

Hi @dariol83 ,

I simply hold the host-access as a static final variable:

private static final HostAccess	HOSTACCESS	= HostAccess.newBuilder().allowArrayAccess(true).allowPublicAccess(true).build();
...
context = Context.newBuilder(lang).allowHostAccess(HOSTACCESS).build();

Before I had the following, which was leading to a memory leak:

context = Context.newBuilder(lang).allowHostAccess(HostAccess.newBuilder().allowArrayAccess(true).allowPublicAccess(true).build()).build(); // this is bad

regards,
Ch

dariol83 · 2020-09-10T11:23:30Z

Hi Christian,

I see, thank you. In my code I am using a straightforward

Context context = Context.newBuilder()
                    .engine(jsEngine)
                    .allowAllAccess(true)
                    .build()

and I was getting the leak. I will try to use a custom cached HostAccess and see if something changes. I will report here.

Best regards,
Dario

dariol83 · 2020-09-13T08:10:06Z

I applied the change, but the leak on the ShapeBasic objects and the related other objects is always there.

horschi · 2020-10-29T15:10:38Z

@dariol83 : Keeping an eye on this for a while now: On our production I seem to have the same issue (also lots of ShapeBasic objects). But it only there. I was not able reproduce it locally (even though I tried with the same data & requests). Perhaps it only happens when multiple threads are involved (which I did not test so far)? Have you tested such scenarios, or perhaps others ?

wirthi · 2020-11-26T16:42:26Z

Hi @dariol83

thanks for your reproducer application (that does not reproduce the problem, unfortunately). I had a quick look at it and can confirm I don't see any immediate problem there. But let me try a few remarks:

you are not explicitly closing the context. That should not be an issue, they are auto-closeable and you have them in a try-with-resource, so it should be fine.
you say you return (and store) an object sometimes. What kind of object is that? A Java object, or a JavaScript object? The latter does retain a link to the context, so this might be a problem. You can anyway not share them between threads (at least, not execute them in other contexts). I assume this is not the case, but you are talking about Java objects?
you share the Engine in order to profit from code caching (better performance) I assume? If not, you could try not to share the Engine, and see if the problem persists.
you could also try, just for the purpose of hunting this bug, to close your Engine and create a new one every X iterations. That might degrade the performance around that time, but if the problem remains the same, it needs to be somewhere else.
Sources: you seem to create 16 Sources overall (4 ExpressionDefinitions per Thread, times 4 threads). In this example, it would suffice to create only 4 (and share them). Again, does not make a difference, just noting it. Might be different with more threads and more sources.

My best guess is around JavaScript objects and Shapes being created. Thus, to get your example into a form where it exposes the problem, can you maybe try to add returning the object as you state, and see if that changes anything? In the current example code, you are not creating any objects dynamically.

Best,
Christian

dariol83 · 2020-11-26T18:14:07Z

Hi @wirthi

My apologies that I am not progressing on this investigation, I had personal issues recently but I will try to have a new analysis starting from your suggestions:

"you say you return (and store) an object sometimes" -> Yes, these are typically numbers, booleans or Strings. Checking with the debugger I can see that the returned classes are java.lang.Integer, java.lang.Long and so on, so no Javascript objects.
"you share the Engine in order to profit from code caching (better performance) I assume? If not, you could try not to share the Engine, and see if the problem persists" -> I have one Engine per "processing object", and the engine is used to evaluate the same expression every time, even though it has different values for the bindings. Threads in a thread pool access the various objects, but each object can be accessed only by a single thread at a given point in time.
"you could also try, just for the purpose of hunting this bug, to close your Engine and create a new one every X iteration" -> I will try and report, indeed. This is a clever idea: actually I know that closing the Engine after each object evaluation frees the memory. In my current 'workaround' solution I put the Engine object into a wrapper, which has the finalize() method overwritten to close the Engine and the wrapper is referenced via a SoftReference: in case the memory pressure becomes high, then the SoftReference becomes eligible for GC and the finalisation of the wrapper closes the Engine. It is not elegant but, with this approach, I can tell you that I have no "leak".

I will keep digging here.

Thanks for all the valid points!

Best regards,
Dario

dariol83 · 2020-12-13T11:25:26Z

Hi again,

I tried to do the following tests:

I am still using an Engine per expression;
I am using a counter X, so that each Engine is closed after X evaluations (per object)

See attached code:

If I set X to 1, then I see no memory leak: performance are slow but there is no apparent leak. The allocation of Truffle objects goes up and down, and they seem to be properly collected (see the sawtooth profile and the memory around 2GB). This is in line with the test I already did in the past, so no surprise here.

If I set X to 2, then the application becomes memory hungry and the memory usage jumps to 8 GB (max memory).

If I set X to 3, same thing:

If I use 100, it gets worse, obviously:

I don't know what I am doing wrong in my 20 lines of code, but the impression is that the Engine is constructing some objects, which are not disposed at the end of the evaluation.

xardasos · 2021-07-12T12:23:09Z

Hi,

I observed similar symptoms to the ones described by @horschi. Also his solution worked in my case (btw thanks Christian!).

I prepared a reproducer project for this memory leak
https://github.com/xardasos/GraalVMMemoryLeak.
@wirthi could you please take a look at it? On my PC after around 23000 iterations I get OOM error (VM options: '-Xmx512m').

Regards,
Tomasz

horschi · 2021-07-12T12:32:36Z

@xardasos Glad I could help :-) Since I made the Hostaccess static, things have been smooth.

I see you are using graal version 19... in your test. Have you tried with newer versions also?
Currently I am using 20.3.1.2, which is fine. I tried 21.0.0.2 , but it was leaking memory in my application.

Edit: whatever issue I am having in 21.0.0.2, I cannot reproduce it in a simple project.

xardasos · 2021-07-13T08:58:39Z

I also tried to run my test with 20.3.2 and 21.1.0 (java 11), the leak is still there. I also checked this 20.3.1.2 version (java 11), but the result is the same.

wirthi · 2021-07-14T16:58:55Z

Hi @xardasos

thanks for your example.

Regarding the HostAccess, you answer yourself in the code example: don't create that for each iteration. There is no need for that; if you cannot use one of the existing configurations (HostAccess.ALL, HostAccess.EXPLICIT, etc.), then create a static one and reuse that across all your engines/contexts.

You create an org.graalvm.polyglot.Engine every iteration. Again, this is not necessary. You can create just one engine and share that across all Contexts.

Finally, is there a specific reason why you use GraalJSScriptEngine? You might have to use that if you interact with code that expects a ScriptEngine. But if that code is under your control, you are better of with just using a org.graalvm.polyglot.Context directly.

Engine polyglotEngine = Engine.newBuilder().build();
for (int i = 1; i < 1000000; i++) {
    System.out.println(i);
    Context ctx = getContextBuilder().engine(polyglotEngine).build();
    ctx.eval("js", SCRIPT_BODY);
    ctx.close();
}
polyglotEngine.close();

6.6 MB heap usage after ~300.000 iterations of your loop.

Edit: moving the Engine out of the loop and sharing it is the one change that seems to make the difference. Even with re-creating the HostAccess every time and using GraalJSScriptEngine, there is no leak, if you reuse the engine. Creating a new Engine every time creates the leak.

Best,
Christian

boris-petrov · 2021-07-24T21:21:46Z

I would like to chime in with my observations. I'm using the latest Graal. For my tests I used to not use the JIT compiler but rather the "default" settings (the interpreter) and all was fine. The latest version (from a few days ago) started outputting some warnings so I enabled JIT compilation for the tests (-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:-UseJVMCICompiler). Then, I started getting OOM failures.

I use a single Engine, create a new Context with it for every execution (using a try-with-resources block to close it after that) and use shared Source objects. At the end of the tests the engine is closed. Digging in I noticed the following: even though the engine is closed, there are a few threads named TruffleCompiler that were running at 100% CPU for quite a while after that - a couple of minutes. Memory usage spiked at that time (and mostly after all were done). Running GC after that cleared all the memory - but before that there were many GBs taken.

Perhaps that's one source of memory "leakage" that people observe. It is indeed cleared after a GC run but why don't the compiler threads stop immediately after I close all Contexts and the Engine? Otherwise that's kind of useless - I stop the Engine, my other tests continue and I get an OOM because Graal is still compiling stuff I won't need and my memory is exhausted. 😄

limanhei · 2021-07-29T01:03:29Z

I have the same observation as boris-petrov, I have a rest API that call Graal js script engine. I also used a single Engine, create a new Context with it for every execution, and shared source object. I ran a test with 100 concurrent connections to the API. It was very strange that the heap size shoot up to 2.5GB only after all the call was complete, and after garbage collection after a few a min the heap size shoot up again to 4G. As it is a rest API, it absolutely no other threads should consume the heap space. Tried on the latest Graalvm 21.2.0 version, it still have this issue.

Running on Java 11 with Graal JavaScript engine lib (https://mvnrepository.com/artifact/org.graalvm.js/js) does not have such issue, I am using this solution for now, but I hope this issue could be resolved so that I can have improved performance on my application.

limanhei · 2021-08-19T07:32:34Z

is there any update on this issue please？

Tried store Java with upgrade module path method to enable graal compiler, still got the same memory leak issue. Not providing upgrade module path will not have memory leak, but the performance is worst than Nashorn.

limanhei · 2021-11-17T01:09:57Z

I have to increase the memory to 6GB to prevent OOMKill, the memory usage shoot up to 6GB for a while and then came down.

wirthi · 2021-11-17T08:04:34Z

Hi @limanhei

is there any update on this issue please

We will most likely only be able to help you if you provide an executable example that exhibits the behavior, best in a minified version that has the code around the Context and Engine creation and a mockup code. If we can run your example on our machines and see the leak, figuring out the source is trivial. Another option is a heap dump (e.g. from VisualVM). Not quite as easy, but still possible to track the issue then.

Only thing I can see in your heap snapshot is that the GraalVM compiler is still active (lots of org.graalvm.compiler.* entries). This means that the JIT-compilation has not yet finished. Either your codebase is large enough so that it just takes a while for everything to be compiled, or there could be a deoptimization loop (that would most likely be a bug on our side, where a certain pattern does not stabilize and e.g. Graal.js repeatedly sends new patterns to the compiler to optimize) - that would be a peak performance problem, but typically not a memory leak.

In general, we are not aware of any open memory leaks, so we don't have any clue what to investigate. As I have written above, typically we find that the Context or Engine API is used wrongly so that it is harder or impossible for the garbage collector to clean up. So seeing your actual usage of the API might give us a hint what is wrong.

Tried store Java with upgrade module path method to enable graal compiler, still got the same memory leak issue. Not providing upgrade module path will not have memory leak, but the performance is worst than Nashorn.

The best possible solution is to use a GraalVM directly; then you should not have any troubles with setting the correct paths. If that is not possible, https://www.graalvm.org/reference-manual/js/RunOnJDK/ and https://github.com/graalvm/graal-js-jdk11-maven-demo should show you how to properly set up Graal.js and the GraalVM compiler on a stock JDK installation.

Best,
Christian

limanhei · 2022-06-28T15:14:36Z

Hi Christian,

The javascript codebase itself is 2.4M large, which consist of 373 functions in 14364 lines. Does it consider to be large? I tried to replicate the issue in my own code but due to security policy I couldn't create the minified version with this javascript codebase. Hope you can provide some optimization for large codebase so that I can move my application to graalvm. Thanks a lot for your help!

The way I eval the source

private val engine = Engine.create()
private val source: Source
private val contextBuilder = Context.nextBuilder().engine(engine).allowHostAccess(HostAccess.ALL)

init {
source = Source.newBuilder("js", bigSourceString, "all_source").buildLiteral()
}

fun execute(executionObj) {
val context = contextBuilder.build()
context.enter()
context.eval(source)
val bindings = context.getBindings("js")
functionNames.forEach{
bindings.getMember(it).executeVoid(executionObj)
}
context.leave()
context.close()
}

wirthi · 2022-06-28T16:30:06Z

Hi @limanhei

that piece of code looks fine. You are using a shared engine, you create a new Context each time, you use a cached source code object. That should trigger source code caching and avoid recompilation problem.

Maybe you can call your java command with this argument: -Dpolyglot.engine.TraceCompilation=true. That should trigger output similar to the following:

[engine] opt done   id=24    :program                                           |Tier 2|Time   454( 327+127 )ms|AST   59|Inlined   2Y   0N|IR    643/  2011|CodeSize    6642|Timestamp 22119660537657|Src all_source:1

Each line means that a certain part of the guest-language program was compiled. This will happen a lot initially, but should happen less and less frequently. While this compilation is going on, memory usage will be higher. Once there is no more compilation, memory usage should be low (and except of some overhead, represent whatever your application needs).

With 2.4mb source code, that should take a few seconds, maybe up to a minute or two - but that highly depends on the actual patterns used in your code, how much of the code is execute repeatedly, how much is only initialization code, etc. Maybe let it run until there are no more [engine] outputs print and measure the time. That alone is an interesting measurement.

Two things additional things to learn from that output:

if you don't see it at all, you are not using our optimizing compiler. Performance will be bad
if it does not go away, the same methods keep getting compiled, you maybe even see repeated opt deopt - that is a sign for troubles that you should report independently (not a problem in itself, only if it keeps deoptimizing the same method repeatedly)

(Minor) I don't think you strictly need the context.enter() and context.leave, those should be implicit in your example (you create and destroy the context anyway). Unless there is some multithreading going on that you don't represent in your example?

Best,
Christian

limanhei · 2022-06-29T14:31:58Z

Hi Christian,

I saw 1 opt failed. GraphTooBigBailoutException: Graph too big to safely compile. Node count: 400002. Limit: 400000.

The compilation only started when I stopped the load test. Except the above failed compilation, others are opt done. After a while, the compilation still continue, and then the container got killed.

limanhei · 2022-06-29T15:56:11Z

I think the problem I encountered is that the JVMCI is actually using memory outside JVM, so increasing the memory of the container without lowering the MaxRamPercentage has minimal effect to the issue. After I increased container memory from 2G to 4G and lowered MaxRamPercentage to 50 percent I think it is stable. However, I am not sure how much memory is required for JVMCI and it left a risk of oomkill, is there anyway to limit its memory usage?

wirthi · 2022-06-30T07:03:53Z

Hi,

I saw 1 opt failed. GraphTooBigBailoutException: Graph too big to safely compile. Node count: 400002. Limit: 400000.

That it not critical - it means, that our compiler was not able to compile that method as it became to big. That could have a performance impact if it is a frequently executed method, but does not cause any correctness or memory issues. Can you state what method is reported to be affected? If that was from a public library (or you could share the relevant code), we might be able to look into the problem.

The compilation only started when I stopped the load test

I am not sure what exactly you mean with that, but in general, compilation of JavaScript methods should begin as soon as you execute them repeatedly. Maybe you just execute initialization code before, that does not contain any repeated function calls or loops in JavaScript?

What also could be the case here is that compilation happens on a separate thread. If you saturate the available cores with other high-priority threads, the compilation threads would starve and not have a chance to do their job.

... then the container got killed

Because it ran out of memory?

how much memory is required

I believe you have two crucial questions: how much memory do you require max, while JIT-compilation is happening, and how much do you require for executing the application after the JIT compiler has done its job. While the first number will be crucial as you need to provide at least as much, the second number should be much lower, as the compiler and all the data and memory it requires should eventually disappear. Maybe you can tweak the number of compiler threads and thus trade in compilation time with memory requirements, if memory is crucial for you?

Again, all this is theoretical discussion. To really help you, we'd need to see the source of your application or at least have some insight into your architecture, heap dumps, compilation logs, etc.

Christian

limanhei · 2022-06-30T10:09:49Z

Thanks Christian!

All js code are developed by us, most of them are just if then else, there is nothing complicated.

We are running the application in a pod so need to know how much memory need to be allocated, otherwise may easily fall into oomkill. To my observation, the JIT compiler seems using memory outside the jvm, correct me if I am wrong, so when I allocate memory to the pod I need to take this into account.

Another observation I have in my load test is, when I test it with 1 thread, the performance in graalvm is better than hotspot + graal js interpreter. However, if I test it with 75 client threads, hotspot vm + interpreter had better performance, not sure if it is due to the fact that the script engine is shared.

…bject Fixes this memory leak by making the HostAccess for the GraalJSScriptEngine available in a static final variable instead of creating it for each new engine. Solution proposed in oracle/graaljs#121 (comment). Sharing a single engine across all Contexts (as proposed in oracle/graaljs#121 (comment)) is not possible, because core expects a ScriptEngine. Signed-off-by: Florian Hotze <florianh_dev@icloud.com>

* [jsscripting] Fix memory-leak caused by com.oracle.truffle.host.HostObject Fixes this memory leak by making the HostAccess for the GraalJSScriptEngine available in a static final variable instead of creating it for each new engine. Solution proposed in oracle/graaljs#121 (comment). Sharing a single engine across all Contexts (as proposed in oracle/graaljs#121 (comment)) is not possible, because core expects a ScriptEngine. Signed-off-by: Florian Hotze <florianh_dev@icloud.com> * [jsscripting] Update JavaDoc Signed-off-by: Florian Hotze <florianh_dev@icloud.com> * [jsscripting] Close `GraalJSScriptEngine` when `OpenhabGraalJSScriptEngine` is closed My breakpoint inside the close method of GraalJSScriptEngine did not trigger until this change was made. Signed-off-by: Florian Hotze <florianh_dev@icloud.com>

* [jsscripting] Fix memory-leak caused by com.oracle.truffle.host.HostObject Fixes this memory leak by making the HostAccess for the GraalJSScriptEngine available in a static final variable instead of creating it for each new engine. Solution proposed in oracle/graaljs#121 (comment). Sharing a single engine across all Contexts (as proposed in oracle/graaljs#121 (comment)) is not possible, because core expects a ScriptEngine. Signed-off-by: Florian Hotze <florianh_dev@icloud.com> * [jsscripting] Update JavaDoc Signed-off-by: Florian Hotze <florianh_dev@icloud.com> * [jsscripting] Close `GraalJSScriptEngine` when `OpenhabGraalJSScriptEngine` is closed My breakpoint inside the close method of GraalJSScriptEngine did not trigger until this change was made. Signed-off-by: Florian Hotze <florianh_dev@icloud.com> Signed-off-by: Ben Rosenblum <rosenblumb@gmail.com>

* [jsscripting] Fix memory-leak caused by com.oracle.truffle.host.HostObject Fixes this memory leak by making the HostAccess for the GraalJSScriptEngine available in a static final variable instead of creating it for each new engine. Solution proposed in oracle/graaljs#121 (comment). Sharing a single engine across all Contexts (as proposed in oracle/graaljs#121 (comment)) is not possible, because core expects a ScriptEngine. Signed-off-by: Florian Hotze <florianh_dev@icloud.com> * [jsscripting] Update JavaDoc Signed-off-by: Florian Hotze <florianh_dev@icloud.com> * [jsscripting] Close `GraalJSScriptEngine` when `OpenhabGraalJSScriptEngine` is closed My breakpoint inside the close method of GraalJSScriptEngine did not trigger until this change was made. Signed-off-by: Florian Hotze <florianh_dev@icloud.com> Signed-off-by: Andras Uhrin <andras.uhrin@gmail.com>

…alJSScriptEngine instances See oracle/graaljs#121 (comment), it is not required to have one engine per GraalJSScriptEngine. This might improve performance a bit on less powerful systems (Raspberry Pi) and decreases heap usage: With 5 GraalJS UI scripts, heap usage is now below 100 MB. Before this change, it was over 100 MB. Signed-off-by: Florian Hotze <florianh_dev@icloud.com>

* [jsscripting] Share org.graalvm.polyglot.Engine across all OpenhabGraalJSScriptEngine instances See oracle/graaljs#121 (comment), it is not required to have one engine per GraalJSScriptEngine. This might improve performance a bit on less powerful systems (Raspberry Pi) and decreases heap usage: With 5 GraalJS UI scripts, heap usage is now below 100 MB. Before this change, it was over 100 MB. * [jsscripting] Extend debug logging * [jsscripting] Cache `@jsscripting-globals.js` across all engines Signed-off-by: Florian Hotze <florianh_dev@icloud.com>

* [jsscripting] Fix memory-leak caused by com.oracle.truffle.host.HostObject Fixes this memory leak by making the HostAccess for the GraalJSScriptEngine available in a static final variable instead of creating it for each new engine. Solution proposed in oracle/graaljs#121 (comment). Sharing a single engine across all Contexts (as proposed in oracle/graaljs#121 (comment)) is not possible, because core expects a ScriptEngine. Signed-off-by: Florian Hotze <florianh_dev@icloud.com> * [jsscripting] Update JavaDoc Signed-off-by: Florian Hotze <florianh_dev@icloud.com> * [jsscripting] Close `GraalJSScriptEngine` when `OpenhabGraalJSScriptEngine` is closed My breakpoint inside the close method of GraalJSScriptEngine did not trigger until this change was made. Signed-off-by: Florian Hotze <florianh_dev@icloud.com>

* [jsscripting] Share org.graalvm.polyglot.Engine across all OpenhabGraalJSScriptEngine instances See oracle/graaljs#121 (comment), it is not required to have one engine per GraalJSScriptEngine. This might improve performance a bit on less powerful systems (Raspberry Pi) and decreases heap usage: With 5 GraalJS UI scripts, heap usage is now below 100 MB. Before this change, it was over 100 MB. * [jsscripting] Extend debug logging * [jsscripting] Cache `@jsscripting-globals.js` across all engines Signed-off-by: Florian Hotze <florianh_dev@icloud.com>

* [jsscripting] Fix memory-leak caused by com.oracle.truffle.host.HostObject Fixes this memory leak by making the HostAccess for the GraalJSScriptEngine available in a static final variable instead of creating it for each new engine. Solution proposed in oracle/graaljs#121 (comment). Sharing a single engine across all Contexts (as proposed in oracle/graaljs#121 (comment)) is not possible, because core expects a ScriptEngine. Signed-off-by: Florian Hotze <florianh_dev@icloud.com> * [jsscripting] Update JavaDoc Signed-off-by: Florian Hotze <florianh_dev@icloud.com> * [jsscripting] Close `GraalJSScriptEngine` when `OpenhabGraalJSScriptEngine` is closed My breakpoint inside the close method of GraalJSScriptEngine did not trigger until this change was made. Signed-off-by: Florian Hotze <florianh_dev@icloud.com>

* [jsscripting] Share org.graalvm.polyglot.Engine across all OpenhabGraalJSScriptEngine instances See oracle/graaljs#121 (comment), it is not required to have one engine per GraalJSScriptEngine. This might improve performance a bit on less powerful systems (Raspberry Pi) and decreases heap usage: With 5 GraalJS UI scripts, heap usage is now below 100 MB. Before this change, it was over 100 MB. * [jsscripting] Extend debug logging * [jsscripting] Cache `@jsscripting-globals.js` across all engines Signed-off-by: Florian Hotze <florianh_dev@icloud.com>

* [jsscripting] Fix memory-leak caused by com.oracle.truffle.host.HostObject Fixes this memory leak by making the HostAccess for the GraalJSScriptEngine available in a static final variable instead of creating it for each new engine. Solution proposed in oracle/graaljs#121 (comment). Sharing a single engine across all Contexts (as proposed in oracle/graaljs#121 (comment)) is not possible, because core expects a ScriptEngine. Signed-off-by: Florian Hotze <florianh_dev@icloud.com> * [jsscripting] Update JavaDoc Signed-off-by: Florian Hotze <florianh_dev@icloud.com> * [jsscripting] Close `GraalJSScriptEngine` when `OpenhabGraalJSScriptEngine` is closed My breakpoint inside the close method of GraalJSScriptEngine did not trigger until this change was made. Signed-off-by: Florian Hotze <florianh_dev@icloud.com> Signed-off-by: Andras Uhrin <andras.uhrin@gmail.com>

wirthi self-assigned this Feb 26, 2019

wirthi added bug Something isn't working memory Memory footprint, memory leaks labels Mar 8, 2019

hashtag-smashtag mentioned this issue Apr 11, 2019

Unremovable Javascript Bindings #146

Closed

florian-h05 mentioned this issue Dec 2, 2022

[jsscripting] Loading script files crashes openHAB openhab/openhab-addons#12577

Closed

florian-h05 mentioned this issue Dec 2, 2022

[jsscripting] Fix memory leak that crashes openHAB openhab/openhab-addons#13824

Merged

florian-h05 mentioned this issue Dec 29, 2022

[jsscripting] Improve performance & reduce memory usage openhab/openhab-addons#14113

Merged

GraalJS memory leak in Engine #121

GraalJS memory leak in Engine #121

Comments

mdsina commented Feb 26, 2019 • edited

wirthi commented Feb 26, 2019 • edited

mdsina commented Feb 26, 2019 • edited

wirthi commented Mar 28, 2019

wirthi commented May 20, 2019

blazarus commented May 20, 2019

mdsina commented May 21, 2019

mdsina commented Jan 5, 2020

dariol83 commented Aug 31, 2020

wirthi commented Sep 1, 2020

dariol83 commented Sep 1, 2020 • edited

dariol83 commented Sep 6, 2020

horschi commented Sep 6, 2020

dariol83 commented Sep 6, 2020

horschi commented Sep 6, 2020

dariol83 commented Sep 6, 2020 • edited

horschi commented Sep 6, 2020 • edited

dariol83 commented Sep 10, 2020

horschi commented Sep 10, 2020 • edited

dariol83 commented Sep 10, 2020

dariol83 commented Sep 13, 2020

horschi commented Oct 29, 2020 • edited

wirthi commented Nov 26, 2020

dariol83 commented Nov 26, 2020

dariol83 commented Dec 13, 2020

xardasos commented Jul 12, 2021

horschi commented Jul 12, 2021 • edited

xardasos commented Jul 13, 2021

wirthi commented Jul 14, 2021 • edited

boris-petrov commented Jul 24, 2021

limanhei commented Jul 29, 2021

limanhei commented Aug 19, 2021 • edited

limanhei commented Nov 17, 2021

wirthi commented Nov 17, 2021

limanhei commented Jun 28, 2022 • edited

wirthi commented Jun 28, 2022

limanhei commented Jun 29, 2022 • edited

limanhei commented Jun 29, 2022 • edited

wirthi commented Jun 30, 2022

limanhei commented Jun 30, 2022

mdsina commented Feb 26, 2019 •

edited

wirthi commented Feb 26, 2019 •

edited

mdsina commented Feb 26, 2019 •

edited

dariol83 commented Sep 1, 2020 •

edited

dariol83 commented Sep 6, 2020 •

edited

horschi commented Sep 6, 2020 •

edited

horschi commented Sep 10, 2020 •

edited

horschi commented Oct 29, 2020 •

edited

horschi commented Jul 12, 2021 •

edited

wirthi commented Jul 14, 2021 •

edited

limanhei commented Aug 19, 2021 •

edited

limanhei commented Jun 28, 2022 •

edited

limanhei commented Jun 29, 2022 •

edited

limanhei commented Jun 29, 2022 •

edited