Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[automation] Avoid memory leak on script loading failure #4162

Merged
merged 1 commit into from Mar 27, 2024

Conversation

florian-h05
Copy link
Contributor

Core part of the fix for openhab/openhab-addons#16462.

Core part of the fix for openhab/openhab-addons#16462.

Signed-off-by: Florian Hotze <florianh_dev@icloud.com>
@florian-h05
Copy link
Contributor Author

@J-N-K Can you please have a look at this PR? It is only a very small change.

@J-N-K J-N-K added the bug An unexpected problem or unintended behavior of the Core label Mar 26, 2024
Copy link
Member

@J-N-K J-N-K left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we run into the same issue when DEBUG is enabled? Or is the reference cleared once the log is written?

@florian-h05
Copy link
Contributor Author

I am not sure if the reference is cleared at some time, it seemed so.

But of course we do run into the same issue when debug is enabled, but this mitigates the issue a bit.

@florian-h05
Copy link
Contributor Author

The logger is the only GC root I found for most of the instances, therefore it at least partially causes the memory leak, there might be another memory leak somewhere though, but I haven’t found one.

@J-N-K
Copy link
Member

J-N-K commented Mar 27, 2024

I guess it doesn't hurt, so let's merge it.

@J-N-K J-N-K added this to the 4.2 milestone Mar 27, 2024
@J-N-K J-N-K merged commit be9cbf2 into openhab:main Mar 27, 2024
3 checks passed
@florian-h05 florian-h05 deleted the script-load-failure-memory-leak branch March 27, 2024 22:45
@dilyanpalauzov
Copy link
Contributor

Does this change imply than now on JS files with syntax errors when debugging is enabled a string object is leaked?

If it is leaked and if I were you, I would describe at https://github.com/ops4j/org.ops4j.pax.logging/issues the problem with keeping references forever, asking for advice. The leak can be in pax.logging itself, so it could be fixed there by a future release, or it could be improper use of the pax.logging API by OH, or it could be in some software used by pax.logging (like log4j).

I cannot refile this case at https://github.com/ops4j/org.ops4j.pax.logging as I neither understand how pax.logging is used in OH nor how it works, and I will not be able to understand the replies.

@florian-h05
Copy link
Contributor Author

It looks like the reference is only held for some time, let me explain.
(I have tested with the JS Scripting add-on already including the fix, but openHAB Core not including this migitation.)

I am now running the reproduce script from openhab/openhab-addons#16462 for nearly half an hour now with a sleeping time of 5 seconds. This means the script file is edited 12 times a minute (6 times with syntax errors) and 360 times in half-an-hour (180 times with syntax errors).

But I only have 12 instead of 180 instances of OpenhabGraalJSScriptEngine (correspondens to 12 script engines for abcd.js) in the heap dump, which takes a good amount of memory, but means that PAX Logging does not hold the references forever.

Does this change imply than now on JS files with syntax errors when debugging is enabled a string object is leaked?

With this PR, we only migitate but not really fix the issue in openHAB core. I don't want to remove or change the debug logging here because for debugging it is very helpful, but this little change avoids the logger being called if log level is INFO, which avoids that PAX Logging gets a reference to the exception on INFO level. Since INFO is the standard level and most users won't have debug level enabled here, it should fix the issue in the most cases.

@jimtng
Copy link
Contributor

jimtng commented Mar 31, 2024

Has anyone noticed an increasing memory usage on successive jsr223 script (js, jruby, etc) reloads, even when there were no errors?

@florian-h05
Copy link
Contributor Author

No, at least not with JS.
Please have a look at openhab/openhab-addons#16462 (comment), where I mention some of my findings.

@dilyanpalauzov
Copy link
Contributor

It looks like the reference is only held for some time, let me explain.
(I have tested with the JS Scripting add-on already including the fix, but openHAB Core not including this migitation.)

I am now running the reproduce script from openhab/openhab-addons#16462 for nearly half an hour now with a sleeping time of 5 seconds. This means the script file is edited 12 times a minute (6 times with syntax errors) and 360 times in half-an-hour (180 times with syntax errors).

My understanding is that if only the changes from openhab/openhab-addons#16462 are applied without the changes from #4162 , then no GrallVM instances are passed to ops4j.pax.logging and for this reason ops4j.pax.logging cannot hold and leak GrallJSScriptEngine objects. If I am correct, this means that instead of GraalJSScriptEngine objects String objects are leaked.

Can you please amend to the String objects containg the error message, passed to openhab-core / ops4j.pax.logging, 100k of spaces and a timestamp, then run for a while and see if the memory steadily increases?

I guess that 12 instances of OpenhabGraalJSScriptEngine mean, that the script is changed so fast, that in order to parse it that many instances are needed. Or in other words, the speed of changing the script is higher than an OpenhabGraalJSScriptEngine instance needs to parse/interpret/execute it. I suggest repeating the test with slightly larger delays, so that a single OpenhabGraalJSScriptEngine has enough time to pares the .js file, before it is changed. Then at most two OpenhabGraalJSScriptEngine instances will be running: one for parsing, and possibly one more for executing the last successfully parsed code.

@dilyanpalauzov
Copy link
Contributor

Has anyone noticed an increasing memory usage on successive jsr223 script (js, jruby, etc) reloads, even when there were no errors?

You can create a test, similar to openhab/openhab-addons#16462 but for a different language, and follow the creation and evaluation of the test results there. If you test it, you will be more certain.

@stefan-hoehn
Copy link
Contributor

@dilyanpalauzov if you think that this is the case, please don't throw in hypothetical ideas but provide hard facts with systematic test, results and proof. Thanx.

@dilyanpalauzov
Copy link
Contributor

I cannot run OH to demonstrate a memory leak of 2000bytes per iteration, after the changes are applied, as to expose it I need to run many, many, many iterations, and each iteration is very slow in my test environment. In any case, if there is still a memory leak, it is much smaller now than before.

@stefan-hoehn
Copy link
Contributor

Which is then a non-issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An unexpected problem or unintended behavior of the Core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants