Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoUpdate: Memory Leak #4849

Closed
melloware opened this issue May 30, 2019 · 33 comments

Comments

Projects
None yet
6 participants
@melloware
Copy link
Contributor

commented May 30, 2019

HISTORY

We recently updated an application from PF 6.1 to 6.2.20. Our application won't stay alive for more the 24 hours without going into a GC death spiral and running out of memory.

We have narrowed the leak down to leaking copies of javax.faces.component.StateHolderSaver

1) Environment

  • PrimeFaces version: 6.2.20
  • Does it work on the newest released PrimeFaces version? NO on 7.0.3
  • Does it work on the newest sources in GitHub? NO on 7.1.1-SNAPSHOT
  • Application server + version: Mojarra 2.2/2.3, Jboss EAP 7.1
  • Affected browsers: ALL

2) Steps to reproduce

Attached is a sample runner:
pf-memleak.zip

Run the application and navigate to http://localhost:8080/primefaces-test/test.xhtml and just leave the page running.

Run JVisualVM and just monitor javax.faces.component.StateHolderSaver you will see the instances climb. If you press "Perform GC" you will see the instances go down but then come back extremely fast as they all get re-instantiated. It will also garbage collect but start at a higher number each time like instances are being held onto to. After 24 hours and 1000's of users we eventually run out of all resources.

If you open 2 or 3 different browsers on your machine you will see the problem happen even faster and more aggressively.

5) Sample XHTML

	<h:form id="frmTest">
		<p:messages id="globalMessages">
			<p:autoUpdate />
		</p:messages>
		<p:poll id="pllTest" interval="1" listener="#{basicController.controllerMethod()}" update="txtTest " />
		<h:outputText id="txtTest" value="#{basicForm.value}" />
	</h:form>

6) Visual VM

image

7) Analysis

  • If you use Mojarra 2.2. or 2.3 we see the leak. Using MyFaces 2.2 and 2.3 it doesn't appear to happen but it looks like MyFaces uses different classes.
  • If we remove the <p:autoUpdate> and have the poller just do update="" our app no longer leaks these javax.faces.component.StateHolderSaver.
  • If you do "Perform GC" over and over you will see each time the StateHolderSaver garbage collect but each time they garbage collect but start at a higher number each time. Because we have say 1000-2000 users all on this page all day you can see how it can leak and not survive for more than 24 hours.
@tandraschko

This comment has been minimized.

Copy link
Member

commented May 30, 2019

I currently dont have much time but i suggest to dig deeper with yourkit. You might be able to get a free Version as Open source Guy :)

@melloware

This comment has been minimized.

Copy link
Contributor Author

commented May 30, 2019

I also submitted through PRO account. I am digging deeper.

@tandraschko tandraschko added the defect label Jun 4, 2019

@cnsgithub

This comment has been minimized.

Copy link
Contributor

commented Jun 11, 2019

@melloware Cannot get your demo application running. Is there something special to execute?

[WARNING] Failed startup of context o.e.j.m.p.JettyWebAppContext@726934e2{/primefaces-test,[file:///XXX/pf-memleak/src/main/webapp/, jar:file:///XXX/.m2/repository/org/glassfish/javax.faces/2.2.17/javax.faces-2.2.17.jar!/META-INF/resources, jar:file://XXX/.m2/repository/org/primefaces/primefaces/7.0/primefaces-7.0.jar!/META-INF/resources, jar:file:///XXX/.m2/repository/org/omnifaces/omnifaces/2.7/omnifaces-2.7.jar!/META-INF/resources],UNAVAILABLE}{file:///XXX/pf-memleak/src/main/webapp/}
javax.servlet.UnavailableException: Class loading error for holder ListenerHolder@47be41c6{STARTING}: ${jsf-listener}
	at org.eclipse.jetty.servlet.BaseHolder.doStart(BaseHolder.java:99)
	at org.eclipse.jetty.servlet.ListenerHolder.doStart(ListenerHolder.java:111)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
	at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:356)
	at org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1497)
	at org.eclipse.jetty.maven.plugin.JettyWebAppContext.startWebapp(JettyWebAppContext.java:360)
	at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1459)
	at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:852)
	at org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:278)
	at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:545)
	at org.eclipse.jetty.maven.plugin.JettyWebAppContext.doStart(JettyWebAppContext.java:428)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:167)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:119)
	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:167)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:119)
	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:167)
	at org.eclipse.jetty.server.Server.start(Server.java:418)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110)
	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:113)
	at org.eclipse.jetty.server.Server.doStart(Server.java:382)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
	at org.eclipse.jetty.maven.plugin.AbstractJettyMojo.startJetty(AbstractJettyMojo.java:477)
	at org.eclipse.jetty.maven.plugin.AbstractJettyMojo.execute(AbstractJettyMojo.java:343)
	at org.eclipse.jetty.maven.plugin.JettyRunMojo.execute(JettyRunMojo.java:169)
	at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
	at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:207)
	at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
	at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
	at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
	at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
	at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
	at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
	at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
	at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
	at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
	at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863)
	at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288)
	at org.apache.maven.cli.MavenCli.main(MavenCli.java:199)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
	at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
	at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
	at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
	at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
	at org.codehaus.classworlds.Launcher.main(Launcher.java:47)
@melloware

This comment has been minimized.

Copy link
Contributor Author

commented Jun 11, 2019

@cnsgithub Are you sure you are running "mvn clean jetty:run-exploded" ???

here is my mvn --version output

Apache Maven 3.6.0 (97c98ec64a1fdfee7767ce5ffb20918da4f719f3; 2018-10-24T14:41:47-04:00)
Maven home: C:\Tools\apache-maven-3.6.0
Java version: 1.8.0_211, vendor: Oracle Corporation, runtime: C:\Tools\jdk1.8.0_211\jre
Default locale: en_US, platform encoding: Cp1252
OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"
@cnsgithub

This comment has been minimized.

Copy link
Contributor

commented Jun 11, 2019

I just removed this from web.xml:

    <listener>
        <listener-class>${jsf-listener}</listener-class>
    </listener>

Then jetty:run is working with YJP profiling from within IntelliJ.

@melloware

This comment has been minimized.

Copy link
Contributor Author

commented Jun 11, 2019

thanks for investigating!

@tandraschko

This comment has been minimized.

Copy link
Member

commented Jun 11, 2019

the listener will be replaced only with jetty:run-exploded, therefore jetty:run won't work out of the box

@cnsgithub

This comment has been minimized.

Copy link
Contributor

commented Jun 11, 2019

I started with -Xms 256M -Xmx 4G. In the first 12 minutes the G1 Eden Space had been growing from 5MB to 145MB, then after automatic minor GC eden space was 5MB again. The old generation remained as is. So I cannot yet reproduce this issue.

@melloware How did you configure your JVM options? When you say you're application does not survive more than 24 hours, does the JVM crash with OutOfMemoryError? Maybe GC overhead limit exceeded? Do you have -XX:+HeapDumpOnOutOfMemoryError enabled?

@melloware

This comment has been minimized.

Copy link
Contributor Author

commented Jun 11, 2019

Yes the Garbage Collection will appear to be working but its not reclaiming EVERYTHING. After each GC the number of javax.faces.component.StateHolderSaver will climb slowly. With only 1 user logged in it will be very slow to get an OOM.

With 1000's of users and running for 24 hours the javax.faces.component.StateHolderSaver eventually fill up the entire space. Its very subtle if you only have 1 user your app would probably last for weeks or months before it ran out of space.

@cnsgithub

This comment has been minimized.

Copy link
Contributor

commented Jun 11, 2019

With 1000's of users and running for 24 hours

How can we reproduce it then? Further decreasing the poll interval isn't possible. Would it help to just include many p:poll components or do you think it's rather related to the number of sessions which are created and kept alive by thousands of users? Then we would have to create a JMeter loadtest. Or just decrease the heap to an absolute minimum to get quicker results? What about -XX:+HeapDumpOnOutOfMemoryError?

@melloware

This comment has been minimized.

Copy link
Contributor Author

commented Jun 11, 2019

We ended up writing JMeter tests against our production application to prove it.

But just to see the leak if you open 3 browsers (IE, Chrome, Firefox) for example and all leave them on the page. And just monitor javax.faces.component.StateHolderSaver you will see it climb and climb. And if it starts at say 20 instances it will climb to 100,000 instances and then when it GC's you will see it go back to 90 instances. then repeat GC it will return back to 245 instances etc etc. So after each GC that amount of held onto instances continues to climb and you will eventually run OOM. The more users you have the faster it will happen.

@alailson

This comment has been minimized.

Copy link

commented Jun 14, 2019

I am also experiencing the same Bug.

Configurations:
Wildfly 10- Primefaces 6.0 (no memory leak)
Wildfly 14- Primefaces 6.0 (no memory leak)
Wildfly 10- Primefaces 6.2 (memory leak)
Wildfly 14- Primefaces 6.2 (memory leak)
Wildfly 10- Primefaces 7.0 (memory leak)
Wildfly 14- Primefaces 7.0 (memory leak)

@alailson

This comment has been minimized.

Copy link

commented Jun 16, 2019

I've just tested in a Apache TomEE webprofile (MyFaces) 7.1.0 and no memory leak was detected. I think it is a problem with the Mojarra implementation.

@melloware

This comment has been minimized.

Copy link
Contributor Author

commented Jun 17, 2019

@alailson I agree I could not get it to happen with MyFaces but could in all versions of Mojarra.

@tandraschko

This comment has been minimized.

Copy link
Member

commented Jun 17, 2019

Can you try to remove:
"component.subscribeToEvent(PreRenderComponentEvent.class...." in AutoUpdateListener?
Otherwise no idea
thats the essential difference compared to the old PF without p:autoUpdate

@tandraschko

This comment has been minimized.

Copy link
Member

commented Jun 24, 2019

@melloware ? :D

@melloware

This comment has been minimized.

Copy link
Contributor Author

commented Jun 24, 2019

Sorry I plan on testing this! Will let you know what I find out.

@Rawi01

This comment has been minimized.

Copy link
Contributor

commented Jun 24, 2019

I just tried your sample project and the only "leak" i could find were some data referenced by some session stuff. I set the session timeout to one minute and the objects were removed properly after a short time.

@melloware

This comment has been minimized.

Copy link
Contributor Author

commented Jun 24, 2019

@Rawi01 were you using Mojarra or MyFaces? If you were using Mojarra you have to let the app run for hours. A session timeout of 1 minute does not do my users any good. They are allowed to be in as "Guest" for 8 hours without being logged out.

@Rawi01

This comment has been minimized.

Copy link
Contributor

commented Jun 25, 2019

@melloware I simply used your sample project so it should be Mojarra. I only tried to decrease the session tieout to test if it cleans up the remaining state holders after that timeout and it does as expected.
I just started a new test and will let it run the next 8 hours.

@melloware

This comment has been minimized.

Copy link
Contributor Author

commented Jun 25, 2019

Nice and after 8 hours with a 8 hour session timeout make sure you look at javax.faces.component.StateHolderSaver instances you will see it won't garbage collect them all.

@melloware

This comment has been minimized.

Copy link
Contributor Author

commented Jun 26, 2019

I tried removing both component.subscribeToEvent and it had no effect. Memory Leak still there in Mojarra.

@Rawi01

This comment has been minimized.

Copy link
Contributor

commented Jun 27, 2019

I tried it several times now with different settings and was unable to reproduce the issue.

  • If I open the website and close it after a few requests everything gets cleaned up instantly
  • If I request the page using curl the GC takes some time to clean them up
  • Sometimes (I could not find any pattern) this also happens using the browser
  • These remaining objects were stored in the session
    In all cases the memory cleaned up properly after some time so there must be something else that triggers your problems.
@melloware

This comment has been minimized.

Copy link
Contributor Author

commented Jun 27, 2019

@Rawi01 Let me re-iterate my use case. I have an app that has been in production for 5 years on PrimeFaces 5.3 then 6.1, now on 6.2 with very little changes to it other than fixing things that PrimeFaces changes between versions.

  1. 8 hour session timeout
  2. 10,000 users sitting on a public page that refreshes its data every 5 seconds using AutoUpdate.
  3. Yes the session cleans up if the session is terminated but this display cannot just kill the users session every few minutes to clean it up. A user is allowed to leave this screen up all day (think of a stock ticker).
  4. In previous versions of PrimeFaces the app worked fine when we used autoupdate="true" instead of the new p:autoUpdate
  5. 2GB of Heap Space, the app in the above scenario runs out of memory in less than 24 hours. The culprit in the heap dump is javax.faces.component.StateHolderSaver

My example zip above won't ever run out of memory because you are running with 1 user. However it shows the each that after EACH GC a few more javax.faces.component.StateHolderSaver get left behind slowly creeping that number up. Now multiply it by 10,000 users and you can see how this "leak" however slow is a leak.

We have proved it does not happen with MyFaces but does happen with Mojarra 2.2 and 2.3. but what we don't understand is why because like @tandraschko said the code is not that different from the old autoupdate="true" to the new p:autoUpdate.

NOTE: I think a JMeter test is the only way you are going to see the problem.

@Rawi01

This comment has been minimized.

Copy link
Contributor

commented Jun 28, 2019

@melloware @tandraschko I was finaly able to reproduce the issue and also found a way to fix it (at least I think I did)

On every request the autoupdate taghandler gets called:

@Override
public void apply(FaceletContext faceletContext, UIComponent parent) throws IOException, FacesException, FaceletException, ELException {
boolean disabled = false;
if (disabledAttribute != null) {
disabled = disabledAttribute.getBoolean(faceletContext);
}
AutoUpdateListener.subscribe(parent, disabled);
}

The AutoUpdateListener.subscribe method adds new PostAddToViewEvent handlers for the component. These handlers gets also saved and restored by the state saving so for every request it adds an additional one and the list grows over time. In most cases this is no problem but if there are many different sessions with a long lifetime it might end up as a huge pile of garbage. You can see that the list grows if you add a simple System.out.println("Listeners: " + parent.getListenersForEventClass(javax.faces.event.PostAddToViewEvent.class)); as last line in the apply method.

For tag handlers there is the ComponentHandler.isNew(parent)helper method to check if its the first time the component gets rendered. If you add this one before the AutoUpdateListener.subscribe it only adds the listener once and the leak is gone. Unfortunately this will most likely break the disabled attribute.

I still have no idea why this does not happen using myfaces and why only one of the events that gets added leaks.

I reproduced the issue using a single browser to open the page and stop the whole page using the debugger. I trigger a GC after that and can see an increase every time I do this. Closing the page cleans up everything thats why I struggled to get it too.

@melloware

This comment has been minimized.

Copy link
Contributor Author

commented Jun 28, 2019

@Rawi01 Awesome work! Want to submit a PR?

tandraschko added a commit that referenced this issue Jun 29, 2019

@tandraschko

This comment has been minimized.

Copy link
Member

commented Jun 29, 2019

latenight prototype commited, please test ;) also with dynamic disabled attr.

@melloware

This comment has been minimized.

Copy link
Contributor Author

commented Jun 30, 2019

@tandraschko I will test in the morning and let you know the results.

tandraschko added a commit that referenced this issue Jun 30, 2019

tandraschko added a commit that referenced this issue Jun 30, 2019

@melloware

This comment has been minimized.

Copy link
Contributor Author

commented Jul 1, 2019

Great news! Tested this morning and I can see with my sample everytime I press GC i can see it return to the same number of instances. So the memory is no longer climbing it is a fixed number of instances!

@tandraschko

This comment has been minimized.

Copy link
Member

commented Jul 1, 2019

Great! Could you please test a bit with static and dynamic disabled attribute? thanks.

@melloware

This comment has been minimized.

Copy link
Contributor Author

commented Jul 1, 2019

OK I am not sure disabled is working correctly. I updated my test case to dynamically control the disabled flag but it looks like it never gets disabled. Attached is my test case.
pf-memleak.zip

@tandraschko

This comment has been minimized.

Copy link
Member

commented Jul 1, 2019

I think your example is wrong. Works fine for me:

`
<h:form id="frmTest">
<p:messages id="globalMessages">
<p:autoUpdate disabled="#{basicForm.enabled}"/>
</p:messages>

            <h:outputText id="txtTest" value="#{basicForm.value}">
                <p:autoUpdate disabled="#{basicForm.enabled}"/>
            </h:outputText>

	<p:poll id="pllTest" interval="1" listener="#{basicController.controllerMethod()}" update="" />

	<p:commandButton value="Toggle Disabled" id="ajax"  action="#{basicController.toggle}" update="@form" />
</h:form>

`

@tandraschko tandraschko closed this Jul 1, 2019

@tandraschko tandraschko added this to the 7.1 milestone Jul 1, 2019

@melloware

This comment has been minimized.

Copy link
Contributor Author

commented Jul 1, 2019

DOH! Yes you are correct.

@mertsincan mertsincan added the 7.0.5 label Jul 1, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.