New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FastBoot] Memory leaks in FastBoot mode #10910

Closed
joshvfleming opened this Issue Apr 20, 2015 · 21 comments

Comments

Projects
None yet
4 participants
@joshvfleming

Preliminaries

Simply stated, our use case is a small Ember application running in a Nashorn script engine on the JVM. Nashorn is an optimizing JIT JavaScript compiler, similar to V8. The test case is a medium-complexity page consisting of an update stream, similar to the news feed on sites like LinkedIn and Facebook. The figure below shows an example rendering of the test page.

stream
Rendered stream
stream-html
Raw output

The page continues on for 10 updates. Each update consists of a profile picture, headline, brief description, comments, and -- what will become important later -- a "like" action. The data for the page is fetched from static json files.

Our test methodology is very simple: use the Ember "visit" function to render the exact same page many times in a serial loop, and record the results for each iteration. This roughly approximates the conditions that the app will experience when running on a server. The figures below show the results of a test run.

heap-leak

The first thing to note here is the runaway heap usage. The figure above shows a dramatic increase in live heap allocations as time goes by, reaching nearly 4gb by the 2000th iteration. Each rendering of the page should be entirely independent of any previous render, so there should not be significant growth in memory consumption over time. Clearly there's a leak of some kind.

render-time-leak

The figure above shows how long it takes to render the test page for each iteration of the test. Everything renders very slowly at first, since the script engine is still figuring out how to compile and optimize the code. After the 500th iteration or so, average performance is pretty good. Note the frequent and large vertical blips, however. These are caused by garbage collection pauses.

Leak 1: ActionHelper

Each item in the stream has a corresponding template that defines the layout for its view. One component of the layout for a stream item is the "like" button.

actionhelper0

The figure above shows the part of the update template containing the "like" action. When the template is rendered, the "action" helper signals that we'd like to handle interactions with the button, and we eventually wind up in Ember's "actionHelper" function.

actionhelper1

var actionId = ActionHelper.registerAction(params[0], actionOptions, hash.allowedKeys);

The "actionHelper" function receives all relevant information about the potential interaction, including the "view" in which it might happen. It then places the view into the "actionOptions" object, and passes this object to the "registerAction" function.

actionhelper2

ActionManager.registeredActions[actionId] = {

In the "registerAction" function, an event handler object is built and then placed into the global action registry object "ActionManager.registeredActions". ActionManager is not really exposed as a global, but it's always reachable from some global variable, so this registry object is never collected. And since the "handler" function for each action closes over the "options" object, which in turn contains the "view" object, the views, model objects, templates, etc. are also never collected.

We've temporarily worked around the leak by manually clearing the "registeredActions" object after each render, with something like the following:

  for (var action in ActionHelper.registeredActions) {
    delete ActionHelper.registeredActions[action];
  }

Note that we can't simply set "registeredActions" to {}, because it's aliased in other places, and doing so would only wipe out this one reference to the object.

Note: As part of a follow-up investigation, @nathanhammond found that there actually is code in Ember that's meant to clean up the "registeredActions" object here:

options.view.on('willClearRender', function() {

It seems that the code path that triggers this event is never executed in FastBoot mode, however.

Leak 2: Application Creation

The node unit tests in Ember suggest that the proper usage is to create a new Application object for each run.

app1

app = createApplication();

However, this usage leads to another leak. The figure below shows that the Application class extends the Namespace class.

app2

var Application = Namespace.extend(DeferredMixin, {

The initial part of the Namespace definition is shown in the figure below.

app3

Namespace.NAMESPACES.push(this);

Whenever an Application object is created, the "init" function is called, and pushes the created application onto the global "NAMESPACES" array. If we create a new Application object for each render, then this array grows constantly over time, along with all the objects reachable from an Application.

We've worked around this one by creating a single Application instance and reusing it for each render. This seems to work fine, but it's not clear if it's the intended usage.

Note: We did try to use Ember.run(Application, 'destroy');, but it fails here:

this.__deprecatedInstance__.destroy();

with the following error:

TypeError: Cannot read property "destroy" from undefined

It seems that this.__deprecatedInstance__ is never being set in the "visit" flow.

Conclusion

With both leaks patched, the graphs look very much better.

heap-comparison

The figure above shows a comparison of heap usage before and after the memory leaks are removed. Some amount of leakage remains that we haven't yet identified, but otherwise it's dramatically improved.

render-time-comparison

The figure above shows the impact on render times. Most of the GC noise has been eliminated, and the overall render times are slightly lower.

Update: Test App

I've create an isolated test app that demonstrates the leak on node. You can find it here:

https://github.com/joshvfleming/fastboot-leaktest

Clone the repo, and do:

npm install
bower install
ember build -prod
node scripts/test.js

Select "ember#canary" if given an option in the third step. After running the last command, note the accumulation of action handlers, and watch the memory usage in your profiler.

@chadhietala

This comment has been minimized.

Show comment
Hide comment
@chadhietala

chadhietala Apr 20, 2015

Member

Hey @joshvfleming, do you have the patch mentioned above? 👍 on the issue write up.

Member

chadhietala commented Apr 20, 2015

Hey @joshvfleming, do you have the patch mentioned above? 👍 on the issue write up.

@joshvfleming

This comment has been minimized.

Show comment
Hide comment
@joshvfleming

joshvfleming Apr 20, 2015

@chadhietala Yes, but it's a temporary workaround patch on our internal wrapper api. I'm not yet confident enough to offer an Ember patch. :)

@chadhietala Yes, but it's a temporary workaround patch on our internal wrapper api. I'm not yet confident enough to offer an Ember patch. :)

@stefanpenner

This comment has been minimized.

Show comment
Hide comment
@stefanpenner

stefanpenner Apr 20, 2015

Member

I'm not yet confident enough to offer an Ember patch. :)

I suspect @dgeb and @tomdale's work to split the registry/container/resolver and separate Application from application instance come into play here.

You did forget to mention what version of ember this report is for, could you share?

Member

stefanpenner commented Apr 20, 2015

I'm not yet confident enough to offer an Ember patch. :)

I suspect @dgeb and @tomdale's work to split the registry/container/resolver and separate Application from application instance come into play here.

You did forget to mention what version of ember this report is for, could you share?

@stefanpenner

This comment has been minimized.

Show comment
Hide comment
@stefanpenner

stefanpenner Apr 20, 2015

Member

A quick skim yields the latest work to separate, App and appInstances should not cause multiple namespaces to leak.

  1. app is created
  2. multiple appInstances are created
  3. App.destroy -> https://github.com/emberjs/ember.js/blob/master/packages/ember-runtime/lib/system/namespace.js#L55-L66

Note: We did try to use Ember.run(Application, 'destroy');, but it fails here:

this.__deprecatedInstance__.destroy();

I suspect fixing this would allow you to actually call destroy and should likely address this specific leak (may also address the second leak).

Member

stefanpenner commented Apr 20, 2015

A quick skim yields the latest work to separate, App and appInstances should not cause multiple namespaces to leak.

  1. app is created
  2. multiple appInstances are created
  3. App.destroy -> https://github.com/emberjs/ember.js/blob/master/packages/ember-runtime/lib/system/namespace.js#L55-L66

Note: We did try to use Ember.run(Application, 'destroy');, but it fails here:

this.__deprecatedInstance__.destroy();

I suspect fixing this would allow you to actually call destroy and should likely address this specific leak (may also address the second leak).

@stefanpenner

This comment has been minimized.

Show comment
Hide comment
@stefanpenner

stefanpenner Apr 20, 2015

Member

As for the actions stuff – interestingly – that global isn't even really needed in fast-boot mode, as DOM events -> actions isn't a thing for SSR. That being said, I am pretty surprised that the teardown isn't happening automatically.

This may be the result of not destroying, which would relate this back to the namespace leak itself.

Do you have a demo repo you can share of you above example?

Member

stefanpenner commented Apr 20, 2015

As for the actions stuff – interestingly – that global isn't even really needed in fast-boot mode, as DOM events -> actions isn't a thing for SSR. That being said, I am pretty surprised that the teardown isn't happening automatically.

This may be the result of not destroying, which would relate this back to the namespace leak itself.

Do you have a demo repo you can share of you above example?

@joshvfleming

This comment has been minimized.

Show comment
Hide comment
@joshvfleming

joshvfleming Apr 20, 2015

@stefanpenner Sorry, I should've mentioned that -- I'm using the latest canary build of Ember.

As for sharing a repo, I'll see if I can put together a public version of the one I'm using. In the meantime, it should happen for any sample app that has a template "action" of some kind, if you call "visit" for the same route repeatedly.

Edit: you can also reproduce the Ember.run(Application, 'destroy'); thing by adding it to one of the tests in https://github.com/emberjs/ember.js/blob/master/tests/node/app-boot-test.js

@stefanpenner Sorry, I should've mentioned that -- I'm using the latest canary build of Ember.

As for sharing a repo, I'll see if I can put together a public version of the one I'm using. In the meantime, it should happen for any sample app that has a template "action" of some kind, if you call "visit" for the same route repeatedly.

Edit: you can also reproduce the Ember.run(Application, 'destroy'); thing by adding it to one of the tests in https://github.com/emberjs/ember.js/blob/master/tests/node/app-boot-test.js

@stefanpenner

This comment has been minimized.

Show comment
Hide comment
@stefanpenner

stefanpenner Apr 20, 2015

Member

Edit: you can also reproduce the Ember.run(Application, 'destroy'); thing by adding it to one of the tests in https://github.com/emberjs/ember.js/blob/master/tests/node/app-boot-test.js

i'll look into this now.

Member

stefanpenner commented Apr 20, 2015

Edit: you can also reproduce the Ember.run(Application, 'destroy'); thing by adding it to one of the tests in https://github.com/emberjs/ember.js/blob/master/tests/node/app-boot-test.js

i'll look into this now.

@stefanpenner

This comment has been minimized.

Show comment
Hide comment
@stefanpenner

stefanpenner Apr 20, 2015

Member

@joshvfleming so I realized, nothing purges app instances after visits, so even with destroy. We likely continue to leak, as destroy is actually uncommon in the SSR mode.

some cleanup: #10912
making destroy work: #10913 as i believe it should work

  • I suspect this may not actually fix the leak the way we want, as nothing automatically prunes app instances.

I only had a few moments today to look at this, and am off to do more pairing. But, I will dig into this more once I am back from pairing.

Member

stefanpenner commented Apr 20, 2015

@joshvfleming so I realized, nothing purges app instances after visits, so even with destroy. We likely continue to leak, as destroy is actually uncommon in the SSR mode.

some cleanup: #10912
making destroy work: #10913 as i believe it should work

  • I suspect this may not actually fix the leak the way we want, as nothing automatically prunes app instances.

I only had a few moments today to look at this, and am off to do more pairing. But, I will dig into this more once I am back from pairing.

@joshvfleming

This comment has been minimized.

Show comment
Hide comment
@joshvfleming

joshvfleming Apr 20, 2015

Thanks @stefanpenner. I'm still working on a sample app that demonstrates the leak outside our environment. Should be ready by tomorrow.

Thanks @stefanpenner. I'm still working on a sample app that demonstrates the leak outside our environment. Should be ready by tomorrow.

@joshvfleming

This comment has been minimized.

Show comment
Hide comment
@joshvfleming

joshvfleming Apr 21, 2015

@stefanpenner I managed to get the test app repo up:

https://github.com/joshvfleming/fastboot-leaktest

Run npm install, bower install, ember build -prod, and then do:

node scripts/test.js

You should see the number of action handlers accumulating, and also a steady increase in memory usage.

@stefanpenner I managed to get the test app repo up:

https://github.com/joshvfleming/fastboot-leaktest

Run npm install, bower install, ember build -prod, and then do:

node scripts/test.js

You should see the number of action handlers accumulating, and also a steady increase in memory usage.

@stefanpenner

This comment has been minimized.

Show comment
Hide comment
@stefanpenner

stefanpenner Apr 21, 2015

Member

Awesome. Thanks for doing the leg work. My day is a bit busy, but I'll try to carve out time.

Member

stefanpenner commented Apr 21, 2015

Awesome. Thanks for doing the leg work. My day is a bit busy, but I'll try to carve out time.

@stefanpenner

This comment has been minimized.

Show comment
Hide comment
@stefanpenner

stefanpenner May 22, 2015

Member

I believe currently canary and fastboot are currently out of sync. Hopefully when they "work" with each other again, and now that #10913 is merged i'll take another swing at fixing this leak.

Member

stefanpenner commented May 22, 2015

I believe currently canary and fastboot are currently out of sync. Hopefully when they "work" with each other again, and now that #10913 is merged i'll take another swing at fixing this leak.

@tomdale

This comment has been minimized.

Show comment
Hide comment
@tomdale

tomdale Jun 15, 2015

Member

FastBoot should be back to good now (thanks to @rwjblue). I am working on making {{{tripleCurlies}}} work today and hopefully will have time to look into this this week.

Member

tomdale commented Jun 15, 2015

FastBoot should be back to good now (thanks to @rwjblue). I am working on making {{{tripleCurlies}}} work today and hopefully will have time to look into this this week.

@stefanpenner stefanpenner added the Bug label Jun 29, 2015

@stefanpenner

This comment has been minimized.

Show comment
Hide comment
@stefanpenner

stefanpenner Jul 14, 2015

Member

I believe several fixes have landed, that:

  1. allow destroy to be called on the AppInstance
  2. fix some unrelated to your analysis, but potentially related leaks

So, I have given your example a try, after upgrading to latest canary and ensuring the instance is destroyed the RSS levels out and stays stable.

I have submitted the adjustments here joshvfleming/fastboot-leaktest#1 by sure to run the process with --export-gc so you can see the lovely logging, that demonstrates that the RSS settles after some time.

This indicates the leak reported in this issue is fixed. I suspect additional leaks are likely around, if someone finds some more a nice reproduction would be great, and should making tracking down the issue relatively easy.

Anyways, @joshvfleming thank you for the detailed report. I suspect we can now close this issue.

Member

stefanpenner commented Jul 14, 2015

I believe several fixes have landed, that:

  1. allow destroy to be called on the AppInstance
  2. fix some unrelated to your analysis, but potentially related leaks

So, I have given your example a try, after upgrading to latest canary and ensuring the instance is destroyed the RSS levels out and stays stable.

I have submitted the adjustments here joshvfleming/fastboot-leaktest#1 by sure to run the process with --export-gc so you can see the lovely logging, that demonstrates that the RSS settles after some time.

This indicates the leak reported in this issue is fixed. I suspect additional leaks are likely around, if someone finds some more a nice reproduction would be great, and should making tracking down the issue relatively easy.

Anyways, @joshvfleming thank you for the detailed report. I suspect we can now close this issue.

@joshvfleming

This comment has been minimized.

Show comment
Hide comment
@joshvfleming

joshvfleming Jul 14, 2015

This is great, thanks @stefanpenner ! I'll check it out today.

This is great, thanks @stefanpenner ! I'll check it out today.

@joshvfleming

This comment has been minimized.

Show comment
Hide comment

Confirmed:

memory

@tomdale

This comment has been minimized.

Show comment
Hide comment
@tomdale

tomdale Jul 14, 2015

Member

Yessssss

Member

tomdale commented Jul 14, 2015

Yessssss

@joshvfleming

This comment has been minimized.

Show comment
Hide comment
@joshvfleming

joshvfleming Jul 15, 2015

@stefanpenner Whoops, it looks like there is actually a leak remaining: the NAMESPACES leak. This didn't appear in the test app previously, because it was creating and reusing only one application. I've updated it to create an application for each request.

You should see steadily increasing RSS if you run the test script now.

Edit: It occurs to me that I might be using the Application object wrong. Is the intended usage that you create a single Application and use it to service all the requests? If so there's not really a leak here.

@stefanpenner Whoops, it looks like there is actually a leak remaining: the NAMESPACES leak. This didn't appear in the test app previously, because it was creating and reusing only one application. I've updated it to create an application for each request.

You should see steadily increasing RSS if you run the test script now.

Edit: It occurs to me that I might be using the Application object wrong. Is the intended usage that you create a single Application and use it to service all the requests? If so there's not really a leak here.

@chadhietala

This comment has been minimized.

Show comment
Hide comment
@chadhietala

chadhietala Jul 15, 2015

Member

This may be related to issues with tests as well as we are getting a new instance for each acceptance test. #11748

Member

chadhietala commented Jul 15, 2015

This may be related to issues with tests as well as we are getting a new instance for each acceptance test. #11748

@stefanpenner

This comment has been minimized.

Show comment
Hide comment
@stefanpenner

stefanpenner Jul 15, 2015

Member

Is the intended usage that you create a single Application and use it to service all the requests? If so there's not really a leak here.

yes

Member

stefanpenner commented Jul 15, 2015

Is the intended usage that you create a single Application and use it to service all the requests? If so there's not really a leak here.

yes

@joshvfleming

This comment has been minimized.

Show comment
Hide comment
@joshvfleming

joshvfleming Jul 15, 2015

Cool, thanks @stefanpenner . Looks like it's all sorted out then.

Cool, thanks @stefanpenner . Looks like it's all sorted out then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment