New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recycle DOM nodes #8783

Merged
merged 21 commits into from Sep 17, 2015

Conversation

Projects
None yet
7 participants
@as-cii
Member

as-cii commented Sep 14, 2015

After meeting with the Chrome team, we have found out that our GC times were quite severe when many DOM nodes were appended and removed from the tree (e.g. a typical scenario where you can see this in practice is scrolling).

This is a chart that shows how master behaves when scrolling down for a while:

before

As you can see, the browser is triggering minor garbage collections during the scroll; at some point (which, I think, coincides with some Chrome threshold), a full GC is triggered and it takes quite a long time:

screen shot 2015-09-14 at 18 26 32

With this PR we are introducing a DOM elements pool, which efficiently allocates and deallocates nodes to minimize GC times. The result is the following:

after

screen shot 2015-09-14 at 18 25 47

馃搲 馃悗

Moreover, we were able to ditch all the code that dealt with strings that contained null bytes (e.g. #8660) because now those characters are retained in the actual text nodes.

Some other pair of 馃憖 are most welcome, especially to look out for performance regressions (it feels like innerHTML was a bit faster, but I couldn't confirm this).

Thanks!

/cc: @atom/feedback @nathansobo

as-cii added some commits Sep 14, 2015

@lee-dohm

This comment has been minimized.

Show comment
Hide comment
@lee-dohm

lee-dohm Sep 14, 2015

Member

It appears that the memory usage of the elements pool will never go down. It should reach an upper bound, of course, but I'm still concerned that someone could accidentally reach an edge case and then their editor is 馃敟 Can you describe what a "worst case scenario" would look like? I'll take a stab at it ...

It appears that this pool will reach an upper bound based on the most complex UI that has been displayed so far. So some things to watch out for might be?

  • Large screens? (Able to fit more stuff)
  • More elements tightly packed? (Minimized JavaScript perhaps?)
  • Lots of tabs open? (Or do non-visible elements get freed?)
  • Lots of panes? (Same as above)
Member

lee-dohm commented Sep 14, 2015

It appears that the memory usage of the elements pool will never go down. It should reach an upper bound, of course, but I'm still concerned that someone could accidentally reach an edge case and then their editor is 馃敟 Can you describe what a "worst case scenario" would look like? I'll take a stab at it ...

It appears that this pool will reach an upper bound based on the most complex UI that has been displayed so far. So some things to watch out for might be?

  • Large screens? (Able to fit more stuff)
  • More elements tightly packed? (Minimized JavaScript perhaps?)
  • Lots of tabs open? (Or do non-visible elements get freed?)
  • Lots of panes? (Same as above)
@maxbrunsfeld

This comment has been minimized.

Show comment
Hide comment
@maxbrunsfeld

maxbrunsfeld Sep 14, 2015

Contributor

Moreover, we were able to ditch all the code that dealt with strings that contained null bytes.

鉂わ笍 Awesome.

Some minor style suggestions:

  • I would call it DOMElementPool instead of DomElementsPool. We've been doing initialisms like DOM as uppercase in our code lately, and I think that singular noun seems to a more common convention for other pool-type objects, (e.g. "connection pool", "object pool").
  • Is DomElementsPool::free meant to be called from the outside? If not, I would probably inline at its one call-site in ::freeElementAndDescendants, to make it as clear as possible that you should always use the latter method.
Contributor

maxbrunsfeld commented Sep 14, 2015

Moreover, we were able to ditch all the code that dealt with strings that contained null bytes.

鉂わ笍 Awesome.

Some minor style suggestions:

  • I would call it DOMElementPool instead of DomElementsPool. We've been doing initialisms like DOM as uppercase in our code lately, and I think that singular noun seems to a more common convention for other pool-type objects, (e.g. "connection pool", "object pool").
  • Is DomElementsPool::free meant to be called from the outside? If not, I would probably inline at its one call-site in ::freeElementAndDescendants, to make it as clear as possible that you should always use the latter method.
@nathansobo

This comment has been minimized.

Show comment
Hide comment
@nathansobo

nathansobo Sep 14, 2015

Contributor

Love the way you structured this with the elements pool. I'm inclined to accept a tiny slowdown of line construction to avoid the big pauses. I'm going to build this and try it out, but in general I'm more worried about janky performance than slightly slower performance. But if you have ideas about how we can have the best of both worlds that would be good too.

Contributor

nathansobo commented Sep 14, 2015

Love the way you structured this with the elements pool. I'm inclined to accept a tiny slowdown of line construction to avoid the big pauses. I'm going to build this and try it out, but in general I'm more worried about janky performance than slightly slower performance. But if you have ideas about how we can have the best of both worlds that would be good too.

@as-cii

This comment has been minimized.

Show comment
Hide comment
@as-cii

as-cii Sep 14, 2015

Member

It appears that the memory usage of the elements pool will never go down. It should reach an upper bound, of course, but I'm still concerned that someone could accidentally reach an edge case and then their editor is Can you describe what a "worst case scenario" would look like? I'll take a stab at it ...

@lee-dohm: that's a really good concern but I think this won't be worse than what we have currently. Let me explain why:

  1. Large screens. Memory usage will reach an upper bound pretty quickly and, since the lines on screen will always be quite the same amount, we shouldn't waste memory. The worst case scenario here is when we haven't got much content on screen and, as a result, the pool is almost full. In practice, this could happen only at the end of a file when "scroll past end" is true.
  2. More elements tightly packed. I think the same reasoning of 1) applies here. The upper bound of the pool is the number of lines that can fit on a screen (which, roughly, is also our lower bound, except in the worst case scenario mentioned above).
  3. Lots of tabs/panes open. Currently, when we have more than one tab open, the ones that are hidden still retain all the nodes inside them and, as a result, we shouldn't see any difference in memory usage.

What do you think? Am I missing something there? On the other hand, I realized we need to make sure that the pool gets disposed properly so that its nodes can be garbage collected when the editor is destroyed.

Some minor style suggestions:

I would call it DOMElementPool instead of DomElementsPool. We've been doing initialisms like DOM as uppercase in our code lately, and I think that singular noun seems to a more common convention for other pool-type objects, (e.g. "connection pool", "object pool").
Is DomElementsPool::free meant to be called from the outside? If not, I would probably inline at its one call-site in ::freeElementAndDescendants, to make it as clear as possible that you should always use the latter method.


Love the way you structured this with the elements pool. I'm inclined to accept a tiny slowdown of line construction to avoid the big pauses. I'm going to build this and try it out, but in general I'm more worried about janky performance than slightly slower performance. But if you have ideas about how we can have the best of both worlds that would be good too.

Thanks for the suggestions guys! Eventually I went with DOMElementPool, but kept ::free as a private method (i.e. not testing it in the specs) 'cause it felt a bit cleaner code-wise.

Overall I feel pretty good about this, exception made for the lines construction routines which seems a bit slower. I'd like to hear your feedback as well after you try this out; in the meantime, I am going to research how we can minimize those times when/if we resort to a background thread.

Member

as-cii commented Sep 14, 2015

It appears that the memory usage of the elements pool will never go down. It should reach an upper bound, of course, but I'm still concerned that someone could accidentally reach an edge case and then their editor is Can you describe what a "worst case scenario" would look like? I'll take a stab at it ...

@lee-dohm: that's a really good concern but I think this won't be worse than what we have currently. Let me explain why:

  1. Large screens. Memory usage will reach an upper bound pretty quickly and, since the lines on screen will always be quite the same amount, we shouldn't waste memory. The worst case scenario here is when we haven't got much content on screen and, as a result, the pool is almost full. In practice, this could happen only at the end of a file when "scroll past end" is true.
  2. More elements tightly packed. I think the same reasoning of 1) applies here. The upper bound of the pool is the number of lines that can fit on a screen (which, roughly, is also our lower bound, except in the worst case scenario mentioned above).
  3. Lots of tabs/panes open. Currently, when we have more than one tab open, the ones that are hidden still retain all the nodes inside them and, as a result, we shouldn't see any difference in memory usage.

What do you think? Am I missing something there? On the other hand, I realized we need to make sure that the pool gets disposed properly so that its nodes can be garbage collected when the editor is destroyed.

Some minor style suggestions:

I would call it DOMElementPool instead of DomElementsPool. We've been doing initialisms like DOM as uppercase in our code lately, and I think that singular noun seems to a more common convention for other pool-type objects, (e.g. "connection pool", "object pool").
Is DomElementsPool::free meant to be called from the outside? If not, I would probably inline at its one call-site in ::freeElementAndDescendants, to make it as clear as possible that you should always use the latter method.


Love the way you structured this with the elements pool. I'm inclined to accept a tiny slowdown of line construction to avoid the big pauses. I'm going to build this and try it out, but in general I'm more worried about janky performance than slightly slower performance. But if you have ideas about how we can have the best of both worlds that would be good too.

Thanks for the suggestions guys! Eventually I went with DOMElementPool, but kept ::free as a private method (i.e. not testing it in the specs) 'cause it felt a bit cleaner code-wise.

Overall I feel pretty good about this, exception made for the lines construction routines which seems a bit slower. I'd like to hear your feedback as well after you try this out; in the meantime, I am going to research how we can minimize those times when/if we resort to a background thread.

@lee-dohm

This comment has been minimized.

Show comment
Hide comment
@lee-dohm

lee-dohm Sep 14, 2015

Member

What do you think? Am I missing something there?

I see now that you're releasing the freed elements in the build method, so that takes care of my concern 馃榾

Member

lee-dohm commented Sep 14, 2015

What do you think? Am I missing something there?

I see now that you're releasing the freed elements in the build method, so that takes care of my concern 馃榾

@abe33

This comment has been minimized.

Show comment
Hide comment
@abe33

abe33 Sep 14, 2015

Contributor

Hi there, nice work! Object pools have always been an efficient way to fix performances problem and have proven to be very effective (I'm using it for both pigments and in my csv editor package and it achieved great results).

I think, in the case of text editor tiles and lines, I can imagine a much more worst scenario even if it won't happen on each run:

When changing the font size from the settings view, if you don't type the size quickly enough you can end with lines with a height of 1px for a short moment. In that case your pool will reach an amount of elements completely meaningless after you revert to a proper font size. It'll also occurs if you unintentionally change the font size with the other available means, but I believe it will have less impact as the change should be less radical than when going from 14 to 1 and then to 12 for instance.

I frequently faced these kind of sudden burst of instances back when I was working in the flash game industry. You can have special events in a game that trigger a massive amount of particles (and we all love that 馃榿). But to avoid having a too huge memory footprint after that I was frequently capping the freed instances collection based on data collected during the testing.

Contributor

abe33 commented Sep 14, 2015

Hi there, nice work! Object pools have always been an efficient way to fix performances problem and have proven to be very effective (I'm using it for both pigments and in my csv editor package and it achieved great results).

I think, in the case of text editor tiles and lines, I can imagine a much more worst scenario even if it won't happen on each run:

When changing the font size from the settings view, if you don't type the size quickly enough you can end with lines with a height of 1px for a short moment. In that case your pool will reach an amount of elements completely meaningless after you revert to a proper font size. It'll also occurs if you unintentionally change the font size with the other available means, but I believe it will have less impact as the change should be less radical than when going from 14 to 1 and then to 12 for instance.

I frequently faced these kind of sudden burst of instances back when I was working in the flash game industry. You can have special events in a game that trigger a massive amount of particles (and we all love that 馃榿). But to avoid having a too huge memory footprint after that I was frequently capping the freed instances collection based on data collected during the testing.

@as-cii

This comment has been minimized.

Show comment
Hide comment
@as-cii

as-cii Sep 15, 2015

Member

Thank you for sharing this super useful insight! That's an edge case I hadn't think of and which we should definitely address.

I think we have a couple of ways in which this could be hypothetically fixed:

  1. Clearing out DOMElementPool when fontSize or lineHeight change. This is the most straightforward in my opinion and exploits the assumption that those parameters won't change that frequently under typical usage (and, most importantly, isn't worse than what we have currently).
  2. Setting an absolute upper bound, as @lee-dohm proposed. This needs to be configured, though, and there's the risk of choosing a value that's not quite right.
  3. Setting a relative upper bound. The idea here could be to work with ratios, configuring the pool to rebalance automatically once the ratio reaches a certain value (i.e. ratio here could be defined as freedElements / builtElements for a given series of moments t0, t1, t2...). Although this is probably the best solution possible, I'd prefer to keep the pool code simple and straight to the point unless some more complex interaction becomes necessary.

I went ahead and implemented 1) in daf4316. @abe33: your experience with games would be super helpful here, so please let me know if you think there could be anything wrong with this solution. Thanks again! 馃憤

Member

as-cii commented Sep 15, 2015

Thank you for sharing this super useful insight! That's an edge case I hadn't think of and which we should definitely address.

I think we have a couple of ways in which this could be hypothetically fixed:

  1. Clearing out DOMElementPool when fontSize or lineHeight change. This is the most straightforward in my opinion and exploits the assumption that those parameters won't change that frequently under typical usage (and, most importantly, isn't worse than what we have currently).
  2. Setting an absolute upper bound, as @lee-dohm proposed. This needs to be configured, though, and there's the risk of choosing a value that's not quite right.
  3. Setting a relative upper bound. The idea here could be to work with ratios, configuring the pool to rebalance automatically once the ratio reaches a certain value (i.e. ratio here could be defined as freedElements / builtElements for a given series of moments t0, t1, t2...). Although this is probably the best solution possible, I'd prefer to keep the pool code simple and straight to the point unless some more complex interaction becomes necessary.

I went ahead and implemented 1) in daf4316. @abe33: your experience with games would be super helpful here, so please let me know if you think there could be anything wrong with this solution. Thanks again! 馃憤

as-cii added some commits Sep 15, 2015

Merge branch 'master' into as-recycle-nodes
# Conflicts:
#	src/line-numbers-tile-component.coffee
#	src/lines-tile-component.coffee
@benogle

This comment has been minimized.

Show comment
Hide comment
@benogle

benogle Sep 15, 2015

Contributor

@as-cii 鉂わ笍 馃挌 馃挍

Contributor

benogle commented Sep 15, 2015

@as-cii 鉂わ笍 馃挌 馃挍

@abe33

This comment has been minimized.

Show comment
Hide comment
@abe33

abe33 Sep 16, 2015

Contributor

@as-cii Solution 1 seems fine too.

Solution 2 woud require some metrics to be really accurate and solution 3 seems more complex to me without having good metrics to analyze.

For instance if I remove all the code in a file we'll end with a ratio used/freed of 1/N where N is the number of lines used before the deletion, but we should not rebalance anything because this is kinda the expected result in this case, but at the same time, the same ratio could be obtained in the scenario I described above but in that case a rebalance would be required.

Contributor

abe33 commented Sep 16, 2015

@as-cii Solution 1 seems fine too.

Solution 2 woud require some metrics to be really accurate and solution 3 seems more complex to me without having good metrics to analyze.

For instance if I remove all the code in a file we'll end with a ratio used/freed of 1/N where N is the number of lines used before the deletion, but we should not rebalance anything because this is kinda the expected result in this case, but at the same time, the same ratio could be obtained in the scenario I described above but in that case a rebalance would be required.

as-cii added a commit that referenced this pull request Sep 17, 2015

@as-cii as-cii merged commit 3fe34ed into master Sep 17, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@as-cii as-cii deleted the as-recycle-nodes branch Sep 17, 2015

@as-cii

This comment has been minimized.

Show comment
Hide comment
@as-cii

as-cii Sep 17, 2015

Member

Thanks everyone! 鉂わ笍

Member

as-cii commented Sep 17, 2015

Thanks everyone! 鉂わ笍

@jeancroy

This comment has been minimized.

Show comment
Hide comment
@jeancroy

jeancroy Oct 31, 2015

Contributor

If the pool re-balance become an issue I'll throw one idea:

At each generation, free a small percent of the pool.
That way you'll avoid steady state error without having to maintain complicated statistic about ideal condition.

Suppose you free 1% of the pool per generation. Then suppose some event occurs that drastically reduce the number of node required (say snap to half of screen). After about 300 generations you'll come to about 5% of the new operating condition whatever that new condition is (0.99^300鈮0.05) .

If you sync those generation with screen refresh (say about 60 per second), it means you'll adapt in about 5 seconds to the new condition. Which is relatively short in term of wasted memory for the user, but at the same time an eternity if you compare to the 30ms timescale of the garbage collection screenshot.

Contributor

jeancroy commented Oct 31, 2015

If the pool re-balance become an issue I'll throw one idea:

At each generation, free a small percent of the pool.
That way you'll avoid steady state error without having to maintain complicated statistic about ideal condition.

Suppose you free 1% of the pool per generation. Then suppose some event occurs that drastically reduce the number of node required (say snap to half of screen). After about 300 generations you'll come to about 5% of the new operating condition whatever that new condition is (0.99^300鈮0.05) .

If you sync those generation with screen refresh (say about 60 per second), it means you'll adapt in about 5 seconds to the new condition. Which is relatively short in term of wasted memory for the user, but at the same time an eternity if you compare to the 30ms timescale of the garbage collection screenshot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment