Fix performance issues likely caused by unnecessary renders #50

taneliang · 2020-07-07T07:41:12Z

Steps

Get hold of a larger profile dataset (kind of done, I've generated a pretty large profile by clicking around the concurrent demo app. We'd ideally want to get hold of a profile from production though)
Figure out what the slowest points are
Design and implement a fix
... Further scope this task

Main user actions that cause renders

Cursor traversal over empty (i.e. no React data or flamegraph node) area
- Investigate this; are we repainting the entire canvas?
Cursor hover over React data or flamegraph node
- Currently, we're repainting the entire canvas, when the only thing that changes is the highlighted data/node.
- Potential optimizations
  - Use a second canvas, layered on top of the existing one, that only displays hovered data/nodes. Recommended by https://developer.mozilla.org/en-US/docs/Web/API/Canvas_API/Tutorial/Optimizing_canvas#Use_multiple_layered_canvases_for_complex_scenes. We need to see if there'll be any syncing issues if the canvas is scrolled when an item is hovered over.
  - Implement a rudimentary layout system that allows us to determine an area of the canvas to redraw, instead of redrawing the entire canvas. This will likely not be enough for larger profiles as the cost of rendering an entire section is pretty high. However, this will allow us to formalize and expand the scope of the rudimentary section stacking that we have now.
Scrolling/zooming/resizing/reloading.
- A rerender of the entire canvas is necessary, unless there's a way to translate the canvas buffer.
- Slow renders will mean jittery scrolling.
- Potential optimizations:
  - Unclear, but all will likely be in the individual render functions. We should look into how Chrome's performance tab is so smooth. (Update: Chrome isn't very smooth when there are many flamegraph events either) I don't think offscreen canvas will help here as almost all the computation is related to rendering the canvas.
  - Don't render React events when they're not visible.
  - Wild idea: investigate using WebGL instead of 2D canvas.

Summary of options

UIViews

Update: Implemented in #80.

Implement rudimentary layout system similar to iOS's UIViews/CALayers. This is intended to allow us to determine an area of the canvas to redraw, instead of redrawing the entire canvas.

Requirements:

Basics:
- Support bounding boxes: to prevent redrawing of the entire canvas.
- Only vertical stacked layouts required.
- Manual size overrides: to support section resizing (Add vertical resizer between canvas #19)
- Visibility detection: to provide a reusable way to avoid drawing of an invisible view.
- Hit testing and event handling: so that we can get the hovered event without recomputing the canvas's layout, which the existing getHoveredEvent current does.
- Subviews: to represent the logical sections. This will also let us support manual size overrides more smoothly.
Possible extensions:
- Horizontal layouts, which will allow us to use the view system to lay out the React data/flamechart nodes as well.
- "Scrolling" and "zooming"; translating some views' contents by a certain offset. We could implement UIScrollView if we have horizontal layouts but that sounds a bit overkill right now.
- If we implement scroll views and horizontal layouts, we can completely separate scrolling/zooming from the layout code.

Pros:

Formalizes the stacked logic that we have now.
Deduplicates the vertical drawing skipping code, allowing the rendering code to focus on drawing the data in their views.
Many potential extensions.
Allows the mouse hover code to depend on the drawing/layout code, instead of recomputing the locations of everything.
We can split this into a reusable package and open source it.

Cons:

May be overkill for our use case.
Does not improve the speed of a whole-screen render.
Does not reduce the amount of rendering required for scrolling/zooming operations. We already have some optimizations that prevents us from drawing most offscreen things; our view system will just make this optimization reusable and transparent.

Questions:

Does something like that already exist? We may be able to use that instead of rolling our own.
Can we use the browser's layout engine instead? We basically require position: sticky and scrolling, so it'll be great if we could just use the browser to do the actual layouts for us, possibly in invisible divs.

ReactART

A canvas renderer that's in the React repo.

Pros:

In the React codebase
Event handling built in
We can use React's tooling and approaches

Cons:

Support and stability are unknown

Questions:

Does clipping mean that items outside the clipping rectangle are not drawn?

Things tried:

Built a POC with ReactART.
Used this fork (Run requestAnimationFrame for no jank if a browser supports it sebmarkbage/art#24) that uses requestAnimationFrame instead of the existing setTimeout, drastically reducing the delay between React's commit phase and the commits being rendered onscreen (unlike in React DOM, ReactART's commit phase does not draw directly to the canvas; the Art library does it after or schedules another rendering task for that).
From inspecting the performance profiles, I found that ReactART's Rectangle was very slow as it drew rects using lines instead of calling the canvas context's rect method. As we are drawing a large number of rectangles, this was a significant bottleneck. I added a way to call the canvas context's rect to Art and ReactART:
1. Forked the fork of Art and implemented a Path.rect function that calls the canvas context's rect function
2. Changed ReactART's Rectangle class to call Path.rect if the radii are 0.
Use a for loop in our most common loop instead of .map.
Use React.memo on our components, especially FlamechartNode, to reduce unnecessary work during the reconciliation phase.

Resolution: Likely too slow to be suitable.

The API is very nice, and the hit testing and event handling are really convenient.

However, both React and ReactART are too slow, as we have too many items to render. This screenshot below shows a mouseover over a flamechart node in a big-ish profile with a total of 121813 nodes generated by clicking around my toy concurrent demo app. As there are too many React elements even in this toy profile, the reconciler takes a long time to do its work, even after React.memo has skipped the rendering of most elements. Additionally, when more nodes are visible, Art's canvas rendering code also takes a long time, seemingly due to its setTransform calls.

WebGL

Investigated Pixi.js and Two.js, but both are not performant enough to handle tens of thousands of rects every frame.

Optimize hot loop code

Replacing the division operation in this line with a constant reduced the runtime of renderFlamegraph by a surprising amount (eyeballed the flamegraph: it looks like that reduced the runtime by >60%). Looks like some work can be done here to optimize all the hot loops:

Tried using Prepack to lift up some of the math for us, but:
- Could not run it on the compiled bundle as Prepack was complaining that document was unknown.
- I could run it on the renderCanvas.js file, but it does not inline the constants file so it did not optimize those calculations. I did not look into this further as I'm not fully expecting Prepack to work completely and it'll take too much time.
Tried using Google Closure Compiler, but as expected there was no performance gain as it did not execute the division.

The text was updated successfully, but these errors were encountered:

taneliang · 2020-07-09T15:12:40Z

Hey @bvaughn, I have some questions related to the perf improvements:

Is there any restriction on the external libraries that can be used with our profiler, given that this code may live in the React codebase at some point? I'm thinking of exploring WebGL-based libraries as your code was already really efficient and we seem to be hitting the limit of what's possible with canvas.
Could we get hold of a sample profile from within Facebook in the next week or so? I'm trying to find out where the performance optimizations should be focused on, or if the current performance is already acceptable.
Would you happen to have any insider tricks for handling >100000 elements? I was trying out ReactART today, and the render phase is taking a significant amount of time because we have such a large number of flamechart nodes, even though I've tried using memo.

By the way, we've just been told that the MLH has scheduled a hackathon for the fellowship next week, so we likely won't be working on the profiler until the 20th.

bvaughn · 2020-07-09T15:35:24Z

Is there any restriction on the external libraries that can be used with our profiler

I believe the only restrictions would have to do with licensing. If it's a license like MIT then we should be fine. If it's a more restrictive license, or a pay-to-use license, then we would not be able to use it. If you have a particular package in mind I could take a look?

I'm thinking of exploring WebGL-based libraries as your code was already really efficient and we seem to be hitting the limit of what's possible with canvas.

Interesting. I have 0 experience with WebGL. I thought it was mostly for 3d. Can browser extensions run WebGL? I assume so but have no idea?

Could we get hold of a sample profile from within Facebook in the next week or so?

Yeah I should be able to do this in a day or so. 😄 Remind me if I forget.

Would you happen to have any insider tricks for handling >100000 elements?

By "elements" you mean... data points we render onto a canvas?

I guess the primary trick is: Don't do it 😅

I'd probably have to spend some time profiling and thinking about this. If the main problem is that we're hogging the UI thread iterating over tons of data, maybe there's a way we could front load some preprocessing to make our rendering more efficient.

Are we slowest when we're zoomed in? If so, maybe we could split the data into chunks somehow so we wouldn't have to iterate over e.g. all events but only chunks that cover the time range we're viewing.

Are we the slowest when we're zoomed out? Maybe we could preprocess into different zoom levels, and just filter out anything that would be too small to show at a certain level anyway (so we don't waste time iterating over it each time).

In other words, if the slowness is coming from our JS iterations- maybe we can reduce the work we're doing there. (In that case, I don't think WebGL would help us any.) If it's coming from the Canvas itself, then maybe we could find a way to render ~~less~~ fewer discrete things.

By the way, we've just been told that the MLH has scheduled a hackathon for the fellowship next week, so we likely won't be working on the profiler until the 20th.

Thanks for the heads up!

taneliang · 2020-07-09T16:26:31Z

If it's a license like MIT then we should be fine

Great! I haven't looked in detail into any packages just yet, but if I'll want to use one that's not using the MIT license I'll let you know.

I have 0 experience with WebGL. I thought it was mostly for 3d.

Me too, but I have a hunch it'll be faster. I'll read up further and make a proof of concept if it looks promising.

Can browser extensions run WebGL?

Great question! I didn't think of this. I'll check this first before looking any further; it'll be a real bummer if it isn't supported.

By "elements" you mean... data points we render onto a canvas?

Oh I didn't clarify this, sorry. This question was in the context of my ReactART poc that I was working on today. I hope I'm using the terms correctly, but essentially I was trying to turn your renderCanvas into React components, so I made a Flamechart component that rendered a FlamechartNode component 100k times, 1 for each flamechart node in the data. Even though FlamechartNode was wrapped in React.memo (and it's correctly memoized; memo provided a huge speedup), the poc was still spending a lot of its time in the render phase (there's a screenshot of a mouseover event profile in the issue description), so I was wondering if there were any other tricks to get React to handle so many elements. ReactART seemed promising so I was hoping we could use it, but it's fine if we can't.

Anyway, I think your response was in the context of the existing profiler code. The current profiler is slowest when zoomed out, because there are many more things to render. Here's a screenshot of a profile of our profiler:

It looks to me like most of the time is spent calling canvas context methods, which is why I'm thinking it may be a good idea to see if WebGL will be a good alternative. renderFlamechart also has a significant self time, but I'm not sure if anything can be done there. I'll look into it further.

The GPU can also tends to spend a really long time processing after a render, but I haven't looked into it. I'm also not sure if WebGL will help with this:

bvaughn · 2020-07-09T16:31:26Z

Oh I didn't clarify this, sorry. This question was in the context of my ReactART poc that I was working on today.

I wouldn't use ReactART for this. Too much overhead from creating all fo the React components. This is an area where using an imperative escape hatch (e.g. Canvas) is probably the only real solution

bvaughn · 2020-07-09T16:33:37Z

Anyway, I think your response was in the context of the existing profiler code. The current profiler is slowest when zoomed out, because there are many more things to render. Here's a screenshot of a profile of our profiler:

It looks to me like most of the time is spent calling canvas context methods, which is why I'm thinking it may be a good idea to see if WebGL will be a good alternative. renderFlamechart also has a significant self time, but I'm not sure if anything can be done there. I'll look into it further.

Gotcha. Maybe we can reduce the number of times we call Canvas methods somehow.

This commit begins a stack of PRs that optimizes our flamechart. Broadly, we need to replace the use of Speedscope's flamechart types with our own types that are optimized for our rendering code. As flamechart was always moved around together with ReactProfilerData, it makes sense to move flamechart into it. That also necessitates moving preprocessFlamechart into preprocessData. This PR works towards #50. * `yarn flow`: errors present, but no errors in affected code. * `yarn lint` * `yarn start`

… frame type Lays groundwork for #75 by reducing the flamechart data passed around the app. Was also intended to help #50 by moving some division operations from the hot render loops into preprocessFlamechart, but any performance gains are negligible. * `yarn flow`: no errors in modified code * `yarn lint` * `yarn start`: no performance difference

Allow the view system to render only affected flamechart rows. This greatly improves flamechart hover performance, which has a high impact on our app's performance as the flamechart is the slowest view to render. Related to #50. * `yarn flow`: no errors in changed code * `yarn lint` * `yarn start`: 60 FPS hovers on our Facebook.com profile

This commit begins a stack of PRs that optimizes our flamechart. Broadly, we need to replace the use of Speedscope's flamechart types with our own types that are optimized for our rendering code. As flamechart was always moved around together with ReactProfilerData, it makes sense to move flamechart into it. That also necessitates moving preprocessFlamechart into preprocessData. This PR works towards #50. * `yarn flow`: errors present, but no errors in affected code. * `yarn lint` * `yarn start`

… frame type Lays groundwork for #75 by reducing the flamechart data passed around the app. Was also intended to help #50 by moving some division operations from the hot render loops into preprocessFlamechart, but any performance gains are negligible. * `yarn flow`: no errors in modified code * `yarn lint` * `yarn start`: no performance difference

Allow the view system to render only affected flamechart rows. This greatly improves flamechart hover performance, which has a high impact on our app's performance as the flamechart is the slowest view to render. Related to #50. * `yarn flow`: no errors in changed code * `yarn lint` * `yarn start`: 60 FPS hovers on our Facebook.com profile

… frame type Lays groundwork for #75 by reducing the flamechart data passed around the app. Was also intended to help #50 by moving some division operations from the hot render loops into preprocessFlamechart, but any performance gains are negligible. * `yarn flow`: no errors in modified code * `yarn lint` * `yarn start`: no performance difference

Allow the view system to render only affected flamechart rows. This greatly improves flamechart hover performance, which has a high impact on our app's performance as the flamechart is the slowest view to render. Related to #50. * `yarn flow`: no errors in changed code * `yarn lint` * `yarn start`: 60 FPS hovers on our Facebook.com profile

… frame type Lays groundwork for #75 by reducing the flamechart data passed around the app. Was also intended to help #50 by moving some division operations from the hot render loops into preprocessFlamechart, but any performance gains are negligible. * `yarn flow`: no errors in modified code * `yarn lint` * `yarn start`: no performance difference

Allow the view system to render only affected flamechart rows. This greatly improves flamechart hover performance, which has a high impact on our app's performance as the flamechart is the slowest view to render. Related to #50. * `yarn flow`: no errors in changed code * `yarn lint` * `yarn start`: 60 FPS hovers on our Facebook.com profile

Allow the view system to render only affected flamechart rows. This greatly improves flamechart hover performance, which has a high impact on our app's performance as the flamechart is the slowest view to render. Also adds a new `ColorView` view that fills the flamechart area with a solid background color to fix these issues: 1. Black empty areas appearing below flamecharts without much stack depth 2. Lines appearing between flamechart rows. My hypothesis is that these are appearing due to subpixel rendering, even though the flamechart rows' `visibleArea`s are all integers. Related to #50. * `yarn flow`: no errors in changed code * `yarn lint` * `yarn start`: 60 FPS hovers on our Facebook.com profile * `yarn test`

taneliang · 2020-08-18T11:42:18Z

This issue has become a bit of a catch all issue for all performance issues in the app, but I think we've gotten perf to a pretty good level. 🎉 I'll close this issue and we can open more specific ones in the React repo if we have to

taneliang self-assigned this Jul 7, 2020

kartikcho mentioned this issue Jul 8, 2020

Add vertical resizer between canvas #19

Closed

5 tasks

taneliang added phase: 1 - MVP Tasks that get us to our initial MVP and removed phase: 2 - iteration Iterations on our MVP labels Jul 8, 2020

taneliang mentioned this issue Jul 9, 2020

Use integers when calling canvas draw methods #62

Closed

This was referenced Jul 28, 2020

[Optimize flamechart][1/4] Add flamechart to ReactProfilerData #90

Merged

[Optimize flamechart][2/4] Replace Speedscope types with custom stack frame type #91

Merged

taneliang mentioned this issue Jul 28, 2020

[Optimize flamechart][4/4] Split FlamechartView into row views #93

Merged

taneliang mentioned this issue Aug 6, 2020

Switch to concurrent React #122

Merged

taneliang closed this as completed Aug 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix performance issues likely caused by unnecessary renders #50

Fix performance issues likely caused by unnecessary renders #50

taneliang commented Jul 7, 2020 •

edited

Loading

taneliang commented Jul 9, 2020

bvaughn commented Jul 9, 2020 •

edited

Loading

taneliang commented Jul 9, 2020

bvaughn commented Jul 9, 2020

bvaughn commented Jul 9, 2020

taneliang commented Aug 18, 2020

Fix performance issues likely caused by unnecessary renders #50

Fix performance issues likely caused by unnecessary renders #50

Comments

taneliang commented Jul 7, 2020 • edited Loading

Steps

Main user actions that cause renders

Summary of options

UIViews

ReactART

WebGL

Optimize hot loop code

taneliang commented Jul 9, 2020

bvaughn commented Jul 9, 2020 • edited Loading

taneliang commented Jul 9, 2020

bvaughn commented Jul 9, 2020

bvaughn commented Jul 9, 2020

taneliang commented Aug 18, 2020

taneliang commented Jul 7, 2020 •

edited

Loading

bvaughn commented Jul 9, 2020 •

edited

Loading