perf(dashboard): Virtualization POC #21438

kgabryje · 2022-09-12T14:43:48Z

SUMMARY

This PR is a proof of concept for dashboard virtualization.
Due to a performance issues on large dashboards (60+ charts in a single tab/view), this PR introduces virtualization in an attempt to reduce the number of elements (DOM nodes) rendered at once.
In the initial state, we render only the charts in current viewport and charts max 1 view height away from current viewport.
When user scrolls down, charts that cross the 1 view height margin get rendered too.
We remove charts from DOM tree when they are at least 4 view heights away from current viewport.
Those values were chosen arbitrarily/experimentally to provide good middle-ground between memory efficiency and good user experience when scrolling up/down.

To sum it up: if chart is not rendered and it's less than 1 view height away from viewport - render it; if chart is rendered and it's more than 4 view heights away from viewport - remove it from DOM

Rerendering charts do NOT trigger new queries - we reuse data fetched on dashboard mount and only redraw the charts.

As of now, this feature is in proof-of-concept faze and should be considered as experimental. It's hidden behind a feature flag DASHBOARD_VIRTUALIZATION

Virtualization is disabled for bot workers (like Celery) to ensure that reports and alerts work correctly

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Screen.Recording.2022-10-09.at.18.45.21.mov

TESTING INSTRUCTIONS

Enable FF DASHBOARD_VIRTUALIZATION
Open a dashboard with a lot of charts
Scroll up and down - verify that only the charts in current viewport (or close to it - 100% margin) are rendered. The rest of them displays a spinner and loads only when user scroll them into view
Verify that charts render when user scrolls, but do not send new queries to the backend

ADDITIONAL INFORMATION

Has associated issue:
Required feature flags: DASHBOARD_VIRTUALIZATION
Changes UI
Includes DB Migration (follow approval process in SIP-59)
- Migration is atomic, supports rollback & is backwards-compatible
- Confirm DB migration upgrade and downgrade tested
- Runtime estimates and downtime expectations provided
Introduces new feature or API
Removes existing feature or API

codecov · 2022-09-12T14:53:45Z

Codecov Report

Merging #21438 (d169d72) into master (f58227a) will decrease coverage by 0.01%.
The diff coverage is 22.72%.

❗ Current head d169d72 differs from pull request most recent head 67052ff. Consider uploading reports for the commit 67052ff to get more accurate results

@@            Coverage Diff             @@
##           master   #21438      +/-   ##
==========================================
- Coverage   66.84%   66.83%   -0.02%     
==========================================
  Files        1800     1802       +2     
  Lines       68951    68972      +21     
  Branches     7335     7342       +7     
==========================================
+ Hits        46092    46096       +4     
- Misses      20965    20980      +15     
- Partials     1894     1896       +2

Flag	Coverage Δ
hive	`52.92% <ø> (ø)`
javascript	`53.14% <22.72%> (-0.03%)`	⬇️
mysql	`78.24% <ø> (ø)`
postgres	`78.31% <ø> (ø)`
presto	`52.82% <ø> (ø)`
python	`81.45% <ø> (ø)`
sqlite	`76.79% <ø> (ø)`
unit	`51.04% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...ackages/superset-ui-core/src/utils/featureFlags.ts	`100.00% <ø> (ø)`
.../src/dashboard/components/gridComponents/Chart.jsx	`55.04% <ø> (ø)`
...ashboard/components/gridComponents/ChartHolder.tsx	`88.88% <ø> (ø)`
...tend/src/utils/isDashboardVirtualizationEnabled.ts	`0.00% <0.00%> (ø)`
superset/config.py	`91.63% <ø> (ø)`
superset/views/base.py	`75.70% <ø> (ø)`
...nd/src/dashboard/components/gridComponents/Row.jsx	`51.21% <17.64%> (-23.79%)`	⬇️
superset-frontend/src/components/Chart/Chart.jsx	`51.66% <50.00%> (-0.88%)`	⬇️
superset-frontend/src/utils/isBot.ts	`100.00% <100.00%> (ø)`
...nts/controls/DateFilterControl/DateFilterLabel.tsx	`41.41% <0.00%> (ø)`

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

kgabryje · 2022-09-12T15:58:57Z

/testenv up FEATURE_DASHBOARD_VIRTUALIZATION

github-actions · 2022-09-12T16:01:30Z

@kgabryje Ephemeral environment spinning up at http://34.216.203.189:8080. Credentials are admin/admin. Please allow several minutes for bootstrapping and startup.

hushaoqing · 2022-09-13T07:45:14Z

Love this feature.
One suggestion: from my experience, it would be better if it didn't re-render when scrolling back.

villebro

LGTM, super nice feature! I kinda agree with @hushaoqing that leaving the charts rendered if they've already been rendered would be nicer. But since this is behind a FF we can probably iterate on these details in forthcoming PRs if there is a good reason not to make that change now.

michael-s-molina · 2022-09-13T11:59:05Z

We currently have react-virtualized and react-window in our dependencies that provide virtualization. react-window is the lighter-weight alternative. Can we use one of them for this problem?

kgabryje · 2022-09-13T13:11:58Z

Thank you for the feedback!
@hushaoqing @villebro Removing charts after scrolling them out of view was a conscious decision in order to reduce the number of nodes in DOM tree. However, I do agree that seeing the loading spinner all the time might be annoying, especially for the charts that take longer to render. It's very easy to change that behaviour, I'll do it later today. If we get some opposite feedback (charts should be removed), I can introduce another feature flag to make it configurable.

@michael-s-molina I considered using react-window, but I don't think it's applicable in this case. react-window expects a clear list of elements to render, with their heights and widths. The logic of rendering the dashboard is more complex - here we have DashboardComponents recursively rendering other DashboardComponents.
Moreover, even if we managed to restructure dashboard rendering logic to use react-window, that would mean that we not only delay drawing the chart, but also running chart queries - as it's the src/components/Chart.jsx component that handles running queries and we'd not render it until it's in view.
I think in this case the "manual" implementation of virtualization with IntersectionObserver is a better choice

michael-s-molina · 2022-09-13T13:24:09Z

@hushaoqing @villebro Removing charts after scrolling them out of view was a conscious decision in order to reduce the number of nodes in DOM tree. However, I do agree that seeing the loading spinner all the time might be annoying, especially for the charts that take longer to render. It's very easy to change that behaviour, I'll do it later today. If we get some opposite feedback (charts should be removed), I can introduce another feature flag to make it configurable.

One reason to render only visible items is memory footprint. Many virtualization libraries use that approach. That being said, may I suggest using pre-defined values for the feature flag instead of creating a new one? Like this:

DASHBOARD_VIRTUALIZATION = NONE | VIEWPORT | PAGINATED
NONE = No virtualization is applied
VIEWPORT = Only elements in the viewport are kept in the DOM
PAGINATED = Elements are added to the DOM when they enter the viewport and they are not removed

kgabryje · 2022-09-13T13:30:01Z

One reason to render only visible items is memory footprint. Many virtualization libraries use that approach. That being said, may I suggest using pre-defined values for the feature flag instead of creating a new one? Like this:

DASHBOARD_VIRTUALIZATION = NONE | VIEWPORT | PAGINATED NONE = No virtualization is applied VIEWPORT = Only elements in the viewport are kept in the DOM PAGINATED = Elements are added to the DOM when they enter the viewport and they are not removed

Do we have an established pattern for 3-state feature flags?

rusackas · 2022-09-13T14:08:43Z

Do we have an established pattern for 3-state feature flags?

I'm not aware of one, and it might be worth mentioning that a lot of companies use feature flag manager platforms that prefer to work (or only work) with booleans. Booleans feel safer.

Loving this feature, btw. I'm getting weird performance issues (i.e. the browser freezing) but i'll try it again in a bit and see if it's just a fluke.

kgabryje · 2022-09-13T14:34:10Z

I'm getting weird performance issues (i.e. the browser freezing) but i'll try it again in a bit and see if it's just a fluke.

If you're getting perf issues with virtualization, but not without it, that might mean that the intersection observer is too expensive to be put on every chart. If that's the case, we might want to put it on Row component instead and propagate the visibility prop down to charts...

kgabryje · 2022-09-13T14:36:34Z

Do we have an established pattern for 3-state feature flags?

I'm not aware of one, and it might be worth mentioning that a lot of companies use feature flag manager platforms that prefer to work (or only work) with booleans. Booleans feel safer.

I refactored it like this: DASHBOARD_VIRTUALIZATION: Literal["NONE", "VIEWPORT", "PAGINATED"] = "NONE". It's available in redux state as common.conf.DASHBOARD_VIRTUALIZATION. We use a similar pattern with another config option: DASHBOARD_AUTO_REFRESH_MODE: Literal["fetch", "force"] = "force". However, I'm also worried about potential problems with feature flag managers...

michael-s-molina · 2022-09-13T15:01:55Z

I'm not aware of one, and it might be worth mentioning that a lot of companies use feature flag manager platforms that prefer to work (or only work) with booleans. Booleans feel safer.

If that is a concern, a possible approach would be to treat it like a configuration instead of a feature flag, similar to CACHE_TYPE which defaults to null but can be overridden with different cache backends. This is not a blocker for me, I'm just trying to avoid two feature flags for the same feature.

kgabryje · 2022-09-13T15:14:25Z

@rusackas @michael-s-molina Updated the PR:

Make DASHBOARD_VIRTUALIZATION an enum config instead of a feature flag
VIEWPORT mode - remove charts that are not in view; PAGINATED - persist the already rendered charts; NONE - virtualization disabled
Move IntersectionObserver to Row. Now we'll have much fewer observers - 1 per row instead of 1 per chart

kgabryje · 2022-09-13T15:23:41Z

Unfortunately I can't start a testenv with virtualization since now it's a config instead of a feature flag, so if you want to test manually, you need to pull my branch to your local setup

stephenLYZ · 2022-09-14T04:18:43Z

Nice feature. Another case to consider is about color consistency. But this scenario is similar to the hidden tab, where the charts are lazy renderer, and this case will be addressed in another PR. cc @geido

mayurnewase · 2022-09-14T06:18:24Z

Awesome feature, tested locally.

michael-s-molina · 2022-09-14T11:45:01Z

@kgabryje @rusackas Thinking more carefully about the problem, I think we could dynamically choose the strategy (viewport or paginated) based on the number of charts that needs to be rendered. This way we would have the best of both strategies and avoid adding another feature flag or configuration.

villebro · 2022-09-14T13:39:15Z

superset-frontend/src/utils/isDashboardVirtualizationEnabled.ts

+export enum DASHBOARD_VIRTUALIZATION_MODE {
+  NONE = 'NONE',
+  VIEWPORT = 'VIEWPORT',
+  PAGINATED = 'PAGINATED',


Nit: I'm always nagging about this 😄 But I think we should stick to PascalCase enums in our TS codebase: https://www.typescriptlang.org/docs/handbook/enums.html

+1 on PascalCase. In the latest graphql-codegen CLI, it'd even force the backend enums to be PascalCase by default.

geido · 2022-09-16T11:07:11Z

Nice feature. Another case to consider is about color consistency. But this scenario is similar to the hidden tab, where the charts are lazy renderer, and this case will be addressed in another PR. cc @geido

Good catch @stephenLYZ. I think we can use the same approach we would use for the tabs. We would be catching series as they appear and either apply an existing color if that was seen before or assign a new color from the scheme and persist it.

geido

Love it! Just one preference from me, I think the options VIEWPORT and PAGINATED are not super clear at first glance. If we set the charts to load on scroll I think LAZY might be a more appropriate keyword.

michael-s-molina · 2022-09-16T11:29:11Z

Love it! Just one preference from me, I think the options VIEWPORT and PAGINATED are not super clear at first glance. If we set the charts to load on scroll I think LAZY might be a more appropriate keyword.

The problem is that both are lazy 😜. Anyway, I vote for removing the flag/configuration completely and choosing the strategy based on the number of charts that need to be rendered.

justinpark · 2022-09-16T18:53:29Z

@kgabryje Is this safe to support the screen capture from "Reports" worker?
I personally recommend using react-intersection-observer which can cover the fallback case of unsupported IntersectionObserver (which possibly issue on screen capture from the celery worker)

btw I recently introduced the similar feature (which only limits the runQuery) in #21488

kgabryje · 2022-09-19T16:05:11Z

Thank you all for feedback and suggestions!
@justinpark I'm not sure if delaying queries is a better solution. Queries are initialized in the order in which the charts are placed on the dashboard, so while we usually are exceeding the browser's simultaneous query limit, charts at the top of the page don't really need to wait for the charts at the bottom. Moreover, I've chosen to only delay drawing charts instead of drawing + running queries in order to reduce a possible "slow scroll" experience.

Great catch about the reports! Do you have any idea how to disable virtualization for celery workers?

@michael-s-molina Agreed, but we need to figure out the threshold between the 2 strategies. I was thinking about this: if the chart hasn't been in view yet, draw it when it's 1 view height away from the viewport (like we do now). If the chart has been drawn and viewed, remove it when it's 5 (arbitrarily chosen number) view heights away from the viewport. That way we won't remove charts in regular dashboards, but we will in big dashboards.
WDYT?

@villebro Good point about the Pascal case! However, if we can figure out the dynamic approach proposed by Michael we won't need that config, so I'm not changing that for now 🙂

michael-s-molina · 2022-09-19T16:33:56Z

@michael-s-molina Agreed, but we need to figure out the threshold between the 2 strategies. I was thinking about this: if the chart hasn't been in view yet, draw it when it's 1 view height away from the viewport (like we do now). If the chart has been drawn and viewed, remove it when it's 5 (arbitrarily chosen number) view heights away from the viewport. That way we won't remove charts in regular dashboards, but we will in big dashboards.
WDYT?

Sounds good. We can determine the arbitrary number by displaying a heavy viewport of charts, measuring the consumed memory, and multiplying it by what we consider acceptable.

ktmud · 2022-09-20T01:01:10Z

IIRC, react-virtualized has been discontinued by the author in favor of continued support for react-window. So that's another reason to not use react-virtualized.

ktmud · 2022-09-20T01:07:47Z

superset-frontend/src/dashboard/components/gridComponents/Row.jsx

+    if (
+      isDashboardVirtualizationEnabled(this.props.dashboardVirtualizationMode)
+    ) {
+      this.observer = new IntersectionObserver(


Should we disconnect and unobserve on unmount? Just to reduce the chance of memory leak.

…d of a feature flag

kgabryje · 2022-10-09T16:38:52Z

@michael-s-molina @justinpark @geido @villebro @hushaoqing @rusackas @ktmud Thank you all for feedback and suggestions... and your patience. I updated the PR - now the charts render when they are 1vh away from the viewport, but they're removed from DOM only if they are more than 4vh away from viewport. See the updated summary for more details

kgabryje · 2022-10-11T06:48:32Z

/testenv up FEATURE_DASHBOARD_VIRTUALIZATION

github-actions · 2022-10-11T07:33:09Z

@kgabryje Ephemeral environment spinning up at http://52.42.75.81:8080. Credentials are admin/admin. Please allow several minutes for bootstrapping and startup.

geido

Thanks for the enhancements!

github-actions · 2022-10-11T14:36:44Z

Ephemeral environment shutdown and build artifacts deleted.

kgabryje requested review from rusackas, zhaoyongjie, villebro, geido and michael-s-molina September 12, 2022 14:43

pull-request-size bot added the size/M label Sep 12, 2022

villebro approved these changes Sep 13, 2022

View reviewed changes

pull-request-size bot added size/L and removed size/M labels Sep 13, 2022

stephenLYZ approved these changes Sep 14, 2022

View reviewed changes

villebro reviewed Sep 14, 2022

View reviewed changes

geido approved these changes Sep 16, 2022

View reviewed changes

ktmud reviewed Sep 20, 2022

View reviewed changes

michael-s-molina mentioned this pull request Oct 7, 2022

feat(dashboard): menu improvements, fallback support for Drill to Detail #21351

Merged

9 tasks

kgabryje added 5 commits October 9, 2022 16:54

perf(dashboard): Virtualization POC

f2d0be3

Move IntersectionObserver to Row, make virtualization a config instea…

2c8356f

…d of a feature flag

Fix lint

a4e2ea7

Exclude bot user from virtualization

8f22d4b

Separate enabler and disabler observers

5d99cec

kgabryje force-pushed the perf/dashboard-virtualized branch from 0c7c41e to 5d99cec Compare October 9, 2022 15:48

Replace unobserve with disconnect

d169d72

fix test

67052ff

geido approved these changes Oct 11, 2022

View reviewed changes

kgabryje merged commit 406e44b into apache:master Oct 11, 2022

michael-s-molina mentioned this pull request Jan 16, 2024

chore: Sets DASHBOARD_VIRTUALIZATION feature flag to True by default #26636

Merged

9 tasks

mistercrunch added 🚢 2.1.3 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 2.1.0 and removed 🚢 2.1.3 labels Feb 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(dashboard): Virtualization POC #21438

perf(dashboard): Virtualization POC #21438

kgabryje commented Sep 12, 2022 •

edited

codecov bot commented Sep 12, 2022 •

edited

kgabryje commented Sep 12, 2022

github-actions bot commented Sep 12, 2022

hushaoqing commented Sep 13, 2022

villebro left a comment

michael-s-molina commented Sep 13, 2022

kgabryje commented Sep 13, 2022

michael-s-molina commented Sep 13, 2022

kgabryje commented Sep 13, 2022

rusackas commented Sep 13, 2022

kgabryje commented Sep 13, 2022

kgabryje commented Sep 13, 2022

michael-s-molina commented Sep 13, 2022

kgabryje commented Sep 13, 2022 •

edited

kgabryje commented Sep 13, 2022

stephenLYZ commented Sep 14, 2022

mayurnewase commented Sep 14, 2022

michael-s-molina commented Sep 14, 2022 •

edited

villebro Sep 14, 2022

ktmud Sep 19, 2022

geido commented Sep 16, 2022

geido left a comment

michael-s-molina commented Sep 16, 2022

justinpark commented Sep 16, 2022 •

edited

kgabryje commented Sep 19, 2022 •

edited

michael-s-molina commented Sep 19, 2022

ktmud commented Sep 20, 2022 •

edited

ktmud Sep 20, 2022

kgabryje Oct 9, 2022

kgabryje commented Oct 9, 2022 •

edited

kgabryje commented Oct 11, 2022

github-actions bot commented Oct 11, 2022

geido left a comment

github-actions bot commented Oct 11, 2022

perf(dashboard): Virtualization POC #21438

perf(dashboard): Virtualization POC #21438

Conversation

kgabryje commented Sep 12, 2022 • edited

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

codecov bot commented Sep 12, 2022 • edited

Codecov Report

kgabryje commented Sep 12, 2022

github-actions bot commented Sep 12, 2022

hushaoqing commented Sep 13, 2022

villebro left a comment

Choose a reason for hiding this comment

michael-s-molina commented Sep 13, 2022

kgabryje commented Sep 13, 2022

michael-s-molina commented Sep 13, 2022

kgabryje commented Sep 13, 2022

rusackas commented Sep 13, 2022

kgabryje commented Sep 13, 2022

kgabryje commented Sep 13, 2022

michael-s-molina commented Sep 13, 2022

kgabryje commented Sep 13, 2022 • edited

kgabryje commented Sep 13, 2022

stephenLYZ commented Sep 14, 2022

mayurnewase commented Sep 14, 2022

michael-s-molina commented Sep 14, 2022 • edited

villebro Sep 14, 2022

Choose a reason for hiding this comment

ktmud Sep 19, 2022

Choose a reason for hiding this comment

geido commented Sep 16, 2022

geido left a comment

Choose a reason for hiding this comment

michael-s-molina commented Sep 16, 2022

justinpark commented Sep 16, 2022 • edited

kgabryje commented Sep 19, 2022 • edited

michael-s-molina commented Sep 19, 2022

ktmud commented Sep 20, 2022 • edited

ktmud Sep 20, 2022

Choose a reason for hiding this comment

kgabryje Oct 9, 2022

Choose a reason for hiding this comment

kgabryje commented Oct 9, 2022 • edited

kgabryje commented Oct 11, 2022

github-actions bot commented Oct 11, 2022

geido left a comment

Choose a reason for hiding this comment

github-actions bot commented Oct 11, 2022

kgabryje commented Sep 12, 2022 •

edited

codecov bot commented Sep 12, 2022 •

edited

kgabryje commented Sep 13, 2022 •

edited

michael-s-molina commented Sep 14, 2022 •

edited

justinpark commented Sep 16, 2022 •

edited

kgabryje commented Sep 19, 2022 •

edited

ktmud commented Sep 20, 2022 •

edited

kgabryje commented Oct 9, 2022 •

edited