Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 2053685: (Topology) Performance improvement by reducing rerenderings and deep-copy toJSON() calls #11001

Merged

Conversation

jerolimov
Copy link
Member

@jerolimov jerolimov commented Feb 3, 2022

Fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=2053685

Tl;dr
First commit adds only tests, second commit adds caching for previously deep-cloned data.
This cache of the JSON data which are already used in the topology allows the topology and many other components to do less useMemo, useEffect, re-calculations and re-renderings. In topology, this happens because an object replacement in a GraphNode rerenders it.

Analysis / Root cause:
While analyzing our topology performance issues I found two (main) issues which are addressed in this PR. (This is not the first and not the last PR for topology.)

  1. The topology components rerenders too much, also when only a part of the data has changed.
  2. A lot of time is consumed in the immutable toJSON() method.

After further investigation I noticed that the toJSON function is called in Firehose, useK8sWatchResource and useK8sWatchResources when converting an immutable object to JSON. This happens whenever a component is rendered and uses one of the hooks.

It also rerenders or recalculates hooks to often with the same data because the created JSON object contains the same data in a new, deep-clone object or array. But Memod-components, hooks dependencies, and other optimizations like the topology mobx graph tree depends on an identity to check to rerender components..

Solution Description:
This PR improves the topology performance or the performance of many different pages with an "immutable toJSON" result cache in firehose.jsx and k8s-watcher.tsx

It adds a cache and saves the converted JSON data at the immutable object itself. (Thanks Christian for this great idea.)

♻️ To see and test what will change with this caching I first added some tests to the Firehose component and the hooks.

  1. The first commit just adds test for the status quo.
  2. The second commit adds the caching and updates the tests so that it should be easier to see which objects are reused now.

❓ Why not just remove Immutable from the Redux store?!

I think removing Immutable would be a good idea. For the future.

But adding this cache has some benefits for the moment: It's easier and safer to add/test/verify and I hope that we can backport this. I don't expect that we would backport a rewrite of the reducers, and selectors, etc.

⚠️ Potential risk: Reused data models (redux state) is now reused over time and by different components. Changing such a data object would be a bad practice similar to changing a props object.

Videos

Before - Load topology with many deployments (in my case 84)

before-topology-initial-load.mp4

After - Load topology with many deployments (in my case 84)

after-topology-initial-load.mp4

Before - Stay on the topology page with many deployments (here 84) and delete a random pod every 3 second

before-topology-deleting-pods.mp4

After - Stay on the topology page with many deployments (here 84) and delete a random pod every 3 second

after-topology-deleting-pods.mp4

Performance screenshots

Before - Load topology with many deployments (in my case 84)

before-topology-first-40sec

After - Load topology with many deployments (in my case 84)

after-topology-load

Before - Stay on the topology page with many deployments (here 84) and delete a random pod every 3 second

before-topology-updatepods-40sec

After - Stay on the topology page with many deployments (here 84) and delete a random pod every 3 second

todo

Screen shots / Gifs for design review:
UI isn't changed. Here are som

Unit test coverage report:
Added a lot of tests for the existing components Firehose, useK8sWatchResource and useK8sWatchResources to check the status quo and track the difference with this PR.

Test setup:

  1. Create a project/namespace with a lot of Deployments (50+)
  2. Open the topology
  3. Delete some pods

You can find some scripts to create a cluster with a lot of load here: https://github.com/jerolimov/openshift/tree/master/loadtest

Browser conformance:

  • Chrome
  • Firefox
  • Safari
  • Edge

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 3, 2022
@openshift-ci openshift-ci bot added component/core Related to console core functionality component/sdk Related to console-plugin-sdk approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Feb 3, 2022
@jerolimov
Copy link
Member Author

jerolimov commented Feb 4, 2022

/cc @christianvogt @spadgett @jhadvig @invincibleJai
Hey, I will update the initial comment with more "numbers" and info next week. But I think this is already ready for an initial look.

@jerolimov jerolimov changed the title [WIP] Performance improvement by reducing rerenderings and deep-copy toJSON() calls [WIP] (Topology) Performance improvement by reducing rerenderings and deep-copy toJSON() calls Feb 4, 2022
@jerolimov
Copy link
Member Author

/cc @sanketpathak @jeff-phillips-18

@@ -19,7 +19,9 @@ const shallowMapEquals = (a, b) => {
return a.every((v, k) => b.get(k) === v);
};

const processReduxId = ({ k8s }, props) => {
const CACHE_SYMBOL = Symbol('_cachedToJSResult');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Safe to use the same symbol as the watcher. However since firehose is old and would probably be more grief to share the symbol with the sdk.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INTERNAL_REDUX_IMMUTABLE_TOJSON_CACHE_SYMBOL is fine for you? 🤣 🤣

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this to the SDK because we have already an import from firehose.jsx to the SDK. I'm not sure what the "right' import direction here....?!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having it in the SDK is correct. I dislike exporting such an internal thing but whatever I suppose since we own the console sdk and can manage it ¯\_(ツ)_/¯

@christianvogt
Copy link
Contributor

@jerolimov I agree that removing immutable JS is something we should look at doing but it's a much larger change. This is a good first step and proves there's a larger issue to address with immutable js and our redux store.

},
})),
metadata: { resourceVersion: '123' },
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this file is duplicate of frontend/packages/console-dynamic-plugin-sdk/src/utils/k8s/hooks/__tests__/useK8sWatchResource.data.tsx.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I duplicated the test-data files so that they don't depend on each other.

Comment on lines +5 to +28
export const podData = {
apiVersion: 'v1',
kind: 'Pod',
metadata: {
name: 'my-pod',
namespace: 'default',
resourceVersion: '123',
},
};

export const podList = {
apiVersion: 'v1',
kind: 'PodList',
items: ['my-pod1', 'my-pod2', 'my-pod3'].map((name) => ({
apiVersion: 'v1',
kind: 'Pod',
metadata: {
name,
namespace: 'default',
resourceVersion: '123',
},
})),
metadata: { resourceVersion: '123' },
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can reuse podList and podData from useK8sWatchResource.data.tsx. I think it is better to use podList and podData from here and delete useK8sWatchResource.data.tsx and useK8sWatchResources.data.tsx. wdyt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a different package. Yes, we have already dependencies from public package (console/internal) to packages/console-dynamic-plugin-sdk but I would like to keep a copy here instead of adding more dependencies between both.

@invincibleJai
Copy link
Member

Thanks @jerolimov I tried to verify it and the performance has improved, I verified topology with the below scenario

  • with 84-120 deployments: No significant issue was observed, interactions were smooth
  • with 150 deployments: Okay, a bit slow at times, and interactions were working but not sluggish.
  • with 200 deployments : app still works but bit sluggish on load and on interactions

Verified on

  • Chrome Version 98.0.4758.80
  • mac OSX - 2.6 GHz 6-Core Intel Core i7 / 16GB

@jerolimov jerolimov changed the title [WIP] (Topology) Performance improvement by reducing rerenderings and deep-copy toJSON() calls (Topology) Performance improvement by reducing rerenderings and deep-copy toJSON() calls Feb 11, 2022
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 11, 2022
Copy link
Member

@spadgett spadgett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

I'm OK with the approach as a short-term fix we can backport. @jerolimov Let's open a follow up issue to remove immutable, which seems like a better long-term approach. We should try to give it priority since this change will increase the in-memory size of the k8s resource data.

@jerolimov jerolimov changed the title (Topology) Performance improvement by reducing rerenderings and deep-copy toJSON() calls Bug 2053685: (Topology) Performance improvement by reducing rerenderings and deep-copy toJSON() calls Feb 11, 2022
@openshift-ci openshift-ci bot added the bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. label Feb 11, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 11, 2022

@jerolimov: This pull request references Bugzilla bug 2053685, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.11.0) matches configured target release for branch (4.11.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @sanketpathak

In response to this:

Bug 2053685: (Topology) Performance improvement by reducing rerenderings and deep-copy toJSON() calls

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Feb 11, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 11, 2022

@jerolimov: This pull request references Bugzilla bug 2053685, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.11.0) matches configured target release for branch (4.11.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @sanketpathak

In response to this:

Bug 2053685: (Topology) Performance improvement by reducing rerenderings and deep-copy toJSON() calls

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

1 similar comment
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 11, 2022

@jerolimov: This pull request references Bugzilla bug 2053685, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.11.0) matches configured target release for branch (4.11.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @sanketpathak

In response to this:

Bug 2053685: (Topology) Performance improvement by reducing rerenderings and deep-copy toJSON() calls

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@christianvogt
Copy link
Contributor

/approve

Copy link
Member

@vikram-raj vikram-raj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified it using cluster bot. In Chrome and Safari browsers with 100 D and 80 Pods in Topology and it is faster than before.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 14, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 14, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: christianvogt, jerolimov, spadgett, vikram-raj

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 902446d into openshift:master Feb 14, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 14, 2022

@jerolimov: All pull requests linked via external trackers have merged:

Bugzilla bug 2053685 has been moved to the MODIFIED state.

In response to this:

Bug 2053685: (Topology) Performance improvement by reducing rerenderings and deep-copy toJSON() calls

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 14, 2022

@jerolimov: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@jerolimov
Copy link
Member Author

/cherry-pick release-4.10

@openshift-cherrypick-robot

@jerolimov: new pull request created: #11059

In response to this:

/cherry-pick release-4.10

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. component/core Related to console core functionality component/sdk Related to console-plugin-sdk lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants