CN performance stabilization tweaks #4145

SidSethi · 2022-10-17T21:53:06Z

Description

Reduce maxWaitingJobs for recurringSyncQueue and updateReplicaSetQueue by 10x
Reduce queue history (num completed/failed jobs) for multiple queues from 100k to 1k
set default num workers to half num cores
Add ability to pause recurringSyncQueue and updateReplicaSetQueue based on envvars

Tests

automated sufficient
also tested manually on staging node

Monitoring - How will this change be monitored? Are there sufficient logs / alerts?

Easy to monitor all above via bull dashboard, grafana bull dashboard, and grafana profiling dashboard

creator-node/src/services/stateMachineManager/stateReconciliation/index.js

dmanjunath · 2022-10-17T21:55:50Z

creator-node/src/utils/clusterUtils.ts

@@ -48,7 +48,7 @@ class ClusterUtils {

    // This is called `cpus()` but it actually returns the # of logical cores, which is possibly higher than # of physical cores if there's hyperthreading
    const logicalCores = cpus().length
-    return config.get('expressAppConcurrency') || logicalCores
+    return (config.get('expressAppConcurrency') || logicalCores) / 2


i was thinking this but w/e

Suggested change

return (config.get('expressAppConcurrency') || logicalCores) / 2

return config.get('expressAppConcurrency') || (logicalCores / 2)

oh yeah, good point

theoilie · 2022-10-17T21:56:02Z

creator-node/src/utils/clusterUtils.ts

@@ -48,7 +48,7 @@ class ClusterUtils {

    // This is called `cpus()` but it actually returns the # of logical cores, which is possibly higher than # of physical cores if there's hyperthreading
    const logicalCores = cpus().length
-    return config.get('expressAppConcurrency') || logicalCores
+    return (config.get('expressAppConcurrency') || logicalCores) / 2


wondering if this should Math.ceil for machines with 1 core so it's not 0.5 which truncates to 0? approving for either way since none of our local, staging, or prod nodes need to worry about that scenario

good call, added

SidSethi added 2 commits October 17, 2022 21:48

CN SM tweaks

a4b8626

fix

1eab195

SidSethi requested review from dmanjunath and theoilie October 17, 2022 21:53

pull-request-size bot added the size/S label Oct 17, 2022

dmanjunath approved these changes Oct 17, 2022

View reviewed changes

theoilie approved these changes Oct 17, 2022

View reviewed changes

tweak numworkers

596187f

SidSethi requested a review from dmanjunath October 17, 2022 21:59

SidSethi changed the title ~~CN SM tweaks~~ CN performance stabilization tweaks Oct 17, 2022

dmanjunath approved these changes Oct 17, 2022

View reviewed changes

SidSethi merged commit a641a26 into master Oct 17, 2022

SidSethi deleted the ss-slow-down-cn-statemachine branch October 17, 2022 22:26

SidSethi added a commit that referenced this pull request Oct 17, 2022

CN performance stabilization tweaks (#4145)

c5c45fb

SidSethi added a commit that referenced this pull request Oct 17, 2022

CN performance stabilization tweaks (#4145)

7993fa2

SidSethi added a commit that referenced this pull request Oct 17, 2022

CN performance stabilization tweaks (#4145)

203f52f

dmanjunath pushed a commit that referenced this pull request Oct 31, 2022

CN performance stabilization tweaks (#4145)

c7190af

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CN performance stabilization tweaks #4145

CN performance stabilization tweaks #4145

SidSethi commented Oct 17, 2022 •

edited

dmanjunath Oct 17, 2022

SidSethi Oct 17, 2022

theoilie Oct 17, 2022

SidSethi Oct 17, 2022

	return (config.get('expressAppConcurrency') \|\| logicalCores) / 2
	return config.get('expressAppConcurrency') \|\| (logicalCores / 2)

CN performance stabilization tweaks #4145

CN performance stabilization tweaks #4145

Conversation

SidSethi commented Oct 17, 2022 • edited

Description

Tests

Monitoring - How will this change be monitored? Are there sufficient logs / alerts?

dmanjunath Oct 17, 2022

Choose a reason for hiding this comment

SidSethi Oct 17, 2022

Choose a reason for hiding this comment

theoilie Oct 17, 2022

Choose a reason for hiding this comment

SidSethi Oct 17, 2022

Choose a reason for hiding this comment

SidSethi commented Oct 17, 2022 •

edited