[Port v2.9] cluster deploy running excessively and taking a long time to execute #44924
Labels
kind/bug
Issues that are defects reported by users or that we know have reached a real release
Milestone
This is a port of #39597
Rancher Server Setup
Information about the Cluster
User Information
Describe the bug
With a disconnected downstream cluster, the
clusterdeploy
handler takes several seconds to time out. This is fine with only a few downstream clusters disconnected, but when the number of disconnected clusters goes over 50 or so, the worker queues will get saturated.Reason for the slow time out is that the handler may execute code that reaches out to the disconnected downstream cluster - typically
clusterDeploy.getNodeAgentImage
which tries to listDaemonSets
. That alone can take up to ~30 seconds due to howTunnelServer
'sremotedialer
mechanism works.To Reproduce
See #39597 (comment)
Result
Expected Result
Little to no executions of cluster deploy handler while creating users. Cluster deploy handler should not probably not exceed an execution time of 1 second.
Screenshots
N/A
Additional context
N/A
The text was updated successfully, but these errors were encountered: