-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ScaledObject reconciliation delay due to redundant reconciles triggered by HPA status updates #5281
Comments
@deefreak thanks for reporting. We should add predicate there. I wonder, should we really trigger reconcile loop on HPA annotations changes? |
I can contribute this 😄 |
At the moment we only replace Spec and Labels if I am not mistaken: Lines 154 to 183 in a2c21ec
Can't recall if this is by desing 🤷♂️ 😄 |
@zroubalik @JorTurFer , I am willing to contribute and have the PR ready. I will link the issue to the PR . |
Report
In our organisation, we have large clusters where workload autoscaling is being managed by KEDA where 350+ scaledobjects are present.
Lately, we have observed that the time taken for a change to the scaledobject to reflect in its child HPA takes around 10 min which is hampering the autoscaler responsiveness.
For example, if we update the scaledobject.spec.maxReplicaCount , it takes around 10 min to reflect in the hpa.spec.maxReplicas.
After debugging and analyzing the KEDA operator pod logs and the scaledobject_controller.go , we identified that ScaledObjects are continuously being reconciled when the child HPA is updated. This includes even upon the status updates to the HPA by other controllers such as HPA controller which is part of the kube controller manager. HPA controller keeps updating the hpa status with information such as conditions and the controller perceived resource metrics
(.status.currentMetrics). The volume of such updates is too high and this is causing unnecessary and redundant reconciles on the scaledobject further delaying any genuine updates.
Expected Behavior
Only changes to the hpa spec or labels/annotations should trigger scaledobject reconcilation and ignore any status updates.
Actual Behavior
In this controller initialisation part, any updates to the HPA are triggering scaledobject reconciles which can be redundant and unnecessary.
Steps to Reproduce the Problem
Have a cluster with 100+ scaledobjects being managed by KEDA. The time taken to update the HPA upon scaledobject spec updates is in the order of minutes.
Logs from KEDA operator
We are seeing multiple logs like this which are triggered by multiple HPA status updates (.status.currentMetrics).
2023-12-08T13:44:04.647+0530 INFO controllers.ScaledObject Reconciling ScaledObject {"ScaledObject.Namespace": "xx", "ScaledObject.Name": "yy"}
2023-12-08T13:51:08.931+0530 INFO controllers.ScaledObject Reconciling ScaledObject {"ScaledObject.Namespace": "xx", "ScaledObject.Name": "yy"}
KEDA Version
2.12.0
Kubernetes Version
1.26
Platform
Other
Scaler Details
CPU
Anything else?
This issue is reproducible in the older and the latest KEDA versions. Also, tweaking KEDA_SCALEDOBJECT_CTRL_MAX_RECONCILES will not completely resolve this issue as the redundant updates will still continue to be processed.
The text was updated successfully, but these errors were encountered: