Description
Expected Behavior
CRDs are managed as expected by the operator now that they are no longer included in the yaml but we'd expect less requests to be made to the API server.
Current Behavior
We upgraded Calico to a supported version for 1.32 so we went from having the CRDs in the yaml file to the operator manage them which worked fine until we noticed API latency alerts going off in certain clusters.
We can see some API requests taking up to 5 seconds which isn't too bad but it seems to keep making them an awful lot as the total latency was very high causing our alerts to periodically trigger as a result.
Possible Solution
Could the frequency of requests be reduced for updating the CRDs so as not to cause so much overall latency to the API server, I didn't see any flags for changing this in the code currently and am happy to make a PR.
So a less agressive default and potential flag might be useful
Steps to Reproduce (for bugs)
If using EKS you can see in cloudwatch with
| sort @timestamp desc
| limit 10000
| filter objectRef.resource == "customresourcedefinitions"
| filter verb == "update"
| stats count(*) as c by responseObject.spec.group, user.username
| sort c desc
And see a high number for crd.projectcalico.org when --manage-crds=true is set
Context
We are setting manage-crds as false and having to do a more involved installation process to also get them updated in order to avoid the latency added.
Your Environment
-
Calico version
quay.io/tigera/operator:v1.38.0 -
Calico dataplane (iptables, windows etc.)
-
Orchestrator version (e.g. kubernetes, mesos, rkt):
Using EKS
Server Version: v1.32.3-eks-4096722 -
Operating System and version:
-
Link to your project (optional):