-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Endpoint Slice Error "resourceVersion should not be set on objects to be created" subsys=ces-controller" #21005
Comments
Discussed over slack Please assign to @Weil0ng |
Thanks for reporting the issue! @dlapcevic will work on a fix. Somehow cannot assign this to him... |
@Weil0ng Any updates here? |
Hi, @christarazi This can hardly be categorized as a bug, because it has no negative effect beside the occasional error logs. I investigated the issue and it turns out that when a new CES is being created while two CEPs get immediately assigned to it, both try to create the new CES. We get the error When I changed that reconciler makes ResourceVersion field of new CES to be empty before sending a request (as it always should be) reconcileCESCreate(), it again fails to create CES: I also verified that there is no negative impact beside the error log. |
Thanks @dlapcevic for the investigation!
Curious, shouldn't the operator look up in the local cache instead before creating the CES? |
It happens in this order:
Basically the time between the CES creation call to the API server and the next event getting to the point when it checks the store is less than the time it takes for this new CES to appear in the store. Maybe I’m overlooking a simple way to fix it. |
I think having a |
This will be fixed by #23615 - "Reduce number of CES updates sent to API server in short time for the same CES". Otherwise, to fix the message, we would have to remove ResourceVersion in reconcileCESCreate() before sending it to be created, because copying it is actually required for updates. I verified that if we don't copy over ResourceVersion, the updates would fail then. |
Hmm, maybe I'm missing something but I don't think the client should specify what |
You are right. I was just saying what would be the shortest fix in the current state, which is not the best. Otherwise it would require more refactoring, which we in long term plan to do. This is because before the fix, a race could happen with two very close consecutive updates for a new CES, and CES copy is expected to be used only for already created CESs that are also present in the cache (store), which sometimes didn't happen. |
Is there an existing issue for this?
What happened?
Steps
a) 900 Node AKS Cluster with BYO CNI
b) Cilium CNI version 1.12 installed with on AKS
c) Networking Default mode VXLAN
Error:
PATCH:https://xxxxxx:443/api/v1/nodes/xxxxx/status" subsys=klog
level=info msg="Starting CNP derivative handler" subsys=cilium-operator-generic
level=info msg="Starting CCNP derivative handler" subsys=cilium-operator-generic
level=info msg="Initialization complete" subsys=cilium-operator-generic
level=info msg="Unable to create CiliumEndpointSlice in k8s-apiserver" ciliumEndpointSliceName=ces-zjlsmyyzb-q8766 error="resourceVersion should not be set on objects to be created" subsys=ces-controller
Cilium Version
1.12
Kernel Version
5.4.0-1086-azure #91~18.04.1-Ubuntu
Kubernetes Version
1.23.8
Sysdump
No response
Relevant log output
Anything else?
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: