Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ipam/crd: Fix spurious CiliumNode update status failures
When running in CRD-based IPAM modes (Alibaba, Azure, ENI, CRD), it is possible to observe spurious "Unable to update CiliumNode custom resource" failures in the cilium-agent. The full error message is as follows: "Operation cannot be fulfilled on ciliumnodes.cilium.io <node>: the object has been modified; please apply your changes to the latest version and try again". It means that the Kubernetes `UpdateStatus` call has failed because the local `ObjectMeta.ResourceVersion` of submitted CiliumNode version is out of date. In the presence of races, this error is expected and will resolve itself once the agent receives a more recent version of the object with the new resource version. However, it is possible that the resource version of a `CiliumNode` object is bumped even though the `Spec` or `Status` of the `CiliumNode` remains the same. This for examples happens when `ObjectMeta.ManagedFields` is updated by the Kubernetes apiserver. Unfortunately, `CiliumNode.DeepEqual` does _not_ consider any `ObjectMeta` fields (including the resource version). Therefore two objects with different resource versions are considered the same by the `CiliumNode` watcher used by IPAM. But to be able to successfully call `UpdateStatus` we need to know the most recent resource version. Otherwise, `UpdateStatus` will always fail until the `CiliumNode` object is updated externally for some reason. Therefore, this commit modifies the logic to always store the most recent version of the `CiliumNode` object, even if `Spec` or `Status` has not changed. This in turn allows `nodeStore.refreshNode` (which invokes `UpdateStatus`) to always work on the most recently observed resource version. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
- Loading branch information