New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get CEP from k8s cache during initialization. #24340
Conversation
/test |
Getting CEP from k8s cache is much cheaper than reading it from Etcd. Now, during initialization, we try to get it from the cache. If it fails, the next read will go to Etcd. Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com>
265a9f6
to
b91758c
Compare
if firstTry { | ||
// First we try getting CEP from the API server cache, as it's cheaper. | ||
// If it fails we get it from etcd to be sure to have fresh data. | ||
localCEP, err = ciliumClient.CiliumEndpoints(namespace).Get(ctx, podName, meta_v1.GetOptions{ResourceVersion: "0"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does doing this get from the cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because of ResourceVersion: "0"
https://kubernetes.io/docs/reference/using-api/api-concepts/#semantics-for-get-and-list
This is also default
what the informer does in ListAndWatch pattern code
It is a quite standard pattern, take a look here: https://github.com/kubernetes/kubernetes/blob/4557c694ef42da6c63f42ac4c81bea5716b471b7/pkg/controller/controller_utils.go#L1036
Normally without resourceVersion specified request goes directly to Etcd and raft consensus. With resourceVersion=0
it is answered by Kubernetes apiserver cache. This smallish improvement might make a significant improvement when cilium-agent is restarting and all agents are trying to get CEP from k8s, spamming k8s apiserver with requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me,
I am wondering is there any specified behaviour for consistency related to separate resources of the same name (i.e. different UID). The endpoint sync depends on CiliumEndpoint.UUID to prevent certain types of race conditions related to ambiguous "ownership" of agent/controller-instance -> CEP. There might be some weird cases if its possible for the server to return resources that no longer actually exist.
I couldn't find any specification on that from the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example:
- Agent restores endpoint without a CEP UID reference). Resource no longer exists, but is returned. If this is a "local" CEP (i.e. referencing a endpoint on the same Node).
- The sync code may try to backfill its CEP UID ref of a restored endpoint. In this case, the CEP will have an old UID reference which will be written out to disk in the endpoint restore header.
- Patch updates to the CEP status are tested to ensure they have the same UID.
- In this case, the error will be
k8serrors.NotFound
(I think ??). - I think the EPS controller might just loop, with the wrong CEP UID, forever?
edit: I think the same thing might happen if you do a restore, from an old endpoint header (without CEP UID) and the returned CEP is both local and an old version, then it might also get stuck failing to patch its CEP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit of code is particularly annoying. It only is necessary for upgrade paths.
Currently the only version of Cilium to guarantee writing the CEP UID to the endpoint restore header state is v1.13. So I think in v1.14 we can remove this code, this will also remove these possible annoying edge cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we just remove that code going forward we should be fine, that should clear up the issues I mentioned.
I think every other scenario just ends with the controller retrying until eventually consistency.
(I'll have to double check our upgrade path practices).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the patch fails, wouldn't we just zero out localCEP and retry fetching it again?
localCEP, err = ciliumClient.CiliumEndpoints(namespace).Patch( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah right, good point. Forgot this falls through back to the needsInit block in that case.
/test |
Getting CEP from k8s cache is much cheaper than reading it from Etcd.
Now, during initialization, we try to get it from the cache. If it fails, the next read will go to Etcd.