-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
daemon: Init endpoint queue during validation #13608
daemon: Init endpoint queue during validation #13608
Conversation
test-me-please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
These error messages also happen in v1.8 CI. |
This is useful during endpoint validation when endpoints are being restored. When they are being restored, their event queue is not yet initialized because they haven't been exposed to the endpoint manager. It is important to initialize an endpoint's event queue so that events are not missed during their restoration. Signed-off-by: Chris Tarazi <chris@isovalent.com>
This commit fixes the following errors: ``` evel=error msg="Unable to enqueue endpoint policy visibility event" containerID=9f680a5847 datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=3479 error="unable to Enqueue event" identity=8771 ipv4=10.116.2.10 ipv6= k8sPodName=cilium-monitoring/grafana-6d49bd9ff7-s8zsd subsys=endpoint ``` These errors occurred because during endpoint validation (when the endpoint is being restored), its event queue has not been initialized yet. Once the endpoint is eventually exposed endpoint manager (after restoration), it will begin processing the events off the queue. Signed-off-by: Chris Tarazi <chris@isovalent.com>
CI passed and code owner approval. Pushing to resolve typo referenced above: commit 6435f18a44a6354e1fec57dccf44a5333def349e
Author: Chris Tarazi <chris@isovalent.com>
Date: Sun Oct 18 14:32:14 2020 -0700
fixup! endpoint: Add function to initialize event queue
Signed-off-by: Chris Tarazi <chris@isovalent.com>
diff --git a/pkg/endpoint/manager.go b/pkg/endpoint/manager.go
index c241d4300..8472f0459 100644
--- a/pkg/endpoint/manager.go
+++ b/pkg/endpoint/manager.go
@@ -172,9 +172,9 @@ func (e *Endpoint) Unexpose(mgr endpointManager) <-chan struct{} {
// InitEventQueue initializes the endpoint's event queue. Note that this
// function does not begin processing events off the queue, as that's left up
-// to the caller when should call Expose in order to allow other subsystems to
-// access the endpoint. This function assumes that the endpoint ID has already
-// been allocated!
+// to the caller to call Expose in order to allow other subsystems to access
+// the endpoint. This function assumes that the endpoint ID has already been
+// allocated!
//
// Having this be a separate function allows us to prepare
// the event queue while the endpoint is being validated (during restoration) |
4afdfa4
to
5cbdc91
Compare
@christarazi Backporting this PR to v1.8 leads to a deadlock situation when some endpoints are being deleted on restore (thanks @aanm for digging and finding out!):
A 1.8 specific version of this PR seems to be required. |
Removing |
See commit msgs.
Fixes: #13398