New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix missed deletion events when reconnecting to/disconnecting from remote clusters (ipcache entries) #25675
Merged
dylandreimerink
merged 2 commits into
cilium:main
from
giorio94:mio/clustermesh-reconnect-deletion-events-ipcache
Jun 8, 2023
Merged
Fix missed deletion events when reconnecting to/disconnecting from remote clusters (ipcache entries) #25675
dylandreimerink
merged 2 commits into
cilium:main
from
giorio94:mio/clustermesh-reconnect-deletion-events-ipcache
Jun 8, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
giorio94
added
kind/bug
This is a bug in the Cilium logic.
release-note/bug
This PR fixes an issue in a previous release of Cilium.
area/clustermesh
Relates to multi-cluster routing functionality in Cilium.
sig/kvstore
Impacts the KVStore package interactions.
labels
May 25, 2023
giorio94
force-pushed
the
mio/clustermesh-reconnect-deletion-events-ipcache
branch
3 times, most recently
from
May 26, 2023 12:24
400a3d9
to
e5d986f
Compare
This commit extends the IPIdentityPair struct implementing the store.Key interface, to allow reading/writing it through the store abstractions. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Currently, a new ipcache watcher is created for each connection to the kvstore backend of a remote cluster, and the old one is dropped. Yet, this approach leads to missing the deletion event for possible entries that are removed in the remote cluster during the reconnection process (added entries would instead be detected by the initial "list" operation). This commit fixes this issue moving to a single ipcache watcher per remote cluster, which is reused upon reconnection. Specifically, it now leverages the newly introduced RestartableWatchStore to automatically trigger a deletion event for all previously known entries that are no longer present after a reconnection. Special handling is performed in case the ClusterID associated with the watcher is changed, draining all previously known entries as no longer valid (even if still present, they need to be observed again with the new ClusterID). Additionally, all ipcache entries are drained when disconnecting from a remote cluster, to properly clean-up the status without requiring to restart the agent. Differently, they are not drained when simply shutting down, to avoid breaking existing connections on restart. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
giorio94
force-pushed
the
mio/clustermesh-reconnect-deletion-events-ipcache
branch
from
June 1, 2023 10:40
e5d986f
to
47c93e5
Compare
/test |
giorio94
added
the
release-blocker/1.14
This issue will prevent the release of the next version of Cilium.
label
Jun 1, 2023
/ci-l4lb Failed due to a flake which seems related to #24728 (comment) |
One of the Cilium Conformance E2E tests failed with #25823. Manually restarting |
jrajahalme
approved these changes
Jun 8, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm quite rusty on this code, but LGTM
maintainer-s-little-helper
bot
added
the
ready-to-merge
This PR has passed all tests and received consensus from code owners to merge.
label
Jun 8, 2023
This was referenced Jun 8, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/clustermesh
Relates to multi-cluster routing functionality in Cilium.
kind/bug
This is a bug in the Cilium logic.
ready-to-merge
This PR has passed all tests and received consensus from code owners to merge.
release-blocker/1.14
This issue will prevent the release of the next version of Cilium.
release-note/bug
This PR fixes an issue in a previous release of Cilium.
sig/kvstore
Impacts the KVStore package interactions.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Follow up of #25499 targeting ipcache entries synchronization. Please refer to the above PR and the commit descriptions for additional information.
Related: #24740
Related: #25499