-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clustermesh: correctly remove remoteCache on connection disruption #23532
Conversation
2d12982
to
85a48fe
Compare
85a48fe
to
1febbc6
Compare
/ci-multicluster |
1febbc6
to
e423e25
Compare
/ci-multicluster |
Something seems wrong; this failure doesn't seem like it should be happening. And this test has been pretty stable. |
e423e25
to
98a72b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for submitting the fix and thanks to @oblazek for finding it!
/ci-multicluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good find! The refactoring/fixing typos related changes could've been separated in a separate commit. LGTM.
Some of the functions need updated description.
pkg/identity/cache/allocator.go
Outdated
@@ -440,18 +440,24 @@ func (m *CachingIdentityAllocator) ReleaseSlice(ctx context.Context, owner Ident | |||
|
|||
// WatchRemoteIdentities starts watching for identities in another kvstore and | |||
// syncs all identities to the local identity cache. | |||
func (m *CachingIdentityAllocator) WatchRemoteIdentities(backend kvstore.BackendOperations) (*allocator.RemoteCache, error) { | |||
func (m *CachingIdentityAllocator) WatchRemoteIdentities(name string, backend kvstore.BackendOperations) (*allocator.RemoteCache, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Please update the function description.
@@ -269,6 +269,7 @@ func (rc *remoteCluster) restartRemoteConnection(allocator RemoteIdentityWatcher | |||
rc.releaseOldConnection() | |||
rc.mesh.metricTotalNodes.WithLabelValues(rc.mesh.conf.Name, rc.mesh.conf.NodeName, rc.name).Set(float64(rc.remoteNodes.NumEntries())) | |||
rc.mesh.metricReadinessStatus.WithLabelValues(rc.mesh.conf.Name, rc.mesh.conf.NodeName, rc.name).Set(metrics.BoolToFloat64(rc.isReadyLocked())) | |||
allocator.RemoveRemoteIdentities(rc.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add this also somewhere here? otherwise it does not really fix the memleak issue.. the controller itself is not stopped, so this cleanup is never run in my case...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oblazek There should be only one controller per remote cluster, so as soon as a new connection is made, it will replace the old cache in the allocator. I decided to preserve the old one, so that connections aren't disrupted while etcd blips. Otherwise all identities are deleted and re-created, meaning connectivity is lost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes, got it
@oblazek found another memory leak :-). Fix coming shortly. |
98a72b1
to
ef5e69b
Compare
OK, I've banged on this pretty well, couldn't get the bug to appear. This seems good to go. |
/test |
ci-multicluster has failed. Investigating. |
Found the issue; we were starting the connectivity test too soon after restarting the agents. I have a test run in #23716, which I'll include here when finished. I believe this change is safe. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late resposse. Changes looks good to me (including the test changes) thanks!
Upon reconnect, we failed to remove the old remoteCache (we were looking at the wrong Allocator on cleanup), meaning that every time we reconnected, all old remoteCaches were kept around. This was, at best, a memory leak, and at worst meant that we continued to read stale data even after reconnecting, depending on the ordering of a map iteration. Fixes: cilium#22988 Fixes: cilium#13446 Signed-off-by: Casey Callendrello <cdc@isovalent.com>
Before this change, RemoteCache would create and start a second cache on an existing Allocator, which is a waste of resources. So, just pass through to the Allocator's existing cache directly. There's no need to duplicate etcd watches or the in-memory cache. Signed-off-by: Casey Callendrello <cdc@isovalent.com>
When two clusters are connected, it causes a full agent rollout. So, wait for that to finish with "cilium status" before proceeding with connectivity tests. Signed-off-by: Casey Callendrello <cdc@isovalent.com>
ef5e69b
to
42c6584
Compare
Passing run for proposed CI changes here: https://github.com/cilium/cilium/actions/runs/4164795509/jobs/7206931563 |
/ci-multicluster |
CI is all green, including multicluster (the flake fixed by this CI change didn't hit us this time) |
cilium status --wait --context ${{ steps.contexts.outputs.context1 }} | ||
cilium status --wait --context ${{ steps.contexts.outputs.context2 }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this is fine for now, I suppose a more proper fix would be to have the clustermesh status --wait
command automatically also imply a status --wait
directly in CLI code. cc @tklauser for thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, ultimately, they do different things. Ideally there would be a --wait
in clustermesh join
/test |
/ci-verifier |
Upon reconnect, we failed to remove the old remoteCache (we were looking at the wrong Allocator on cleanup), meaning that every time we reconnected, all old remoteCaches were kept around.
This was, at best, a memory leak, and at worst meant that we continued to read stale data even after reconnecting, depending on the ordering of a map iteration.
Many thanks to @oblazek, who found the root cause.
Fixes: #22988
Fixes: #13446
Signed-off-by: Casey Callendrello cdc@isovalent.com