-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
k8s: CEP GC runs in steps #6029
Conversation
test-me-please |
cb7a1ab
to
df4bd02
Compare
test-me-please |
1 similar comment
test-me-please |
pkg/endpoint/endpoint.go
Outdated
ceps, err := ciliumClient.CiliumEndpoints(meta_v1.NamespaceAll).List(listOpts) | ||
switch { | ||
case ceps.Continue != "" && err != nil: | ||
// this is ok, it means we saw a 410 ResourceExpired error but we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is helpful, but is this documented somewhere? I'd suggest putting a link in a comment here to said documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I was, in fact, checking this incorrectly. I've fixed it.
@raybejjani do you have any benchmarks from manual testing that you can post here to validate the fix? |
@ianvernon I added 10k CEPs, all would be GCed. That took 12
The real one took
Oddly, listing the CEPs after this still shows them but the count is decreasing... so I think there is a delay on the delete. I'll run this again with no fetch limit... I'll be a few minutes |
Ok, maybe not... I stopped the agent and the count stopped dropping. I think another GC run had started... although that shouldn't have happened (the interval is 30 minutes) |
df4bd02
to
7f59875
Compare
test-me-please |
Running this again but with 1000 CEPs (10k was taking too long) No CEPs to delete
1000 CEPs to delete
1000 CEPs to delete without a fetch limit
1000 CEPs to delete with
I don't think we can extrapolate to much larger numbers, necessarily, since there may be some cliff somewhere. In any case, the goal was to lower the likelihood of overloading the apiserver (or the etcd behind it) when fetching these. |
7f59875
to
30d9828
Compare
test-me-please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll run some more benchmarks but I updated the code a little.
30d9828
to
4790d8d
Compare
Ok, I finally ran with different limit values. It doesn't seem to make a difference:
|
4790d8d
to
7b9380a
Compare
test-me-please |
7b9380a
to
89c344f
Compare
test-me-please |
1 similar comment
test-me-please |
The watcher also keeps a local cache of objects. We can use this for lookups in places like garbage-collection. Signed-off-by: Ray Bejjani <ray@covalent.io>
On large clusters the CEP GC would fetch the entire list of CEPs. These can grow to significant sizes (100kB in one instance) and this causes extreme degradation in kube-apiserver. We now iterate 10 CEPs at a time, thus lowering the load we put on the cluster. Signed-off-by: Ray Bejjani <ray@covalent.io>
89c344f
to
e41505c
Compare
test-me-please |
if this patch merge to 1.2 version |
@WingkaiHo I'm assuming you ran into scale issues with CEP. We recommend upgrading to 1.4 to use CEP at scale. Upgrading this commit only will not be sufficient to resolve all scale issues present in 1.2 and 1.3. |
On large clusters the CEP GC would fetch the entire list of CEPs. These
can grow to significant sizes (100kB in one instance) and this causes
extreme degradation in kube-apiserver. We now iterate 10 CEPs at a
time, thus lowering the load we put on the cluster.
I also return the cache.Store object from our k8s helper factories. No functionality has changed there.
I began by trying to watch for CEPs in the controller but the code became a little complex. This was partly because only the node that should do GC needs to watch for CEPs, but that meant creating a non-shared informer (shared informers cannot stop once they begin watching for a specific type). In the end, limiting the list to a small number seemed like a reasonable compromise, especially since the GC isn't a critical operation here.
Relates to: #5913
This change is