-
Notifications
You must be signed in to change notification settings - Fork 985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove cluster level list/watch for configmaps, serviceaccounts, secrets #3469
Remove cluster level list/watch for configmaps, serviceaccounts, secrets #3469
Conversation
561d305
to
5af1c83
Compare
@israel-hdez Can you help test whether this fixes the out-of-memory issue ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for fixing this!
I'll test on my Wednesday. |
5af1c83
to
b8e07c0
Compare
@sivanantha321 Could you point me to where you removed the list calls for secrets? Perhaps I missed it somewhere but I only see the removal from manifests. |
pkg/controller/v1beta1/inferenceservice/components/explainer.go
Outdated
Show resolved
Hide resolved
...ller/v1beta1/inferenceservice/reconcilers/cabundleconfigmap/cabundle_configmap_reconciler.go
Show resolved
Hide resolved
@sivanantha321 I'm still replicating the setup. I tried today, and my cluster died while on it. |
We are not using list calls in the code base. |
I've just replicated the OOMKill with I used a script from @skonto available in this gist: https://gist.github.com/skonto/188b2c45c5af449629caaa69a190392c, with a minor modification: edit line 3 to I'm now trying the code changes in this PR. |
I confirm the kserve-controller pod no longer crashes with OOMKill. Notice that I only tried around Secrets which is what I reported in #3467. I didn't try the other resources. |
I meant "list/watch for secrets" as described in the PR title but thanks I got it now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving a few minor comments.
pkg/controller/v1alpha1/trainedmodel/reconcilers/modelconfig/modelconfig_reconciler.go
Show resolved
Hide resolved
7f50571
to
610f77b
Compare
/lgtm |
Per request in prow comment: /assign @rachitchauhan43 |
610f77b
to
cfe9a48
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
I pinged Rachit previously. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small change requested. Rest LGTM.
Looks like lots of |
Also, @israel-hdez , if we have reproducibility, what do you think of adding a test case to make sure regression always catches such issues ? |
@rachitchauhan43 I do know how to reproduce, but IMO that should be solved under its own ticket, because we would need to find a way to replicate within CI -- I crashed my dev cluster a few times before I could replicate successfully, so what I did on my device may not be suitable for GitHub Actions. And, of course, we would need to ensure all relevant code paths are covered. |
Actually, I think contributors and reviewers should be careful when a This PR is around ConfigMaps, ServiceAccounts and Secrets and is already removing the no longer needed privileges. But any resource that the controller is watching may lead to an OOMKill, if the watch/informer fills the available memory. |
We can do it in a separate PR as many files requires reorganization. |
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
06d911a
to
48fcc80
Compare
Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
48fcc80
to
a886fbb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-approving
/lgtm
Sure, I meant that only. Don't want to have multi-purpose PR. |
Agreed. Please create separate ISsue for that. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: sivanantha321, terrytangyuan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…ets (kserve#3469) * Remove cluster level list/watch for configmaps, serviceaccounts, secrets Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Resolve comments Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> Signed-off-by: tjandy98 <3953059+tjandy98@users.noreply.github.com>
…ets (kserve#3469) * Remove cluster level list/watch for configmaps, serviceaccounts, secrets Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Resolve comments Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
…ets (kserve#3469) * Remove cluster level list/watch for configmaps, serviceaccounts, secrets Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Resolve comments Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
…ets (kserve#3469) * Remove cluster level list/watch for configmaps, serviceaccounts, secrets Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Resolve comments Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
…ets (kserve#3469) * Remove cluster level list/watch for configmaps, serviceaccounts, secrets Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> * Resolve comments Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com> --------- Signed-off-by: Sivanantham Chinnaiyan <sivanantham.chinnaiyan@ideas2it.com>
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #3467
Fixes #366
Type of changes
Please delete options that are not relevant.
Feature/Issue validation/testing:
Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.
Test A
Test B
Logs
Special notes for your reviewer:
Checklist:
Release note: