-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert "[Re-Apply][Distroless] Convert the GCE manifests for master containers." #77904
Revert "[Re-Apply][Distroless] Convert the GCE manifests for master containers." #77904
Conversation
@mborsz: Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm |
/test pull-kubernetes-typecheck |
@yuwenma @tallclair @MaciekPytel @anguslees - FYI /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mborsz, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cc @dims Hi Maciej, we really hope this change can go in in v1.15. Can you provide more context about what you have witnessed and proved the manifest change is the root cause? I'm happy to help triaging the issue but would need to have more inputs. |
We are quite confident that this change is causing visible regressions in 5K node cluster tests. There is a strict correlation between logrotate and huge increase in apiserver cpu usage, which in turn increases api latency, causes components to restart (due to slow connection to master), etc. Now, exactly the same run but with #76396 reverted. As you can see, there is a huge difference. Why don't know what is the exact root cause here, but we're 100% sure that #76396 introduced an regression. |
Reverts #76396
Sorry for reverting the change again, but we have evidence that it makes gce 5000 node test failing.
The problems we are seeing with this PR:
While 1. is quite easy to understand, the reason for 2. is not known yet.
I'm happy to collaborate with anyone who is interested in rolling forward this PR again to understand the issue and make sure third roll forward will be the successful one :)
/assign @wojtek-t
/cc @mm4tt @krzysied