-
Notifications
You must be signed in to change notification settings - Fork 39k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix HPA sample sanitization #67252
Fix HPA sample sanitization #67252
Conversation
/sig autoscaling |
/assign @DirectXMan12 |
dc83aa0
to
a5252de
Compare
/ok-to-test |
/retest |
3 similar comments
/retest |
/retest |
/retest |
a5252de
to
058240f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to add what the fix is to the PR description. This definitely needs a release note. All algorithm changes need release notes.
Separately, hard coding a startup window is not acceptable. Some pods start up within 10s, others take 5 minutes. We can't just arbitrarily assume 2 minutes is correct.
058240f
to
5ca5f95
Compare
Done. I Updated description. And also I changed implementation a little (to match the conclusions in the doc). |
@DirectXMan12 please take a look |
5ca5f95
to
d89cb2c
Compare
After my previous changes HPA wasn't behaving correctly in the following situation: - Pods use a lot of CPU during initilization, become ready right after they initialize, - Scale up triggers, - When new pods become ready HPA counts their usage (even though it's not related to any work that needs doing), - Another scale up, even though existing pods can handle work, no problem.
/assign @wojtek-t |
@DirectXMan12 |
Please sqaush the last two commits. |
Duration of initialization taint on CPU and window of initial readiness setting controlled by flags. Adding API violation exceptions following example of e50340e
c422739
to
4fd6a16
Compare
@wojtek-t Done. |
/lgtm [Though those validation exceptions need api-reviewer approval]. |
/hold @thockin - for final approval of the exceptions. |
I am not sure what I am asked to approve? what is triggering the violation? |
@thockin I'm adding new flags for HPA. They require corresponding fields in componentconfig v1alpha1 API. All such fields have an exception (I think because they start with a capital letter). |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jbartosik, mwielgus, thockin, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
@DirectXMan12 @mwielgus please merge this PR |
/test all [submit-queue is verifying that this PR is safe to merge] |
@jbartosik: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Automatic merge from submit-queue (batch tested with PRs 66916, 67252, 67794, 67619, 67328). If you want to cherry-pick this change to another branch, please follow the instructions here. |
As long as the PRs are posted before code slush that's acceptable. |
What this PR does / why we need it: @mwielgus pointed out a case when HPA fails as a result of my changes to HPA algorithm:
The fix is:
Reasoning behind this in: https://docs.google.com/document/d/1UdtYedhmCxjaJIQi6hwJMY0eHQQKxlVD8lSHZC1BPOA/edit
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Special notes for your reviewer:
Release note: