New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HPA e2e] Calculate more precise consumed CPU usage for N replicas #115584
Conversation
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hi @pbeschetnov. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test nice idea 👍 , let's see how it helps with flakes |
LGTM label has been added. Git tree hash: 2fb282de5b80d93f66739dd93f4e15bae8241f41
|
pull-kubernetes-e2e-autoscaling-hpa-cpu logs are gone |
/assign @mwielgus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mwielgus, pbeschetnov The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
/kind flake
What this PR does / why we need it:
With the current HPA e2e tests with behavior setup of:
500m
25%
110m
and
n * {usage for single replica}
the consumed CPU per n replicas would be
As shown, the usage is more balanced for 4-5 replicas, and skewed for <=3 and >=6 replicas — this leads to flaky tests when consumed CPU by the ResourceConsumer fluctuates and goes over the intended usage. Example:
254 / 500 / 0.25 = 2.032
— rounding up to 3.For >=9 replicas it even produces not enough usage for the intended recommendation.
Instead of using this approach, I suggest calculating the replica usage for
n
pods as ifn - 0.5
replicas consume all CPU matching the target. It would be:(replicas - 0.5) * request * targetPercentage / 100%
The 0.5 replica reduction is to accommodate for the deviation between the actual consumed cpu and requested usage by the ResourceConsumer. HPA rounds up the recommendations. So, if the usage is e.g. for 3.5 replicas, the recommended replica number will be 4.
This will eliminate flakiness in HPA e2e tests with behavior.
Does this PR introduce a user-facing change?