Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-4.6] Bug 1899406: HPA: Ignore deleted pods. #465

Conversation

openshift-cherrypick-robot

This is an automated cherry-pick of #462

/assign joelsmith

When a pod is deleted, it is given a deletion timestamp. However the
pod might still run for some time during graceful shutdown. During
this time it might still produce CPU utilization metrics and be in a
Running phase.

Currently the HPA replica calculator attempts to ignore deleted pods
by skipping over them. However by not adding them to the ignoredPods
set, their metrics are not removed from the average utilization
calculation. This allows pods in the process of shutting down to drag
down the recommmended number of replicas by producing near 0%
utilization metrics.

In fact the ignoredPods set is misnomer. Those pods are not fully
ignored. When the replica calculator recommends to scale up, 0%
utilization metrics are filled in for those pods to limit the scale
up. This prevents overscaling when pods take some time to startup. In
fact, there should be 4 sets considered (readyPods, unreadyPods,
missingPods, ignoredPods) not just 3.

This change renames ignoredPods as unreadyPods and leaves the scaleup
limiting semantics. Another set (actually) ignoredPods is added to
which delete pods are added instead of being skipped during
grouping. Both ignoredPods and unreadyPods have their metrics removed
from consideration. But only unreadyPods have 0% utilization metrics
filled in upon scaleup.
@openshift-ci-robot
Copy link

@openshift-cherrypick-robot: Bugzilla bug 1899405 has been cloned as Bugzilla bug 1899406. Retitling PR to link against new bug.
/retitle [release-4.6] Bug 1899406: HPA: Ignore deleted pods.

In response to this:

[release-4.6] Bug 1899405: HPA: Ignore deleted pods.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot changed the title [release-4.6] Bug 1899405: HPA: Ignore deleted pods. [release-4.6] Bug 1899406: HPA: Ignore deleted pods. Nov 19, 2020
@openshift-ci-robot openshift-ci-robot added bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Nov 19, 2020
@openshift-ci-robot
Copy link

@openshift-cherrypick-robot: This pull request references Bugzilla bug 1899406, which is invalid:

  • expected dependent Bugzilla bug 1899405 to be in one of the following states: VERIFIED, RELEASE_PENDING, CLOSED (ERRATA), but it is MODIFIED instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

[release-4.6] Bug 1899406: HPA: Ignore deleted pods.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@joelsmith
Copy link

/cc @rphillips

Copy link
Member

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 19, 2020
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: openshift-cherrypick-robot, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 19, 2020
@joelsmith
Copy link

/bugzilla refresh

@openshift-ci-robot openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Nov 19, 2020
@openshift-ci-robot
Copy link

@joelsmith: This pull request references Bugzilla bug 1899406, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

6 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.6.z) matches configured target release for branch (4.6.z)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
  • dependent bug Bugzilla bug 1899405 is in the state VERIFIED, which is one of the valid states (VERIFIED, RELEASE_PENDING, CLOSED (ERRATA))
  • dependent Bugzilla bug 1899405 targets the "4.7.0" release, which is one of the valid target releases: 4.7.0
  • bug has dependents

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot removed the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Nov 19, 2020
@joelsmith
Copy link

/retest

@ecordell ecordell added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Nov 19, 2020
@openshift-merge-robot openshift-merge-robot merged commit 43983cd into openshift:release-4.6 Nov 19, 2020
@openshift-ci-robot
Copy link

@openshift-cherrypick-robot: All pull requests linked via external trackers have merged:

Bugzilla bug 1899406 has been moved to the MODIFIED state.

In response to this:

[release-4.6] Bug 1899406: HPA: Ignore deleted pods.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-merge-robot pushed a commit that referenced this pull request Sep 17, 2021
Cherry pick #465 in cloud provider azure to 1.20: Cleanup subnet in frontend IP configs
openshift-merge-robot pushed a commit that referenced this pull request Oct 7, 2021
Cherry pick #465 in cloud provider azure to 1.19: Cleanup subnet in frontend IP configs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants