Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated cherry pick of #72558: add goroutine to move unschedulablepods to activeq regularly 1.13 #73454

Conversation

denkensk
Copy link
Member

@denkensk denkensk commented Jan 29, 2019

Cherry pick of #72558 on release-1.13.

#72558: add goroutine to move unschedulablepods to activeq regularly

What type of PR is this?
/kind bug
/sig scheduling
/priority important-longterm

What this PR does / why we need it:
The scheduler places unschedulable pods in "unschedulabe" queue and retries them only when certain events happen that could potentially make them schedulable. This logic works well in almost all scenarios, but inevitable race condition in large distributed systems, could potentially cause some events to be seen before pods are added to the unschedulable queue. If this happens, pods may be left in the unschedulable queue and not be retried. Such scenarios should be rare and even if they occur, usually there are other events that trigger a retry and cover them. However, if such scenarios happen in smaller and low churn clusters, other events may not be seen for a while and pods may be stuck in the unschedulable queue for a long time.

Which issue(s) this PR fixes:
Fixes #72122

Special notes for your reviewer:

Move unschedulable pods to activeq if they are not retried for more than 1 minute

@k8s-ci-robot k8s-ci-robot added do-not-merge/cherry-pick-not-approved Indicates that a PR is not yet approved to merge into a release branch. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jan 29, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @denkensk. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 29, 2019
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jan 29, 2019
@denkensk denkensk changed the title Automated cherry pick of #72558: add goroutine to move unschedulablepods to activeq regularly Automated cherry pick of #72558: add goroutine to move unschedulablepods to activeq regularly 1.13 Jan 29, 2019
@Huang-Wei
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 29, 2019
Copy link
Member

@Huang-Wei Huang-Wei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 29, 2019
@denkensk denkensk force-pushed the automated-cherry-pick-of-#72558-upstream-release-1.13 branch from f4b1db8 to 7a02945 Compare January 29, 2019 09:22
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 29, 2019
@denkensk denkensk force-pushed the automated-cherry-pick-of-#72558-upstream-release-1.13 branch from 7a02945 to 290862f Compare January 29, 2019 15:57
@denkensk denkensk force-pushed the automated-cherry-pick-of-#72558-upstream-release-1.13 branch from 290862f to 06d31c3 Compare January 29, 2019 20:59
@denkensk
Copy link
Member Author

/cc @bsalamat @Huang-Wei

Copy link
Member

@bsalamat bsalamat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Thanks, @denkensk!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 29, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bsalamat, denkensk

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 29, 2019
@tpepper tpepper added the cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. label Jan 29, 2019
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/cherry-pick-not-approved Indicates that a PR is not yet approved to merge into a release branch. label Jan 29, 2019
@tpepper
Copy link
Member

tpepper commented Jan 29, 2019

/retest

@tpepper
Copy link
Member

tpepper commented Jan 29, 2019

pull-kubernetes-e2e-kops-aws likely will not pass due to k/k issue 73444

@k8s-ci-robot
Copy link
Contributor

@denkensk: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-kops-aws 06d31c3 link /test pull-kubernetes-e2e-kops-aws

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot merged commit 0aa225f into kubernetes:release-1.13 Jan 30, 2019
@denkensk denkensk deleted the automated-cherry-pick-of-#72558-upstream-release-1.13 branch February 15, 2019 09:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants