Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

de-prioritize ScheduleFailed pods in scheduler #85792

Open
yingnanzhang666 opened this issue Dec 2, 2019 · 3 comments

Comments

@yingnanzhang666
Copy link

@yingnanzhang666 yingnanzhang666 commented Dec 2, 2019

We found performance issue for scheduler in our production env.

Scenario:
If there are several pods cannot be scheduled, it will be always add to queue, then if customer creates new pod, this pod will be added in the tail of workqueue, scheduler will schedule the new pod after finishing all the ScheduleFailed pods in this loop.
In large cluster, especially for the pods with localvolume which scheduling latency is high, they are put into scheduler work queue, the performance impact will be enlarged. (consider scheduler performance including waiting time in queue.)

So we'd better to de-prioritize these ScheduleFailed pod, avoid them to block the queue.

@yingnanzhang666

This comment has been minimized.

Copy link
Author

@yingnanzhang666 yingnanzhang666 commented Dec 2, 2019

/sig scheduling

@ProgramerGu

This comment has been minimized.

Copy link

@ProgramerGu ProgramerGu commented Dec 2, 2019

Do you want to cancel or optimize the priority of the new container?

@yingnanzhang666

This comment has been minimized.

Copy link
Author

@yingnanzhang666 yingnanzhang666 commented Dec 2, 2019

One proposal is that,

  • keep to de-prioritize ScheduleFailed pods when each time this pod wasn't scheduled successfully.
  • set back to original priority for those pods being de-prioritized when receive events about capacity adding.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.