scheduler pending #3518

jorahbi · 2024-06-13T02:35:28Z

Please provide an in-depth description of the question you have:

What do you think about this question?:
What is the reason for this issue? Is it because Volcano is unable to synchronize job yaml modifications in a timely manner? What kind of operation is needed to restore normalcy to this issue? The current approach is to resume normal scheduling after restarting the scheduler, and sometimes even after deleting the restart job, it can still be scheduled normally. Retry to delete job xxxxxx continues to appear in the log.

Environment:

Volcano Version: 1.8.2
Kubernetes version (use kubectl version):1.27
Cloud provider or hardware configuration:
OS (e.g. from /etc/os-release):ubuntu 22.04
Kernel (e.g. uname -a):
Install tools:
Others:

The text was updated successfully, but these errors were encountered:

googs1025 · 2024-06-13T14:26:24Z

Will this error affect the use of volcano?
According to my understanding, if there is any mistake, please forgive me.
This should be a transient error, since the error returned on operation conflict will trigger another reconcile, where the controller fetches the latest version of the resource from the apiserver and tries the update again. This will repeat until the update goes through.

jorahbi · 2024-06-17T07:14:16Z

The pending sometimes lasts for a long time. I think this is caused by the inconsistency between the cache resources and the K8S resources, because the scheduling is successful after restarting the volcano or deleting the job and republishing it.

Monokaix · 2024-06-20T09:35:27Z

first, update job conflict is a normal case and scheduler will retry to schedule it.
second, retrying job is normal when a pod of job is deleted, the job in cache will finally be deleted after the pod is remvoed from etcd, aka pod graceful delete terminated.
Also, you can try to use v1.9.0, this pr #3144 has fixed some problem about slow job retrying delete. @jorahbi

jorahbi · 2024-08-28T09:36:11Z

This is a vcjob, under what circumstances would this task need to be deleted? Resulting in continuous unsuccessful scheduling. Until the volcano schudeler pod is restarted.

vc-scheduler log

vc-controller log

jorahbi · 2024-08-28T09:41:35Z

@Monokaix @lowang-bh Could you please clarify doubts, big brothers? Thank you very much

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scheduler pending #3518

scheduler pending #3518

jorahbi commented Jun 13, 2024

googs1025 commented Jun 13, 2024

jorahbi commented Jun 17, 2024

Monokaix commented Jun 20, 2024 •

edited

Loading

jorahbi commented Aug 28, 2024

jorahbi commented Aug 28, 2024

scheduler pending #3518

scheduler pending #3518

Comments

jorahbi commented Jun 13, 2024

googs1025 commented Jun 13, 2024

jorahbi commented Jun 17, 2024

Monokaix commented Jun 20, 2024 • edited Loading

jorahbi commented Aug 28, 2024

jorahbi commented Aug 28, 2024

Monokaix commented Jun 20, 2024 •

edited

Loading