Can "Pod not yet started" K8s events be mapped to queued state instead of running state? #40312

ketozhang · 2024-06-19T00:04:39Z

ketozhang
Jun 19, 2024

When autoscaling with K8s, I have a few minutes of lag between the node appearing as a K8s resources and when the node is ready. For many minutes, this produces logs reporting "Pod not yet started" which likely comes from the K8s event API.

[2024-06-11, 14:12:16 PDT] {pod_manager.py:378} WARNING - Pod not yet started: $SOME_POD_NAME
...
[2024-06-11, 14:18:12 PDT] {pod_manager.py:378} WARNING - Pod not yet started: $SOME_POD_NAME

In the gantt view, I can see the time interval of the above logs coincides with the running state instead of the queued state.

In the docs, the states are defined as

queued: The task has been assigned to an Executor and is awaiting a worker
running: The task is running on a worker (or on a local/synchronous executor)

...and I'm not sure where "task has been assigned but worker is not ready" falls.

Are there any configs that can modify this behavior or am I missing some other mechanic?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can "Pod not yet started" K8s events be mapped to queued state instead of running state? #40312

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Can "Pod not yet started" K8s events be mapped to queued state instead of running state? #40312

ketozhang Jun 19, 2024

Replies: 0 comments

ketozhang
Jun 19, 2024