kubernetes_executor does not properly fail task-instances in cleanup_stuck_queued_tasks #39078
Open
2 tasks done
Labels
area:core
kind:bug
This is a clearly a bug
needs-triage
label for new issues that we didn't triage yet
Apache Airflow version
Other Airflow 2 version (please specify below)
If "Other Airflow 2 version" selected, which one?
2.8.3
What happened?
I'm chasing an issue involving a hard-to-reproduce issue with deferrable operators being "stuck" in Queued state.
In debugging, I noticed what seems to be a defficiency in the kubernetes_executor's override of cleanup_stuck_queued_tasks is not actually
fail
ing the instance.What you think should happen instead?
Background: In debugging, I focused on the logs available at default log-level, which included
In reviewing the code printing this message, it seems clear that the expectation of a base_executor subclass is to
fail
the task-instances. For reference, the celery_executor does.Problem: The kubernetes_executor does not.
Solution: It seems to me like it should.
How to reproduce
The true root cause of my issue is still a mystery to me. This is an attempt at fixing this safety net.
Operating System
debian
Versions of Apache Airflow Providers
apache-airflow-providers-cncf-kubernetes==8.0.0
Deployment
Other 3rd-party Helm chart
Deployment details
No response
Anything else?
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: