Airflow 2.1.0 doesn't retry a task if it externally killed #16285
Labels
affected_version:2.1
Issues Reported for 2.1
kind:bug
This is a clearly a bug
priority:high
High priority bug that should be patched quickly but does not require immediate new release
Milestone
Apache Airflow version: 2.1.0
Kubernetes version (if you are using kubernetes) (use
kubectl version
):Environment:
uname -a
): Linux 4.15.0-143-generic move isinstance check outside of loop #147-Ubuntu SMP Wed Apr 14 16:10:11 UTC 2021 x86_64 x86_64 x86_64 GNU/LinuxWhat happened:
When a task get externally killed, it is marked as Failed even though it can be retried.
What you expected to happen:
When a task get externally killed (kill -9 pid), it should put back to retry if it
retries
did not run out yet.How to reproduce it:
I'm using Celery as executor and I have a cluster of ~250 machine.
I have a task that defined as next. When the task started to execute, and it get killed externally by sending SIGKILL to it (or to the executor process and it's children), it get marked as FAILED and doesn't put to retry (even though retries is set to 10 times)
Anything else we need to know: This bug is introduced by 15537 as far as I know.
Next is the task log after sending SIGKILL to it.
The text was updated successfully, but these errors were encountered: