Deferrable Operators get stuck as "scheduled" during backfill #25653
Labels
affected_version:2.3
Issues Reported for 2.3
area:async-operators
AIP-40: Deferrable ("Async") Operators
area:backfill
Specifically for backfill related
area:Scheduler
including HA (high availability) scheduler
kind:bug
This is a clearly a bug
Milestone
Apache Airflow version
2.3.3
What happened
If you try to backfill a DAG that uses any deferrable operators, those tasks will get indefinitely stuck in a "scheduled" state.
If I watch the Grid View, I can see the task state change: "scheduled" (or sometimes "queued") -> "deferred" -> "scheduled". I've tried leaving in this state for over an hour, but there are no further state changes.
When the task is stuck like this, the log appears as empty in the web UI. The corresponding log file does exist on the worker, but it does not contain any errors or warnings that might point to the source of the problem.
Ctrl-C-ing the backfill at this point seems to hang on "Shutting down LocalExecutor; waiting for running tasks to finish." Force-killing and restarting the backfill will "unstick" the stuck tasks. However, any deferrable operators downstream of the first will get back into that stuck state, requiring multiple restarts to get everything to complete successfully.
What you think should happen instead
Deferrable operators should work as normal when backfilling.
How to reproduce
airflow dags backfill test_backfill -s 2022-08-01 -e 2022-08-04
Operating System
CentOS Stream 8
Versions of Apache Airflow Providers
None
Deployment
Other
Deployment details
Self-hosted/standalone
Anything else
I was able to reproduce this with the following configurations:
standalone
mode + SQLite backend +SequentialExecutor
standalone
mode + Postgres backend +LocalExecutor
CeleryExecutor
I have not yet found anything telling in any of the backend logs.
Possibly related:
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: