Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deferrable Operators get stuck as "scheduled" during backfill #25653

Closed
1 of 2 tasks
Gollum999 opened this issue Aug 10, 2022 · 1 comment · Fixed by #26205
Closed
1 of 2 tasks

Deferrable Operators get stuck as "scheduled" during backfill #25653

Gollum999 opened this issue Aug 10, 2022 · 1 comment · Fixed by #26205
Labels
affected_version:2.3 Issues Reported for 2.3 area:async-operators AIP-40: Deferrable ("Async") Operators area:backfill Specifically for backfill related area:Scheduler including HA (high availability) scheduler kind:bug This is a clearly a bug
Milestone

Comments

@Gollum999
Copy link
Contributor

Apache Airflow version

2.3.3

What happened

If you try to backfill a DAG that uses any deferrable operators, those tasks will get indefinitely stuck in a "scheduled" state.

If I watch the Grid View, I can see the task state change: "scheduled" (or sometimes "queued") -> "deferred" -> "scheduled". I've tried leaving in this state for over an hour, but there are no further state changes.

When the task is stuck like this, the log appears as empty in the web UI. The corresponding log file does exist on the worker, but it does not contain any errors or warnings that might point to the source of the problem.

Ctrl-C-ing the backfill at this point seems to hang on "Shutting down LocalExecutor; waiting for running tasks to finish." Force-killing and restarting the backfill will "unstick" the stuck tasks. However, any deferrable operators downstream of the first will get back into that stuck state, requiring multiple restarts to get everything to complete successfully.

What you think should happen instead

Deferrable operators should work as normal when backfilling.

How to reproduce

#!/usr/bin/env python3
import datetime
import logging

import pendulum
from airflow.decorators import dag, task
from airflow.sensors.time_sensor import TimeSensorAsync


logger = logging.getLogger(__name__)


@dag(
    schedule_interval='@daily',
    start_date=datetime.datetime(2022, 8, 10),
)
def test_backfill():
    time_sensor = TimeSensorAsync(
        task_id='time_sensor',
        target_time=datetime.time(0).replace(tzinfo=pendulum.UTC),  # midnight - should succeed immediately when the trigger first runs
    )

    @task
    def some_task():
        logger.info('hello')

    time_sensor >> some_task()


dag = test_backfill()


if __name__ == '__main__':
    dag.cli()

airflow dags backfill test_backfill -s 2022-08-01 -e 2022-08-04

Operating System

CentOS Stream 8

Versions of Apache Airflow Providers

None

Deployment

Other

Deployment details

Self-hosted/standalone

Anything else

I was able to reproduce this with the following configurations:

  • standalone mode + SQLite backend + SequentialExecutor
  • standalone mode + Postgres backend + LocalExecutor
  • Production deployment (self-hosted) + Postgres backend + CeleryExecutor

I have not yet found anything telling in any of the backend logs.

Possibly related:

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@Gollum999 Gollum999 added area:core kind:bug This is a clearly a bug labels Aug 10, 2022
@eladkal eladkal added area:Scheduler including HA (high availability) scheduler area:async-operators AIP-40: Deferrable ("Async") Operators affected_version:2.3 Issues Reported for 2.3 and removed area:core labels Aug 11, 2022
@potiuk
Copy link
Member

potiuk commented Aug 21, 2022

Likely related to #25859

@potiuk potiuk added this to the Airflow 2.4.0 milestone Aug 21, 2022
@eladkal eladkal added the area:backfill Specifically for backfill related label Aug 31, 2022
@ashb ashb modified the milestones: Airflow 2.4.0, Airflow 2.4.1 Sep 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affected_version:2.3 Issues Reported for 2.3 area:async-operators AIP-40: Deferrable ("Async") Operators area:backfill Specifically for backfill related area:Scheduler including HA (high availability) scheduler kind:bug This is a clearly a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants