Skip to content

Prevent AlreadyRunningBackfill error caused by invalid date range request#66874

Merged
choo121600 merged 4 commits into
apache:mainfrom
david-parkk:fix/backfill-invalid-date-range-validation
May 20, 2026
Merged

Prevent AlreadyRunningBackfill error caused by invalid date range request#66874
choo121600 merged 4 commits into
apache:mainfrom
david-parkk:fix/backfill-invalid-date-range-validation

Conversation

@david-parkk
Copy link
Copy Markdown
Contributor

@david-parkk david-parkk commented May 13, 2026

Summary

Fix RuntimeError when creating a backfill with from_date after to_date by adding explicit validation that raises InvalidBackfillDateRange early, before any DB operations are attempted.

Problem

When a backfill is requested with from_date after to_date (e.g. from_date=2026-05-13, to_date=2026-05-12, from_date > to_date), the following chain of failures occurs:

  1. Orphaned Backfill record blocks subsequent backfills

When a backfill is requested with from_date after to_date, _validate_backfill_params() passes without error. The Backfill record is committed to the DB (session.commit()) before _get_info_list() is
called. Since _get_info_list() returns an empty list for an invalid date range, a RuntimeError("No runs to create for Dag ...") is raised — but the Backfill record already exists in the DB with
completed_at=None and no associated BackfillDagRun records.

As a result, any subsequent backfill attempt for the same Dag immediately fails with AlreadyRunningBackfill, even though no DagRuns were ever created. The scheduler's _mark_backfills_complete() does
eventually clean up such orphaned records (via the created_at < initializing_cutoff guard — 2 minutes), but until then the Dag is effectively locked for backfilling.

  1. Misleading UI state

As a consequence of the above, the UI's backfill list shows the empty backfill as still running. Any new backfill attempt triggers an "Another backfill is running" popup — even though no actual runs
exist — leaving users unable to identify what went wrong or when it will resolve.

Before

airflow backfill create --dag-id test_backfill_validation --from-date 2026-05-13 --to-date 2026-05-1

no backfill data but backfill processing message
image

│ scheduler               UP │┃  File "/opt/airflow/airflow-core/src/airflow/utils/providers_configuration_loader.py", line 54, in wrapped_function                                                         ┃
│ api_server              UP │┃    return func(*args, **kwargs)                                                                                                                                             ┃
│ triggerer               UP │┃  File "/opt/airflow/airflow-core/src/airflow/cli/commands/backfill_command.py", line 99, in create_backfill                                                                 ┃
│ dag_processor           UP │┃    _create_backfill(                                                                                                                                                        ┃
│•shell                   UP │┃  File "/opt/airflow/airflow-core/src/airflow/models/backfill.py", line 655, in _create_backfill                                                                             ┃
│                            │┃    raise RuntimeError(f"No runs to create for Dag {dag_id}")                                                                                                                ┃
│                            │┃RuntimeError: No runs to create for Dag test_backfill_validation

Changes

models/backfill.py

  • Added InvalidBackfillDateRange exception class to distinguish date range errors from the existing InvalidBackfillDate (which covers future date requests)
  • Added from_date > to_date check at the top of _validate_backfill_params(), before any DB access — consistent with the "fail fast on bad input" principle
  • Grouped date-related validations (from_date > to_date and future date check) together, followed by DAG structure checks (depends_on_past) and config validation. This ordering matches the general convention of validating raw inputs before inspecting DAG internals

routes/public/backfills.py

  • Added InvalidBackfillDateRange to import and both except blocks in create_backfill and create_backfill_dry_run, so it is converted to a 400 RequestValidationError

After

airflow backfill create --dag-id test_backfill_validation --from-date 2026-05-13 --to-date 2026-05-1

no backfill data and no message
image

┌Apache Airflow──────────────┐┏Terminal COPY MODE━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
│ scheduler               UP │┃    return func(*args, **kwargs)                                                                                                                                             ┃
│ api_server              UP │┃  File "/opt/airflow/airflow-core/src/airflow/cli/commands/backfill_command.py", line 99, in create_backfill                                                                 ┃
│ triggerer               UP │┃    _create_backfill(                                                                                                                                                        ┃
│ dag_processor           UP │┃  File "/opt/airflow/airflow-core/src/airflow/models/backfill.py", line 642, in _create_backfill                                                                             ┃
│•shell                   UP │┃    _validate_backfill_params(dag, reverse, from_date, to_date, reprocess_behavior, dag_run_conf)                                                                            ┃
│                            │┃  File "/opt/airflow/airflow-core/src/airflow/models/backfill.py", line 279, in _validate_backfill_params                                                                    ┃
│                            │┃    raise InvalidBackfillDateRange(                                                                                                                                          ┃
│                            │┃airflow.models.backfill.InvalidBackfillDateRange: from_date (2026-05-13T00:00:00+00:00) must not be after to_date (2026-05-01T00:00:00+00:00).

Discussion

Exception message datetime format

The error message uses datetime.isoformat():

from_date (2026-05-13T00:00:00+00:00) must not be after to_date (2021-01-01T00:00:00+00:00).

This format was adopted by referencing other parts of the codebase (e.g. timetables/base.py, utils/log/file_task_handler.py), but I'm not certain it is the right convention for exception messages specifically — the existing InvalidBackfillDate does not include date values at all ("Backfill cannot be executed for future dates."). Would appreciate guidance on whether to keep the values for debuggability or simplify to a static message.


Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)
    claude

I'm happy to make any adjustments based on your feedback. Thank you to the maintainers for taking the time to review this contribution!

…x/backfill-invalid-date-range-validation
@choo121600 choo121600 added the ready for maintainer review Set after triaging when all criteria pass. label May 15, 2026
Copy link
Copy Markdown
Member

@choo121600 choo121600 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, Looks good to me :)

@choo121600 choo121600 added the backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch label May 20, 2026
@choo121600 choo121600 merged commit 1536238 into apache:main May 20, 2026
142 checks passed
@github-actions github-actions Bot added this to the Airflow 3.2.2 milestone May 20, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Hi maintainer, this PR was merged without a milestone set.
We've automatically set the milestone to Airflow 3.2.2 based on: backport label targeting v3-2-test
If this milestone is not correct, please update it to the appropriate milestone.

This comment was generated by Milestone Tag Assistant.

@github-actions
Copy link
Copy Markdown
Contributor

Backport failed to create: v3-2-test. View the failure log Run details

Note: As of Merging PRs targeted for Airflow 3.X
the committer who merges the PR is responsible for backporting the PRs that are bug fixes (generally speaking) to the maintenance branches.

In matter of doubt please ask in #release-management Slack channel.

Status Branch Result
v3-2-test Commit Link

You can attempt to backport this manually by running:

cherry_picker 1536238 v3-2-test

This should apply the commit to the v3-2-test branch and leave the commit in conflict state marking
the files that need manual conflict resolution.

After you have resolved the conflicts, you can continue the backport process by running:

cherry_picker --continue

If you don't have cherry-picker installed, see the installation guide.

vatsrahul1001 pushed a commit to choo121600/airflow that referenced this pull request May 21, 2026
…te range request (apache#66874)

When a backfill is requested with from_date after to_date, the Backfill
record was committed before _get_info_list() returned an empty list, leaving
an orphaned record that blocked subsequent backfills with
AlreadyRunningBackfill until the scheduler's 2-minute cleanup ran.

Add an InvalidBackfillDateRange exception and validate from_date <= to_date
at the top of _validate_backfill_params(), before any DB operations.

(cherry picked from commit 1536238)
bbovenzi pushed a commit that referenced this pull request May 21, 2026
…te range request (#66874) (#67250)

* Fix OTel timer metrics using Gauge instead of Histogram (#64207) (#66865)

* Fix OTel timer metrics using Gauge instead of Histogram

* Use ExponentialBucketHistogramAggregation for timing metrics

* Use public API import path for ExponentialBucketHistogramAggregation and fix histogram map isolation

(cherry picked from commit b2dadd2)

Co-authored-by: namratachaudhary <namratachaudhary@users.noreply.github.com>

* [v3-2-test] Prevent AlreadyRunningBackfill error caused by invalid date range request (#66874)

When a backfill is requested with from_date after to_date, the Backfill
record was committed before _get_info_list() returned an empty list, leaving
an orphaned record that blocked subsequent backfills with
AlreadyRunningBackfill until the scheduler's 2-minute cleanup ran.

Add an InvalidBackfillDateRange exception and validate from_date <= to_date
at the top of _validate_backfill_params(), before any DB operations.

(cherry picked from commit 1536238)

---------

Co-authored-by: Rahul Vats <43964496+vatsrahul1001@users.noreply.github.com>
Co-authored-by: namratachaudhary <namratachaudhary@users.noreply.github.com>
Co-authored-by: Park Jiwon <57484954+david-parkk@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch ready for maintainer review Set after triaging when all criteria pass.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants