Skip to content

Allow future logical_date for manual and operator-triggered DAG runs#65856

Open
ZhaoMJ wants to merge 4 commits intoapache:mainfrom
ZhaoMJ:allow-manual-future-logical-date
Open

Allow future logical_date for manual and operator-triggered DAG runs#65856
ZhaoMJ wants to merge 4 commits intoapache:mainfrom
ZhaoMJ:allow-manual-future-logical-date

Conversation

@ZhaoMJ
Copy link
Copy Markdown

@ZhaoMJ ZhaoMJ commented Apr 25, 2026

Summary

PR #46663 removed the allow_trigger_in_future config and hardcoded a blanket ban on future logical_date. Both the scheduler (_schedule_dag_run) and the RunnableExecDateDep task dependency unconditionally reject any DAG run whose logical_date is in the future, regardless of run type. The result: when a user triggers a DAG with a future logical_date, the DagRun appears as "running" but tasks never execute — and the run can't be deleted until the future date either.

This change relaxes the restriction for manually triggered and operator-triggered runs, allowing them to execute immediately even with a future logical_date. Scheduled runs with a future logical_date are still blocked. A new "Run immediately" checkbox in the Trigger DAG UI gives users explicit control over whether a future-dated run executes immediately or waits until the logical date. The default behavior is unchanged — existing workflows are not affected.

Key insight: The API already has separate logical_date and run_after fields. run_after controls when the scheduler picks up the run (WHERE run_after <= NOW()). Setting run_after=now() with logical_date=future should work — but a hard block in the scheduler and a task dep (RunnableExecDateDep) prevented it.

Motivation

The rename from execution_date to logical_date in Airflow 3 reflects a deliberate design shift: logical_date is a logical organizer for DAG runs, not a constraint on when tasks execute. Blocking future values undermines this.

This is blocking our migration from Airflow 2 to 3. In Airflow 2, allow_trigger_in_future=True supported our workflows. That config was removed in Airflow 3 with no replacement, breaking existing production patterns.

Suggested workarounds don't work

In #46663, it was suggested: "Instead of using a future date, you can trigger with a None logical date whenever you want. A custom run_id can be supplied if you want it. If a date is needed, it can be passed as a DAG param instead."

However:

  • There is currently no option in the UI to set logical_date to None. The Trigger DAG form always pre-fills logical date with now() and provides no way to clear it.
  • Passing the date as a DAG param is not equivalent because:
    • Template macros like {{ logical_date }} are used extensively throughout task definitions
    • The Calendar view positions runs by logical_date — params are invisible there
    • UI filtering and sorting by date relies on logical_date, not params

Use cases

  • Custom session date models: Organizations that maintain their own concept of "session date" (e.g., triggering a Monday run on Friday night) where the session date maps directly to Airflow's logical_date. These runs should execute immediately at trigger time, not wait until the logical date arrives. This is our primary use case — we model business session dates that don't align with calendar dates, and logical_date is the natural representation for this in Airflow.

  • Cross-timezone event-driven ETL: DAGs triggered by external webhooks that serve data pipelines across multiple timezones. The logical_date represents the business date being processed — not when the task should execute. For Asian timezones (e.g., Asia/Tokyo, Asia/Sydney), the business day rolls over hours before UTC midnight. When a webhook fires at 16:00 UTC to process the next business day's data for Tokyo (already April 27 in JST while still April 26 in UTC), the logical_date must be set to the next UTC day — a future date from the server's perspective.

  • Pre-staging and testing: Trigger a run for a future date to verify pipelines ahead of time (e.g., end-of-month processing, quarterly reports).

  • Forecast/forward-looking data: Pipelines that process weather forecasts, financial projections, or other data indexed by future dates naturally need future logical_date values.

  • Calendar view positioning: The Calendar view positions runs by logical_date. Without future logical_date support, manually triggered runs always cluster at today's date instead of appearing at their intended position.

Changes

Scheduler and task dependency (commits 1–2)

  • Scheduler (_schedule_dag_run): skip future logical_date block for MANUAL and OPERATOR_TRIGGERED run types (the run_after <= now() query filter is the primary gate)
  • Task dep (RunnableExecDateDep): skip future-date check for MANUAL and OPERATOR_TRIGGERED run types
  • UI (TriggerDAGForm.tsx, types.ts, useTrigger.ts): add "Run immediately" checkbox that appears when logical date is in the future. When unchecked, run_after = logicalDate so the run waits; when checked, run_after defaults to now() and the run executes immediately
  • Locales: add translated runImmediately string for all 21 locales
  • Tests: add operator-triggered future logical_date tests for both scheduler and task dep; existing manual and scheduled tests remain

Optional run_after on TriggerDagRunOperator (commit 3)

Adds an optional run_after parameter to TriggerDagRunOperator, threaded through the full chain: operator → DagRunTriggerException → task runner → supervisor → SDK client → Execution API → trigger_dag().

Default behavior is unchanged:

  • When neither logical_date nor run_after is set: run_after=Nonetrigger_dag() defaults to utcnow() → run executes immediately (same as before)
  • When logical_date is explicitly set but run_after is not: run_after defaults to logical_date, so the run waits until that time (same as before — previously enforced by RunnableExecDateDep, now by run_after)
  • When run_after is explicitly set: user controls exactly when the run becomes eligible
  • When run_after=None is explicitly set with a future logical_date: forces immediate execution

Files changed:

  • TriggerDagRunOperator: new run_after param with NOTSET default, added to template_fields
  • DagRunTriggerException: new run_after field
  • Task runner (_handle_trigger_dag_run): passes run_after to TriggerDagRun message
  • Supervisor: passes run_after to client.dag_runs.trigger()
  • SDK client (DagRunOperations.trigger): new run_after param, included in TriggerDAGRunPayload
  • Execution API model (TriggerDAGRunPayload): new run_after field
  • Execution API route: passes run_after to trigger_dag()
  • Execution API versioning: AddRunAfterToTriggerPayload version change for backward compat
  • SDK generated models: run_after field added to TriggerDAGRunPayload

related: #46663
related: #60443

"Run immediately" checkbox

Screenshot 2026-04-26 at 6 57 17 AM

Calendar view with future logical_date

Screenshot 2026-04-26 at 5 50 21 AM
Was generative AI tooling used to co-author this PR?
  • Yes — Claude Code (Claude Opus 4.6)

Generated-by: Claude Code (Claude Opus 4.6) following the guidelines

PR apache#46663 removed the `allow_trigger_in_future` config and hardcoded
a blanket ban on future `logical_date`. Both the scheduler and the
`RunnableExecDateDep` task dependency unconditionally reject any DAG
run whose `logical_date` is in the future, regardless of run type.

There are legitimate use cases (testing, pre-staging runs, Calendar
view organization) where users want to manually trigger a DAG with a
future `logical_date`. This change only blocks future `logical_date`
for scheduled runs, not manual ones.

Changes:
- Scheduler (`_schedule_dag_run`): skip future-date check for MANUAL runs
- Task dep (`RunnableExecDateDep`): skip future-date check for MANUAL runs
- Tests: updated existing tests, added scheduler-level tests for both
  manual (allowed) and scheduled (still blocked) future logical_date
@ZhaoMJ ZhaoMJ requested review from XD-DENG and ashb as code owners April 25, 2026 21:49
@boring-cyborg boring-cyborg Bot added the area:Scheduler including HA (high availability) scheduler label Apr 25, 2026
@ZhaoMJ ZhaoMJ marked this pull request as draft April 25, 2026 22:27
The previous commit removed the scheduler/task-dep hard block for manual
runs with a future logical_date. This extends the same treatment to
operator-triggered runs (TriggerDagRunOperator) and adds a UI checkbox
so users can choose whether a future-dated manual run executes
immediately or waits until the logical date.

- Scheduler: also skip the future logical_date block for
  OPERATOR_TRIGGERED runs (run_after query filter is sufficient)
- Task dep (RunnableExecDateDep): skip future-date check for both
  MANUAL and OPERATOR_TRIGGERED run types
- UI: show "Run immediately" checkbox when logical date is in the
  future; when unchecked, set run_after=logicalDate so the run waits
- Add translated strings for all 21 locales
- Add tests for operator-triggered runs in both scheduler and dep tests
@ZhaoMJ ZhaoMJ changed the title Allow manually triggered DAG runs with future logical_date Allow future logical_date DAG runs to execute immediately Apr 25, 2026
@ZhaoMJ ZhaoMJ marked this pull request as ready for review April 25, 2026 23:14
@ZhaoMJ ZhaoMJ marked this pull request as draft April 26, 2026 00:18
Thread `run_after` through the full operator → exception → task runner →
supervisor → SDK client → execution API chain so users can control when
an operator-triggered DAG run becomes eligible for scheduling.

When `logical_date` is explicitly set but `run_after` is not, `run_after`
defaults to `logical_date` so the run waits until that time. When neither
is set, `run_after` is left to the API server default (`utcnow()`).
Setting `run_after=None` explicitly forces immediate execution even with
a future `logical_date`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ZhaoMJ ZhaoMJ changed the title Allow future logical_date DAG runs to execute immediately Allow future logical_date for manual and operator-triggered DAG runs Apr 26, 2026
@ZhaoMJ ZhaoMJ marked this pull request as ready for review April 26, 2026 01:05
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@SameerMesiah97 SameerMesiah97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can understand rationale behind this PR but this looks like a partial implementation toward decoupling logical_date from scheduling and execution semantics.

Right now, the scheduler explicitly uses logical_date as an ordering criterion when selecting task instances to queue (e.g. ORDER BY priority_weight, logical_date). This means the system assumes that execution progresses in logical_date order. I have left a comment where I go into this in abit more detail but my point is that implementing this fully may require removing logical_date from scheduling order and redefining how task prioritization works.

Have you dicsussed with or sought alignment from the maintainers in this regard?

dag_run.logical_date
and dag_run.logical_date > timezone.utcnow()
and dag_run.run_type not in (DagRunType.MANUAL, DagRunType.OPERATOR_TRIGGERED)
):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the only place where logical_date is being used as an invariant. If you look at other parts of scheduler_job_runner.py (specifically _executable_task_instances_to_queued), you will see that task instances are ordered by logical_date before starvation filters are applied. This means that if we allow a manually triggered run with a future logical_date to start executing early, it can still be deprioritized relative to earlier logical dates during scheduling.

So instead of being fully unblocked, these runs may end up in a state where they start but then make slower or inconsistent progress depending on what other runs exist. More generally, we now allow execution to violate logical ordering, while the scheduler still uses logical_date to enforce it.

Copy link
Copy Markdown
Author

@ZhaoMJ ZhaoMJ Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback, interesting observation! I haven't sought alignment from the maintainers in this regard. From my (limited) understanding, the direction is to pivot from logical_date to run_after entirely in scheduling, but I think it's a separate concern from this PR.

The goal of this PR is to add a non-breaking and non-confusing option to allow immediate execution of future logical date. I'm afraid changing TI scheduling order to use run_after instead of logical_date would affect existing workflows, and would be too impactful, risky and breaking.

The mismatch between run_after affecting dagrun scheduling decisions and logical_date affecting task prioritization already exists today — I'm just extending it to future logical dates.

That said, decoupling logical_date from scheduling and execution semantics feels like a separate meaningful discussion if the community wants to pursue it, and I'm happy to follow up on it.


Regarding the specific code here — I'm not quite sure how this changes anything regarding logical ordering violation. I believe you can already specify mismatched run_after and logical_date using the API today and potentially allow a dagrun with an earlier logical_date to execute later (than another one with a later logical_date but an earlier run_after.)

Besides, per our current definition of logical_date:

A date-time that logically identifies the current Dag run. This value does not contain any semantics, but is simply a value for identification.

I'd argue that the "logical" is really to the user rather than to Airflow internals, and it shouldn't be applied to ordering if not for backward compatibility reasons.

That said, looking at this part again and considering that run_after is defined as "A pendulum.DateTime instance that tells the scheduler when the Dag run can be scheduled", I feel like maybe we shouldn't be checking logical_date here at all.

@jscheffl
Copy link
Copy Markdown
Contributor

I assume such change requires a small [DISCUSS] thread on the devlist as it changes semantics of scheduling and can have a couple of side effects. No opinion on my side and did not take a look to code.

@ZhaoMJ
Copy link
Copy Markdown
Author

ZhaoMJ commented Apr 27, 2026

Thanks @jscheffl — I didn't start a discussion thread first because I believed the changes are not breaking. That said, I agree it is a good idea to have some discussion first, given the side effects and that the PR is more complex than I thought.

I've started a discussion thread: #65949. (I also sent an email to the devlist, but that didn't seem to go through for some reason.) Happy to incorporate feedback from there.

@jscheffl
Copy link
Copy Markdown
Contributor

I've started a discussion thread: #65949.

Discussions in Github are "just support" they are not real devlist discussions - many contributors will miss discussions. If you miss to see the email on devlist you mightneed to subscribe yourself?

@Lee-W
Copy link
Copy Markdown
Member

Lee-W commented Apr 30, 2026

yep, agree with @jscheffl we'll need a dev mailing list discussion for this. I'm a bit surprised by the change. you can join the dev list by the instruction here https://airflow.apache.org/community/

@ZhaoMJ
Copy link
Copy Markdown
Author

ZhaoMJ commented Apr 30, 2026

@Lee-W @jscheffl Thanks for the reply! I finally managed to subscribe and send the discussion email to the devlist after switching from outlook to gmail. Looking forward to the discussion there: https://lists.apache.org/thread/5j6ljfq6tdyo2jy63dvr209hthwhbykl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Scheduler including HA (high availability) scheduler

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants