You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'd like to discuss a change proposed in PR #65856 that relaxes the blanket ban on future logical_date for manually triggered and operator-triggered DAG runs.
Background
PR #46663 removed the allow_trigger_in_future config and hardcoded a block on any DAG run whose logical_date is in the future. This affects both the scheduler (_schedule_dag_run) and the RunnableExecDateDep task dependency. The result: when a user triggers a DAG with a future logical_date, the DagRun appears as "running" but tasks do not execute until the logical_date is reached.
This is blocking our migration from Airflow 2 to 3 — we use logical_date to represent business session dates that don't align with calendar dates, and these runs need to execute immediately at trigger time.
Given that run_after is used for scheduling and logical_date is meant to "not contain any semantics, but is simply a value for logical identification," I believe it makes sense to relax the block on future logical_date. Please see the PR description for the rationale, use cases, and more details.
Proposed change
Skip the future logical_date block for MANUAL and OPERATOR_TRIGGERED run types only. Scheduled runs remain blocked. The existing run_after <= now() query filter still controls when the scheduler picks up a run, but dag runs with future logical_date will be able to execute immediately if run_after is set to None or now(). A "Run immediately" checkbox in the Trigger DAG UI gives users explicit control. Also added run_after to TriggerDagRunOperator and TriggerDAGRunPayload so that users can control this from the operator path as well.
Why I believe this is non-breaking
The default behavior is unchanged. The run_after field already controls when the scheduler picks up a run. Manual triggers via the Core API already allow setting run_after=now() with a future logical_date — this just removes the secondary block that prevents it from working.
Broader question raised in review
logical_date is still used for TI prioritization in _executable_task_instances_to_queued and as a scheduling guard in _schedule_dag_run. Given that logical_date is defined as purely identificational ("does not contain any semantics") while run_after is defined as the scheduling control, should we decouple logical_date from scheduling entirely? I think that's a separate, larger discussion — this PR intentionally avoids changing TI ordering.
Looking for feedback on whether the approach is reasonable and if the broader decoupling warrants its own discussion thread.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Hi all,
I'd like to discuss a change proposed in PR #65856 that relaxes the blanket ban on future
logical_datefor manually triggered and operator-triggered DAG runs.Background
PR #46663 removed the
allow_trigger_in_futureconfig and hardcoded a block on any DAG run whoselogical_dateis in the future. This affects both the scheduler (_schedule_dag_run) and theRunnableExecDateDeptask dependency. The result: when a user triggers a DAG with a futurelogical_date, the DagRun appears as "running" but tasks do not execute until thelogical_dateis reached.This is blocking our migration from Airflow 2 to 3 — we use
logical_dateto represent business session dates that don't align with calendar dates, and these runs need to execute immediately at trigger time.Given that
run_afteris used for scheduling andlogical_dateis meant to "not contain any semantics, but is simply a value for logical identification," I believe it makes sense to relax the block on futurelogical_date. Please see the PR description for the rationale, use cases, and more details.Proposed change
Skip the future
logical_dateblock forMANUALandOPERATOR_TRIGGEREDrun types only. Scheduled runs remain blocked. The existingrun_after <= now()query filter still controls when the scheduler picks up a run, but dag runs with futurelogical_datewill be able to execute immediately ifrun_afteris set toNoneornow(). A "Run immediately" checkbox in the Trigger DAG UI gives users explicit control. Also addedrun_aftertoTriggerDagRunOperatorandTriggerDAGRunPayloadso that users can control this from the operator path as well.Why I believe this is non-breaking
The default behavior is unchanged. The
run_afterfield already controls when the scheduler picks up a run. Manual triggers via the Core API already allow settingrun_after=now()with a futurelogical_date— this just removes the secondary block that prevents it from working.Broader question raised in review
logical_dateis still used for TI prioritization in_executable_task_instances_to_queuedand as a scheduling guard in_schedule_dag_run. Given thatlogical_dateis defined as purely identificational ("does not contain any semantics") whilerun_afteris defined as the scheduling control, should we decouplelogical_datefrom scheduling entirely? I think that's a separate, larger discussion — this PR intentionally avoids changing TI ordering.Looking for feedback on whether the approach is reasonable and if the broader decoupling warrants its own discussion thread.
Thanks,
Mingjie Zhao
Beta Was this translation helpful? Give feedback.
All reactions