Skip to content

Conversation

@avkirilishin
Copy link
Contributor

closes: #20461

Problem: Airflow is trying to schedule tasks prior to DAG's start_date.
Solution: Added ExecDateAfterStartDateDep to the dependencies that need to be met for a given task instance to be set to 'RUNNING' state.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@boring-cyborg boring-cyborg bot added area:CLI area:Scheduler including HA (high availability) scheduler labels Feb 19, 2022
Copy link
Contributor

@ephraimbuddy ephraimbuddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right fix. If I understand correctly, the problem is that the scheduler is creating dagruns for those past dates and not that the task instances are being scheduled. The ExecDateAfterStartDateDep currently in place will prevent tasks from getting into queued state if the start date is before the execution date

@avkirilishin
Copy link
Contributor Author

I don't think this is the right fix. If I understand correctly, the problem is that the scheduler is creating dagruns for those past dates and not that the task instances are being scheduled. The ExecDateAfterStartDateDep currently in place will prevent tasks from getting into queued state if the start date is before the execution date

I explained it here: #20461 (comment)

I think there are two different problems:

The problem is related to the different logic of the scheduler before and after the update. Maybe there are no tasks in the running dags or something else. Can you show the rows for this dag run in dag, dag_run and task_instance?

I agree with you that it is not the right behavior for the scheduler to continue scheduling runs that are earlier than the actual start date of the DAG or Task. It can happen, for example, after turning the dag off and back on. So I made a PR to fix it: #21684

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Apr 10, 2022
@github-actions github-actions bot closed this Apr 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:CLI area:Scheduler including HA (high availability) scheduler stale Stale PRs per the .github/workflows/stale.yml policy file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Airflow is trying to schedule tasks prior to DAG's start_date

2 participants