New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Airflow is trying to schedule tasks prior to DAG's start_date #20461
Comments
The scheduler is not going to delete dag runs already existing , so you have to manually delete them , I think it's safe this way |
This is fine and it actually makes sense that the scheduler won't delete the old runs, however, I don't think it is the right behavior for the scheduler to continue scheduling runs that are earlier than the actual start date of the DAG. |
It's very strange because in 2.2.2 we have strong checks on When were these runs created? Did you upgrade Airflow after it? |
@avkirilishin runs were created in 2.2.0 or 2.2.1 likely then Airflow was upgraded to 2.2.2. |
@andreychernih I think there are two different problems:
|
(Sorry I missed this) I think #21011 is different. That one schedules the task (incorrectly) at |
Since tasks go through
|
I think airflow/airflow/ti_deps/dependencies_deps.py Lines 67 to 75 in 3c4524b
|
What I think we should do but I have not tried any:
I think the solution you have right now, will have the scheduler put those task instances in queued state and never move them to |
Can someone please give me exact reproduction steps for this including a dag I can run? |
This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author. |
This issue has been closed because it has not received response from the issue author. |
Apache Airflow version
2.2.2
What happened
I have a DAG which start_date was 09/01/2021 initially but then it was changed to 11/01/2021. This DAG has some runs prior to 11/01/2021 that did not get a chance to finish. I can now see that scheduler is still trying to schedule the runs prior to 11/01/2021. But none of the tasks in these runs are starting because of the start_date check I presume. This is maxing out the active runs thus blocking any other days within the DAG range to be scheduled.
What you expected to happen
Scheduler should not be trying to schedule runs that are prior to DAG's start_date.
How to reproduce
No response
Operating System
Airflow Docker
Versions of Apache Airflow Providers
No response
Deployment
Other Docker-based deployment
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: