-
Notifications
You must be signed in to change notification settings - Fork 16.6k
Description
Apache Airflow version
2.1.3
What happened
DAG run gets failed due to some task instances getting failed in the first try either due to "pod name must be provided" or "pod not found" issue.
But in the second retry task gets completed successfully and that task is marked as a success but before that downstream tasks were marked as "upstream failed" and they didn't rerun and the whole dag run is marked as failed.
In some instances, in 1st try run of the task, there are no errors for fetching logs but no logs are shown except 1st reading from the log file statement.
But on 2nd try proper logs are shown and that task is completed successfully. But in that case, the same issue occurs, downstream tasks were marked as "upstream failed" before 2nd retry was complete, and the dag run was marked as failed.
What you expected to happen
For 1st try of task instance, it should be able to fetch worker pod logs and the "pod name must be provided" error must not occur or logs should not empty.
And if 1st try of the task has failed but if some no of retries is set for it then the downstream task should not be marked as "upstream failed" completely even if downstream tasks also have the same no of retries set. And if in 2nd try of task if the task gets completed successfully then downstream tasks should also rerun and dag run should not be marked as failed.
How to reproduce
No response
Operating System
Debian GNU/Linux
Versions of Apache Airflow Providers
No response
Deployment
Other 3rd-party Helm chart
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct