Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clearing a subdag task leaves parent dag in the failed state #15374

Closed
ctimbreza opened this issue Apr 14, 2021 · 13 comments · Fixed by #15562
Closed

Clearing a subdag task leaves parent dag in the failed state #15374

ctimbreza opened this issue Apr 14, 2021 · 13 comments · Fixed by #15562
Labels
affected_version:2.0 Issues Reported for 2.0 area:webserver Webserver related Issues kind:bug This is a clearly a bug

Comments

@ctimbreza
Copy link

ctimbreza commented Apr 14, 2021

Apache Airflow version:
2.0.1

Kubernetes version:
Server Version: version.Info{Major:"1", Minor:"18+", GitVersion:"v1.18.9-eks-d1db3c", GitCommit:"d1db3c46e55f95d6a7d3e5578689371318f95ff9", GitTreeState:"clean", BuildDate:"2020-10-20T22:18:07Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

What happened:
Clearing a failed subdag task with Downstream+Recursive does not automatically set the state of the parent dag to 'running' so that the downstream parent tasks can execute.

The work around is to manually set the state of the parent dag to running after clearing the subdag task

What you expected to happen:
With airflow version 1.10.4 the parent dag was automatically set to 'running' for this same scenario

How to reproduce it:

  • Clear a failed subdag task selecting option for Downstream+Recursive
  • See that all down stream tasks in the subdag as well as the parent dag have been cleared
  • See that the parent dag is left in 'failed' state.
@ctimbreza ctimbreza added the kind:bug This is a clearly a bug label Apr 14, 2021
@boring-cyborg
Copy link

boring-cyborg bot commented Apr 14, 2021

Thanks for opening your first issue here! Be sure to follow the issue template!

@jedcunningham jedcunningham added affected_version:2.0 Issues Reported for 2.0 area:webserver Webserver related Issues labels Apr 14, 2021
@jedcunningham
Copy link
Member

@ctimbreza, I wasn't able to reproduce this in master naively. Is there anything in the webserver log that might be helpful? Do you have a simple DAG that reproduces it?

@ctimbreza
Copy link
Author

Here is the dag that I used to reproduce the issue. Subdag_1 task_2 is configured to randomly fail to help setup the scenario for this issue.

from airflow.models.dag import DAG
from airflow.operators.bash import BashOperator
from airflow.operators.dummy import DummyOperator
from airflow.operators.python_operator import PythonOperator
from airflow.operators.subdag_operator import SubDagOperator
from airflow.utils.dates import days_ago
import random

access_control = {
    "dst-view": ["can_read"],
    "dst-operator": ["can_read", "can_edit"],
    "dst-admin": ["can_read", "can_edit"]
}

def random_fail(**context):
    if random.random() > 0.5:
        raise Exception("Test failure")
    print("Success")

def __build_subdag1(parent_dag, child_dag_name):
    with DAG(dag_id='%s.%s' % (parent_dag.dag_id, child_dag_name),
             default_args=parent_dag.default_args,
             schedule_interval=parent_dag.schedule_interval,
             start_date=parent_dag.start_date,
             params=parent_dag.params,
             access_control=parent_dag.access_control,
             is_paused_upon_creation=False) as subdag:
        task_1 = DummyOperator(task_id="task_1")
        task_2 = PythonOperator(task_id="task_2", python_callable=random_fail, provide_context=True)
        task_1 >> task_2
    return subdag

def __build_subdag2(parent_dag, child_dag_name):
    with DAG(dag_id='%s.%s' % (parent_dag.dag_id, child_dag_name),
             default_args=parent_dag.default_args,
             schedule_interval=parent_dag.schedule_interval,
             start_date=parent_dag.start_date,
             params=parent_dag.params,
             access_control=parent_dag.access_control,
             is_paused_upon_creation=False) as subdag:
        task_1 = DummyOperator(task_id="task_1")
        task_2 = BashOperator(task_id="task_2", bash_command='sleep 15')
        task_1 >> task_2
    return subdag


with DAG(dag_id="chris-test", start_date=days_ago(2), tags=["dev"], access_control=access_control) as dag:

    start = DummyOperator(task_id="start")

    subdag_1 = SubDagOperator(
        subdag=__build_subdag1(parent_dag=dag,
                              child_dag_name="subdag_1"
                              ),
        task_id="subdag_1",
        dag=dag
    )

    subdag_2 = SubDagOperator(
        subdag=__build_subdag2(parent_dag=dag,
                              child_dag_name="subdag_2"
                              ),
        task_id="subdag_2",
        dag=dag
    )

    end = DummyOperator(task_id='end')

    start >> subdag_1 >> subdag_2 >> end

@xinbinhuang
Copy link
Contributor

xinbinhuang commented Apr 15, 2021

Hi @ctimbreza, welcome to Airflow! Before solving your specific issue with SubDagOperator, I wonder if you have considered using TaskGroup for your use case instead?
It was introduced as an alternative to SubDagOperator. It's simple to use and without the shortcomings from SubDag

@ephraimbuddy
Copy link
Contributor

@ctimbreza looks like this issue is resolved in #14776, can you verify?

@ctimbreza
Copy link
Author

@xinbinhuang Thanks for your suggestion on TaskGroup. We are look at TaskGroup for new dags going forward. However we do have a lot of existing dags currently running on airflow1 that we would like to move over to airflow2 as is.

@ctimbreza
Copy link
Author

@ephraimbuddy Thanks for pointing me to #14776. Looks like this is targeted for release 2.0.2 which is not out yet. I spent a little time today to see if I could run airflow locally to verify and was looking at the breeze development environment.

So far it has been pretty straightforward to setup, however after launching the environment (breeze start-airflow) and accessing the UI (http://127.0.0.1:28080/home) I cannot seem to interact with the UI. My dag is shown but none of the UI links are working.

@ephraimbuddy
Copy link
Contributor

@ephraimbuddy Thanks for pointing me to #14776. Looks like this is targeted for release 2.0.2 which is not out yet. I spent a little time today to see if I could run airflow locally to verify and was looking at the breeze development environment.

So far it has been pretty straightforward to setup, however after launching the environment (breeze start-airflow) and accessing the UI (http://127.0.0.1:28080/home) I cannot seem to interact with the UI. My dag is shown but none of the UI links are working.

Probably there's no user, you can create a user in breeze environment, you just have to run the create user command:
airflow users create -f firstname -l lastname -u admin -p admin -r Admin

You can drop into breeze environment with :
./breeze

@xinbinhuang
Copy link
Contributor

@ephraimbuddy Thanks for pointing me to #14776. Looks like this is targeted for release 2.0.2 which is not out yet. I spent a little time today to see if I could run airflow locally to verify and was looking at the breeze development environment.

So far it has been pretty straightforward to setup, however after launching the environment (breeze start-airflow) and accessing the UI (http://127.0.0.1:28080/home) I cannot seem to interact with the UI. My dag is shown but none of the UI links are working.

It's likely that the frontend assets haven't been build (gone into this issue recently). Can you try, inside breeze, cd airflow/www; yarn build prod/dev? And then start the webserver again

@ctimbreza
Copy link
Author

I tried running cd airflow/www; yarn build prod/dev but received an error that I forgot to capture.

I ended up deleting the .build dir and starting over. This time when running breeze I noticed a message saying to run './airflow/www/compile_assets.sh'. I ran this and then the UI was up and responding as expected.

Hitting a new issue now where my dag doesn't seem to be processing the tasks and I am seeing this message about the scheduler not running. I started with 'breeze start-airflow' and the scheduler looks like it is running from the console window.
image

Thanks everyone for the quick responses on this issue. I'll need to put this aside for now as I work on higher priority items on my side. Feel free to close the issue if you think it has already been addressed.

@xinbinhuang
Copy link
Contributor

I tried running cd airflow/www; yarn build prod/dev but received an error that I forgot to capture.

I mean either cd airflow/www; yarn build prod or cd airflow/www; yarn build dev. sorry for the confusion. But comiple_assets.sh will work too.

Hitting a new issue now where my dag doesn't seem to be processing the tasks and I am seeing this message about the scheduler not running. I started with 'breeze start-airflow' and the scheduler looks like it is running from the console window.

You need to start both the scheduler and webserver

# start scheduler in background; `&` means run the command in background
airflow scheduler &
# start webserver 
airflow webserver

@ctimbreza
Copy link
Author

Hi folks. Looks like #14776 did not resolve this issue. I am still able to reproduce it on Airflow 2.0.2

@zarrarrana
Copy link
Contributor

@ctimbreza @ephraimbuddy
#14776 was very close in fixing this, it is missing parent dag_id to be added to dag_ids if include_parentdag is true, to change the DagRun state.

kaxil pushed a commit that referenced this issue Apr 30, 2021
Closes: #15374
This pull request follows #14776. 

Clearing a subdag with Downstream+Recursive does not automatically set the state of the parent dag so that the downstream parent tasks can execute.
potiuk pushed a commit that referenced this issue May 9, 2021
Closes: #15374
This pull request follows #14776.

Clearing a subdag with Downstream+Recursive does not automatically set the state of the parent dag so that the downstream parent tasks can execute.

(cherry picked from commit a4211e2)
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Sep 17, 2021
Closes: apache/airflow#15374
This pull request follows apache/airflow#14776.

Clearing a subdag with Downstream+Recursive does not automatically set the state of the parent dag so that the downstream parent tasks can execute.

GitOrigin-RevId: a4211e276fce6521f0423fe94b01241a9c43a22c
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Sep 23, 2021
Closes: apache/airflow#15374
This pull request follows apache/airflow#14776.

Clearing a subdag with Downstream+Recursive does not automatically set the state of the parent dag so that the downstream parent tasks can execute.

GitOrigin-RevId: a4211e276fce6521f0423fe94b01241a9c43a22c
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Nov 27, 2021
Closes: apache/airflow#15374
This pull request follows apache/airflow#14776.

Clearing a subdag with Downstream+Recursive does not automatically set the state of the parent dag so that the downstream parent tasks can execute.

GitOrigin-RevId: a4211e276fce6521f0423fe94b01241a9c43a22c
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Mar 10, 2022
Closes: apache/airflow#15374
This pull request follows apache/airflow#14776.

Clearing a subdag with Downstream+Recursive does not automatically set the state of the parent dag so that the downstream parent tasks can execute.

GitOrigin-RevId: a4211e276fce6521f0423fe94b01241a9c43a22c
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Jun 4, 2022
Closes: apache/airflow#15374
This pull request follows apache/airflow#14776.

Clearing a subdag with Downstream+Recursive does not automatically set the state of the parent dag so that the downstream parent tasks can execute.

GitOrigin-RevId: a4211e276fce6521f0423fe94b01241a9c43a22c
kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Jul 9, 2022
Closes: apache/airflow#15374
This pull request follows apache/airflow#14776.

Clearing a subdag with Downstream+Recursive does not automatically set the state of the parent dag so that the downstream parent tasks can execute.

GitOrigin-RevId: a4211e276fce6521f0423fe94b01241a9c43a22c
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Aug 27, 2022
Closes: apache/airflow#15374
This pull request follows apache/airflow#14776.

Clearing a subdag with Downstream+Recursive does not automatically set the state of the parent dag so that the downstream parent tasks can execute.

GitOrigin-RevId: a4211e276fce6521f0423fe94b01241a9c43a22c
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Oct 4, 2022
Closes: apache/airflow#15374
This pull request follows apache/airflow#14776.

Clearing a subdag with Downstream+Recursive does not automatically set the state of the parent dag so that the downstream parent tasks can execute.

GitOrigin-RevId: a4211e276fce6521f0423fe94b01241a9c43a22c
aglipska pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Oct 7, 2022
Closes: apache/airflow#15374
This pull request follows apache/airflow#14776.

Clearing a subdag with Downstream+Recursive does not automatically set the state of the parent dag so that the downstream parent tasks can execute.

GitOrigin-RevId: a4211e276fce6521f0423fe94b01241a9c43a22c
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Dec 7, 2022
Closes: apache/airflow#15374
This pull request follows apache/airflow#14776.

Clearing a subdag with Downstream+Recursive does not automatically set the state of the parent dag so that the downstream parent tasks can execute.

GitOrigin-RevId: a4211e276fce6521f0423fe94b01241a9c43a22c
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Jan 27, 2023
Closes: apache/airflow#15374
This pull request follows apache/airflow#14776.

Clearing a subdag with Downstream+Recursive does not automatically set the state of the parent dag so that the downstream parent tasks can execute.

GitOrigin-RevId: a4211e276fce6521f0423fe94b01241a9c43a22c
kosteev pushed a commit to kosteev/composer-airflow-test-copybara that referenced this issue Sep 12, 2024
Closes: apache/airflow#15374
This pull request follows apache/airflow#14776.

Clearing a subdag with Downstream+Recursive does not automatically set the state of the parent dag so that the downstream parent tasks can execute.

GitOrigin-RevId: a4211e276fce6521f0423fe94b01241a9c43a22c
kosteev pushed a commit to kosteev/composer-airflow-test-copybara that referenced this issue Sep 13, 2024
Closes: apache/airflow#15374
This pull request follows apache/airflow#14776.

Clearing a subdag with Downstream+Recursive does not automatically set the state of the parent dag so that the downstream parent tasks can execute.

GitOrigin-RevId: a4211e276fce6521f0423fe94b01241a9c43a22c
kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Sep 17, 2024
Closes: apache/airflow#15374
This pull request follows apache/airflow#14776.

Clearing a subdag with Downstream+Recursive does not automatically set the state of the parent dag so that the downstream parent tasks can execute.

GitOrigin-RevId: a4211e276fce6521f0423fe94b01241a9c43a22c
kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Nov 7, 2024
Closes: apache/airflow#15374
This pull request follows apache/airflow#14776.

Clearing a subdag with Downstream+Recursive does not automatically set the state of the parent dag so that the downstream parent tasks can execute.

GitOrigin-RevId: a4211e276fce6521f0423fe94b01241a9c43a22c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affected_version:2.0 Issues Reported for 2.0 area:webserver Webserver related Issues kind:bug This is a clearly a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants