Skip to content

Use task_id as backup if no task.label#33103

Closed
bbovenzi wants to merge 1 commit intoapache:mainfrom
astronomer:Use-task-id-for-graph-node-is-no-label
Closed

Use task_id as backup if no task.label#33103
bbovenzi wants to merge 1 commit intoapache:mainfrom
astronomer:Use-task-id-for-graph-node-is-no-label

Conversation

@bbovenzi
Copy link
Contributor

@bbovenzi bbovenzi commented Aug 4, 2023

Fixes: #33093


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues labels Aug 4, 2023
@bbovenzi bbovenzi changed the title Use task_id as back for node if no label Use task_id as backup if not task.label Aug 4, 2023
@bbovenzi bbovenzi changed the title Use task_id as backup if not task.label Use task_id as backup if no task.label Aug 4, 2023
@uranusjr
Copy link
Member

uranusjr commented Aug 4, 2023

Where does label come from? Maybe we should fix that at the Python side instead.

@bbovenzi
Copy link
Contributor Author

bbovenzi commented Aug 8, 2023

@uranusjr I agree. I just don't know why label isn't being populated. @dstandish With your example dag, do you have any ideas?

@eladkal eladkal added this to the Airflow 2.7.0 milestone Aug 8, 2023
@dstandish
Copy link
Contributor

I was about to say that I think we can close this because after #33102, setting deps in a context no longer "auto adds" tasks to the context. So now the sample dag posted in #33093 will produce this graph:

image

I.e. nothing will be in the task group.

BUT...

then i added the tasks to the context with add_task:

    with TaskGroup("my_group") as tg:
        setup1 >> work1 >> teardown1
        setup1 >> teardown1
        tg.add_task(setup1)
        tg.add_task(teardown1)
        tg.add_task(work1)
    tg >> work2

And now we encounter the issue again:
image

I'll poke around and see if I can figure anything out.

@dstandish
Copy link
Contributor

Ok so the problem is this logic:

    @property
    def label(self) -> str | None:
        tg = self.task_group
        if tg and tg.node_id and tg.prefix_group_id:
            # "task_group_id.task_id" -> "task_id"
            return self.node_id[len(tg.node_id) + 1 :]
        return self.node_id

The issue is when the task is instantiated outside of the group, it doesn't get prefixed. So self.node_id[len(tg.node_id) + 1 :] is ''

@dstandish
Copy link
Contributor

dstandish commented Aug 9, 2023

An odd consequence of this is if the task group id is short, we'll actually get a partial name:
image

code to repro:

with DAG(dag_id="leaves_ignored", start_date=pendulum.now()):

    @task
    def setup1():
        ...

    @task
    def setup2():
        ...

    @task
    def teardown1():
        ...

    @task
    def teardown2():
        ...

    @task
    def work1():
        ...

    @task
    def work2():
        ...


    setup1 = setup1()
    # setup2 = setup2()
    teardown1 = teardown1()
    # teardown2 = teardown2()
    work1 = work1()
    work2 = work2()
    with TaskGroup("tg") as tg:
        setup1 >> work1 >> teardown1
        setup1 >> teardown1
        tg.add_task(setup1)
        tg.add_task(teardown1)
        tg.add_task(work1)
    tg >> work2

Maybe we should remove add_task for task group for now, wdyt @ephraimbuddy @uranusjr? It still has not been released yet.

@uranusjr
Copy link
Member

uranusjr commented Aug 9, 2023

Removing add_task sounds reasonable, I think that’s proposed once previously for another reason as well?

@ephraimbuddy
Copy link
Contributor

Removing add_task sounds reasonable, I think that’s proposed once previously for another reason as well?

Yeah. We should remove it. We previously wanted to make it add_tasks

@ephraimbuddy ephraimbuddy deleted the Use-task-id-for-graph-node-is-no-label branch August 9, 2023 20:46
@ephraimbuddy ephraimbuddy removed this from the Airflow 2.7.0 milestone Aug 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Task ids not shown inside task group

5 participants