Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Render Dataset Conditions in DAG Graph view #41137

Merged
merged 2 commits into from
Jul 31, 2024

Conversation

bbovenzi
Copy link
Contributor

@bbovenzi bbovenzi commented Jul 30, 2024

Before we were just rendering a json object of the any/all conditions of dataset events for a dag to run. Now, we interpret that and render it in the graph view with logical gates.

Datasets that actually had events will be highlighted with a different border so it's easy to see what triggered the selected run.

Also, added a check to still create a dataset node if there is a dataset event even if the getDatasets endpoint didnt return anything.

These are both workarounds. It would best to refactor the dag graph python code to accept a with_datasets param and handle this logic and include dataset aliases too. Hopefully, I can make that a follow-up PR.

Screenshot 2024-07-30 at 5 36 47 PM

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@bbovenzi bbovenzi added this to the Airflow 2.10.0 milestone Jul 30, 2024
@boring-cyborg boring-cyborg bot added area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues labels Jul 30, 2024
@Lee-W
Copy link
Member

Lee-W commented Jul 31, 2024

The UI looks great! I tested it with the following dag but couldn't see the graph. Is there anything I missed? Thanks!

from __future__ import annotations

import pendulum

from airflow import DAG
from airflow.datasets import Dataset
from airflow.decorators import task

with DAG(
    dag_id="issue_856",
    start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
    schedule=Dataset("s3://bucket/my-task") | Dataset("2") & Dataset("3") | (Dataset("4") & Dataset("5")),
    catchup=False,
    tags=["producer", "dataset"],
):

    @task
    def produce_dataset_events():
        pass

    produce_dataset_events()

Copy link
Contributor

@eladkal eladkal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@phanikumv phanikumv merged commit 16ed4df into apache:main Jul 31, 2024
48 checks passed
@phanikumv phanikumv deleted the dag_outlets branch July 31, 2024 12:59
@utkarsharma2 utkarsharma2 added the type:improvement Changelog: Improvements label Jul 31, 2024
@bbovenzi
Copy link
Contributor Author

The UI looks great! I tested it with the following dag but couldn't see the graph. Is there anything I missed? Thanks!

This is on the DAG graph not the datasets dependency graph. Can you share a screenshot?

@Lee-W
Copy link
Member

Lee-W commented Aug 1, 2024

Yep, it looks like this.
Screenshot 2024-08-01 at 9 25 32 AM

with DAG(
    dag_id="issue_consumer",
    start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
    schedule=Dataset("1") | Dataset("2"),
    catchup=False,
    tags=["consumer", "dataset"],
):
    ...

@Lee-W
Copy link
Member

Lee-W commented Aug 1, 2024

oh, I finally got what you mean!
It looks super cool. Thanks @bbovenzi !
圖片

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues type:improvement Changelog: Improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants