Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter Datasets by associated dag_ids (GET /datasets) #37512

Merged
merged 15 commits into from Feb 21, 2024

Conversation

Satoshi-Sh
Copy link
Contributor

Description

Added dag_ids query to the GET /datasets endpoint. Updated document and unit test accordingly.

Related Issue

closes #37423


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues labels Feb 18, 2024
@uranusjr
Copy link
Member

I wonder if it’s clear enough to call this just dag_ids, or should a more descriptive name be used, say comsuming_dag_ids.

@Satoshi-Sh
Copy link
Contributor Author

dag_ids could be from consuming_dags and producing_tasks.

We could have 2 queries for consuming_dags and producing_tasks separately. For now, I put them together as dag_ids.

@bbovenzi
Copy link
Contributor

I wonder if it’s clear enough to call this just dag_ids, or should a more descriptive name be used, say comsuming_dag_ids.

My use case is that I want any datasets connected to a single dag. But I am indifferent if that is a single param dag_ids or if I need to pass the dag_id twice in consuming_dag_ids and producing_dag_ids. I guess the later is most flexible.

Copy link
Contributor

@bbovenzi bbovenzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking as "request changes" to make sure we don't accidentally merge.

@jedcunningham
Copy link
Member

My use case is that I want any datasets connected to a single dag. But I am indifferent if that is a single param dag_ids or if I need to pass the dag_id twice in consuming_dag_ids and producing_dag_ids. I guess the later is most flexible.

I think having both makes sense. In your use case, if we only had consuming/producing, you'd have to hit the endpoint twice (they'd be AND'd together, not OR'd).

So maybe we start with the simple dag_ids and be clear in the description it filters for both consuming or producing dags. Leave the more granular filters for another PR / future need?

@bbovenzi bbovenzi added this to the Airflow 2.9.0 milestone Feb 20, 2024
Co-authored-by: Brent Bovenzi <brent.bovenzi@gmail.com>
Copy link
Contributor

@bbovenzi bbovenzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally, works great. Thanks for picking this up!

@potiuk potiuk merged commit fae6310 into apache:main Feb 21, 2024
57 checks passed
@Satoshi-Sh Satoshi-Sh deleted the feat/#37423/filter_datasets_by_dag_id branch February 21, 2024 16:01
abhishekbhakat pushed a commit to abhishekbhakat/my_airflow that referenced this pull request Mar 5, 2024
Co-authored-by: Brent Bovenzi <brent.bovenzi@gmail.com>
@ephraimbuddy ephraimbuddy added the type:improvement Changelog: Improvements label Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API area:UI Related to UI/UX. For Frontend Developers. area:webserver Webserver related Issues type:improvement Changelog: Improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Filter datasets by dag_id in rest API
8 participants