Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add conditional logic for dataset triggering #37016

Merged

Conversation

dstandish
Copy link
Contributor

@dstandish dstandish commented Jan 25, 2024

Add conditional logic for dataset-triggered dags.

This means we can schedule based on dataset1 OR dataset1.

This PR only implements the underlying classes, DatasetAny and DatasetAll. In a followup PR we will add more convenient syntax for this, specifically the | and & symbols, e.g. (dataset1 | dataset2) & dataset3.

@dstandish
Copy link
Contributor Author

dstandish commented Feb 2, 2024

@sunank200

  1. fix test on main PR
  2. look at the tests on the PR and determine which should be added or removed
  3. performance
    • why 10 minutes in main?
    • why slower with this PR?
    • what's going on?

update: the performance concern appears to be invalid, due to randomness in task execution issues / unrelated scheduler restarts

@sunank200 sunank200 force-pushed the add-conditional-logic-for-dataset-triggering branch 5 times, most recently from 83087fc to 3cffcf4 Compare February 7, 2024 06:24
@kaxil kaxil added this to the Airflow 2.9.0 milestone Feb 7, 2024
@sunank200 sunank200 force-pushed the add-conditional-logic-for-dataset-triggering branch 3 times, most recently from 7514d44 to ab979da Compare February 8, 2024 07:16
@sunank200
Copy link
Collaborator

@sunank200

  1. fix test on main PR

  2. look at the tests on the PR and determine which should be added or removed

  3. performance

    • why 10 minutes in main?
    • why slower with this PR?
    • what's going on?

Documentation changes are done here

@dstandish dstandish changed the title DRAFT Add conditional logic for dataset triggering Add conditional logic for dataset triggering Feb 13, 2024
@dstandish dstandish marked this pull request as ready for review February 13, 2024 19:32
@dstandish dstandish force-pushed the add-conditional-logic-for-dataset-triggering branch from 870452c to 9753973 Compare February 20, 2024 22:14
@dstandish dstandish force-pushed the add-conditional-logic-for-dataset-triggering branch from 0137de6 to 71f6eba Compare February 21, 2024 17:47
dstandish and others added 5 commits February 21, 2024 09:49
Co-authored-by: Wei Lee <weilee.rx@gmail.com>
Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>
@dstandish
Copy link
Contributor Author

Do we plan to update object/next_run_datasets so the UI can show all of this logic?

Same for /next_run_datasets_summary, We should make sure the ready+total counts are still accurate. It might need to change from total to a min and max. so we can say something like 1 of 2-3 datasets updated

Created issue on our board @bbovenzi

@dstandish dstandish merged commit f971232 into apache:main Feb 21, 2024
59 checks passed
@dstandish dstandish deleted the add-conditional-logic-for-dataset-triggering branch February 21, 2024 19:24
abhishekbhakat pushed a commit to abhishekbhakat/my_airflow that referenced this pull request Mar 5, 2024
Add conditional logic for dataset-triggered dags so that we can schedule based on dataset1 OR dataset1.

This PR only implements the underlying classes, DatasetAny and DatasetAll. In a followup PR we will add more convenient syntax for this, specifically the | and & symbols, e.g. (dataset1 | dataset2) & dataset3.

---------

Co-authored-by: Ankit Chaurasia <8670962+sunank200@users.noreply.github.com>
Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>
Co-authored-by: Wei Lee <weilee.rx@gmail.com>
@ephraimbuddy ephraimbuddy added the type:new-feature Changelog: New Features label Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants