Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce DatasetOrTimeSchedule #36710

Merged
merged 9 commits into from Feb 1, 2024
Merged

Conversation

uranusjr
Copy link
Member

@uranusjr uranusjr commented Jan 10, 2024

This special timetable allows a DAG to be run against a time-based schedule and dataset events at the same time. The logic is nothing special---scheduled runs are created based on a time-based timetable, and dataset-triggered runs are created when dataset events happen. The two do not interact in any way.

I aim to maybe refactor this a bit so a DAG can take more complex combinations (e.g. dataset1 OR dataset2 OR timetable), but this is the simplest thing for the most common use case of “rerun DAG whenever things update, but also periodically once in a while”.

Please suggest a better name for this class.

@phanikumv phanikumv changed the title Introduce DatasetTimetable Introduce DatasetOrTimeSchedule Jan 29, 2024
@phanikumv phanikumv changed the title Introduce DatasetOrTimeSchedule Introduce DatasetOrTimeSchedule Jan 29, 2024
@dstandish
Copy link
Contributor

dstandish commented Jan 31, 2024

@sunank200 perhaps (either in this PR or followup) we should update the UI so that on home page it should shew the next run time instead of 0 of 2 datasets updated (as as an example). or something special to indicate there are both kinds of trigger. should also update the alt text to be more accurate.

image

@sunank200 sunank200 force-pushed the dataset-timetable branch 2 times, most recently from 9f22bbe to b63cd00 Compare January 31, 2024 06:48
uranusjr and others added 9 commits February 1, 2024 15:58
This special timetable allows a DAG to be run against a time-based
schedule and dataset events at the same time. The logic is nothing
special---scheduled runs are created based on a time-based timetable,
and dataset-triggered runs are created when dataset events happen. The
two do not interact in any way.
@phanikumv phanikumv merged commit fb27898 into apache:main Feb 1, 2024
57 checks passed
@phanikumv phanikumv deleted the dataset-timetable branch February 1, 2024 13:39
@ephraimbuddy ephraimbuddy added this to the Airflow 2.9.0 milestone Feb 19, 2024
@ephraimbuddy ephraimbuddy added the type:new-feature Changelog: New Features label Feb 19, 2024
abhishekbhakat pushed a commit to abhishekbhakat/my_airflow that referenced this pull request Mar 5, 2024
* Introduce DatasetTimetable

This special timetable allows a DAG to be run against a time-based
schedule and dataset events at the same time. The logic is nothing
special---scheduled runs are created based on a time-based timetable,
and dataset-triggered runs are created when dataset events happen. The
two do not interact in any way.

Co-authored-by: Ankit Chaurasia <8670962+sunank200@users.noreply.github.com>
Co-authored-by: Daniel Standish <15932138+dstandish@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants