Skip to content

Datasets should have allowed_states parameter #49525

@karenbraganz

Description

@karenbraganz

Description

Currently, a dataset is updated only when the producing task completes in the success state. I propose adding an allowed_states parameter to datasets, which would allow datasets to trigger the downstream consuming DAG even if the producing task is not successful. This would provide more flexibility with dataset scheduling.

Proposed Changes:

  • The consuming DAG should include a list of allowed_states with the dataset used in the schedule.
  • The dataset should be updated once the producing task completes irrespective of the state of the producing task (not only when the producing task succeeds).
  • This update should trigger the consuming DAG if the state of the producing task is one of the states included in the allowed_states list.
  • The allowed_states will default to including the success state only.

Use case/motivation

  • User wants consuming DAG to be triggered irrespective of the state of the producing task.
  • User wants different consuming DAGs to be triggered depending on the state of the producing task.

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions