Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DAG "Skipped" status in case of "startDate" in the future while manually triggering a DAGRun #23911

Open
2 tasks done
omarsmak opened this issue May 25, 2022 · 6 comments
Open
2 tasks done
Labels
kind:feature Feature Requests

Comments

@omarsmak
Copy link
Member

Description

Currently, if we set a startDate in the future, let's say 2029 and the the backfill is disabled, when triggering a DAGRun, the DAGRun status will be set to "Success" although no tasks have ran due to the future startDate, this can be potentially misleading to the user thinking. Therefore, I think it would make sense in such a use case, to add a new "Skipped" DAGRun status that indicates run has been skipped or some similar status.

Use case/motivation

The DAGRun status should indicate that no tasks have ran either by having a "Skipped" DAGRun status or something similar

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@omarsmak omarsmak added the kind:feature Feature Requests label May 25, 2022
@uranusjr
Copy link
Member

uranusjr commented May 26, 2022

Skipped sounds wrong because a skipped task means it won’t run. Success is technically correct since there’s no failure. What is the part you find this is confusing? If it’s the UI, we can probably do something just in the UI to signify a run is “empty” without touching the database definition. If it’s the API, maybe we can add an additional field to make it more obvious to the user that no tasks were actually run.

@omarsmak
Copy link
Member Author

Skipped sounds wrong because a skipped task means it won’t run. Success is technically correct since there’s no failure. What is the part you find this is confusing? If it’s the UI, we can probably do something just in the UI to signify a run is “empty” without touching the database definition. If it’s the API, maybe we can add an additional field to make it more obvious to the user that no tasks were actually run.

The confusing part is the API, when I query the DAGRuns through the REST APIs for a particular DAG, I get for those DAGRuns that didn't run a success which is really misleading since no tasks have ran for the particular DAG. Regarding the Skipped status, this is just an idea, if there is an extra field or different status that tells me nothing has ran, I think this would suffice.
What kind of additional field you have in mind to add?

@uranusjr
Copy link
Member

uranusjr commented Jun 1, 2022

I guess this is different opinions then. Personally I find it makes sense for a run without any tasks marked success, same as how all([]) is True.

Perhaps the actual solution is to not dwindle on the actual state, but to introduce some context in the API response so the user can read more into how the state means? For example, we could add a task_count field, and the user can easily tell something might not be right if it shows 0.

@omarsmak
Copy link
Member Author

omarsmak commented Jun 2, 2022

I guess this is different opinions then. Personally I find it makes sense for a run without any tasks marked success, same as how all([]) is True.

Perhaps the actual solution is to not dwindle on the actual state, but to introduce some context in the API response so the user can read more into how the state means? For example, we could add a task_count field, and the user can easily tell something might not be right if it shows 0.

Sounds good as long there is a way to identify such status. Quick question, do we have the number of task that ran written to the database? Perhaps it can be a quick PR in this case.

Thanks

@spatocode
Copy link

Skipped sounds wrong because a skipped task means it won’t run. Success is technically correct since there’s no failure. What is the part you find this is confusing? If it’s the UI, we can probably do something just in the UI to signify a run is “empty” without touching the database definition. If it’s the API, maybe we can add an additional field to make it more obvious to the user that no tasks were actually run.

@uranusjr I believe PENDING would be the best fit for this scenario.

@uranusjr
Copy link
Member

do we have the number of task that ran written to the database

There’s not a count available directly, but each task is an individual row in the database, so you can issue a count() with appropriate filter on state to get the number.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:feature Feature Requests
Projects
None yet
Development

No branches or pull requests

3 participants