[Feature] add support for retrying failed tasks

When running on a heterogeneous cluster, sometimes a task fails not because of a fundamental error in the function, but because of node-specific problems, such as low memory.
For such tasks, it would be useful to be able to pass the executor a "retry_count=n" kwarg, which will mean that if the task fails, the scheduler will attempt to rerun it (pref. on a different worker). Possibly, with a list of exceptions which are "expected" in such a case.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] add support for retrying failed tasks #352

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature] add support for retrying failed tasks #352

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions