Defer error exit status to end of pipeline run #4446

adamrtalbot · 2023-10-26T19:04:01Z

really go as far as possible in running the pipeline, but then exit 1 at the very end if any errors were ignored.

This is a pretty useful feature. Imagine the scenario where you have 96 independent samples, if 1 fails and it brings the whole pipeline crashing down that's annoying, but if you use errorStrategy = 'ignore' then you have to write your own solution. We could have process.errorStrategy = 'catch'. This might be a separate ticket though.

Having some form of introspection on tasks in the workflow.onComplete block would also help for similar reasons and be more generally useful.

Originally posted by @adamrtalbot in #4365 (comment)

The text was updated successfully, but these errors were encountered:

pditommaso · 2023-10-26T19:06:45Z

Is it not finish doing this?

ewels · 2023-10-26T20:09:46Z

No, finish allows the currently running jobs to complete. This would tell Nextflow to continue submitting new jobs for as long as it can before dying.

The use case is for automated runs. If there is one bad sample out of hundreds, the run will fail early and need to wait until it can be manually restarted. Then the whole pipeline needs to run through, which is slow. With this option, the other samples would all process so when the pipeline is resumed without the bad sample it would complete very quickly.

pditommaso · 2023-10-27T07:58:52Z

yeah, you are right. I was trying to be lazy 😆

bentsherman · 2023-10-27T14:46:18Z

We could either change ignore to return a non-zero exit code or add a new strategy like ignoreThenFail, which one would you guys prefer?

pinin4fjords · 2023-10-27T15:06:23Z

I think ignnoreThenFail is the more generally understandable solution.

adamrtalbot · 2023-10-27T16:05:28Z

Agreed.

adamrtalbot · 2023-10-27T16:11:00Z

If you want to be fancy, we could call it defer but I'd like a non-native speaker to check that makes sense.

ewels · 2023-10-30T20:14:59Z

I generally prefer verbose and clear, for me ignoreThenFail is the winner in this list 👆🏻 defer is more likely to send me scurrying to the docs.

boothms · 2024-01-25T04:11:29Z

This is the solution that I was looking for. Is it still under consideration ?

ghost · 2024-01-25T13:46:29Z

I would like to second @boothms message. This would greatly improve the runs with multiple samples where at least one process for one sample might fail. It would be great if the progress of the execution could continue for the rest of the healthy samples.

ghost · 2024-01-25T18:15:42Z

@bentsherman , thank you so much for jumping on this. I have a quick question about the implementation.

If one sets the errorStrategy 'ignoreThenFail', that implies that the task will never be retried, correct?

Would there be a place where these 2 are not mutually exclusive? Try the maxRetries, and then if it fails, still have the possibility to defer the exit status to the end of the pipeline?

bentsherman · 2024-01-25T19:59:47Z

You could do that with a closure:

process.errorStrategy = { task.attempt < 3 ? 'retry' : 'ignoreThenFail' }

ghost · 2024-01-25T20:02:34Z

This is great. Thank you once again.

I'm really looking forward to testing this.

ghost · 2024-02-22T17:32:42Z

Hi folks, I saw that the PR is almost done. Do you have an estimate on when this might be merged?

adamrtalbot · 2024-04-04T11:30:28Z

Relevant community discussion: https://community.seqera.io/t/handling-single-sample-failures/618/3

pditommaso · 2024-06-13T09:06:16Z

So ignoreThenFail cannot be used because errorStrategy cannot be longer than 10 chars.

Two choices:

find a shorted name
turn this setting into a nextflow config option (not an errorStrategy) e.g. nextflow.failOnIgnoredErrors = true

pinin4fjords · 2024-06-13T10:03:40Z

Where does the 10 character limit derive from?

But from conversations with users, I think the errorStrategy is the most intuitive thing, the thing people expect to find.

ignoreFail? Though that sounds a bit like we're ignoring a failure.

adamrtalbot · 2024-06-13T13:34:35Z

My other suggestion was defer, which isn't as clear but is shorter than 10 characters.

Like Jon, I have to ask where does the 10 character limit come from?

pditommaso · 2024-06-13T14:05:15Z

where does the 10 character limit come from

The platform db schema

pinin4fjords · 2024-06-13T14:06:12Z

deferFail?

FriederikeHanssen · 2024-06-13T14:11:03Z

delayFail ?

pinin4fjords · 2024-06-13T14:16:36Z

I asked ChatGPT:

failAtEnd
finishFail
runThenFail
completeErr
ignoreFail
tryThenFail
endWithErr

pditommaso · 2024-06-13T14:17:12Z

and the winner is failAtEnd 😄

bentsherman · 2024-06-14T16:03:28Z

I think I'm coming around to ignoreFail and just rely the language server to provide an inline explanation (i.e. hover hint, code completion) if the user isn't clear on the meaning

pditommaso · 2024-06-14T17:13:18Z

I'm still not convinced about the best to proceed with this. Apart from the strategy name length, I'd like to avoid introducing a new error strategy for doing a similar thing to ignore, above all because this is going to impact some logic in Platform about how error are reported, etc.

I've made a variation in which the user is expected to use the string ignoreThenFail as error strategy, however behind the scenes is mapped to ignore, but still causing the failure on completion.

#5066

In this way, the changes is transparent from the point of view of Platform. However I fear that it can be difficult to determine why it failed because tasks will report ignore as retry action

pditommaso · 2024-06-17T11:43:47Z

I've made two other implementation for this feature that prevent the propagation of a new error strategy literal.

PR #5066 implements a pseudo-error strategy named ignoreThenFail, which is replaced behind the scenes with ignore and triggers the expected behaviour. However it sounds a bit hacky and not transparent the behavioural.

PR #5071 implements the same logic as a nextflow config setting.

bentsherman · 2024-06-17T18:06:06Z

I think I prefer #5071. When the workflow fails due to failOnIgnore, we can say as much in the error message so that it's clear why the workflow failed

pditommaso assigned bentsherman Oct 30, 2023

marcodelapierre added the core-runtime label Nov 1, 2023

bentsherman removed the core-runtime label Nov 1, 2023

bentsherman mentioned this issue Jan 25, 2024

Add ignoreThenFail error strategy #4686

Merged

pditommaso linked a pull request Jun 17, 2024 that will close this issue

4446 ignore then fail strategy v3 #5071

Merged

pditommaso linked a pull request Jun 17, 2024 that will close this issue

Implement ignoreThenFail as pseudo error strategy #5066

Closed

pditommaso closed this as completed in #4686 Jun 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Defer error exit status to end of pipeline run #4446

Defer error exit status to end of pipeline run #4446

adamrtalbot commented Oct 26, 2023

pditommaso commented Oct 26, 2023

ewels commented Oct 26, 2023

pditommaso commented Oct 27, 2023

bentsherman commented Oct 27, 2023

pinin4fjords commented Oct 27, 2023

adamrtalbot commented Oct 27, 2023

adamrtalbot commented Oct 27, 2023

ewels commented Oct 30, 2023

boothms commented Jan 25, 2024 •

edited

ghost commented Jan 25, 2024

ghost commented Jan 25, 2024

bentsherman commented Jan 25, 2024

ghost commented Jan 25, 2024

ghost commented Feb 22, 2024

adamrtalbot commented Apr 4, 2024

pditommaso commented Jun 13, 2024

pinin4fjords commented Jun 13, 2024 •

edited

adamrtalbot commented Jun 13, 2024

pditommaso commented Jun 13, 2024

pinin4fjords commented Jun 13, 2024

FriederikeHanssen commented Jun 13, 2024

pinin4fjords commented Jun 13, 2024

pditommaso commented Jun 13, 2024

bentsherman commented Jun 14, 2024

pditommaso commented Jun 14, 2024

pditommaso commented Jun 17, 2024

bentsherman commented Jun 17, 2024

Defer error exit status to end of pipeline run #4446

Defer error exit status to end of pipeline run #4446

Comments

adamrtalbot commented Oct 26, 2023

pditommaso commented Oct 26, 2023

ewels commented Oct 26, 2023

pditommaso commented Oct 27, 2023

bentsherman commented Oct 27, 2023

pinin4fjords commented Oct 27, 2023

adamrtalbot commented Oct 27, 2023

adamrtalbot commented Oct 27, 2023

ewels commented Oct 30, 2023

boothms commented Jan 25, 2024 • edited

ghost commented Jan 25, 2024

ghost commented Jan 25, 2024

bentsherman commented Jan 25, 2024

ghost commented Jan 25, 2024

ghost commented Feb 22, 2024

adamrtalbot commented Apr 4, 2024

pditommaso commented Jun 13, 2024

pinin4fjords commented Jun 13, 2024 • edited

adamrtalbot commented Jun 13, 2024

pditommaso commented Jun 13, 2024

pinin4fjords commented Jun 13, 2024

FriederikeHanssen commented Jun 13, 2024

pinin4fjords commented Jun 13, 2024

pditommaso commented Jun 13, 2024

bentsherman commented Jun 14, 2024

pditommaso commented Jun 14, 2024

pditommaso commented Jun 17, 2024

bentsherman commented Jun 17, 2024

boothms commented Jan 25, 2024 •

edited

pinin4fjords commented Jun 13, 2024 •

edited