New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error message control #725
Comments
I got the idea of the |
Do you mean problems with the data that is produced while running the pipeline? In those cases you might consider using the |
@pditommaso We want to be able to control the output, which is why I thought it might be good to be able to call a Python program to produce the output. Perhaps this is somethig that the main script of the process could produce an output file, which could be output. I suppose what I want is an exception handler. @stevekm This works will in many cases, but this assumes that the checking is cheap. It also requires us to write two sets of code to manage the same data. Where we have many large data sets, it wouldn't be computationally feasible to sequentially check all the data before running processes. Rather, it's a the point of actual processing we can have an exception and handle it. Let me give two different examples, simplifications of a real cases I've had. Someone runs a workflow and one step runs PLINK on a large file -- but they've done something stupid (e.g. not provided the appropriate phenotype file) and PLINK then fails. I can check the PLINK log file and print out an appropriate error message. Here I can only reasonably detect there's a problem with the data when PLINK fails. I couldn't duplicate this cde. We're processing 100s of 100MB size data inputting tables and we get a bad value in a column in one file which means we can no longer continue. This is not something that I could write Groovy code to check beforehand, practically, since it would take far too long (and means I have to duplicate code). My processing code (in Python) has an exception hander for when it detects errors -- I want to be able to print out a simple message saying : "File "dataABD" column "SNPcount" line 3189 has an error". See directory /.../.../work/ef/272861abce77827171" I don't wan't to print out the normal trace, and I don't want to print the script (which especially if a template is very confusing for others) |
I understand but I'm not sure to support this feature because there's no way to prevent a user to execute compute/memory intensive tasks by using this mechanism. Unless the job is not killed abruptly by the cluster you should be able to wrap your task execution by another script that terminate gracefully and return a more informative error message if the task fail. |
Would this be possible though
be possible. A particular problem is template scripts -- which when it fails prints the whole template which could be hundreds of line long. So suppressing normal error messages and trace is important |
Basically an |
Perfect — that would be great thanks
Scott
On 18 Jun 2018, at 17:29, Paolo Di Tommaso <notifications@github.com<mailto:notifications@github.com>> wrote:
Basically an errorTerse mode that would only show the stderr output?
This communication is intended for the addressee only. It is confidential. If you have received this communication in error, please notify us immediately and destroy the original message. You may not copy or disseminate this communication without the permission of the University. Only authorised signatories are competent to enter into agreements on behalf of the University and recipients are thus advised that the content of this message may not be legally binding on the University and may contain the personal views and opinions of the author, which are not necessarily the views and opinions of The University of the Witwatersrand, Johannesburg. All agreements between the University and outsiders are subject to South African Law unless the University agrees in writing to the contrary.
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I think this would have been an extremely useful feature, but it seems to have been ignored? I'm in a similar situation where I can either fail silently using a |
I associate myself to what @mhoban said. For example I'm now doing an integrity check on some Something like this would have been very nice to have:
I am fully aware that the pipeline will stop and that I'll have all the information in the work directory and related files. But it would have been something useful to reduce the stderr clutter to the bare minimum in some selected situations where maybe someone else is running my pipeline :) Just a thought. |
You should be able to customize the error message by piggy backing on Nextflow's standard error message for a task failure, which includes the stderr of the task. So in your Bash script you could catch a particular exit code and print a custom error message: gzip -t ${gz_file}
ret=$?
[[ $ret != 0 ]] && >&2 echo "Your gz files are not okay, please check them"
exit $ret Then the task will fail and Nextflow will print the standard task failure message with stdout and stderr, including your message |
Respectfully, this suggestion doesn't address the issue both I and @MatteoSchiavinato have brought up. The point is not whether we can control what the error message is. The point is that when we do so it'd be very nice to have the option to display only the error message that would be helpful in the case, rather than all the other information that ends up being displayed (viz. script contents, working file path, etc.). Perhaps I'm missing something in what you're suggesting? Here is a pared-down example: #!/usr/bin/env nextflow
nextflow.enable.dsl=2
process just_fail {
output:
stdout
script:
"""
>&2 echo "This is a bad error!"
exit 1
"""
}
workflow {
just_fail | view
} When I run this, I get the following output: What we would love is the option to display only the part within the "Command error" section and not all the other stuff. The "errorTerse" suggestion that was suggested above is exactly the sort of thing we're looking for. |
I would like to have a feature to allow me to control precisely what the error message is if something fails, suppressing the normal NextFlow error message and stack track.
Sometime the workflow runs into an error because of some problem with the data and I want to print a clear error message to the user to explain the problem and how to fix it. However, the normal error message is very verbose and for someone who is not the developer of the workflow very confusing. This is particularly the case when the script being run in a template where the whole templatised code gets dumped out as the "script".
What I'd want is a directive like
errorMessage 45 "pheno_err_msg.py $fname"
which would mean "if any step in the script returns errror code 45, call the script bin/pheno_err_msg.py"
The text was updated successfully, but these errors were encountered: