Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add warnings to STDOUT for all skipped and failed strandedness check samples #961

Closed
calizilla opened this issue Mar 16, 2023 · 4 comments
Closed
Labels
question Further information is requested
Milestone

Comments

@calizilla
Copy link

Description of the bug

If the number of reads after trimming with trimgalore falls below min_trimmed_reads, alignment and subsequent steps are not run, yet no warning message is printed. The final mesage 'Pipeline completed successfully' is misleading and uninformative.

In order to determine why there are no BAM files, one must trawl through .nextflow.log (the only clue being that the STAR processes are 'Started' but not 'Submitted' or 'Completed') and the nextflow code on github to eventually find the answer.

Comparing the standard output of such a failed run shows a difference in STDOUT - for a run not affected by failing this filter, the message reads N/N samples passed STAR 5% mapped threshold. For a run where all samples failed, there is no summary of the number of samples that passed any required minimum thresholds.

The STDOUT should include warning messages for any samples failing filters, and direct the user to the relevant file. Something along the lines of N/N samples passed min_trimmed_reads parameter, please see <outdir>/multiqc/star_salmon/multiqc_data/multiqc_fail_trimmed_samples-plot.txt.

Command used and terminal output

nextflow run ../rnaseq/main.nf -profile singularity -params-file params.yaml

Relevant files

log_and_config_and_stdout.zip

System information

Nextflow version - 22.10.6.5843
Hardware - Pawsey Nimbus VM
Executor - local
Container engine - Singularity
OS - Ubuntu
Version of nf-core/rnaseq - 3.10.1

@calizilla calizilla added the bug Something isn't working label Mar 16, 2023
@drpatelh drpatelh added this to the 3.11 milestone Mar 16, 2023
@drpatelh drpatelh added question Further information is requested and removed bug Something isn't working labels Mar 16, 2023
@drpatelh
Copy link
Member

Hi @Calliza ! All of these warnings are intentionally printed right at the top of the MultiQC report (eluded to in the docs):
image

There is an indication as to how many samples failed the mapping threshold printed to STDOUT but I'm not sure it's a good idea to start bloating the STDOUT with various reasons regarding why samples failed - this is exactly what the MultiQC report is for.

image

The pipeline completing successfully is somewhat unrelated to which samples failed quality thresholds because it indicates that there were no issues running the pipeline itself.

@calizilla
Copy link
Author

Thanks for your quick response @drpatelh. My multiQC report looks different to yours (version 1.13) - there are no WARNING messages. The 'Fail trimmed samples' plot is present, and shows 6 samples failed, however it would be nice if the rnaseq workflow did make it immediately clear that there were samples failing to enter the mapping stage of the workflow (ie without having to learn this from multiqc).

I understand your concern about bloating the STDOUT, but it seems arbitrary to report on STDOUT if a sample failed STAR mapping threshold and not to report if failing trim threshold. Both have the same end result - no BAM file for that sample. A simple message 'N/N samples failed trim threshold' seems warranted. To avoid making these warnings excessively long, especially for large cohorts, printing the sample IDs would not even be necessary - just the summary 'N of N failed for <parameter/process>' and a direction to the relevant file (eg multiqc report or multiqc_fail_trimmed_samples-plot.txt).

It seems like a simple yet extremely helpful addition to nf-core workflows to report to STDOUT if samples failed any critical stages :-)

mqc_fail_trimmed

@drpatelh
Copy link
Member

I think we have to remember that quite alot of users won't even look at STDOUT if they are running on Cloud infras or non-interactively. But I see how this could be useful and I have been meaning to add messages for scenarios where the trimming and strand checks aren't successful. Turns out it wasn't as simple as expected because there is quite a bit of Groovy code that needs to be tested to re-write this 😅

But here we go! I have removed reporting sample ids and just included summary messages after the completion message pointing users to check the MultiQC report:

image

My multiQC report looks different to yours (version 1.13) - there are no WARNING messages.

This is weird....I see the warnings as expected.

PR incoming!

@drpatelh drpatelh changed the title Failure to provide suitable warning message when trimmed reads fail min_trimmed_reads threshold Add warnings to STDOUT for all skipped and failed strandedness check samples Mar 17, 2023
drpatelh added a commit to drpatelh/nf-core-rnaseq that referenced this issue Mar 17, 2023
@drpatelh drpatelh mentioned this issue Mar 17, 2023
drpatelh added a commit that referenced this issue Mar 17, 2023
@calizilla
Copy link
Author

Thanks for this, looks great :-)

alot of users won't even look at STDOUT if they are running on Cloud infras or non-interactively

Fair point!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants