unable to plot "Filtered Reads" for cutadapt, when no read passed filters #1328

qifei9 · 2020-10-26T16:15:12Z

Description of bug:
I run multiQC for a cutadapt log, which says:

=== Summary ===

Total read pairs processed:          9,029,687
  Read 1 with adapter:               8,812,237 (97.6%)
  Read 2 with adapter:               8,609,012 (95.3%)
Pairs written (passing filters):             0 (0.0%)

and in multiQC report, it says Error - was not able to plot data. in the Cutadapt -> Filtered Reads section.

I think sometimes the result that no read passed the filters tells something to the user. It may appears due to wrong adapter/data/filters used in cutadapt run. Therefore, it's good to know the result from the multiQC report, perhaps by plot or text telling user that no read passed, otherwise one may ignore it assuming this is due to a bug/error within multiQC.

MultiQC Error log:

log:

[INFO   ]         multiqc : This is MultiQC v1.9
...
[INFO   ]        cutadapt : Found 1 reports
[WARNING]        bargraph : Tried to make bar plot, but had no data
...

multiqc_cutadapt.txt:

Sample  pairs_processed r1_with_adapters        r2_with_adapters        pairs_written   ...
ss 9029687 8812237 8609012 0       ...

File that triggers the error:

MultiQC run details (please complete the following):

MultiQC Version: 1.9
Method of MultiQC installation: snakemake -> singularity -> docker://ewels/multiqc

The text was updated successfully, but these errors were encountered:

ewels · 2020-12-28T22:49:43Z

Thanks for reporting @qifei9 - I agree that this should be handled in a nicer way.

If you have an actual file from cutadapt that I could use for testing that would be better than a truncated excerpt 👍🏻

Phil

Created file to recreate MultiQC/MultiQC#1328

… had some

ewels · 2021-07-02T21:45:00Z

Hi @qifei9,

@ErikDanielsson and I have been looking into this today and trying to figure out how to solve the issue. It was made quite a lot harder by the fact that we don't have the full cutadapt log files that you generated, so we had to recreate our own with your snippet and some guessing. Cutadapt log syntax changes over versions, so I also had to do a bit of forensics to try to guess which version you're working with.

The problem here is not just that you have 0 reads passing filters, but more that your log snippet doesn't contain the number of reads in different filter categories. Every cutadapt log I've seen and can generate looks like this:

=== Summary ===

Total read pairs processed:            250,000
  Read 1 with adapter:                 106,082 (42.4%)
  Read 2 with adapter:                 105,259 (42.1%)
Pairs that were too short:               3,850 (1.5%)
Pairs written (passing filters):       246,150 (98.5%)

Note the line with Pairs that were too short that accounts for the filtered reads. These category lines are only missing for me when 100% of reads pass filters. This is why the report is failing, as MultiQC parses these categories. In your case there are no passing reads and no categories, so MultiQC assumes no failing reads either, so the plot is empty.

Instead of throwing warnings in this case (as @ErikDanielsson initially added in #1480), I have added some code that checks for missing categories in the more recent cutadapt log syntax and counts these reads. These are then included in the plot, which should now show filtered reads that are unaccounted for:

(code added in 5e40729 and d546366)

I know you created this issue quite a long time ago, so I don't have high hopes, but - if you are able to tell us what version of Cutadapt you were using, and ideally what commands you used to run + the full log output file that we can run with MultiQC, that would be fantastic. Then we can confirm that my detective work above is correct and that the fix works.

Many thanks for reporting, and I hope that the fix is useful!

Cheers,

Phil

ewels · 2021-07-02T21:45:21Z

'cc @marcelm just in case you are interested in this! 😉

marcelm · 2021-07-03T08:21:15Z

Cutadapt before 3.1 did not print statistics for the --max-ee and --discard-casava filters, see marcelm/cutadapt#482, so perhaps these were used? Perhaps also --discard-trimmed is involved, I’ll have to check if. But in any case, it’s strange that all reads were filtered.

ewels · 2021-07-03T12:44:14Z

Right - I should probably revisit the regexes that MultiQC uses to parse the logs too, as I bet I'm missing a bunch of categories:

https://github.com/ewels/MultiQC/blob/7594729cc41e37e66c20eff83af68951faa9c8fd/multiqc/modules/cutadapt/cutadapt.py#L83-L88

ewels · 2021-07-03T12:45:16Z

..and refactor to avoid the pairs / reads duplication.. 🤔 (this module code is pretty old now)

ewels added the bug: module Bug in a MultiQC module label Dec 28, 2020

ewels assigned ErikDanielsson Jun 29, 2021

ewels added this to the MultiQC v1.11 milestone Jul 2, 2021

ErikDanielsson mentioned this issue Jul 2, 2021

Added clearer warning messages when no bargraph can be created in Cutadapt #1480

Closed

2 tasks

ewels mentioned this issue Jul 2, 2021

Created file to recreate ewels/MultiQC#1328 MultiQC/test-data#206

Merged

ewels added a commit to MultiQC/test-data that referenced this issue Jul 2, 2021

Merge pull request #206 from ErikDanielsson/cutadapt-no-filtered-reads

c0bbacb

Created file to recreate MultiQC/MultiQC#1328

ewels added a commit to MultiQC/test-data that referenced this issue Jul 2, 2021

Delete cutadapt synthetic logs for MultiQC/MultiQC#1328 as we already…

8d393fd

… had some

ewels added a commit to MultiQC/test-data that referenced this issue Jul 2, 2021

Remove filter categories to match log output from MultiQC/MultiQC#1328

fdbe431

ewels closed this as completed in d546366 Jul 2, 2021

marcelm mentioned this issue Jul 3, 2021

Add lines for --discard-trimmed and --discard-untrimmed to the report marcelm/cutadapt#541

Closed

marcelm mentioned this issue Jul 3, 2021

Asking for feedback on a JSON log file format #1482

Closed

vladsavelyev pushed a commit to vladsavelyev/MultiQC_TestData that referenced this issue Apr 16, 2022

Remove filter categories to match log output from MultiQC/MultiQC#1328

e38ca49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unable to plot "Filtered Reads" for cutadapt, when no read passed filters #1328

unable to plot "Filtered Reads" for cutadapt, when no read passed filters #1328

qifei9 commented Oct 26, 2020

ewels commented Dec 28, 2020

ewels commented Jul 2, 2021

ewels commented Jul 2, 2021

marcelm commented Jul 3, 2021

ewels commented Jul 3, 2021

ewels commented Jul 3, 2021

unable to plot "Filtered Reads" for cutadapt, when no read passed filters #1328

unable to plot "Filtered Reads" for cutadapt, when no read passed filters #1328

Comments

qifei9 commented Oct 26, 2020

ewels commented Dec 28, 2020

ewels commented Jul 2, 2021

ewels commented Jul 2, 2021

marcelm commented Jul 3, 2021

ewels commented Jul 3, 2021

ewels commented Jul 3, 2021