FastQC: add top overrepresented sequences table #2075

vladsavelyev · 2023-09-25T18:12:30Z

Add a table into the FastQC module showing the most common overrepresented sequences across all samples:

By default, it shows top 20 sequences occurring in the most number of samples. The number can be customised:

fastqc_config:
  top_overrepresented_sequences: 50

It can also be customised to rank sequences by the total number of occurrences instead of the number of samples:

fastqc_config:
  top_overrepresented_sequences_by: "total"

github-actions · 2023-09-25T18:13:43Z

🚀 Deployed on https://mqc-pr-2075--multiqc.netlify.app

ewels · 2023-09-26T09:36:42Z

How about having a third column that shows the read count as a percentage of the total read count across all samples?

vladsavelyev · 2023-09-27T09:04:56Z

How about having a third column that shows the read count as a percentage of the total read count across all samples?

That's going to always be close to zero, not sure if a column like this is helpful 🤔

Though maybe it is, as it just means that everything is okay? Maybe I should add another column with the max percentage across samples, to quickly identify the bad sequence in a bad sample.

vladsavelyev · 2023-09-27T16:07:49Z

@multiqc-bot changelog

ewels

Great!

CHANGELOG.md

docs/modules/fastqc.md

ewels · 2023-09-28T23:08:54Z

@multiqc-bot fix linting

ewels · 2023-09-28T23:09:29Z

Maybe I should add another column with the max percentage across samples, to quickly identify the bad sequence in a bad sample.

Could be nice, but might be approaching overkill a little. I think I'm happy to merge as-is for now, can wait to see if we get any feedback about this and always add it at a later date.

ewels · 2023-09-28T23:11:50Z

@multiqc-bot fix linting

ewels · 2023-09-28T23:18:53Z

Added a commit to fix https://github.com/ewels/MultiQC/pull/2082/files#r1340743794

* master: Just run CI on the oldest + newest supported Python versions (MultiQC#2074) Picard: fix parsing mixed strings/numbers, account for trailing tab (MultiQC#2083) FastQC: add top overrepresented sequences table (MultiQC#2075) Add GitHub Actions bot workflow to fix code linting from a PR comment (MultiQC#2082) Use custom exception type instead of `UserWarning` when no samples are found. (MultiQC#2049) Lint modules for missing `self.add_software_version` (MultiQC#2081) Changelog bot: Update docs (MultiQC#2077) Changelog action: remove `.capitalize()`, add changelog entry (MultiQC#2080) Add action to populate the change log from PR titles triggered by `@multiqc-bot changelog` (MultiQC#2025) # Conflicts: # CHANGELOG.md # multiqc/modules/ngsderive/ngsderive.py

vladsavelyev added 2 commits September 25, 2023 20:00

FastQC: add overrepresented sequences table

20d69fd

Clean up

0ce693f

vladsavelyev requested a review from ewels September 25, 2023 18:12

vladsavelyev added the module: enhancement label Sep 25, 2023

vladsavelyev added this to the MultiQC v1.17 milestone Sep 25, 2023

vladsavelyev added the awaits-review Awaiting final review and merge. label Sep 25, 2023

vladsavelyev and others added 4 commits September 27, 2023 13:33

Percentage in all reads rather than number of samples

0c4a86d

Merge branch 'master' into fastqc-overrep

9cdccda

Merge branch 'master' into fastqc-overrep

9a82259

Fix adapter content

4be59f1

vladsavelyev changed the title ~~FastQC: top overrepresented sequences table~~ FastQC: add top overrepresented sequences table Sep 27, 2023

[automated] Update CHANGELOG.md

ffa7a7a

vladsavelyev enabled auto-merge (squash) September 27, 2023 16:09

vladsavelyev disabled auto-merge September 28, 2023 08:17

ewels approved these changes Sep 28, 2023

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

docs/modules/fastqc.md Outdated Show resolved Hide resolved

docs/modules/fastqc.md Outdated Show resolved Hide resolved

ewels added 2 commits September 29, 2023 01:07

Apply suggestions from code review

513b3de

Merge branch 'master' into fastqc-overrep

0437ba1

Fix linting: Don't exit code 1 if we changed something

5b698c7

Fix linting

1620e9a

ewels merged commit f64ee52 into master Sep 28, 2023
11 checks passed

ewels deleted the fastqc-overrep branch September 28, 2023 23:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FastQC: add top overrepresented sequences table #2075

FastQC: add top overrepresented sequences table #2075

vladsavelyev commented Sep 25, 2023

github-actions bot commented Sep 25, 2023 •

edited

ewels commented Sep 26, 2023

vladsavelyev commented Sep 27, 2023

vladsavelyev commented Sep 27, 2023

ewels left a comment

ewels commented Sep 28, 2023

ewels commented Sep 28, 2023

ewels commented Sep 28, 2023

ewels commented Sep 28, 2023

FastQC: add top overrepresented sequences table #2075

FastQC: add top overrepresented sequences table #2075

Conversation

vladsavelyev commented Sep 25, 2023

github-actions bot commented Sep 25, 2023 • edited

ewels commented Sep 26, 2023

vladsavelyev commented Sep 27, 2023

vladsavelyev commented Sep 27, 2023

ewels left a comment

Choose a reason for hiding this comment

ewels commented Sep 28, 2023

ewels commented Sep 28, 2023

ewels commented Sep 28, 2023

ewels commented Sep 28, 2023

github-actions bot commented Sep 25, 2023 •

edited