Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dragen: v1.14 no longer looks for .bed files for coverage metrics #1844

Closed
iatol1 opened this issue Jan 17, 2023 · 9 comments · Fixed by #1854
Closed

Dragen: v1.14 no longer looks for .bed files for coverage metrics #1844

iatol1 opened this issue Jan 17, 2023 · 9 comments · Fixed by #1854
Labels
bug: module Bug in a MultiQC module

Comments

@iatol1
Copy link

iatol1 commented Jan 17, 2023

Description of bug

Somewhere between v1.13 and v1.14, search_patterns.yaml was changed as follows:

https://github.com/ewels/MultiQC/blob/c247d077801bc1cd8d6e7e1f9078db5036bae6e1/multiqc/utils/search_patterns.yaml#L262-L267

vs.

https://github.com/ewels/MultiQC/blob/4702c8086ce95521d1e2c46d1a85ebfb41812d8e/multiqc/utils/search_patterns.yaml#L284-L291

This change means that MultiQC no longer looks for "*.target_bed_*.csv when plotting coverage metrics. It seems like a fix would be similar to what was done in #1290, but I am not sure.

The following regexes also seem relevant, but I don't know how they are used:

https://github.com/ewels/MultiQC/blob/4702c8086ce95521d1e2c46d1a85ebfb41812d8e/multiqc/modules/dragen/coverage_metrics.py#L292

https://github.com/ewels/MultiQC/blob/4702c8086ce95521d1e2c46d1a85ebfb41812d8e/multiqc/modules/dragen/coverage_hist.py#L144

https://github.com/ewels/MultiQC/blob/4702c8086ce95521d1e2c46d1a85ebfb41812d8e/multiqc/modules/dragen/coverage_per_contig.py#L166

If someone can give advice, I am happy to write a PR for a fix.

@ewels ewels changed the title MultiQC 1.14 no longer looks for .bed files for coverage metrics Dragen: v1.14 no longer looks for .bed files for coverage metrics Jan 17, 2023
@ewels ewels added the bug: module Bug in a MultiQC module label Jan 17, 2023
@ewels
Copy link
Member

ewels commented Jan 17, 2023

Could @vladsavelyev or @alexiswl or anyone else involved in #1829 / the regent DRAGEN updates please take a look at this?

I guess it's just that merging this old PR basically overwrote what was added by @vladsavelyev in #1363.

It would be good if someone with closer knowledge of this module could comment though please 🙏🏻

@alexiswl
Copy link
Contributor

Thanks @iatol1 for picking this up, my guess is that this mammoth commit 08c5272 by @andrei-seleznev back in July 2021 to merge upstream master into the bnbowman master chose between the target and wgs coverage options when resolving conflicts.

There was a lot of code to digest through so pretty easy to miss. Since that merge commit would have both the bnbowman master and upstream master as parents, we wouldn't have seen this as a conflict when #1232 was finally merged.

I have access to a dragen instance so can try a fix on our side.

@alexiswl
Copy link
Contributor

@iatol1 which version of dragen are you using as well? I will likely run these tests on dragen 4.0.3

@iatol1
Copy link
Author

iatol1 commented Jan 17, 2023

@alexiswl Yes, we are using Dragen 4.0.3 as well

@ckandoth
Copy link

The Dragen pipeline always generates *.wgs_*.csv files, and I don't see an option to disable it. They are not relevant for targeted sequencing, so it is preferred that MultiQC use *.target_bed_*.csv files when available and ignore the WGS ones.

The *.target_bed_*.csv files are only generated when a --vc-target-bed is specified or what we do is --qc-coverage-tag-1 target_bed --qc-coverage-region-1 exons.bed --qc-coverage-reports-1 cov_report. This latter method has the benefit of generating coverage metrics in targeted regions, but not limiting variants calls to the same regions.

@alexiswl
Copy link
Contributor

alexiswl commented Jan 17, 2023

The Dragen pipeline always generates .wgs_.csv files, and I don't see an option to disable it. They are not relevant for targeted sequencing, so it is preferred that MultiQC use .target_bed_.csv files when available and ignore the WGS ones.

@ckandoth thank you for clarifying

This latter method has the benefit of generating coverage metrics in targeted regions, but not limiting variants calls to the same regions.

That's a really good idea, thank you for sharing!

@ewels
Copy link
Member

ewels commented Feb 1, 2023

Hi all! Any progress on this front? I'm keen to release a patch release ASAP to fix a few bugs that have appeared, it would be great to be able to include this if possible..

@alexiswl
Copy link
Contributor

alexiswl commented Feb 1, 2023

Hi @ewels apologies, will get you something for this by the end of this week!

alexiswl added a commit to alexiswl/MultiQC that referenced this issue Feb 2, 2023
Refactored parse_wgs_coverage_metrics to generic parse_coverage_metrics.  

Fixes MultiQC#1844

Target bed, qc region and wgs qc then call parse_coverage_metrics each with different parameters.  

 Fixed search patterns to include target bed coverage metrics and qc region coverage metrics.
@alexiswl
Copy link
Contributor

alexiswl commented Feb 2, 2023

@iatol1 I have made a PR at #1854 (with some examples present too)

Please feel free to pass through any dragen csv files to test this with, otherwise you can checkout the branch at https://github.com/alexiswl/MultiQC/tree/bugfix/re-add-target-metrics-to-dragen-module.

Note that this still uses the wgs metrics, just also adds in the qc coverage metrics and target bed metrics tables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug: module Bug in a MultiQC module
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants