Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiQC Dragen abort: division by zero for empty sample #1735

Closed
stefanhelfert opened this issue Jul 31, 2022 · 3 comments
Closed

MultiQC Dragen abort: division by zero for empty sample #1735

stefanhelfert opened this issue Jul 31, 2022 · 3 comments
Labels
bug: module Bug in a MultiQC module

Comments

@stefanhelfert
Copy link

stefanhelfert commented Jul 31, 2022

Description of bug

╭───────────────────────────────────────────── Oops! The 'dragen' MultiQC module broke... ─────────────────────────────────────────────╮
│ Please copy this log and report it at https://github.com/ewels/MultiQC/issues                                                        │
│ Please attach a file that triggers the error. The last file found was: ./z04748_syw1/z04748_syw1.wgs_fine_hist.csv                   │
│                                                                                                                                      │
│ Traceback (most recent call last):                                                                                                   │
│   File "/home/helfert/miniconda3/lib/python3.8/site-packages/multiqc/multiqc.py", line 651, in run                                   │
│     output = mod()                                                                                                                   │
│   File "/home/helfert/miniconda3/lib/python3.8/site-packages/multiqc/modules/dragen/dragen.py", line 64, in __init__                 │
│     samples_found |= self.add_coverage_hist()                                                                                        │
│   File "/home/helfert/miniconda3/lib/python3.8/site-packages/multiqc/modules/dragen/coverage_hist.py", line 21, in add_coverage_hist │
│     s_name, data_by_phenotype = parse_fine_hist(f)                                                                                   │
│   File "/home/helfert/miniconda3/lib/python3.8/site-packages/multiqc/modules/dragen/coverage_hist.py", line 140, in parse_fine_hist  │
│     cum_pct = cum_cnt / total_cnt * 100.0                                                                                            │
│ ZeroDivisionError: division by zero                                                                                                  │

The underlying problem may be that this specific sample z04748 is a complete dropout. That is, the FASTQ files of that sample are 20 byte long, meaning completely empty.
Our FASTQ->BAM pipeline automatically ran through the samplesheet and built data for this sample, where of course a number like total_cnt is 0, triggering the division-by-zero error.

File that triggers the error

z04748_syw1.wgs_fine_hist.csv

MultiQC Error log

/data/NGS/NovaSeq_analysis_data/220729_A00866_0397_BHCMYCDMXY_syw1_bam> multiqc -f . /data/NGS/NovaSeq_analysis_data/220729_A00866_0397_BHCMYCDMXY_syw1_fastq/

  /// MultiQC 🔍 | v1.12

|           multiqc | Search path : /data/netapp/DE_MHG2_FASCL01_fvol3/NGS/NovaSeq_analysis_data/220729_A00866_0397_BHCMYCDMXY_syw1_bam
|           multiqc | Search path : /data/NGS/NovaSeq_analysis_data/220729_A00866_0397_BHCMYCDMXY_syw1_fastq
|         searching | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 5001/5001  
|          mosdepth | Found 79 reports
╭───────────────────────────────────────────── Oops! The 'dragen' MultiQC module broke... ─────────────────────────────────────────────╮
│ Please copy this log and report it at https://github.com/ewels/MultiQC/issues                                                        │
│ Please attach a file that triggers the error. The last file found was: ./z04748_syw1/z04748_syw1.wgs_fine_hist.csv                   │
│                                                                                                                                      │
│ Traceback (most recent call last):                                                                                                   │
│   File "/home/helfert/miniconda3/lib/python3.8/site-packages/multiqc/multiqc.py", line 651, in run                                   │
│     output = mod()                                                                                                                   │
│   File "/home/helfert/miniconda3/lib/python3.8/site-packages/multiqc/modules/dragen/dragen.py", line 64, in __init__                 │
│     samples_found |= self.add_coverage_hist()                                                                                        │
│   File "/home/helfert/miniconda3/lib/python3.8/site-packages/multiqc/modules/dragen/coverage_hist.py", line 21, in add_coverage_hist │
│     s_name, data_by_phenotype = parse_fine_hist(f)                                                                                   │
│   File "/home/helfert/miniconda3/lib/python3.8/site-packages/multiqc/modules/dragen/coverage_hist.py", line 140, in parse_fine_hist  │
│     cum_pct = cum_cnt / total_cnt * 100.0                                                                                            │
│ ZeroDivisionError: division by zero                                                                                                  │
│                                                                                                                                      │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
|            fastqc | Found 79 reports
|            fastqc | Sample had zero reads: 'z04748_syw1'
|        bclconvert | 2 lanes and 79 samples found
|           multiqc | Compressing plot data
|           multiqc | Deleting    : multiqc_report.html   (-f was specified)
|           multiqc | Deleting    : multiqc_data   (-f was specified)
|           multiqc | Report      : multiqc_report.html
|           multiqc | Data        : multiqc_data
|           multiqc | MultiQC complete
@cmatKhan
Copy link

I just had the same error. This is from a minimal test set which seems to be a bit too minimal.

  ╭───────────────── Oops! The 'rseqc' MultiQC module broke... ──────────────────╮
  │ Please copy this log and report it at                                        │
  │ https://github.com/ewels/MultiQC/issues                                      │
  │ Please attach a file that triggers the error. The last file found was:       │
  │ ./4/hg002_T1_star.bam_stat.txt                                               │
  │                                                                              │
  │ Traceback (most recent call last):                                           │
  │   File "/usr/local/lib/python3.10/site-packages/multiqc/multiqc.py", line 65 │
  │     output = mod()                                                           │
  │   File "/usr/local/lib/python3.10/site-packages/multiqc/modules/rseqc/rseqc. │
  │     n[sm] = getattr(module, "parse_reports")(self)                           │
  │   File "/usr/local/lib/python3.10/site-packages/multiqc/modules/rseqc/bam_st │
  │     d["unique_percent"] = (float(d["mapq_gte_mapq_cut_unique"]) / t) * 100.0 │
  │ ZeroDivisionError: float division by zero                                    │
  │                                                                              │
  ╰──────────────────────────────────────────────────────────────────────────────╯

The bam stat file looks like this:

#==================================================
#All numbers are READ count
#==================================================

Total records:                          0

QC failed:                              0
Optical/PCR duplicate:                  0
Non primary hits                        0
Unmapped reads:                         0
mapq < mapq_cut (non-unique):           0

mapq >= mapq_cut (unique):              0
Read-1:                                 0
Read-2:                                 0
Reads map to '+':                       0
Reads map to '-':                       0
Non-splice reads:                       0
Splice reads:                           0
Reads mapped in proper pairs:           0
Proper-paired reads map to different chrom:0

hg002_T1_star.bam_stat.txt

@ewels ewels added the bug: module Bug in a MultiQC module label Jan 10, 2023
@ewels
Copy link
Member

ewels commented Jan 15, 2023

I just had the same error.

I don't think so, your error is from the RSeQC module, whereas the first is from Dragen. Whilst they both are raising ZeroDivisionErrors the bugs are quite separate. That is a very standard core Python exception type to raise.

Thanks for commenting and providing test data though, I'll check and try to fix both bugs 👍🏻 I missed labelling this issue before so hadn't caught it before release, apologies.

Phil

ewels added a commit to MultiQC/test-data that referenced this issue Jan 15, 2023
@ewels ewels closed this as completed in 8c3fff2 Jan 15, 2023
@ewels
Copy link
Member

ewels commented Jan 15, 2023

@stefanhelfert I just tried your example data with the latest release and I don't get the same error any more. So I think that this was fixed as part of one of the several large Dragen PRs that have been merged recently. I correctly get a report with two basically empty plots, showing zero coverage.

@cmatKhan - I found and fixed the RSeQC bug in 8c3fff2. Thanks for reporting! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug: module Bug in a MultiQC module
Projects
None yet
Development

No branches or pull requests

3 participants