Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

htseq-count output files skipped when more than two columns present #2114

Closed
4 tasks done
gl-eb opened this issue Oct 12, 2023 · 1 comment · Fixed by #2129
Closed
4 tasks done

htseq-count output files skipped when more than two columns present #2114

gl-eb opened this issue Oct 12, 2023 · 1 comment · Fixed by #2129
Labels
bug: module Bug in a MultiQC module

Comments

@gl-eb
Copy link

gl-eb commented Oct 12, 2023

Description of bug

If htseq-count output files contain more than the two standard columns feature and count, they are ignored. The htseq-count user has the option to add further columns for e.g. feature name or chromosome on which the feature is located.
This behaviour is present in both MultiQC 1.15 and 1.16. Versions 1.14 and 1.13 work as expected.

The current regex pattern ^(feature\tcount|\w+\t\d+)$ does not allow for additional columns. Adding the following pattern to my system-wide config file fixed the issue: ^(feature\tcount|\w+.*\t\d+)$

File that triggers the error

WTCHG_536273_229157.tsv.zip

MultiQC Error log

multiqc -v --force --interactive --filename multiqc-test --outdir <outdir> <srcdir>

  /// MultiQC 🔍 | v1.16

[2023-10-12 20:53:29] multiqc                                            [DEBUG  ]  This is MultiQC v1.16
[2023-10-12 20:53:29] multiqc                                            [DEBUG  ]  Command used: /Users/gleb/micromamba/envs/multiqc-new/bin/multiqc -v --force --interactive --filename multiqc-test --outdir <outdir> <srcdir>
[2023-10-12 20:53:30] multiqc                                            [DEBUG  ]  Latest MultiQC version is v1.16
[2023-10-12 20:53:30] multiqc                                            [DEBUG  ]  Working dir : /Users/gleb
[2023-10-12 20:53:30] multiqc                                            [DEBUG  ]  Template    : default
[2023-10-12 20:53:30] multiqc                                            [DEBUG  ]  Running Python 3.11.6 | packaged by conda-forge | (main, Oct  3 2023, 10:37:07) [Clang 15.0.7 ]
[2023-10-12 20:53:30] multiqc                                            [DEBUG  ]  Analysing modules: custom_content, ccs, ngsderive, purple, conpair, lima, peddy, somalier, methylQA, mosdepth, phantompeakqualtools, qualimap, preseq, hifiasm, quast, qorts, rna_seqc, rockhopper, rsem, rseqc, busco, bustools, goleft_indexcov, gffcompare, disambiguate, supernova, deeptools, sargasso, verifybamid, mirtrace, happy, mirtop, sambamba, gopeaks, homer, hops, macs2, theta2, snpeff, gatk, htseq, bcftools, featureCounts, fgbio, dragen, dragen_fastqc, dedup, pbmarkdup, damageprofiler, mapdamage, biobambam2, jcvi, mtnucratio, picard, vep, sentieon, bakta, prokka, qc3C, nanostat, samblaster, samtools, sexdeterrmine, eigenstratdatabasetools, bamtools, jellyfish, vcftools, longranger, stacks, varscan2, snippy, umitools, bbmap, bismark, biscuit, diamond, hicexplorer, hicup, hicpro, salmon, kallisto, slamdunk, star, hisat2, tophat, bowtie2, bowtie1, cellranger, snpsplit, odgi, pangolin, nextclade, freyja, humid, kat, leehom, librarian, adapterRemoval, bbduk, clipandmerge, cutadapt, flexbar, sourmash, kaiju, kraken, malt, motus, trimmomatic, sickle, skewer, sortmerna, biobloomtools, fastq_screen, afterqc, fastp, fastqc, filtlong, prinseqplusplus, pychopper, porechop, pycoqc, minionqc, anglerfish, multivcfanalyzer, clusterflow, checkqc, bcl2fastq, bclconvert, interop, ivar, flash, seqyclean, optitype, whatshap
[2023-10-12 20:53:30] multiqc                                            [DEBUG  ]  Using temporary directory for creating report: /var/folders/2g/nwb9zhbj4kb0_x6y0r2yx5hw0000gn/T/tmpqi862ikb
[2023-10-12 20:53:30] multiqc                                            [INFO   ]  Search path : <srcdir>
|         searching | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 27/27
[2023-10-12 20:53:30] multiqc                                            [DEBUG  ]  Summary of files that were skipped by the search: [skipped_module_specific_max_filesize: 189] // [skipped_no_match: 27]
[2023-10-12 20:53:30] multiqc.plots.bargraph                             [DEBUG  ]  Using matplotlib version 3.8.0
[2023-10-12 20:53:30] multiqc.plots.linegraph                            [DEBUG  ]  Using matplotlib version 3.8.0
[2023-10-12 20:53:30] multiqc                                            [DEBUG  ]  No samples found: custom_content
[2023-10-12 20:53:30] multiqc.utils.software_versions                    [DEBUG  ]  Reading software versions from config.software_versions
[2023-10-12 20:53:30] multiqc                                            [WARNING]  No analysis results found. Cleaning up..
[2023-10-12 20:53:30] multiqc                                            [INFO   ]  MultiQC complete

Before submitting

  • I have read the troubleshooting documentation.
  • I am using the latest release of MultiQC.
  • I have included a full MultiQC log, not truncated.
  • I have attached an input file (.zip if necessary) that triggers the error.
@vladsavelyev
Copy link
Member

Thanks @gl-eb a lot for the bug report, for the test example and even the fix itself!

I merged your example into MultiQC_TestData and added your fix in this PR: #2129

@vladsavelyev vladsavelyev added this to the MultiQC v1.17 milestone Oct 16, 2023
@vladsavelyev vladsavelyev added the bug: module Bug in a MultiQC module label Oct 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug: module Bug in a MultiQC module
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants