Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in multiqc / mirTop broke #137

Closed
apeltzer opened this issue Mar 18, 2022 · 29 comments
Closed

Bug in multiqc / mirTop broke #137

apeltzer opened this issue Mar 18, 2022 · 29 comments
Labels
bug Something isn't working

Comments

@apeltzer
Copy link
Member

Had this unfortunately with the dev branch:

  /// MultiQC <U+1F50D> | v1.12

|           multiqc | Search path : /work/38/b3bd004777869168b75aaf74ee8d69
|         searching | ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 418/418
|    custom_content | software_versions: Found 1 sample (html)
|    custom_content | nf-core-smrnaseq-summary: Found 1 sample (html)
╭───────────────── Oops! The 'mirtop' MultiQC module broke... ─────────────────╮
│ Please copy this log and report it at                                        │
│ https://github.com/ewels/MultiQC/issues                                      │
│ Please attach a file that triggers the error. The last file found was:       │
│ ./full_mirtop_stats.log                                                      │
│                                                                              │
│ Traceback (most recent call last):                                           │
│   File "/usr/local/lib/python3.10/site-packages/multiqc/multiqc.py", line 65 │
│     output = mod()                                                           │
│   File "/usr/local/lib/python3.10/site-packages/multiqc/modules/mirtop/mirto │
│     self.parse_mirtop_report(f)                                              │
│   File "/usr/local/lib/python3.10/site-packages/multiqc/modules/mirtop/mirto │
│     parsed_data["read_count"] = parsed_data["isomiR_sum"] + parsed_data["ref │
│ KeyError: 'ref_miRNA_sum'                                                    │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯
|          samtools | Found 158 stats reports
|            fastqc | Found 79 reports
|           multiqc | Compressing plot data
|           multiqc | Report      : multiqc_report.html
|           multiqc | Data        : multiqc_data
|           multiqc | Plots       : multiqc_plots
|           multiqc | MultiQC complete
|           multiqc | 1 flat-image plot used in the report due to large sample numbers
|           multiqc | To force interactive plots, use the '--interactive' flag.
See the documentation.

Not sure what is causing this, didN#t check the module so far.

@ewels any idea?

@apeltzer apeltzer added the bug Something isn't working label Mar 18, 2022
@ewels
Copy link
Member

ewels commented Mar 19, 2022

Not sure sorry, I guess the output is different for some reason? Not seen it before, need to see the tool output. These things are usually fairly easy to patch though.

@apeltzer
Copy link
Member Author

Hi @ewels I have added some fake data (that resembles the original data very much) to add a testcase for mirtop mabye. See linked PR to testdata: MultiQC/test-data#227

@lpantano might also be interested in seeing this -- it looks like the ref_miRNA_sum variable is not present for all the samples here, which might explain that the access then fails.

@lpantano
Copy link
Contributor

oh, I didn't think about not having that for all samples...that seems weird to be honest, related to the data itself...no sequences matching the references seems weird. Thank for spotting this.

@apeltzer
Copy link
Member Author

Considering that we likely need to address this upstream either in mirtop to enforce this is always there in the output or in the multiqc module - can we potentially add a --skip_mirtop option to enable users to circumvent this bug for the time until this is fixed upstream? @JoseEspinosa @lpantano what do you think about that?

@JoseEspinosa
Copy link
Member

Yes, might be a good solution for now. Should we skip the whole process or maybe only multiqc i.e. --skip_mirtop_multiqc ?

@lpantano
Copy link
Contributor

I think the quantification is useful, I would prefer to skip the QC of multiqc in this case only if it were my data.

@lpantano
Copy link
Contributor

I fixed that in mirtop, waiting for bioconda to bump the version, and then we need to update the docker container, that I forgot how to do it, so If somebody knows how, happy to get help here :)

@JoseEspinosa
Copy link
Member

You mean the mulled container here ? I can take care of it if you are referring to this

@JoseEspinosa
Copy link
Member

Here is the PR in the biocontainers multi-package containers repository BioContainers/multi-package-containers#2150

@JoseEspinosa
Copy link
Member

Here is the PR with the updated container #143

@apeltzer
Copy link
Member Author

All fine now 👍🏻

@apeltzer
Copy link
Member Author

Hmm, unfortunately still an issue. I've had more projects with the same error :-(

@apeltzer apeltzer reopened this Jun 17, 2022
@apeltzer
Copy link
Member Author

It still looks like some samples are missing the same entry, e.g. ref_miRNA_sum is not present for all samples unfortunately. Thus the MultiQC module breaks --> crashing pipeline :-(

@apeltzer apeltzer added this to the Release 2.1.0 milestone Jun 17, 2022
@gpalidwor
Copy link

I'm getting the same problem, looks like same cause; smrnaseq pipeline failing on same mirTop parsing error, rest of the pipeline completes. Can you recommend a workaround, is there a way to suppress multiqc parsing of mirTop in the pipeline or should I just run multiqc separately?

@apeltzer
Copy link
Member Author

I think the problem is, that the MultiQC module assumes certain keys to be present in the mirtop output - however mirtop does not always produce these, causing the MultiQC module to crash as it can't find these for all samples when computing metrics. We hoped that the fix @lpantano had made in mirtop 0.4.25 fixed this, but apparently not :-(

Currently not really a workaround - skipping multiqc and running it outside means you will not have a nice report which is also not too nice.

@ewels
Copy link
Member

ewels commented Jun 22, 2022

Is there an issue / PR already for this on the MultiQC repo? Should be fairly easy to fix.

@gpalidwor
Copy link

gpalidwor commented Jun 22, 2022

Is there an issue / PR already for this on the MultiQC repo? Should be fairly easy to fix.

I don't see one, want me to open it?

@ewels
Copy link
Member

ewels commented Jun 22, 2022

Yes please 👍🏻 Then maybe we can put @ErikDanielsson on the job over a coffee break 😅

@gpalidwor
Copy link

Bug opened MultiQC/MultiQC#1712

@apeltzer
Copy link
Member Author

Maybe a good idea yeah to fix this in MultiQC, although that means we have to wait for MultiQC V1.13 to fix this 😢

@ewels
Copy link
Member

ewels commented Jun 22, 2022

wait

@lpantano
Copy link
Contributor

Hi,
if you give me a gff file from mirtop that is missing that I can take a look. I know where the code is so I can pull a fix quickly. It is working for us, so I don't have any example.
Thanks!

@gpalidwor
Copy link

gpalidwor commented Jun 23, 2022

below is the gff file from the mirtop output in the results directory of the nf-core/smrna output from the run where the multiqc parsing of the mirtop output failed. It's mostly zero counts because it was a bad configuration, rerunning to see if multiqc succeeds with the proper config

mirtop.gff.gz

@gpalidwor
Copy link

fwiw: my issue was due to a misconfiguration leading to samples with low or zero mapping. Fixing the configuration fixed the multiQC/mirtop problem for me. This is probably still a multiQC issue that should be fixed eventually as a lack of any mapped miRNA is a reasonable edge case.

@ewels
Copy link
Member

ewels commented Jun 27, 2022

Yeah, MultiQC should never crash with a traceback like that. For starters it means that all subsequent logs from that tool will be ignored, so it should definitely be fixed 👍🏻

@ewels
Copy link
Member

ewels commented Jun 27, 2022

Ok @ErikDanielsson should have fixed it in MultiQC, just merged to dev.

If you want this quickly you can do a Bioconda release of dev if you like. People have done that in the past.

@apeltzer
Copy link
Member Author

I'll do that then, should be fine to do in such a case and then upgrade in smrnaseq once we have 1.13 again 🥳

@apeltzer
Copy link
Member Author

PR in modules is open nf-core/modules#1814 , once this is merged, can upgrade the module for MultiQC here in the repository and we're fine again 👍🏻

@apeltzer
Copy link
Member Author

MultiQC 1.13 is released, as such this should be fixable by upgrading to the MQC 1.13 module

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants