Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Found errors in qc after debbuging #69

Merged
merged 4 commits into from
May 4, 2022
Merged

[FIX] Found errors in qc after debbuging #69

merged 4 commits into from
May 4, 2022

Conversation

Mxrcon
Copy link
Member

@Mxrcon Mxrcon commented Apr 28, 2022

Hey, 👋 as requested on our latest meeting and mentioned in #58 (comment). I'm giving some attention to the qc_reports workflow.

I tried to run the pipeline normally without specifying --skip-qc so I could check if the qc report is working as normal.

I got this error message:

 Error executing process > 'QC_REPORTS:MULTIQC'

Caused by:
  Not a valid path value type: java.util.ArrayList ([/data/mariliaconceicao/Davi/test_qc_wf/mtbseq-nf/work/5d/7a1c18eec6224ab8d4e86fa82c42b2/G049482_R1_fastqc.html, /data/mariliaconceicao/Davi/test_qc_wf/mtbseq-nf/work/5d/7a1c18eec6224ab8d4e86fa82c42b2/G049482_R2_fastqc.html])


Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

This error was throw because of differences into the fastqc outputs and multiqc inputs, so I emitted a new named channel so they'll match.

After that The version of multiqc was giving a odd error:

File "/data/mariliaconceicao/Davi/test_qc_wf/mtbseq-nf/conda_envs/mtbseq-nf-env/bin/multiqc", line 6, in <module>
    from multiqc.__main__ import multiqc
  File "/data/mariliaconceicao/Davi/test_qc_wf/mtbseq-nf/conda_envs/mtbseq-nf-env/lib/python3.6/site-packages/multiqc/__init__.py", line 16, in <module>
    from .multiqc import run
  File "/data/mariliaconceicao/Davi/test_qc_wf/mtbseq-nf/conda_envs/mtbseq-nf-env/lib/python3.6/site-packages/multiqc/multiqc.py", line 38, in <module>
    from .plots import table
  File "/data/mariliaconceicao/Davi/test_qc_wf/mtbseq-nf/conda_envs/mtbseq-nf-env/lib/python3.6/site-packages/multiqc/plots/table.py", line 9, in <module>
    from multiqc.utils import config, report, util_functions, mqc_colour
  File "/data/mariliaconceicao/Davi/test_qc_wf/mtbseq-nf/conda_envs/mtbseq-nf-env/lib/python3.6/site-packages/multiqc/utils/mqc_colour.py", line 7, in <module>
    import spectra
  File "/data/mariliaconceicao/Davi/test_qc_wf/mtbseq-nf/conda_envs/mtbseq-nf-env/lib/python3.6/site-packages/spectra/__init__.py", line 1, in <module>
    from .core import COLOR_SPACES, Color, Scale
  File "/data/mariliaconceicao/Davi/test_qc_wf/mtbseq-nf/conda_envs/mtbseq-nf-env/lib/python3.6/site-packages/spectra/core.py", line 1, in <module>
    from colormath import color_objects, color_conversions
  File "/data/mariliaconceicao/Davi/test_qc_wf/mtbseq-nf/conda_envs/mtbseq-nf-env/lib/python3.6/site-packages/colormath/color_conversions.py", line 13, in <module>
    import networkx
  File "/data/mariliaconceicao/Davi/test_qc_wf/mtbseq-nf/conda_envs/mtbseq-nf-env/lib/python3.6/site-packages/networkx/__init__.py", line 81, in <module>
    from networkx import algorithms
  File "/data/mariliaconceicao/Davi/test_qc_wf/mtbseq-nf/conda_envs/mtbseq-nf-env/lib/python3.6/site-packages/networkx/algorithms/__init__.py", line 81, in <module>
    from networkx.algorithms import tree
  File "/data/mariliaconceicao/Davi/test_qc_wf/mtbseq-nf/conda_envs/mtbseq-nf-env/lib/python3.6/site-packages/networkx/algorithms/tree/__init__.py", line 1, in <module>
    from .branchings import *
  File "/data/mariliaconceicao/Davi/test_qc_wf/mtbseq-nf/conda_envs/mtbseq-nf-env/lib/python3.6/site-packages/networkx/algorithms/tree/branchings.py", line 30, in <module>
    from dataclasses import dataclass, field
ModuleNotFoundError: No module named 'dataclasses'

which could be fixed by updating the multiqc to the latest version (v.1.12), we should check this update as it'll downgrade the picard version to 2.17, this shoudn't be a problem as the bioconda recipe for mtbseq requires picard >=2.17.0.

runing MTBseq --check:

MTBseq --check
<INFO>	[2022-04-28 14:53:25]	Found perl module: MCE
<INFO>	[2022-04-28 14:53:25]	Found perl module: Statistics::Basic
<INFO>	[2022-04-28 14:53:26]	Found bwa in your PATH!
<INFO>	[2022-04-28 14:53:26]	Found samtools in your PATH!
<INFO>	[2022-04-28 14:53:26]	Found gatk in your PATH!
<INFO>	[2022-04-28 14:53:26]	Found picard in your PATH!

After those changes, I started to run mtbseq_nf again to double check if the changes into the conda env would make any change.

N E X T F L O W  ~  version 21.04.0
Launching `main.nf` [furious_mayer] - revision: affe96c6e4
executor >  local (24)
[43/09d9bd] process > QC_REPORTS:FASTQC (G049502)                                      [100%] 3 of 3 ✔
[73/9f8eb0] process > QC_REPORTS:MULTIQC                                               [100%] 1 of 1 ✔
[75/9195e2] process > PARALLEL_ANALYSIS:PER_SAMPLE_ANALYSIS:TBBWA (G049502 - prj)      [100%] 3 of 3 ✔
[85/58f29e] process > PARALLEL_ANALYSIS:PER_SAMPLE_ANALYSIS:TBREFINE (G049502 - prj)   [100%] 3 of 3 ✔
[c6/dc4aa8] process > PARALLEL_ANALYSIS:PER_SAMPLE_ANALYSIS:TBPILE (G049502 - prj)     [100%] 3 of 3 ✔
[e1/9632d5] process > PARALLEL_ANALYSIS:PER_SAMPLE_ANALYSIS:TBLIST (G049502 - prj)     [100%] 3 of 3 ✔
[a7/03c05c] process > PARALLEL_ANALYSIS:PER_SAMPLE_ANALYSIS:TBVARIANTS (G049502 - prj) [100%] 3 of 3 ✔
[15/d61257] process > PARALLEL_ANALYSIS:PER_SAMPLE_ANALYSIS:TBSTATS (prj)              [100%] 1 of 1 ✔
[0f/d681ee] process > PARALLEL_ANALYSIS:PER_SAMPLE_ANALYSIS:TBSTRAINS (prj)            [100%] 1 of 1 ✔
[c3/880d59] process > PARALLEL_ANALYSIS:COHORT_ANALYSIS:TBJOIN (prj)                   [100%] 1 of 1 ✔
[57/52a62a] process > PARALLEL_ANALYSIS:COHORT_ANALYSIS:TBAMEND (prj)                  [100%] 1 of 1 ✔
[71/e564ac] process > PARALLEL_ANALYSIS:COHORT_ANALYSIS:TBGROUPS (prj)                 [100%] 1 of 1 ✔
Completed at: 28-Apr-2022 15:47:48
Duration    : 57m 29s
CPU hours   : 2.4
Succeeded   : 24


Happy to contribute with this work, feel free to add changes or request any modification on my approach to the errors.

Kindly, Davi

@Mxrcon Mxrcon requested a review from abhi18av April 28, 2022 19:01
@Mxrcon Mxrcon self-assigned this Apr 28, 2022
@Mxrcon Mxrcon changed the title [FIX] Found erros in qc after debbuging [FIX] Found errors in qc after debbuging Apr 28, 2022
Copy link
Member

@abhi18av abhi18av left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this @Mxrcon! 🚀

Approving with one suggested change of updating the output channel name.

  1. Generally, we've followed the pattern of _ch therefore updated accordingly,

  2. Since the same output could be consumed by other processes in future, I've used a generic name.

modules/qc/fastqc.nf Outdated Show resolved Hide resolved
workflows/qc_reports.nf Outdated Show resolved Hide resolved
@abhi18av
Copy link
Member

which could be fixed by updating the multiqc to the latest version (v.1.12), we should check this update as it'll downgrade the picard version to 2.17, this shoudn't be a problem as the bioconda recipe for mtbseq requires picard >=2.17.0.

So, what's the picard version in the env as of now?

@Mxrcon
Copy link
Member Author

Mxrcon commented Apr 29, 2022

So, what's the picard version in the env as of now?

Using conda env export: picard=2.27.1=hdfd78af_0

@abhi18av
Copy link
Member

abhi18av commented May 4, 2022

Cool, then I think that we're good to go 🤞

Thanks Davi!

@abhi18av abhi18av merged commit ad52933 into master May 4, 2022
@abhi18av abhi18av deleted the mxrcon/fix_qc branch May 4, 2022 10:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants