Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing TrimGalore --hardtrim3 / --hardtrim5 via custom config raises missing output filename error #922

Closed
isaiahwtaylor opened this issue Jan 4, 2023 · 1 comment · Fixed by nf-core/modules#2715
Labels
bug Something isn't working
Milestone

Comments

@isaiahwtaylor
Copy link

Description of the bug

When setting Trimgalore hardtrim parameter in custom config file with --hardtrim5 50 option, the resulting fastq is named XYZ_.50bp_5prime.fq.gz. Nextflow then throws an error:
Missing output file(s) '*{trimmed,val}*.fq.gz' expected by process

If I don't set --hardtrim5 50, the pipeline succeeds.

Command used and terminal output

Command: 

nextflow run nf-core/rnaseq --input /path/to/input/sample_sheet.csv --outdir /path/to/output/results/ --fasta /path/to/genome.fa --gtf /path/to/gtf/genes.gtf --skip_markduplicates -profile singularity -c gne_11_28_22.config

Output:

Work dir:
/gstore/scratch/u/taylori5/d1/915e503fceb06efff5d79e464b1cae
Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out
executor > slurm (100)
[88/95579c] process > NFCORE_RNASEQ:RNASEQ:PREPAR... [100%] 1 of 1 :heavy_check_mark:
[17/65fe93] process > NFCORE_RNASEQ:RNASEQ:PREPAR... [100%] 1 of 1 :heavy_check_mark:
[-    ] process > NFCORE_RNASEQ:RNASEQ:PREPAR... [ 0%] 0 of 1
[16/a2ca06] process > NFCORE_RNASEQ:RNASEQ:PREPAR... [100%] 1 of 1 :heavy_check_mark:
[c4/694a51] process > NFCORE_RNASEQ:RNASEQ:PREPAR... [100%] 1 of 1 :heavy_check_mark:
[74/a03d96] process > NFCORE_RNASEQ:RNASEQ:INPUT_... [100%] 1 of 1 :heavy_check_mark:
[-    ] process > NFCORE_RNASEQ:RNASEQ:CAT_FASTQ -
[f7/b7137c] process > NFCORE_RNASEQ:RNASEQ:FASTQC... [ 29%] 56 of 188
[d2/1bb566] process > NFCORE_RNASEQ:RNASEQ:FASTQC... [ 20%] 39 of 188, failed...
[-    ] process > NFCORE_RNASEQ:RNASEQ:MULTIQ... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:QUANTI... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:QUANTI... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:QUANTI... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:QUANTI... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:QUANTI... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:QUANTI... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:QUANTI... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:DESEQ2... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:MULTIQ... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:PRESEQ... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:STRING... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:SUBREA... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:MULTIQ... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:BEDTOO... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:BEDGRA... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:BEDGRA... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:BEDGRA... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:BEDGRA... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:QUALIM... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:DUPRADAR -
[-    ] process > NFCORE_RNASEQ:RNASEQ:RSEQC:... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:RSEQC:... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:RSEQC:... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:RSEQC:... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:RSEQC:... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:RSEQC:... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:RSEQC:... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:MULTIQ... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:CUSTOM... -
[-    ] process > NFCORE_RNASEQ:RNASEQ:MULTIQC  -
-[nf-core/rnaseq] Pipeline completed with errors-
Error executing process > 'NFCORE_RNASEQ:RNASEQ:FASTQC_UMITOOLS_TRIMGALORE:TRIMGALORE (study_06200546b0001s_20210430)'
Caused by:
 Missing output file(s) *{trimmed,val}*.fq.gz expected by process NFCORE_RNASEQ:RNASEQ:FASTQC_UMITOOLS_TRIMGALORE:TRIMGALORE (study_06200546b0001s_20210430)`
Command executed:
 [ ! -f study_06200546b0001s_20210430_1.fastq.gz ] && ln -s study_ngs_rna_wts_totalrna_06200546b0001s_20210430_1.fastq.gz study_06200546b0001s_20210430_1.fastq.gz
 [ ! -f study_06200546b0001s_20210430_2.fastq.gz ] && ln -s study_ngs_rna_wts_totalrna_06200546b0001s_20210430_2.fastq.gz study_06200546b0001s_20210430_2.fastq.gz
 trim_galore \
   --hardtrim5 50 --fastqc_args '-t 8' \
   --cores 4 \
   --paired \
   --gzip \
    \
    \
    \
    \
   study_06200546b0001s_20210430_1.fastq.gz \
   study_06200546b0001s_20210430_2.fastq.gz
 cat <<-END_VERSIONS > versions.yml
 "NFCORE_RNASEQ:RNASEQ:FASTQC_UMITOOLS_TRIMGALORE:TRIMGALORE":
   trimgalore: $(echo $(trim_galore --version 2>&1) | sed 's/^.*version //; s/Last.*$//')
   cutadapt: $(cutadapt --version)
 END_VERSIONS
Command exit status:
 0
Command output:
 pigz 2.6
Command error:
 Path to Cutadapt set as: 'cutadapt' (default)
 Cutadapt seems to be working fine (tested command 'cutadapt --version')
 Cutadapt version: 3.4
 Could not detect version of Python used by Cutadapt from the first line of Cutadapt (but found this: >>>#!/bin/sh<<<)
 Letting the (modified) Cutadapt deal with the Python version instead
 Parallel gzip (pigz) detected. Proceeding with multicore (de)compression using 4 cores
 Hard-trimming from the 3'-end selected. File(s) will be trimmed to leave the leftmost 50 bp on the 5'-end, and Trim Galore will then exit.
 Input file name: study_06200546b0001s_20210430_1.fastq.gz
 Writing trimmed version (using the first 50 bp only) of the input file 'study_06200546b0001s_20210430_1.fastq.gz' to 'study_06200546b0001s_20210430_1.50bp_5prime.fq.gz'
 Finished writing out converted version of the FastQ file study_06200546b0001s_20210430_1.fastq.gz (86076033 sequences in total)
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Input file name: study_06200546b0001s_20210430_2.fastq.gz
 Writing trimmed version (using the first 50 bp only) of the input file 'study_06200546b0001s_20210430_2.fastq.gz' to 'study_06200546b0001s_20210430_2.50bp_5prime.fq.gz'
 Finished writing out converted version of the FastQ file study_06200546b0001s_20210430_2.fastq.gz (86076033 sequences in total)
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Work dir:
 /gstore/scratch/u/taylori5/d1/915e503fceb06efff5d79e464b1cae
Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out

Relevant files

Custom config:
params {
config_profile_description = 'HPC nf-core config.'
config_profile_contact = 'Isaiah Taylor'
singularity_cache_dir = '/path/to/cache/singularity_cache_nfcore'
}

executor {
name = 'slurm'
queueSize = 50
queue = 'defq'
clusterOptions = '--qos long'
}

process {
beforeScript = "module load singularity"
}

singularity{
enabled = true
autoMounts = true
cacheDir = params.singularity_cache_dir
}

params {
max_memory = 64.GB
max_cpus = 8
max_time = 336.h
}

process {
withName: '.*:FASTQC_UMITOOLS_TRIMGALORE:TRIMGALORE' {
ext.args = {
[
"--hardtrim5 50 --fastqc_args '-t ${task.cpus}' ",
params.trim_nextseq > 0 ? "--nextseq ${params.trim_nextseq}" : ''
].join(' ').trim()
}
}

System information

nf-core/rnaseq 3.9

@isaiahwtaylor isaiahwtaylor added the bug Something isn't working label Jan 4, 2023
@drpatelh drpatelh changed the title Setting Trim-galore hard-trimming causes output unexpected output filenames Passing TrimGalore --hardtrim3 / --hardtrim5 via custom config raises missing output filename error Jan 4, 2023
@drpatelh drpatelh added this to the 3.11 milestone Jan 4, 2023
drpatelh added a commit to drpatelh/nf-core-rnaseq that referenced this issue Jan 5, 2023
@drpatelh drpatelh mentioned this issue Jan 5, 2023
@drpatelh
Copy link
Member

drpatelh commented Jan 5, 2023

*report.txt aren't produced by TrimGalore when --hardtrim 3 or --hardtrim 5 are set which means we can't get the read stats info. So I had to refactor the logic that filters channels if the FastQ files are empty after trimming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants