Skip to content

Not a valid path value when provide genome by fasta and gtf in compressed gz format, and not providing genome index or gene bed #427

@ljw20180420

Description

@ljw20180420

If provide genome by fasta and gtf, workflow will try to generate genome index and gene bed if they are not provided. It works fine when fasta and gtf are uncompressed. However, if any one of fasta and gtf is compressed (in gz format in my case), and the corresponding downstream file (i.e. genome index or gene bed) is not provided, I will get the following error:

Not a valid path value: '../genome/Mus_musculus.GRCm38.dna_sm.primary_assembly.fa.gz'
Not a valid path value: '../genome/Mus_musculus.GRCm38.102.gtf.gz'

My params.yaml is

input: './samplesheet.csv'
outdir: './results/'
fasta: '../genome/Mus_musculus.GRCm38.dna_sm.primary_assembly.fa.gz'
gtf: '../genome/Mus_musculus.GRCm38.102.gtf.gz'
narrow_peak: true
aligner: 'bowtie2'
read_length: 150
max_memory: '50.GB'
max_cpus: 24
save_reference: true

My command is

HTTPS_PROXY=http://localhost:1081 nextflow run $PWD/../pipeline/nf-core-chipseq/2_1_0/main.nf -profile singularity -resume -params-file params.yaml

Both should be right because it works fine if I uncompress 'Mus_musculus.GRCm38.dna_sm.primary_assembly.fa.gz' and 'Mus_musculus.GRCm38.102.gtf'.

I also find some related issues in other nf-core workflows:

https://github.com/nf-core/rnaseq/issues/1311
https://github.com/nf-core/atacseq/issues/277
https://github.com/nf-core/cutandrun/issues/187

Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions