Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

local module: csvtk_split and snpsift_split (splitVariants) #93

Closed
marissaDubbelaar opened this issue Oct 27, 2021 · 5 comments
Closed

Comments

@marissaDubbelaar
Copy link
Contributor

This function uses the csvtk tool to split CSV/TSV into multiple files according to column values

@marissaDubbelaar marissaDubbelaar created this issue from a note in Hackathon-October-2021 (epitopeprediction) Oct 27, 2021
@marissaDubbelaar marissaDubbelaar changed the title local module: csvtk_split (splitVariants) local module: csvtk_split and snpsift_split $(splitVariants) Oct 27, 2021
@marissaDubbelaar marissaDubbelaar changed the title local module: csvtk_split and snpsift_split $(splitVariants) local module: csvtk_split and snpsift_split (splitVariants) Oct 27, 2021
@christopher-mohr christopher-mohr added dsl2 enhancement New feature or request labels Oct 27, 2021
@marissaDubbelaar
Copy link
Contributor Author

We can also take this global (csvtk concat is already present in nfcore modules)

@marissaDubbelaar marissaDubbelaar self-assigned this Oct 27, 2021
@marissaDubbelaar
Copy link
Contributor Author

// Import generic module functions
include { initOptions; saveFiles; getSoftwareName; getProcessName } from './functions'

params.options = [:]
options        = initOptions(params.options)

process SNPSIFT_SPLIT {

    publishDir "${params.outdir}",
        mode: params.publish_dir_mode,
        saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:'', meta:[:], publish_by_meta:[]) }

    conda (params.enable_conda ? "conda-forge::snpsift:4.2" : null)
    if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) {
        container "https://depot.galaxyproject.org/singularity/snpsift:4.2--hdfd78af_5"
    } else {
        container "quay.io/biocontainers/snpsift:4.2--hdfd78af_5"
    }

    input:
        tuple val(meta)

    output:
        tuple val(meta), path("*.vcf"), path("*.tsv"), path("*.GSvar"), emit: splitted
        path "versions.yml", emit: versions
    // when: !params.peptides && !params.show_supported_models // TODO: Remove this by creating a nstatement in the main workflow

    script:
    // TODO: put the if else statement outside of the process call
    // if ( variants.toString().endsWith('.vcf') || variants.toString().endsWith('.vcf.gz') ) {
        // """
        // SnpSift split ${variants}
        // """
    // }
    // else {
        // """
        // sed -i.bak '/^##/d' ${variants}
        // csvtk split ${variants} -t -C '&' -f '#chr'
        // """
    // }
    """
        SnpSift split ${meta.variants}
        cat <<-END_VERSIONS > versions.yml
            ${getProcessName(task.process)}:
                snpsift: \$(echo \$(SnpSift 2>&1) )
            END_VERSIONS
    """
}

@marissaDubbelaar
Copy link
Contributor Author

// Import generic module functions
include { initOptions; saveFiles; getSoftwareName; getProcessName } from './functions'

params.options = [:]
options        = initOptions(params.options)

process CSVTK_SPLIT {

    publishDir "${params.outdir}",
        mode: params.publish_dir_mode,
        saveAs: { filename -> saveFiles(filename:filename, options:params.options, publish_dir:'', meta:[:], publish_by_meta:[]) }

    conda (params.enable_conda ? "conda-forge::csvtk:0.23.0" : null)
    if (workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container) {
        container "https://depot.galaxyproject.org/singularity/csvtk:0.23.0--h9ee0642_0"
    } else {
        container "quay.io/biocontainers/csvtk:0.23.0--h9ee0642_0"
    }

    input:
        tuple val(meta)

    output:
        tuple val(meta), path("*.vcf"), path("*.tsv"), path("*.GSvar"), emit: splitted
        path "versions.yml", emit: versions

    // when: !params.peptides && !params.show_supported_models // TODO: Remove this by creating a nstatement in the main workflow

    script:
    """
        sed -i.bak '/^##/d' ${meta.variants}
        csvtk split ${meta.variants} -t -C '&' -f '#chr'

        cat <<-END_VERSIONS > versions.yml
            ${getProcessName(task.process)}:
                csvtk: \$(echo \$( csvtk version | sed -e "s/csvtk v//g" ))
            END_VERSIONS
    """
}
```

@marissaDubbelaar
Copy link
Contributor Author

Since there are two processes that are available at this step, they need to be separated and divided using an ifelse statement in epitopeprediction.nf

@ggabernet
Copy link
Member

Done in #124

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Hackathon-October-2021
epitopeprediction
Development

No branches or pull requests

3 participants