Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support native cluster commands #1640

Closed
henryjtaylor opened this issue Jun 18, 2020 · 3 comments
Closed

Support native cluster commands #1640

henryjtaylor opened this issue Jun 18, 2020 · 3 comments

Comments

@henryjtaylor
Copy link

Hi,

My group is using Nextflow to build a portable, flexible pipeline for analyzing single cell RNAseq data. To allow for greater portability, we are building config files that will allow users to run the pipeline over several different executors (sge, Slurm, etc.).

One facet of our group uses a cluster that is based on SGE, but has several custom modifications. For example, it uses non-conventional arguments to specify details such as the number of cpus and memory.

We're facing two issues with Nextflow:

  1. Is there a way we can change the default definitions of how Nextflow passes certain key words, such as cpus or memory in a config file? We would like to stick to a pattern of using a base config, and then cluster-specific configs to help adapt to native clusters. But, this has been an issue, and our current work-around is re-writing the whole config to use clusterOptions for all cpu and memory demands.

  2. We're getting an odd issue that I believe is an issue specific to our cluster, but was curious if it has been seen before, and a known solution exists:

We're using dynamic allocation to increase memory after job failure up to three retries. We are finding a scenario where a job will fail due to too little memory, nextflow retries the job, the allocation bump is correct (we double the memory -- I can check the job and it is correct), but nextflow fails to pass the input files into the second job. I check the working directory, and the files are correctly sym-linked into the directory, but they apparently aren't being copied into the the keyword in the job.

Below is one of the jobs we're having the issue. Specifically, the files__anndata fails to populate:

process umap_gather {
    // Merge UMAP from reduced_dims (reduce or gather).
    // ------------------------------------------------------------------------
    //tag { output_dir }
    //cache false        // cache results from run
    scratch false      // use tmp directory
    echo echo_mode          // echo output from script

    publishDir  path: "${outdir}",
                saveAs: {filename ->
                    if(filename.endsWith("metadata.tsv.gz")) {
                        null
                    } else if(filename.endsWith("pcs.tsv.gz")) {
                        null
                    } else if(filename.endsWith("reduced_dims.tsv.gz")) {
                        null
                    } else {
                        filename.replaceAll("${runid}-", "")
                    }
                },
                mode: "${task.publish_mode}",
                overwrite: "true"

    input:
        tuple(
            val(key),
            val(outdir_prev),
            path(original__file__anndata),
            path(original__file__metadata),
            path(original__file__pcs),
            path(original__file__reduced_dims),
            path(files__anndata)
        )
    output:
        val(outdir_prev, emit: outdir)
        path("${runid}-${outfile}.h5ad", emit: anndata)
        path(original__file__metadata, emit: metadata)
        path(original__file__pcs, emit: pcs)
        path(original__file__reduced_dims, emit: reduced_dims)

    script:
        runid = random_hex(16)
        outdir = "${outdir_prev}"  // For some reason dir here messed up?
        outfile = "${original__file__anndata}".minus(".h5ad").split("-")
            .drop(1).join("-")
        outfile = "${outfile}-umap"
        // outfile = "adata-umap"
        // Get all of the adata files that we want to gather
        files__anndata = files__anndata.join(',')
        process_info = "${runid} (runid)"
        process_info = "${process_info}, ${task.cpus} (cpus)"
        process_info = "${process_info}, ${task.memory} (memory)"
        """
        echo "umap_gather: ${process_info}"
        umap_gather.py \
            --h5_anndata_list ${files__anndata} \
            --h5_root ${original__file__anndata} \
            --output_file ${runid}-${outfile}
        """
}
@pditommaso
Copy link
Member

but they apparently aren't being copied into the the keyword in the job

What does it mean?

@henryjtaylor
Copy link
Author

Hi,

We have a workaround for the issue stated with:

but they apparently aren't being copied into the the keyword in the job

So, please ignore this issue.

I would like to focus on the first statement:

  1. Is there a way we can change the default definitions of how Nextflow passes certain key words, such as cpus or memory in a config file?

We would like to stick to a pattern of using a base config, and then cluster-specific configs to help adapt to native clusters. But, this has been an issue, and our current work-around is re-writing the whole config to use clusterOptions for all cpu and memory demands.

We have a use-case where the cluster takes irregular commands. As such, specifying cpus using the conventional SGE approach causes the pipeline to fail.

@stale
Copy link

stale bot commented Dec 26, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Dec 26, 2020
@stale stale bot closed this as completed Feb 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants