You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My group is using Nextflow to build a portable, flexible pipeline for analyzing single cell RNAseq data. To allow for greater portability, we are building config files that will allow users to run the pipeline over several different executors (sge, Slurm, etc.).
One facet of our group uses a cluster that is based on SGE, but has several custom modifications. For example, it uses non-conventional arguments to specify details such as the number of cpus and memory.
We're facing two issues with Nextflow:
Is there a way we can change the default definitions of how Nextflow passes certain key words, such as cpus or memory in a config file? We would like to stick to a pattern of using a base config, and then cluster-specific configs to help adapt to native clusters. But, this has been an issue, and our current work-around is re-writing the whole config to use clusterOptions for all cpu and memory demands.
We're getting an odd issue that I believe is an issue specific to our cluster, but was curious if it has been seen before, and a known solution exists:
We're using dynamic allocation to increase memory after job failure up to three retries. We are finding a scenario where a job will fail due to too little memory, nextflow retries the job, the allocation bump is correct (we double the memory -- I can check the job and it is correct), but nextflow fails to pass the input files into the second job. I check the working directory, and the files are correctly sym-linked into the directory, but they apparently aren't being copied into the the keyword in the job.
Below is one of the jobs we're having the issue. Specifically, the files__anndata fails to populate:
process umap_gather {
// Merge UMAP from reduced_dims (reduce or gather).
// ------------------------------------------------------------------------
//tag { output_dir }
//cache false // cache results from run
scratch false // use tmp directory
echo echo_mode // echo output from script
publishDir path: "${outdir}",
saveAs: {filename ->
if(filename.endsWith("metadata.tsv.gz")) {
null
} else if(filename.endsWith("pcs.tsv.gz")) {
null
} else if(filename.endsWith("reduced_dims.tsv.gz")) {
null
} else {
filename.replaceAll("${runid}-", "")
}
},
mode: "${task.publish_mode}",
overwrite: "true"
input:
tuple(
val(key),
val(outdir_prev),
path(original__file__anndata),
path(original__file__metadata),
path(original__file__pcs),
path(original__file__reduced_dims),
path(files__anndata)
)
output:
val(outdir_prev, emit: outdir)
path("${runid}-${outfile}.h5ad", emit: anndata)
path(original__file__metadata, emit: metadata)
path(original__file__pcs, emit: pcs)
path(original__file__reduced_dims, emit: reduced_dims)
script:
runid = random_hex(16)
outdir = "${outdir_prev}" // For some reason dir here messed up?
outfile = "${original__file__anndata}".minus(".h5ad").split("-")
.drop(1).join("-")
outfile = "${outfile}-umap"
// outfile = "adata-umap"
// Get all of the adata files that we want to gather
files__anndata = files__anndata.join(',')
process_info = "${runid} (runid)"
process_info = "${process_info}, ${task.cpus} (cpus)"
process_info = "${process_info}, ${task.memory} (memory)"
"""
echo "umap_gather: ${process_info}"
umap_gather.py \
--h5_anndata_list ${files__anndata} \
--h5_root ${original__file__anndata} \
--output_file ${runid}-${outfile}
"""
}
The text was updated successfully, but these errors were encountered:
but they apparently aren't being copied into the the keyword in the job
So, please ignore this issue.
I would like to focus on the first statement:
Is there a way we can change the default definitions of how Nextflow passes certain key words, such as cpus or memory in a config file?
We would like to stick to a pattern of using a base config, and then cluster-specific configs to help adapt to native clusters. But, this has been an issue, and our current work-around is re-writing the whole config to use clusterOptions for all cpu and memory demands.
We have a use-case where the cluster takes irregular commands. As such, specifying cpus using the conventional SGE approach causes the pipeline to fail.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Hi,
My group is using Nextflow to build a portable, flexible pipeline for analyzing single cell RNAseq data. To allow for greater portability, we are building config files that will allow users to run the pipeline over several different executors (sge, Slurm, etc.).
One facet of our group uses a cluster that is based on SGE, but has several custom modifications. For example, it uses non-conventional arguments to specify details such as the number of
cpus
andmemory
.We're facing two issues with Nextflow:
Is there a way we can change the default definitions of how Nextflow passes certain key words, such as
cpus
ormemory
in a config file? We would like to stick to a pattern of using a base config, and then cluster-specific configs to help adapt to native clusters. But, this has been an issue, and our current work-around is re-writing the whole config to useclusterOptions
for allcpu
andmemory
demands.We're getting an odd issue that I believe is an issue specific to our cluster, but was curious if it has been seen before, and a known solution exists:
We're using dynamic allocation to increase memory after job failure up to three retries. We are finding a scenario where a job will fail due to too little memory, nextflow retries the job, the allocation bump is correct (we double the memory -- I can check the job and it is correct), but nextflow fails to pass the input files into the second job. I check the working directory, and the files are correctly sym-linked into the directory, but they apparently aren't being copied into the the keyword in the job.
Below is one of the jobs we're having the issue. Specifically, the
files__anndata
fails to populate:The text was updated successfully, but these errors were encountered: