Skip to content

Best practices / questions when running different nextflow pipelines concurrently on HPC compute cluster with shared storage for scratch / workDir #4584

Answered by bentsherman
flipfloptech asked this question in Q&A
Discussion options

You must be logged in to vote

Concurrent pipeline runs are guaranteed to not collide with each other because they each have a unique session ID which is included in the task hash (i.e. task work directory name). Personally I like using a single work directory for all my pipelines, especially in scratch storage with a cleanup policy, then I can set it and forget about it.

The important thing for your cleanup policy is that (1) the max retention is greater than the total walltime of most pipeline runs in your cluster and (2) the policy actually only deletes files based on age. It sounds like your policy is deleting empty directories regardless of age, which could be a problem in the exact scenario that you mentioned.

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
1 reply
@flipfloptech
Comment options

Answer selected by flipfloptech
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants