This cookiecutter provides a template Snakemake profile for configuring Snakemake to run on the SLURM Workload Manager. The profile defines the following scripts
slurm-submit.py
- submits a jobscript to slurmslurm-jobscript.sh
- a template jobscriptslurm-status.py
- checks the status of jobs in slurmslurm-sidecar.py
- run a Snakemake cluster sidecar for caching queries to Slurm's controller/database daemons
and a configuration file config.yaml
that defines default values for
snakemake command line arguments.
Given an installed profile profile_name
, when snakemake is run with
--profile profile_name
, the configuration keys and values from
config.yaml
are passed to snakemake - plus any additional options to
snakemake that the user has applied.
Note that the use of option --cluster-config
is discouraged, but the
profile still provides support for backwards compatibility. The default
configuration file therefore contains a commented section with examples
of resource configuration (see also snakemake best
practices):
# Example resource configuration
# default-resources:
# - runtime=100
# - mem_mb=6000
# - disk_mb=1000000
# # set-threads: map rule names to threads
# set-threads:
# - single_core_rule=1
# - multi_core_rule=10
# # set-resources: map rule names to resources in general
# set-resources:
# - high_memory_rule:mem_mb=12000
# - long_running_rule:runtime=1200
See the snakemake documentation on profiles for more information.
For a more light-weight alternative, see the excellent repo smk-simple-slurm by @jdblischak. In particular, it can handle larger amounts of jobs than this profile (see issue #79).
To create a slurm profile from the cookiecutter, simply run
# create config directory that snakemake searches for profiles (or use something else)
profile_dir="${HOME}/.config/snakemake"
mkdir -p "$profile_dir"
# use cookiecutter to create the profile in the config directory
template="gh:Snakemake-Profiles/slurm"
cookiecutter --output-dir "$profile_dir" "$template"
You will be prompted to set some values for your profile (here assumed
to be called profile_name
), after which the profile scripts and
configuration file will be installed in $profile_dir
as
profile_name/
. Then you can run Snakemake with
snakemake --profile profile_name ...
Note that the --profile
argument can be either a relative or absolute
path. In addition, snakemake will search for a corresponding folder
profile_name
in /etc/xdg/snakemake
and $HOME/.config/snakemake
,
where globally accessible profiles can be placed.
One typical use case is to setup a profile to use a specific slurm account:
$ cookiecutter --output-dir "$profile_dir" "$template"
profile_name [slurm]: slurm.my_account
sbatch_defaults []: account=my_account no-requeue exclusive
cluster_sidecar_help: [Use cluster sidecar. NB! Requires snakemake >= 7.0! Enter to continue...]
Select cluster_sidecar:
1 - yes
2 - no
Choose from 1, 2 [1]:
cluster_name []:
cluster_config_help: [The use of cluster-config is discouraged. Rather, set snakemake CLI options in the profile configuration file (see snakemake documentation on best practices). Enter to continue...]
cluster_config []:
The command snakemake --profile slurm.my_account ...
will submit jobs
with sbatch --parsable --account=my_account --no-requeue --exclusive
.
Note that the option --parsable
is always added.
It is possible to install multiple profiles in a project directory. Assuming our HPC defines a multi-cluster environment, we can create a profile that uses a specified cluster:
$ cookiecutter slurm
profile_name [slurm]: slurm.dusk
sbatch_defaults []: account=my_account
cluster_sidecar_help: [Use cluster sidecar. NB! Requires snakemake >= 7.0! Enter to continue...]
Select cluster_sidecar:
1 - yes
2 - no
Choose from 1, 2 [1]:
cluster_name []: dusk
cluster_config_help: [The use of cluster-config is discouraged. Rather, set snakemake CLI options in the profile configuration file (see snakemake documentation on best practices). Enter to continue...]
cluster_config []:
(Note that once a cookiecutter has been installed, we can reuse it without using the github URL).
The command snakemake --profile slurm.dusk ...
will now submit jobs
with sbatch --parsable --account=my_account --cluster=dusk
. In
addition, the slurm-status.py
script will check for jobs in the dusk
cluster job queue.
profile_name
: A name to address the profile via the--profile
Snakemake option.use_singularity
: This sets the default--use-singularity
parameter. Default is not to use (false
).use_conda
: This sets the default--use-conda
parameter. Default is not to use (false
).jobs
: This sets the default--cores
/--jobs
/-j
parameter.restart_times
: This sets the default--restart-times
/-T
parameter.max_status_checks_per_second
: This sets the default--max-status-checks-per-second
parameter.max_jobs_per_second
: This sets the default--max-jobs-per-second
parameter.latency_wait
: This sets the default--latency-wait
/--output-wait
/-w
parameter.print_shell_commands
: This sets the default--printshellcmds
/-p
parameter.sbatch_defaults
: List of (space-separated) default arguments to sbatch, e.g.:qos=short time=60
. Note, we support human-friendly time specification.cluster_sidecar
: Whether to use the cluster sidecar feature. (Requires Snakemake version of at least 7.0)cluster_name
: some HPCs define multiple SLURM clusters. Set the cluster name, leave empty to use the default. This will add the--cluster
string to the sbatch defaults, and adjustslurm-status.py
to check status on the relevant cluster.cluster_jobname
: A pattern to use for naming Slurm jobs (--job-name
). See Patterns below. Set to"""
(i.e., blank) to use the slurm default.cluster_logpath
: A pattern to use for setting the--output
and--error
log files. You can use slurm filename patterns and Patterns. Set to"""
(i.e., blank) to use the slurm default. For example,logs/slurm/%r_%w
creates logs named%r_%w.out
and%r_%w.err
in the directorylogs/slurm
.cluster_config
(NB: discouraged): Path to a YAML or JSON configuration file analogues to the Snakemake--cluster-config
option . Path may be relative to the profile directory or absolute including environment variables (e.g.$PROJECT_ROOT/config/slurm_defaults.yaml
).
For job name and log paths we provide a custom pattern syntax.
%r
: Rule name. If it is a group job, the group ID will be used instead.%i
: Snakemake job ID.%w
: Wildcards string. e.g., wildcards A and B will be concatenated asA=<val>.B=<val>
%U
: A random universally unique identifier (UUID).%S
: A shortened version of%U
. For example,16fd2706-8baf-433b-82eb-8c7fada847da
would become16fd2706
.%T
: The Unix timestamp (rounded to an integer).
Other default arguments to snakemake
may be adjusted in the resulting
<profile_name>/config.yaml
file.
NB!!! As previusly pointed out, the use of cluster-config is discouraged. Rule specific resource configuration is better handled by snakemake's CLI arguments (see snakemake best practices) which can be put in the profile configuration file.
Arguments are set and overridden in the following order and must be named according to sbatch long option names:
sbatch_defaults
cookiecutter option- Profile
cluster_config
file__default__
entries - Snakefile threads and resources (time, mem)
- Profile
cluster_config
file<rulename>
{=html} entries --cluster-config
parsed to Snakemake (deprecated since Snakemake 5.10)- Snakemake CLI resource configuration in profile configuration file
In addition to Snakemake CLI resource configuration, resources can be
specified in Snakefile rules and must all be in the correct unit/format
as expected by sbatch
(except time). The
implemented resource names are given (and may be adjusted) in
slurm-submit.py
's variable RESOURCE_MAPPING
. This is intended for
system agnostic resources such as time and memory. Currently supported
resources are time
, mem
, mem-per-cpu
, nodes
, and partition
. An
example rule resources configuration follows:
rule bwa_mem:
resources:
time = "00:10:00",
mem = 12000,
partition = "debug"
Resources not listed in RESOURCE_MAPPING
can also be specified with
the special slurm
parameter to resources. For example, to specify
a specific QoS and 2 GPU resources for a rule gpu_stuff
rule gpu_stuff:
resources:
time="12:00:00",
mem_mb=8000,
partition="gpu",
slurm="qos=gpuqos gres=gpu:2"
Note: slurm
must be a space-separated string of the form <option>=<value>
. The <option>
names must match the
long option name of sbatch (see list
here; can be
snake_case or kebab-case). Flags (i.e., options that do not take a
value) should be given without a value. For example, to use the
wait flag, you would pass slurm="qos=gpuqos wait"
.
We support specifying the time for a rule in a "human-friendly" format.
For example, a rule with a time limit of 4 hours and 30 minutes can be
specified as time="4h30m"
.
Supported (case-insensitive) time units are:
w
: weekd
: dayh
: hourm
: minutes
: second
However, you may also pass the time in the slurm format.
Although cluster configuration has been officially
deprecated
in favor of profiles since snakemake 5.10, the --cluster-config
option
can still be used to configure default and per-rule options. Upon
creating a slurm profile, the user will be prompted for the location of
a cluster configuration file, which is a YAML or JSON file (see example
below).
__default__:
account: staff
mail-user: slurm@johndoe.com
large_memory_requirement_job:
constraint: mem2000MB
ntasks: 16
The __default__
entry will apply to all jobs.
Tests can be run on a HPC running SLURM or locally in a docker stack. To execute tests, run
pytest -v -s tests
from the source code root directory. Test options can be configured via
the pytest configuration file tests/pytest.ini
.
Test dependencies are listed in test-environment.yml
and can be
installed in e.g. a conda environment.
Test fixtures are setup in temporary directories created by
pytest. Usually fixtures
end up in /tmp/pytest-of-user or something similar. In any case, these
directories are usually not accessible on the nodes where the tests are
run. Therefore, when running tests on a HPC running SLURM, by default
tests fixtures will be written to the directory .pytest
relative to
the current working directory (which should be the source code root!).
You can change the location with the pytest
option --basetemp
.
For local testing the test suite will deploy a docker stack
cookiecutter-slurm
that runs two services based on the following
images:
The docker stack will be automatically deployed provided that the user
has installed docker and enabled docker
swarm (docker swarm init
). The
docker stack can also be deployed manually from the top-level directory
as follows:
DOCKER_COMPOSE=tests/docker-compose.yaml ./tests/deploystack.sh
See the deployment script tests/deploystack.sh
for details.
Testing of the cookiecutter template is enabled through the pytest plugin for Cookiecutters.
The slurm tests all depend on fixtures and fixture factories defined in
tests/conftest.py
to function properly. The cookie_factory
fixture
factory generates slurm profiles, and the data
fixture copies the
files tests/Snakefile
and tests/cluster-config.yaml
to the test
directory. There is also a datafile
fixture factory that copies any
user-provided data file to the test directory. Finally, the smk_runner
fixture provides an instance of the tests/wrapper.SnakemakeRunner
class for running Snakemake tests.
As an example, tests/test_slurm_advanced.py
defines a fixture
profile
that uses the cookie_factory
fixture factory to create an
slurm profile that uses advanced argument conversion:
@pytest.fixture
def profile(cookie_factory, data):
cookie_factory(advanced="yes")
The test tests/test_slurm_advanced.py::test_adjust_runtime
depends on
this fixture and smk_runner
:
def test_adjust_runtime(smk_runner, profile):
smk_runner.make_target(
"timeout.txt", options=f"--cluster-config {smk_runner.cluster_config}"
)
The make_target
method makes a snakemake target with additional
options passed to Snakemake.
See tests/test_issues.py
and tests/Snakefile_issue49
for an example
of how to write a test with a custom Snakefile.