Skip to content
Permalink
Browse files
fix: quote jobid passed to status script to support multi-cluster Slu…
…rm setup (#1459)

* fix: quote jobid passed to status script to support multi-cluster setup

Motivation: In a multi-cluster Slurm setup, i.e. passing the flag
`--clusters` to `sbatch`, then the job id returned with the `--parsable`
flag is actually `jobid;cluster_name`. This breaks a custom status
script because the shell ends the command at the semi-colon. This PR
quotes `jobid;cluster_name` so that a custom status script could then
parse this input to obtain the job status from the specified
cluster_name and jobid.

* Add a test for multi-cluster status script

Co-authored-by: Johannes Köster <johannes.koester@tu-dortmund.de>
  • Loading branch information
jdblischak and johanneskoester committed Mar 7, 2022
1 parent ef0475c commit 023220160c6146810e3da2b277439441e8af9827
Show file tree
Hide file tree
Showing 6 changed files with 42 additions and 1 deletion.
@@ -1215,7 +1215,7 @@ def job_status(job, valid_returns=["running", "success", "failed"]):
if self.sidecar_vars:
env["SNAKEMAKE_CLUSTER_SIDECAR_VARS"] = self.sidecar_vars
ret = subprocess.check_output(
"{statuscmd} {jobid}".format(
"{statuscmd} '{jobid}'".format(
jobid=job.jobid, statuscmd=self.statuscmd
),
shell=True,
@@ -0,0 +1,13 @@
from snakemake import shell

envvars:
"TESTVAR"



rule all:
input: 'output.txt'

rule compute:
output: 'output.txt'
shell: 'touch {output}'
Empty file.
@@ -0,0 +1,8 @@
#!/bin/bash
echo `date` >> sbatch.log
tail -n1 $1 >> sbatch.log
# simulate printing of job id by a random number plus the name
# of the cluster
echo "$RANDOM;name-of-cluster"
cat $1 >> sbatch.log
sh $1
@@ -0,0 +1,9 @@
#!/bin/bash

# The argument passed from sbatch is "jobid;cluster_name"

arg="$1"
jobid="${arg%%;*}"
cluster="${arg##*;}"

echo success
@@ -166,6 +166,17 @@ def test_cluster_cancelscript_nargs1():
assert len(scancel_lines[1].split(" ")) == 2


@skip_on_windows
def test_cluster_statusscript_multi():
os.environ["TESTVAR"] = "test"
run(
dpath("test_cluster_statusscript_multi"),
snakefile="Snakefile.nonstandard",
cluster="./sbatch",
cluster_status="./status.sh",
)


def test15():
run(dpath("test15"))

0 comments on commit 0232201

Please sign in to comment.