Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: quote jobid passed to status script to support multi-cluster Slurm setup #1459

Merged
merged 3 commits into from Mar 7, 2022

Conversation

jdblischak
Copy link
Contributor

@jdblischak jdblischak commented Mar 4, 2022

Note: This is a resubmission of my previous PR #977. I closed it temporarily and rebased it onto main, but since I force pushed it I couldn't re-open it. I apologize for the extra noise. This is a minor fix that comes with a test, and I think is a nice complement to the recent cluster feature updates (e.g. cluster-cancel) and the Slurm overhaul in PR #1015


I'm a first-time contributor but long-time user, so I wanted to start by saying: Snakemake is awesome! I really appreciate all the effort to develop and maintain this awesome resource.

Description

Motivation

I am trying to use --cluster-status with a multi-cluster Slurm setup. To check the status of the job, both sacct and scontrol require the jobid and the name of the cluster. When I use the sbatch option --parsable, it returns both in the format jobid;cluster_name. This entire string gets passed to the custom script I pass to --cluster-status, which is great since it provides all the information that the script requires. The problem is that the string isn't quoted, thus the semicolon delimiter is misinterpreted as the end of a line. Thus the shell attempts to execute the name of the cluster. I discovered this behavior with Snakemake 6.2.1.

This happens because the first argument passed to the cluster status script isn't quoted:

def job_status(job, valid_returns=["running", "success", "failed"]):
try:
# this command shall return "success", "failed" or "running"
ret = subprocess.check_output(
"{statuscmd} {jobid}".format(
jobid=job.jobid, statuscmd=self.statuscmd
),
shell=True,
).decode()

Proposed solution

This PR quotes the jobid string passed as the first argument to the cluster status script. I confirmed it works both in a standard Slurm setup (i.e. when --parsable returns only the job id) and also when specifying the cluster name with --clusters.

QC

  • The PR contains a test case for the changes or the changes are already covered by an existing test case.
  • The documentation (docs/) is updated to reflect the changes or this is not necessary (e.g. if the change does neither modify the language nor the behavior or functionalities of Snakemake).

jdblischak added 2 commits Mar 4, 2022
Motivation: In a multi-cluster Slurm setup, i.e. passing the flag
`--clusters` to `sbatch`, then the job id returned with the `--parsable`
flag is actually `jobid;cluster_name`. This breaks a custom status
script because the shell ends the command at the semi-colon. This PR
quotes `jobid;cluster_name` so that a custom status script could then
parse this input to obtain the job status from the specified
cluster_name and jobid.
@jdblischak jdblischak requested a review from johanneskoester as a code owner Mar 4, 2022
@sonarcloud
Copy link

sonarcloud bot commented Mar 6, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants