Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setting ntasks for non-MPI jobs #40

Closed
tatumdmortimer opened this issue Feb 26, 2024 · 6 comments
Closed

setting ntasks for non-MPI jobs #40

tatumdmortimer opened this issue Feb 26, 2024 · 6 comments
Assignees

Comments

@tatumdmortimer
Copy link

I am working on transitioning my snakemake workflows to Snakemake v8. However, when I use this plugin with a profile, I get the following error message:
WorkflowError: SLURM job submission failed. The error message was sbatch: error: You must specify a number of tasks. sbatch: error: Batch job submission failed: Job size specification needs to be provided

Here are the versions of Snakemake, this plugin, and slurm that I am working with:

Snakemake 8.5.2
snakemake-executor-plugin-slurm 0.3.2
slurm 23.02.4

Here is my profile:

executor: "slurm"
software-deployment-method: "conda"
latency-wait: 60
jobs: 100
default-resources:
  runtime: 10
  slurm_partition: "batch"
  mem_mb: 4000
  tasks: 1
  nodes: 1

I cloned the repository, removed the if job.resources.get("mpi", False): on line 107 of __init__.py, and reinstalled the plugin. This seems to have fixed the issue for me. Is there a reason these were only included in the sbatch command for mpi jobs?

Thanks!

@cmeesters
Copy link
Collaborator

cmeesters commented Feb 27, 2024

Can you please run a minimal example with the --verbose flag? And paste or attach (the output is probably a mouthful) the output here? Edit: Oh, please include the command line.

Suggestion for a minimal Snakefile:

rule all:
     input: "results/a.out"

rule test1:
     output: "results/a.out"
     shell: "touch {output}"

@tatumdmortimer
Copy link
Author

I've attached the output using the minimal snakefile you suggested. This is the command line that I used: snakemake --cores 1 --workflow-profile slurm_profile --verbose

snakemake_test.txt

@cmeesters
Copy link
Collaborator

cmeesters commented Feb 28, 2024

Thank you. This is really weird, because the SLURM docs state, that:

The default is one task per node ...

Which means that no requirement is imposed to set -n/--ntasks explicitly.

It will not hurt to implement submitting with a default 1 for the tasks and not requiring users with a situation like yours to patch the plugin. But I wonder: Are you in contact with your admins and can tell us why this cluster deviates from the SLURM defaults?

@cmeesters cmeesters self-assigned this Feb 29, 2024
fgvieira pushed a commit that referenced this issue Feb 29, 2024
related to #40 , ought to fix this. Does not hurt the smp nor the mpi
case.
cmeesters pushed a commit that referenced this issue Feb 29, 2024
🤖 I have created a release *beep* *boop*
---


##
[0.4.1](v0.4.0...v0.4.1)
(2024-02-29)


### Bug Fixes

* fixes issue
[#40](#40)
- ntasks set explicitly
([#44](#44))
([f5c2c2c](f5c2c2c))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@cmeesters
Copy link
Collaborator

cmeesters commented Feb 29, 2024

@tatumdmortimer please update to the latest release and give it a try.

@tatumdmortimer
Copy link
Author

I updated to the latest release, and the minimal Snakefile completed successfully. Thanks for fixing this so quickly.

I did get in touch with the research computing at my university, and while they did confirm that cluster deviates from the defaults, they didn't provide me with a reason why.

@cmeesters
Copy link
Collaborator

Well, then I am just glad it's working for you (and hope for anybody else, too).

FYI: SchedMD's (the company behind SLURM) "ecosystem" shows some odd flowers, mainly due to service policies. I am merely trying to get information about the how and why of deviations from the standard to improve and stabilize these plugins. And sometimes the answer is "just because". Anyway, thanks for asking!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants