-
Notifications
You must be signed in to change notification settings - Fork 1
Description
When running Gator with --scheduler slurm, standalone !Job specifications execute locally on the control node instead of being submitted to the configured Slurm scheduler. However, !JobGroup specifications with the same jobs correctly submit to Slurm.
Expected Behavior
Jobs specified with !Job should respect the --scheduler slurm command-line flag and submit to the Slurm cluster.
Actual Behavior
- Single
!Job: Executes locally on the control node, no Slurm scheduler messages in logs !JobGroupwith!Jobchildren: Correctly submits to Slurm, logs show "Slurm scheduler using REST API"
Reproduction
Configure Gator with working Slurm REST API
Create a single job spec (/tmp/single_job.yaml):
!Job
ident: test_single
command: hostname
resources:
- !Cores [1]
- !Memory [1, GB]Run: python3 -m gator --scheduler slurm /tmp/single_job.yaml
Result: Job runs locally, no Slurm submission
Create a job group spec (/tmp/group_job.yaml):
!JobGroup
ident: test_group
jobs:
- !Job
ident: test_node
command: hostname
resources:
- !Cores [1]
- !Memory [1, GB]Run: python3 -m gator --scheduler slurm /tmp/group_job.yaml
Result: Job correctly submits to Slurm
Evidence
Single !Job output (no scheduler messages):
[05:10:23] INFO root Running on slurm-ctrl.farm.trustdoubtless.com as PID 1950
INFO root Layer 'test_single' launching sub-jobs
!JobGroup output (scheduler active):
[05:16:46] INFO root Running on slurm-ctrl.farm.trustdoubtless.com as PID 2088
INFO root Layer 'test_group' launching sub-jobs
INFO root Slurm scheduler using REST API version v0.0.41
INFO root Waiting for 1 dependency tasks to complete
Environment
Gator version: main branch
Slurm version: 24.05.4
Python version: 3.11+
Impact
This makes it impossible to submit simple single-job workflows to Slurm without wrapping them in a !JobGroup, which is counter-intuitive and inconsistent with the --scheduler flag's purpose.