User Story
As the WMS consumer,
I want BatchBackend(SSHTransport, Slurm) and BatchBackend(LocalTransport, Slurm) working
end-to-end against the containerised Slurm stacks,
So that the N+M composition promise of IC-ADR-001 §3 is proven on the first real scheduler.
Feature Description
Scheduler protocol finalised: submit_cmd(spec), parse_status(raw), kill_cmd(ids),
stages_own_files flag (+ stage_inputs/collect_outputs hooks where needed). Evaluate the
ADR's open question — promote staging to a Stager collaborator — and record the outcome.
_slurm internal module (Tier C): sbatch script/option generation from SubmissionSpec
(incl. count→--array for identical fan-out), squeue/sacct parsing, native-state →
JobStatus map (port DIRAC's mapping: PENDING/SUSPENDED/CONFIGURING→waiting, COMPLETED→done,
CANCELLED/PREEMPTED→aborted…).
Slurm scheduler consuming _slurm; registered via intercede.schedulers.
- Generic
BatchBackend (Tier B): satisfies JobBackend + OutputRetriever
(destructive=False) + Cancellable + Purgeable + LoadReporter; sandbox staging via
Transport.put/get when stages_own_files is false; workdir layout per job; JobHandle
carries host/workdir routing (replaces DIRAC's ssh<batch>:// ref encoding).
- Phase 2 wiring: the integration contract suite (submit → status → fetch → kill, fetch-twice,
partial-failure maps) runs against ssh-slurm and local-slurm stacks via markers.
Definition of Done
Alternatives Considered
- Per-combination classes (
SSHSlurmBackend, LocalSlurmBackend…) — N×M explosion; rejected.
- Porting DIRAC's remote-executed
SLURM.py driver — requires remote python; dropped (see
issue-10).
Additional Context
This issue is the template for every further scheduler.
User Story
As the WMS consumer,
I want
BatchBackend(SSHTransport, Slurm)andBatchBackend(LocalTransport, Slurm)workingend-to-end against the containerised Slurm stacks,
So that the N+M composition promise of IC-ADR-001 §3 is proven on the first real scheduler.
Feature Description
Schedulerprotocol finalised:submit_cmd(spec),parse_status(raw),kill_cmd(ids),stages_own_filesflag (+stage_inputs/collect_outputshooks where needed). Evaluate theADR's open question — promote staging to a
Stagercollaborator — and record the outcome._slurminternal module (Tier C): sbatch script/option generation fromSubmissionSpec(incl.
count→--arrayfor identical fan-out),squeue/sacctparsing, native-state →JobStatusmap (port DIRAC's mapping: PENDING/SUSPENDED/CONFIGURING→waiting, COMPLETED→done,CANCELLED/PREEMPTED→aborted…).
Slurmscheduler consuming_slurm; registered viaintercede.schedulers.BatchBackend(Tier B): satisfiesJobBackend+OutputRetriever(
destructive=False) +Cancellable+Purgeable+LoadReporter; sandbox staging viaTransport.put/getwhenstages_own_filesis false; workdir layout per job;JobHandlecarries host/workdir routing (replaces DIRAC's
ssh<batch>://ref encoding).partial-failure maps) runs against
ssh-slurmandlocal-slurmstacks via markers.Definition of Done
ssh-slurmandlocal-slurmstacks in CIcount > 1submits via a single--array(asserted, not N submits)fetch_outputstreams an arbitrary output sandbox (not just stdout/stderr) with boundedmaterialisation (size/count/timeout/path-containment)
LoadReporter.counts()fromsqueue; unknown ids →JobStatus.UNKNOWNAlternatives Considered
SSHSlurmBackend,LocalSlurmBackend…) — N×M explosion; rejected.SLURM.pydriver — requires remote python; dropped (seeissue-10).
Additional Context
This issue is the template for every further scheduler.