ROCm · coketaste · May 29, 2026 · May 9, 2026 · May 9, 2026 · May 9, 2026
@@ -5,6 +5,50 @@ All notable changes to madengine will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [Unreleased]
+
+## [2.1.0] - 2026-05-28
+
+### Added
+
+- **`slurm_multi` SLURM escape-hatch launcher**: New self-managed multi-node launcher for workloads that orchestrate their own per-node Docker containers via `srun` (e.g. SGLang Disaggregated proxy + prefill + decode topologies). Selected via `distributed.launcher: "slurm_multi"` (or `"slurm-multi"` alias). Generates a wrapper SBATCH script that runs the model's `.slurm` script directly on baremetal so `srun`/`scontrol` work inside it; performs parallel `srun docker pull` of the registry image on all allocated nodes when the model card sets `env_vars.DOCKER_IMAGE_NAME`. Honors model-card and `--additional-context` `slurm` fields (`partition`, `nodes`, `gpus_per_node`, `time`, `exclusive`, `reservation`, `nodelist`). This launcher coexists with the standard templated launchers (torchrun, vllm, sglang, deepspeed, megatron, torchtitan, primus) — those continue to flow through the standard sbatch template unchanged; only `slurm_multi`/`slurm-multi` takes the self-managed bypass path.
+
+- **`madengine build --use-image [IMAGE | auto]`**: Skip the local Docker build and use a pre-built image instead. With no value, resolves to the model card's `env_vars.DOCKER_IMAGE_NAME` automatically. Mutually exclusive with `--registry` and `--build-on-compute`. Manifest entries are keyed by model name with `local_image: True` so `ContainerRunner.run_models_from_manifest()` resolves `run_image` correctly and pulls on demand.
+
+- **`madengine build --build-on-compute`**: Build Docker images on a SLURM compute node and push to a registry, then have `madengine run` pull the image in parallel on all allocated nodes. Requires `--registry`. The resulting manifest carries `built_on_compute: true`.
+
+- **slurm_multi build registry gate**: When `madengine build` discovers a `slurm_multi` model and no `--registry`/`--use-image`/`--build-on-compute` is given, the orchestrator either auto-uses `env_vars.DOCKER_IMAGE_NAME` from the model card (implicit `--use-image` fallback) or raises a structured `ConfigurationError` with the four supported options listed.
+
+- **bash-in-salloc execution path** for slurm_multi: when `madengine run` detects `SLURM_JOB_ID` (i.e. running inside an existing `salloc`), the slurm_multi launcher runs the generated wrapper synchronously with `bash` instead of nesting another `sbatch` job. Other launchers continue to use `sbatch` even inside `salloc` (no behavior change for non-slurm_multi).
+
+- **Local self-managed launcher execution** (`container_runner.py`): `ContainerRunner._run_self_managed()` runs the model script directly on the host for self-managed launchers, bypassing madengine's Docker wrapper. Used when `madengine run` detects a `slurm_multi` launcher in local/non-SLURM contexts. Environment variables from the model card and `--additional-context` are injected; keys are logged without values to avoid leaking credentials.
+
+- **Model card config merge into manifest `deployment_config`**: `_execute_with_prebuilt_image` now merges the model card's `distributed` and `slurm` sections into the manifest's `deployment_config`, so the run phase auto-detects SLURM deployment and launcher settings without requiring `--additional-context`. User-supplied CLI values take precedence over model card defaults.
+
+- **`DockerBuilder` registry image injection for parallel pull**: After a successful registry push, `DockerBuilder.generate_manifest()` now sets `DOCKER_IMAGE_NAME` in each `built_models` entry's `env_vars` to the registry image, enabling slurm_multi parallel `srun docker pull` on all nodes without requiring manual image specification.
+
+- **`DeploymentResult.skip_monitoring`** (`deployment/base.py`): new dataclass field so synchronous deploy paths (e.g. slurm_multi's bash-in-salloc) can skip the monitor poll.
+
+- **`SlurmNodeSelector` `reservation` parameter**: optional reservation name forwarded to srun health/cleanup commands so node-prep srun calls run inside the reservation.
+
+- **`tests/unit/test_slurm_multi.py`**: contract tests for `slurm_multi` registry membership, hyphen alias normalization, end-to-end env_vars-export contract against MAD-private PR #186's `pyt_sglang_disagg_qwen3-32b_short` model card, and `_execute_with_prebuilt_image` manifest key-set contract (`built_images.keys() == built_models.keys()`).
+
+- **`examples/slurm-configs/minimal/slurm-multi-minimal.json`**: minimal reference config for the new launcher.
+
+### Changed
+
+- **Early model discovery reuse in `BuildOrchestrator`**: The `DiscoverModels` result from the slurm_multi registry-gate check is now cached and reused for the actual build step, avoiding duplicate `get_models_json.py` execution and duplicate console output.
+
+- **E2E test cleanup defaults expanded**: `DEFAULT_CLEAN_FILES` in `tests/fixtures/utils.py` now includes `build_manifest.json` and related perf artefacts (`perf_super.json`, `perf_entry.csv`, etc.) so stale manifests from prior e2e tests cannot silently cause the wrong image to be executed.
+
+### Fixed
+
+- **slurm_multi: cwd `perf.csv` aggregation**: After a successful slurm_multi run, `madengine run` previously printed a cosmetic `Performance CSV not found: perf.csv` warning even though `_collect_slurm_multi_results` had ingested the per-job CSV from `/shared_inference/$USER/$JOBID/perf.csv`. The reporter (`display_performance_table`) reads cwd `perf.csv` by default. Now `_collect_slurm_multi_results` also writes the per-job rows into cwd `perf.csv` (copy if absent, append-data-rows if present) so reporting and HTML generation work without extra args. Local + classic-SLURM flows are unchanged.
+
+### Security
+
+- **Shell injection hardening in slurm_multi wrapper scripts**: `shlex.quote()` is applied to env_var values, the model script name, and model args in the generated SBATCH wrapper script (`slurm.py::_prepare_slurm_multi_script`) and the local self-managed runner (`container_runner.py::_run_self_managed`), preventing shell metacharacters (`$()`, backticks, `;`, `"`, etc.) in user-supplied inputs from triggering host-shell expansion.
+
 ## [2.0.3] - 2026-05-26
 
 ### Added

@@ -97,6 +97,8 @@ madengine build [OPTIONS]
 | `--tags` | `-t` | TEXT | `[]` | Model tags to build (can specify multiple) |
 | `--target-archs` | `-a` | TEXT | `[]` | Target GPU architectures (e.g., gfx908,gfx90a,gfx942) |
 | `--registry` | `-r` | TEXT | `None` | Docker registry to push images to |
+| `--use-image` | | TEXT | `None` | Skip Docker build and use a pre-built image. Omit value or pass `auto` to resolve from model card's `DOCKER_IMAGE_NAME`. Mutually exclusive with `--registry` and `--build-on-compute` |
+| `--build-on-compute` | | FLAG | `False` | Build Docker images on a SLURM compute node and push to registry. Requires `--registry` |
 | `--batch-manifest` | | TEXT | `None` | Input batch.json file for batch build mode |
 | `--additional-context` | `-c` | TEXT | `"{}"` | Additional context as JSON string |
 | `--additional-context-file` | `-f` | TEXT | `None` | File containing additional context JSON |
@@ -142,6 +144,15 @@ madengine build --tags model \
 
 # Real-time output with verbose logging
 madengine build --tags model --live-output --verbose
+
+# Use a pre-built image (skip Docker build)
+madengine build --tags model --use-image lmsysorg/sglang:v0.5.2rc1-rocm700-mi30x
+
+# Auto-detect image from model card's DOCKER_IMAGE_NAME
+madengine build --tags model --use-image
+
+# Build on SLURM compute node and push to registry
+madengine build --tags model --build-on-compute --registry docker.io/myorg
 ```
 
 **Default Values:**
@@ -658,6 +669,6 @@ madengine recognizes these environment variables:
 
 ---
 
-**Version:** 2.0.0  
-**Last Updated:** December 2025
+**Version:** 2.1.0  
+**Last Updated:** May 2026
 
@@ -472,6 +472,8 @@ Automatically applies (see presets under `src/madengine/deployment/presets/k8s/`
 - `gpus_per_node` - GPUs per node (default: 1)
 - `nodes` - Number of nodes (default: 1)
 - `nodelist` - Comma-separated node names to run on (e.g. `"node01,node02"`); when set, job is restricted to these nodes and automatic node health preflight is skipped
+- `reservation` - SLURM reservation name; forwarded to srun health/cleanup commands and SBATCH directives
+- `exclusive` - Exclusive node access (default: `true`)
 - `time` - Wall time limit HH:MM:SS (required)
 - `mem` - Memory per node (e.g., "64G")
 - `mail_user` - Email for notifications
@@ -521,8 +523,11 @@ Automatically applies (see presets under `src/madengine/deployment/presets/k8s/`
 - `deepspeed` - ZeRO optimization
 - `megatron` - Large transformers (K8s + SLURM)
 - `torchtitan` - LLM pre-training
+- `primus` - Primus unified pretrain
 - `vllm` - LLM inference
 - `sglang` - Structured generation
+- `sglang-disagg` - Disaggregated SGLang
+- `slurm_multi` / `slurm-multi` - Self-managed multi-container topologies (SLURM only)
 
 See [Launchers Guide](launchers.md) for details.
 

@@ -144,6 +144,7 @@ This creates:
 - `vllm` - LLM inference
 - `sglang` - Structured generation
 - `sglang-disagg` - Disaggregated SGLang (multi-node)
+- `slurm_multi` / `slurm-multi` - Self-managed multi-container topologies (SLURM only, escape hatch)
 
 See [Launchers Guide](launchers.md) for details.
 
@@ -242,8 +243,10 @@ The deployment target is automatically detected from the `slurm` key in the conf
 - `gpus_per_node`: Number of GPUs per node
 - `nodes`: Number of nodes (for multi-node)
 - `nodelist`: Comma-separated node names to run on (e.g. `"node01,node02"`); when set, job runs only on these nodes and node health preflight is skipped
+- `reservation`: SLURM reservation name; forwarded to srun health/cleanup commands
 - `time`: Wall time limit (HH:MM:SS)
 - `mem`: Memory per node (e.g., "64G")
+- `exclusive`: Exclusive node access (default: `true`)
 - `mail_user`: Email for job notifications
 - `mail_type`: Notification types (BEGIN, END, FAIL, ALL)
 
@@ -291,6 +294,53 @@ scontrol show job <job_id>
 tail -f slurm-<job_id>.out
 ```
 
+### Pre-Built Images and Build-on-Compute
+
+For workloads that use externally maintained Docker images (e.g. SGLang, vLLM releases):
+
+```bash
+# Skip Docker build, use a pre-built image
+madengine build --tags model --use-image lmsysorg/sglang:latest
+
+# Auto-detect image from model card's DOCKER_IMAGE_NAME
+madengine build --tags model --use-image
+
+# Build on a SLURM compute node and push to registry
+madengine build --tags model --build-on-compute --registry docker.io/myorg
+```
+
+The manifest generated by `--use-image` merges the model card's `distributed` and `slurm` config into `deployment_config`, so the run phase auto-detects SLURM deployment without additional `--additional-context`.
+
+### slurm_multi Launcher (Self-Managed)
+
+For workloads that orchestrate their own per-node Docker containers (e.g. SGLang Disaggregated proxy + prefill + decode topologies), use the `slurm_multi` launcher:
+
+```json
+{
+  "distributed": {
+    "launcher": "slurm_multi"
+  },
+  "slurm": {
+    "partition": "gpu",
+    "nodes": 3,
+    "gpus_per_node": 8,
+    "reservation": "my-reservation"
+  }
+}
+```
+
+Unlike templated launchers, slurm_multi runs the model's `.slurm` script directly on baremetal. The script manages its own Docker containers via `srun` internally. See [Launchers Guide — slurm_multi](launchers.md#9-slurm_multi-self-managed-escape-hatch) for details.
+
+### Running Inside salloc
+
+When `madengine run` detects an existing SLURM allocation (`SLURM_JOB_ID` is set, e.g. inside `salloc`), the slurm_multi launcher runs the generated wrapper script synchronously with `bash` instead of nesting another `sbatch`. Other launchers continue to use `sbatch` even inside `salloc`.
+
+```bash
+salloc --nodes=3 --gpus-per-node=8 --partition=gpu
+madengine run --manifest-file build_manifest.json
+# → Detects salloc, runs synchronously
+```
+
 ### Cancellation
 
 ```bash

@@ -20,6 +20,7 @@ madengine provides unified support for multiple distributed frameworks, enabling
 | **vLLM** | Inference | High-throughput LLM serving | ✅ | ✅ | ✅ |
 | **SGLang** | Inference | Fast LLM inference | ✅ | ✅ | ✅ |
 | **SGLang Disaggregated** | Inference | Large-scale disaggregated inference | ✅ | ✅ | ✅ (min 3) |
+| **slurm_multi** | Escape hatch | Self-managed multi-container topologies | ❌ | ✅ | ✅ |
 
 ---
 
@@ -557,6 +558,108 @@ madengine run --tags model --config custom-split-config.json
 
 ---
 
+### 9. slurm_multi (Self-Managed Escape Hatch)
+
+**Purpose**: Run workloads that manage their own per-node Docker containers via `srun` — an escape hatch for topologies that don't fit the standard templated launchers.
+
+**When to Use**:
+- ✅ Multi-container SLURM topologies (e.g. SGLang Disaggregated proxy + prefill + decode)
+- ✅ Workloads whose `.slurm` script orchestrates Docker containers via `srun` internally
+- ✅ Scenarios requiring baremetal `srun`/`scontrol` access from the model script
+- ❌ NOT a peer of templated launchers — use torchrun, vllm, sglang, etc. for standard workloads
+
+**Configuration**:
+```json
+{
+  "distributed": {
+    "launcher": "slurm_multi",
+    "nnodes": 3,
+    "nproc_per_node": 8
+  },
+  "slurm": {
+    "partition": "gpu",
+    "nodes": 3,
+    "gpus_per_node": 8,
+    "time": "04:00:00",
+    "exclusive": true,
+    "reservation": "my-reservation"
+  }
+}
+```
+
+**How It Works**:
+
+Unlike templated launchers that inject `MAD_MULTI_NODE_RUNNER` and wrap the model script inside a Docker container, slurm_multi:
+
+1. Generates a wrapper SBATCH script that exports `env_vars` from the model card
+2. Runs the model's own `.slurm` script directly on baremetal (head node)
+3. The model script orchestrates per-node Docker containers via `srun` internally
+4. Performs parallel `srun docker pull` on all allocated nodes when using registry images
+5. Writes a completion marker file for robust job completion detection
+
+```
+┌─────────────────────────────────────────────────┐
+│  madengine build --use-image <image>             │
+│  → Generates manifest with pre-built image       │
+│  → Merges model card slurm/distributed config    │
+└───────────────────┬─────────────────────────────┘
+                    ↓
+┌─────────────────────────────────────────────────┐
+│  madengine run --manifest-file manifest.json     │
+│  → Detects slurm_multi launcher                  │
+│  → Generates wrapper SBATCH script               │
+│  → Parallel docker pull on all nodes (if needed) │
+│  → Submits sbatch (or runs bash if inside salloc)│
+└───────────────────┬─────────────────────────────┘
+                    ↓
+┌─────────────────────────────────────────────────┐
+│  Model's .slurm script runs on head node         │
+│  → Orchestrates Docker containers via srun       │
+│  → Manages its own topology (proxy/prefill/...)  │
+│  → Writes perf.csv (collected by madengine)      │
+└─────────────────────────────────────────────────┘
+```
+
+**Build Phase**:
+
+slurm_multi models typically use pre-built images. The build phase has a **registry gate**: if no `--registry`, `--use-image`, or `--build-on-compute` is given, the orchestrator either auto-detects `DOCKER_IMAGE_NAME` from the model card (implicit `--use-image`) or raises a `ConfigurationError` with supported options.
+
+```bash
+# Use a pre-built image (recommended for slurm_multi)
+madengine build --tags my_model --use-image lmsysorg/sglang:latest
+
+# Auto-detect image from model card's DOCKER_IMAGE_NAME
+madengine build --tags my_model --use-image
+
+# Build on compute node and push to registry
+madengine build --tags my_model --build-on-compute --registry docker.io/myorg
+```
+
+**Run Phase — salloc support**:
+
+When `madengine run` detects `SLURM_JOB_ID` (running inside an existing `salloc` allocation), the slurm_multi launcher runs the wrapper script synchronously with `bash` instead of nesting another `sbatch`. Other launchers continue to use `sbatch` inside `salloc` (no behavior change).
+
+```bash
+# Inside salloc: runs synchronously with bash
+salloc --nodes=3 --gpus-per-node=8 --partition=gpu
+madengine run --manifest-file build_manifest.json
+```
+
+**Alias**: `"slurm-multi"` (hyphen) is normalized to `"slurm_multi"` (underscore).
+
+**Features**:
+- Wrapper SBATCH script with shell-quoted env_vars (injection-safe)
+- Parallel `srun docker pull` on all nodes for registry images
+- Completion marker for robust job status detection
+- bash-in-salloc synchronous execution path
+- `DeploymentResult.skip_monitoring` for synchronous runs
+- Model card slurm/distributed config auto-merged into manifest
+
+**Examples**:
+- SLURM: `examples/slurm-configs/minimal/slurm-multi-minimal.json`
+
+---
+
 ## Comparison Matrix
 
 ### Training Launchers
@@ -732,7 +835,7 @@ SGLANG_NODE_RANK=${SLURM_PROCID}
 ```bash
 Error: Unknown launcher type 'xyz'
 ```
-Solution: Use one of: `torchrun`, `deepspeed`, `megatron`, `torchtitan`, `primus`, `vllm`, `sglang`, `sglang-disagg`
+Solution: Use one of: `torchrun`, `deepspeed`, `megatron`, `torchtitan`, `primus`, `vllm`, `sglang`, `sglang-disagg`, `slurm_multi` (or `slurm-multi`)
 
 **2. Multi-Node Communication Fails**
 ```bash
@@ -782,6 +885,9 @@ $MAD_MULTI_NODE_RUNNER your_training_script.py --args
 
 # For vLLM/sglang (no MAD_MULTI_NODE_RUNNER)
 python your_inference_script.py --args
+
+# For slurm_multi (no MAD_MULTI_NODE_RUNNER; script runs on baremetal and manages Docker via srun)
+# The model's .slurm script is executed directly — it handles srun, docker run, etc. internally
 ```
 
 ### Launcher Detection