Release v1.0.0 · NVIDIA/srt-slurm

What's Changed

Update README to include TensorRT LLM and vLLM in description by @nlevin-ui in #1
[MISC] Add License / headers, and a small check to prepare for release by @xinli-sw in #4
feat: enable runtime container detection for portable dynamo source builds by @qiching in #3
Sync ishandhanani/srt-slurm history into NVIDIA/srt-slurm by @csahithi in #14
Add trace-replay benchmark type by @alec-flowers in #16
fix: use custom_tokenizer to workaround the trtllm + glm5 tokenizer loading issue by @richardhuo-nv in #20
fix: add nvidia pypi as an extra index to be able to pip install the prerelease dynamo wheels by @richardhuo-nv in #22
fix: support cross-arch clusters (x86_64 login, aarch64 compute) by @alec-flowers in #17
feat: trace-replay benchmark with aiperf_args passthrough by @alec-flowers in #18
feat: add mocker backend for pipeline smoke tests by @alec-flowers in #25
feat: separate login-node and compute-node venvs by @alec-flowers in #29
feat: runtime fingerprinting, identity verification, and lockfile by @alec-flowers in #19
feat: configurable NATS max_payload for disagg serving by @alec-flowers in #31
Copy {job_id}.json into log directory for S3 upload by @KaunilD in #15
TRTLLM nsys profiling harness + Dynamo OTEL tracing automation by @karen-sy in #27
Add CODEOWNERS file by @xinli-sw in #37
Add CSV export for sa-bench rollup by @weireweire in #26
Sanitize srun output in node IP resolution by @weireweire in #38
feat: lockfile v2 — shareable recipe + lock section by @alec-flowers in #32
fix: Install maturin if not present by @trevor-m in #45
[codex] Add generic telemetry and custom benchmark support by @ishandhanani in #43
[codex] Port HF cache cleanup by @ishandhanani in #49
Add srt-slurm MCP spec server and preflight validation by @ishandhanani in #53
Push logs_url to status API eagerly and via final PUT by @ishandhanani in #54
[codex] narrow srtctl mcp to authoring and validation by @ishandhanani in #55
[codex] Keep MCP validation off host cluster config by @ishandhanani in #56
fix: emit aggregated resources and harden sa-bench rollup by @ishandhanani in #58
feat: use pre-generated custom dataset for benchmarking MTP with chat template by @richardhuo-nv in #64
docs: loud warnings on custom benchmark templating and nginx-off mode by @ishandhanani in #66
feat(sa-bench): add sglang DeepSeek-V4 tokenizer by @YAMY1234 in #73
feat: DeepSeek-V4-Pro perf recipes for GB300 / GB200 (1k/1k agg) by @elvischenv in #70
fix(orchestrate): robust container bootstrap (maturin/protoc/venv-race) by @ishandhanani in #81
fix(sa-bench): actionable error + warmup parity for use_chat_template by @YAMY1234 in #76
feat(schema): make gsm8k a first-class BenchmarkType by @ishandhanani in #82
[codex] add AIME benchmark by @ishandhanani in #83
feat(aime): rework around ns eval for reasoning-model parity by @ishandhanani in #87
Add scripts for wideEP; Note we can reach a PD balance with dep8, cc=2048 by @samuellees in #52
Revert "Add scripts for wideEP; Note we can reach a PD balance with dep8, cc=2048" by @ishandhanani in #89
refactor(aime): drop structured runner, ship configs/aime/{run.sh,rescore.py} by @ishandhanani in #91
Add the chat template to the glm5 tokenizer and apply that when sampling the requests by @richardhuo-nv in #65
feat(config): resolve container aliases for telemetry + preflight by @ishandhanani in #101
[codex] Add Dynamo nightly wheel install support by @alec-flowers in #99
feat(dynamo): cache hash-pinned source builds on /configs by @ishandhanani in #88
Add DeepSeek V4 Pro vLLM GB200 recipes by @alec-flowers in #102
feat(config): cluster-wide default_bash_preamble for ulimits and the like by @ishandhanani in #104
fix(nginx): raise file descriptor limit for nginx workers by @ishandhanani in #108
log: always set dyn skip log fmt by @ishandhanani in #109
[NOT FINAL] add wip DSv4 aggregate and disaggregate recipes by @ishandhanani in #85
nginx: rework to make ulimit optional by @ishandhanani in #110
log: demote per-srun command line to DEBUG by @cquil11 in #111
fix: using a setup script to install pip in trtllm venv by @richardhuo-nv in #116
default dyn log by @ishandhanani in #118
feat: Add live monitor to SRT-SLURM by @leo-cf-tian in #119
Pass in boostrap port on prefill by @wenscarl in #121
Cherry-pick lm-eval benchmark runner from sa-submission-q2-2026 by @ishandhanani in #122
fix: preflight accepts hf:* model paths and Docker image URIs by @Thunderbeee in #125
Add GLM5 B200 FP8 disaggregated recipe by @weireweire in #50
[NOT FINAL] Qwen3.5 fp8 mtp-off recipes by @samuellees in #128
feat: live in-flight batch-metrics snapshotter (opt-in) by @YAMY1234 in #115
feat(profiling): add extra_nsys_args for optional nsys CLI flags by @zhengd-nv in #59
Handle null telemetry in live metrics startup by @weireweire in #135
Add GPT-OSS TRT-LLM aggregated recipe by @faradawn in #132
feat: peak gen throughput metric in sa-bench + server-side node metrics CSV export by @zhengd-nv in #93
feat: first-class mooncake KV store support for SGLang backend by @ishandhanani in #136
feat: SGLang decode slow_down for PD disagg nsys profiling (with skip-warmup workflow) by @zhengd-nv in #60
sglang: enable mooncake_master HTTP metadata server + auto-inject MOONCAKE_TE_META_DATA_SERVER by @ishandhanani in #138
recipes: update glm5 sglang to use faster weights loading by @weireweire in #137
sa-bench: make SGLangDeepseekV4Tokenizer callable by @ch-wan in #144
fix(batch-metrics): split agg logs by DP rank by @YAMY1234 in #145
Capture git state for extra mounts by @YAMY1234 in #146
Sglang port jitter by @nvjullin in #134
Default SA-Bench random workers to auto by @weireweire in #147
Update GB300 FP4 GLM-5 recipe by @weireweire in #152
Expand environment variables in extra_mount paths by @weireweire in #153
Make batch metrics legends translucent by @YAMY1234 in #151
Centralize safe runtime port allocation by @weireweire in #156
Support default sbatch directives in srtslurm config by @weireweire in #159
Update GB300 FP8 GLM-5 recipe by @weireweire in #160
Add Nemotron Super 120B recipes by @faradawn in #150
Add --no-preflight CLI flag to srtctl apply by @cquil11 in #162
Add Qwen3.5 DeepEP MTP recipes by @YAMY1234 in #163
Accept legacy token metric names in telemetry plots by @weireweire in #166
Fixing sweep submissions for 'sweep' block by @AlphaBladez in #170
Add DSV4 GB300 8k1k recipe by @weireweire in #173
Add GB300 FP8 GLM5 MTP recipes and Upadate max-running-requests. by @weireweire in #168
Add spread worker placement and vLLM colocation (PR against main) by @jasonlizhengjian in #182
Force-reinstall maturin in portable top_of_tree dynamo source build by @Ankur-singh in #183
feat(config): add default_health_check cluster-level default in srtslurm.yaml by @shljessie in #180
fix(slurm): use --key=value for srun options (Slurm 25.11 cpu-bind regression) by @shljessie in #179
Added heterogenous job support by @nvjullin in #178
Copy resolved override/zip config into log dir for S3 upload by @KaunilD in #194
Add workflow: auto-release on PR merge to main by @Ankur-singh in #185
Fix release workflow 403 by using pull_request_target by @Ankur-singh in #200

New Contributors

@xinli-sw made their first contribution in #4
@csahithi made their first contribution in #14
@KaunilD made their first contribution in #15
@karen-sy made their first contribution in #27
@trevor-m made their first contribution in #45
@ishandhanani made their first contribution in #43
@elvischenv made their first contribution in #70
@samuellees made their first contribution in #52
@cquil11 made their first contribution in #111
@leo-cf-tian made their first contribution in #119
@wenscarl made their first contribution in #121
@Thunderbeee made their first contribution in #125
@zhengd-nv made their first contribution in #59
@faradawn made their first contribution in #132
@ch-wan made their first contribution in #144
@nvjullin made their first contribution in #134
@AlphaBladez made their first contribution in #170
@Ankur-singh made their first contribution in #183
@shljessie made their first contribution in #180

Full Changelog: https://github.com/NVIDIA/srt-slurm/commits/v1.0.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.0.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

New Contributors

Contributors

Uh oh!