Add MI355X config: glm5-fp8-sglang-mtp#1086
Conversation
There was a problem hiding this comment.
LGTM — straightforward MTP config addition mirroring existing patterns; the stale image name in the comment is flagged inline.
Extended reasoning...
Overview
This PR adds a new glm5-fp8-mi355x-sglang-mtp entry to amd-master.yaml, a matching benchmarks/single_node/glm5_fp8_mi355x_mtp.sh launch script, and an append-only perf-changelog.yaml entry. Three files touched, all additive.
Security risks
None. This is a benchmark configuration and shell script that runs inside a pre-defined container. No auth, crypto, or permission-sensitive paths are involved.
Level of scrutiny
Low. The change follows a well-worn pattern — identical structure to dsr1-fp8-b200-sglang-mtp, qwen3.5-fp8-b200-sglang-mtp, and other MTP siblings already in the repo. The EAGLE speculative-decoding flags, SGLANG_ENABLE_SPEC_V2=1, and --use-chat-template are consistent with existing MTP scripts.
Other factors
The one bug found (stale image name in a comment) is a documentation nit with zero functional impact — the actual image is correctly specified in the YAML and the pip install workaround runs unconditionally regardless. The inline comment already covers this. No outstanding reviewer threads exist.
| # GLM-5 requires transformers with glm_moe_dsa model type support. | ||
| # However, the Image rocm/sgl-dev:v0.5.8.post1-rocm720-mi35x-20260219 doesn't provide this support. | ||
| python3 -m pip install -U --no-cache-dir \ | ||
| "git+https://github.com/huggingface/transformers.git@6ed9ee36f608fd145168377345bfc4a5de12e1e2" |
There was a problem hiding this comment.
🟡 The comment on lines 18-19 of glm5_fp8_mi355x_mtp.sh references image rocm/sgl-dev:v0.5.8.post1-rocm720-mi35x-20260219, but the script actually runs against lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260413 (as defined in the YAML config). Update the comment to reference the correct image so maintainers can accurately evaluate whether the pinned transformers pip install is still required.
Extended reasoning...
What the bug is: Lines 18–19 of the new benchmarks/single_node/glm5_fp8_mi355x_mtp.sh contain the comment:
# However, the Image rocm/sgl-dev:v0.5.8.post1-rocm720-mi35x-20260219 doesn't provide this support.
This comment explains why the script unconditionally runs pip install -U … transformers@<commit>. The stated reason is that a specific image lacks GLM-5 glm_moe_dsa model type support.
The specific code path that triggers it: A developer reading the script to decide whether the workaround is still necessary will look at lines 18–21 to understand the rationale. The comment identifies rocm/sgl-dev:v0.5.8.post1-rocm720-mi35x-20260219 as the deficient image.
Why existing code doesn't prevent it: The comment was copy-pasted from benchmarks/single_node/glm5_fp8_mi355x.sh (the non-MTP script), which itself already had a stale comment after PR #1023 upgraded the non-MTP image from that old rocm/sgl-dev image to lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260413 without updating the explanatory comment. The MTP script was then created by copying the non-MTP script and inheriting the already-stale comment.
What the impact would be: The actual image used—as defined by glm5-fp8-mi355x-sglang-mtp in amd-master.yaml—is lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260413. A maintainer reading the comment might:
- Assume the pip install workaround is still needed because of a limitation in the old v0.5.8 image, and not think to re-check it against the newer v0.5.10rc0 image.
- Conversely, if they do notice the discrepancy, they waste time tracing the history to figure out which image the comment was meant to describe.
The pip install runs regardless, so there is no functional impact—this is purely a documentation accuracy issue.
How to fix it: Update the comment to reference the image actually in use. For example:
# GLM-5 requires transformers with glm_moe_dsa model type support.
# However, the image lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260413 doesn't provide this support.The same fix should also be applied to the identical stale comment in glm5_fp8_mi355x.sh (the non-MTP script), which was introduced in PR #1023.
Step-by-step proof:
- Open
.github/configs/amd-master.yamland look at theglm5-fp8-mi355x-sglang-mtpkey →image: lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260413. - Open
benchmarks/single_node/glm5_fp8_mi355x_mtp.sh, line 19 →# However, the Image rocm/sgl-dev:v0.5.8.post1-rocm720-mi35x-20260219 doesn't provide this support. - The image name in the comment (
rocm/sgl-dev:v0.5.8.post1-rocm720-mi35x-20260219) is entirely different from the image actually pulled (lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260413) — different registry, different tag, different date. - Check
benchmarks/single_node/glm5_fp8_mi355x.shline 19 → identical stale comment, confirming the MTP script was copy-pasted from there without updating the comment.
1f01c5a to
fa2f438
Compare
Mirrors the existing glm5-fp8-mi355x-sglang non-MTP recipe and adds EAGLE speculative decoding (num-steps=3, eagle-topk=1, num-draft-tokens=4) via the standard spec-decoding=mtp suffix. SGLANG_ENABLE_SPEC_V2=1 is set before launching the server as required for GLM-5 MTP. Script also passes --use-chat-template to run_benchmark_serving, as required by AGENTS.md for all MTP configs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fa2f438 to
c698215
Compare
Summary
glm5-fp8-mi355x-sglang-mtpconfig mirroring the existingglm5-fp8-mi355x-sglangnon-MTP recipe, plus a newbenchmarks/single_node/glm5_fp8_mi355x_mtp.shlaunch script.--speculative-algorithm EAGLE --speculative-num-steps 3 --speculative-eagle-topk 1 --speculative-num-draft-tokens 4.SGLANG_ENABLE_SPEC_V2=1in the env beforesglang.launch_server(required for GLM-5 MTP).--use-chat-templatetorun_benchmark_servingper the AGENTS.md requirement for all MTP scripts.spec-decoding: mtpso the runner picks up the_mtp.shvariant.perf-changelog.yamldiff is append-only (no modifications to any existing line).Test plan
bash -n benchmarks/single_node/glm5_fp8_mi355x_mtp.sh— bash syntax OK.git diff perf-changelog.yamlshows only additions.python3 utils/matrix_logic/generate_sweep_configs.py full-sweep --config-files .github/configs/amd-master.yaml— emits 10 entries (2 ISL/OSL × 5 concurrencies) with spec-decoding=mtp.🤖 Generated with Claude Code