Skip to content

Add MI355X config: glm5-fp8-sglang-mtp#1086

Merged
functionstackx merged 1 commit intomainfrom
claude/add-glm5-fp8-mi355x-mtp
Apr 18, 2026
Merged

Add MI355X config: glm5-fp8-sglang-mtp#1086
functionstackx merged 1 commit intomainfrom
claude/add-glm5-fp8-mi355x-mtp

Conversation

@functionstackx
Copy link
Copy Markdown
Contributor

Summary

  • Adds glm5-fp8-mi355x-sglang-mtp config mirroring the existing glm5-fp8-mi355x-sglang non-MTP recipe, plus a new benchmarks/single_node/glm5_fp8_mi355x_mtp.sh launch script.
  • Adds EAGLE speculative decoding flags on top of the non-MTP script: --speculative-algorithm EAGLE --speculative-num-steps 3 --speculative-eagle-topk 1 --speculative-num-draft-tokens 4.
  • Sets SGLANG_ENABLE_SPEC_V2=1 in the env before sglang.launch_server (required for GLM-5 MTP).
  • Passes --use-chat-template to run_benchmark_serving per the AGENTS.md requirement for all MTP scripts.
  • Search space rows carry spec-decoding: mtp so the runner picks up the _mtp.sh variant.
  • perf-changelog.yaml diff is append-only (no modifications to any existing line).

Test plan

  • YAML parses for both master config and perf-changelog.
  • bash -n benchmarks/single_node/glm5_fp8_mi355x_mtp.sh — bash syntax OK.
  • git diff perf-changelog.yaml shows only additions.
  • python3 utils/matrix_logic/generate_sweep_configs.py full-sweep --config-files .github/configs/amd-master.yaml — emits 10 entries (2 ISL/OSL × 5 concurrencies) with spec-decoding=mtp.
  • CI sweep passes on MI355X.

🤖 Generated with Claude Code

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — straightforward MTP config addition mirroring existing patterns; the stale image name in the comment is flagged inline.

Extended reasoning...

Overview

This PR adds a new glm5-fp8-mi355x-sglang-mtp entry to amd-master.yaml, a matching benchmarks/single_node/glm5_fp8_mi355x_mtp.sh launch script, and an append-only perf-changelog.yaml entry. Three files touched, all additive.

Security risks

None. This is a benchmark configuration and shell script that runs inside a pre-defined container. No auth, crypto, or permission-sensitive paths are involved.

Level of scrutiny

Low. The change follows a well-worn pattern — identical structure to dsr1-fp8-b200-sglang-mtp, qwen3.5-fp8-b200-sglang-mtp, and other MTP siblings already in the repo. The EAGLE speculative-decoding flags, SGLANG_ENABLE_SPEC_V2=1, and --use-chat-template are consistent with existing MTP scripts.

Other factors

The one bug found (stale image name in a comment) is a documentation nit with zero functional impact — the actual image is correctly specified in the YAML and the pip install workaround runs unconditionally regardless. The inline comment already covers this. No outstanding reviewer threads exist.

Comment on lines +18 to +21
# GLM-5 requires transformers with glm_moe_dsa model type support.
# However, the Image rocm/sgl-dev:v0.5.8.post1-rocm720-mi35x-20260219 doesn't provide this support.
python3 -m pip install -U --no-cache-dir \
"git+https://github.com/huggingface/transformers.git@6ed9ee36f608fd145168377345bfc4a5de12e1e2"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The comment on lines 18-19 of glm5_fp8_mi355x_mtp.sh references image rocm/sgl-dev:v0.5.8.post1-rocm720-mi35x-20260219, but the script actually runs against lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260413 (as defined in the YAML config). Update the comment to reference the correct image so maintainers can accurately evaluate whether the pinned transformers pip install is still required.

Extended reasoning...

What the bug is: Lines 18–19 of the new benchmarks/single_node/glm5_fp8_mi355x_mtp.sh contain the comment:

# However, the Image rocm/sgl-dev:v0.5.8.post1-rocm720-mi35x-20260219 doesn't provide this support.

This comment explains why the script unconditionally runs pip install -U … transformers@<commit>. The stated reason is that a specific image lacks GLM-5 glm_moe_dsa model type support.

The specific code path that triggers it: A developer reading the script to decide whether the workaround is still necessary will look at lines 18–21 to understand the rationale. The comment identifies rocm/sgl-dev:v0.5.8.post1-rocm720-mi35x-20260219 as the deficient image.

Why existing code doesn't prevent it: The comment was copy-pasted from benchmarks/single_node/glm5_fp8_mi355x.sh (the non-MTP script), which itself already had a stale comment after PR #1023 upgraded the non-MTP image from that old rocm/sgl-dev image to lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260413 without updating the explanatory comment. The MTP script was then created by copying the non-MTP script and inheriting the already-stale comment.

What the impact would be: The actual image used—as defined by glm5-fp8-mi355x-sglang-mtp in amd-master.yaml—is lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260413. A maintainer reading the comment might:

  1. Assume the pip install workaround is still needed because of a limitation in the old v0.5.8 image, and not think to re-check it against the newer v0.5.10rc0 image.
  2. Conversely, if they do notice the discrepancy, they waste time tracing the history to figure out which image the comment was meant to describe.
    The pip install runs regardless, so there is no functional impact—this is purely a documentation accuracy issue.

How to fix it: Update the comment to reference the image actually in use. For example:

# GLM-5 requires transformers with glm_moe_dsa model type support.
# However, the image lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260413 doesn't provide this support.

The same fix should also be applied to the identical stale comment in glm5_fp8_mi355x.sh (the non-MTP script), which was introduced in PR #1023.

Step-by-step proof:

  1. Open .github/configs/amd-master.yaml and look at the glm5-fp8-mi355x-sglang-mtp key → image: lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260413.
  2. Open benchmarks/single_node/glm5_fp8_mi355x_mtp.sh, line 19 → # However, the Image rocm/sgl-dev:v0.5.8.post1-rocm720-mi35x-20260219 doesn't provide this support.
  3. The image name in the comment (rocm/sgl-dev:v0.5.8.post1-rocm720-mi35x-20260219) is entirely different from the image actually pulled (lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260413) — different registry, different tag, different date.
  4. Check benchmarks/single_node/glm5_fp8_mi355x.sh line 19 → identical stale comment, confirming the MTP script was copy-pasted from there without updating the comment.

@functionstackx functionstackx force-pushed the claude/add-glm5-fp8-mi355x-mtp branch from 1f01c5a to fa2f438 Compare April 18, 2026 03:07
Mirrors the existing glm5-fp8-mi355x-sglang non-MTP recipe and adds
EAGLE speculative decoding (num-steps=3, eagle-topk=1, num-draft-tokens=4)
via the standard spec-decoding=mtp suffix. SGLANG_ENABLE_SPEC_V2=1 is
set before launching the server as required for GLM-5 MTP. Script also
passes --use-chat-template to run_benchmark_serving, as required by
AGENTS.md for all MTP configs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@functionstackx functionstackx force-pushed the claude/add-glm5-fp8-mi355x-mtp branch from fa2f438 to c698215 Compare April 18, 2026 06:03
@functionstackx functionstackx merged commit ca59324 into main Apr 18, 2026
3 checks passed
@functionstackx functionstackx deleted the claude/add-glm5-fp8-mi355x-mtp branch April 18, 2026 06:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

1 participant