Skip to content

Update qwen3.5-fp8-mi355x-sglang and mtp SGLang ROCm image to v0.5.12-rocm720-mi35x-20260517#1444

Merged
functionstackx merged 7 commits into
mainfrom
claude/issue-1154-qwen3.5-fp8-mi355x-sglang-mtp
May 19, 2026
Merged

Update qwen3.5-fp8-mi355x-sglang and mtp SGLang ROCm image to v0.5.12-rocm720-mi35x-20260517#1444
functionstackx merged 7 commits into
mainfrom
claude/issue-1154-qwen3.5-fp8-mi355x-sglang-mtp

Conversation

@Klaud-Cold
Copy link
Copy Markdown
Collaborator

@Klaud-Cold Klaud-Cold commented May 17, 2026

Summary

  • Update qwen3.5-fp8-mi355x-sglang image from lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260414 to lmsysorg/sglang-rocm:v0.5.12-rocm720-mi35x-20260517
  • Update qwen3.5-fp8-mi355x-sglang-mtp image from lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260414 to lmsysorg/sglang-rocm:v0.5.12-rocm720-mi35x-20260517

Ref #1154

Generated with Claude Code

…-rocm720-mi35x-20260517

Ref #1154

Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

Comment thread perf-changelog.yaml Outdated
- qwen3.5-fp8-mi355x-sglang-mtp
description:
- "Update SGLang ROCm image from v0.5.10rc0-rocm720-mi35x-20260414 to v0.5.12-rocm720-mi35x-20260517"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The new perf-changelog.yaml entry uses https://github.com/SemiAnalysisAI/InferenceX/pull/XXX as its pr-link — a placeholder that was never substituted with the real PR number. This should be https://github.com/SemiAnalysisAI/InferenceX/pull/1444 to match every other entry in the file and to make the link actually resolve.

Extended reasoning...

What the bug is

At perf-changelog.yaml:2632, the newly-added entry for the qwen3.5-fp8-mi355x-sglang and qwen3.5-fp8-mi355x-sglang-mtp image bump ends with:

  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX

XXX is a placeholder — almost certainly the literal text from a changelog template — that was never replaced with the actual PR number before this PR was opened.

Why this is wrong

The PR metadata for this change is #1444, so the link should be https://github.com/SemiAnalysisAI/InferenceX/pull/1444. Every other recent entry in the same file uses a real numeric PR id — the four entries immediately preceding this one point at #1423, #1429, #1416, and #1394 (lines 2606, 2612, 2619, 2625). The convention is unambiguous and this entry breaks it.

Impact

  1. The link is a hard 404 — https://github.com/SemiAnalysisAI/InferenceX/pull/XXX does not (and cannot) resolve to a valid GitHub PR, so anyone clicking it from the changelog lands on an error page.
  2. Any tooling that consumes perf-changelog.yaml and parses pr-link (e.g. release-notes generators, dashboards, or scripts that cross-reference entries with GitHub PR titles/labels) will either skip this row, log a warning, or break entirely when it cannot resolve XXX to an integer.
  3. The changelog loses its provenance for this image bump — once the PR is merged you can no longer trace the entry back to its discussion/review.

Step-by-step proof

  1. Open the diff for perf-changelog.yaml in this PR. The added block (lines 2627–2632) is:
    - config-keys:
        - qwen3.5-fp8-mi355x-sglang
        - qwen3.5-fp8-mi355x-sglang-mtp
      description:
        - "Update SGLang ROCm image from v0.5.10rc0-rocm720-mi35x-20260414 to v0.5.12-rocm720-mi35x-20260517"
      pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX
  2. Look at the four immediately-preceding entries (lines 2606, 2612, 2619, 2625): each ends with pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/<integer> (1423, 1429, 1416, 1394). No other entry uses XXX or any non-numeric token.
  3. The PR creating this entry is #1444 (see PR metadata in this review).
  4. Therefore XXX on line 2632 is a leftover template placeholder.

Fix

Replace XXX with 1444:

  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1444

Trivial one-character-class change; no other lines need updating.


qwen3.5-fp8-mi355x-sglang:
image: lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260414
image: lmsysorg/sglang-rocm:v0.5.12-rocm720-mi35x-20260517
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Nit: the qwen3.5-fp8-mi355x-sglang-agentic sibling (line 271) still pins lmsysorg/sglang-rocm:v0.5.10rc0-rocm720-mi35x-20260414, but this PR bumps the parent qwen3.5-fp8-mi355x-sglang to v0.5.12-rocm720-mi35x-20260517. The comment block immediately above the -agentic entry states 'Metadata is identical to origin/main's qwen3.5-fp8-mi355x-sglang' — after merge that claim becomes inaccurate. Consider either bumping the -agentic image to match, or adding a 'Reasons below' note in the same style as kimik2.5-fp4-mi355x-vllm-agentic / minimaxm2.5-...-vllm-agentic to document the intentional image divergence.

Extended reasoning...

What the comment claims. Lines 263–267 of .github/configs/amd-master.yaml carry this block, introduced by PR #1393 when the agentic-coding sibling was first split off:\n\n\n# Diverged from qwen3.5-fp8-mi355x-sglang (agentic-coding sibling). Metadata is\n# identical to origin/main's qwen3.5-fp8-mi355x-sglang; the split exists because this\n# PR adds an agentic-coding scenarios block that differs from main\n# (either main had none or had a different conc/offload sweep).\n# The original qwen3.5-fp8-mi355x-sglang entry stays byte-identical to origin/main.\n\n\nThe first sentence is a present-tense factual assertion ("Metadata is identical to origin/main's qwen3.5-fp8-mi355x-sglang") about a relationship between two configs that both live in this file. The last sentence reinforces this by saying the parent stays byte-identical to main.\n\nWhy this PR breaks the assertion. This PR bumps the parent qwen3.5-fp8-mi355x-sglang (line 222) from v0.5.10rc0-rocm720-mi35x-20260414v0.5.12-rocm720-mi35x-20260517. It does not touch the -agentic sibling at line 271, which remains pinned to the older v0.5.10rc0 image. So once this PR lands on main, the claim "Metadata is identical to origin/main's qwen3.5-fp8-mi355x-sglang" is no longer true — the image: field of the -agentic sibling diverges from the parent it claims to mirror.\n\nStep-by-step proof:\n1. Pre-PR state on main: parent image = v0.5.10rc0-...-20260414; -agentic image = v0.5.10rc0-...-20260414. Comment is accurate (both fields identical).\n2. This PR's diff modifies only the parent's image: to v0.5.12-...-20260517 (line 222), and the corresponding -mtp sibling (line 244). The -agentic entry at line 271 is untouched.\n3. Post-merge state on main: parent image = v0.5.12-...-20260517; -agentic image = v0.5.10rc0-...-20260414. The two fields differ, so the comment's assertion that they are "identical" is now stale.\n\nConvention in the same file. When -agentic siblings intentionally diverge from their fixed-seq-len parent on the image field, this file's convention is to call it out explicitly. See the kimik2.5-fp4-mi355x-vllm-agentic block (lines ~430-460 in the modified file), which carries a 'Reasons below' header with an explicit image: 'old' -> 'new' line, and the three minimaxm2.5-...-vllm-agentic blocks which do the same. The qwen3.5-fp8-mi355x-sglang-agentic block has no such documentation because, until this PR, it didn't need any — the images were aligned. After this PR the divergence is undocumented.\n\nAddressing the refutation that this is a historical snapshot. One reading of the comment is that "this PR" refers to #1393 (the PR that created the divergence), making the whole block a historical narrative about how the split came to be rather than an invariant. That reading is plausible for the "this PR adds..." clause, but the framing sentences ("Metadata is identical..." and "The original ... stays byte-identical to origin/main") use present/continuous tense and read as ongoing claims about file state, not a past event. Future readers of the comment after merge will be misled. The fix is cheap either way: bump the image (one-line change) or replace the now-stale assertion with a Reasons below block matching the kimik2.5 / minimaxm2.5 pattern.\n\nImpact and how to fix. Purely a documentation/maintenance consistency issue — both images are independently valid and the system runs correctly, so this is a nit, not a blocking bug. To resolve, either: (a) also bump line 271 to lmsysorg/sglang-rocm:v0.5.12-rocm720-mi35x-20260517 (and add the key to the perf-changelog entry), or (b) replace lines 263–267 with a 'Reasons below' block listing image: 'v0.5.10rc0-rocm720-mi35x-20260414' -> 'v0.5.12-rocm720-mi35x-20260517' (intentionally pinned because <reason>).

@github-actions
Copy link
Copy Markdown
Contributor

@Klaud-Cold
Copy link
Copy Markdown
Collaborator Author

e2e test result: SLURM infrastructure failure

Run 26000388078: All single-node and eval jobs failed — SLURM salloc cancelled while waiting for GPU allocation (same mi355x cluster contention). Not related to the image update.

@github-actions
Copy link
Copy Markdown
Contributor

claude-fix-bot added 2 commits May 17, 2026 19:02
Drops the -20260517 nightly suffix so the recipe uses the
lmsysorg/sglang-rocm:v0.5.12-rocm720-mi35x release tag rather than
a date-pinned nightly build.
@github-actions
Copy link
Copy Markdown
Contributor

@github-actions
Copy link
Copy Markdown
Contributor

@functionstackx functionstackx changed the title Update qwen3.5-fp8-mi355x-sglang and mtp SGLang ROCm image to v0.5.12-rocm720-mi35x-20260517 Update qwen3.5-fp8-mi355x-sglang and mtp SGLang ROCm image to v0.5.12-rocm720-mi35x May 17, 2026
Docker Hub does not publish a clean lmsysorg/sglang-rocm:v0.5.12-rocm720-mi35x
release tag — only the dated nightly variant. The earlier switch to the
un-suffixed tag was a mistake (caused 'manifest not found' on every job).

Restoring the dated nightly tag that does exist.
@functionstackx functionstackx changed the title Update qwen3.5-fp8-mi355x-sglang and mtp SGLang ROCm image to v0.5.12-rocm720-mi35x Update qwen3.5-fp8-mi355x-sglang and mtp SGLang ROCm image to v0.5.12-rocm720-mi35x-20260517 May 18, 2026
# Conflicts:
#	perf-changelog.yaml
@github-actions
Copy link
Copy Markdown
Contributor

@github-actions
Copy link
Copy Markdown
Contributor

@functionstackx
Copy link
Copy Markdown
Collaborator

/reuse-sweep-run

@functionstackx functionstackx merged commit 52f4d4b into main May 19, 2026
4 of 5 checks passed
@functionstackx functionstackx deleted the claude/issue-1154-qwen3.5-fp8-mi355x-sglang-mtp branch May 19, 2026 03:37
@github-actions
Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

2 participants