Update dsr1-fp4-b200-sglang SGLang image to v0.5.12-cu130#1415
Conversation
Ref #1154 Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25980010941 |
There was a problem hiding this comment.
LGTM, straightforward SGLang image version bump.
Extended reasoning...
Overview
This PR bumps the SGLang image for the dsr1-fp4-b200-sglang config from v0.5.9-cu130 to v0.5.12-cu130 in .github/configs/nvidia-master.yaml and adds a corresponding entry to perf-changelog.yaml.
Security risks
None. This is a container image version pin change in a CI/benchmark config; it does not touch auth, crypto, permissions, or any executable code paths in the repo itself.
Level of scrutiny
Low. This is a mechanical, single-line image tag bump matching the established pattern used across the repo for routine SGLang/vLLM upgrades, scoped to one recipe entry.
Other factors
No bugs were flagged by the bug hunting system, there are no outstanding reviewer comments (only the automated recipe-reminder bot post), and the changelog entry is consistent with prior version-bump PRs in this repo.
# Conflicts: # perf-changelog.yaml
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25980011057 |
The e2e workflow downloads artifact pattern 'agentic_*' but benchmark-tmpl.yml uploads as 'bmk_agentic_*', so the agentic step always fails on artifact collection. Comment the agentic-coding block on this recipe until the workflow naming is aligned; the rest of the sweep (fixed-seq-len 1k1k + 8k1k) can finish green.
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25984497141 |
Rebase to current main so the full-sweep-enabled label can launch a fresh sweep. perf-changelog.yaml restored byte-for-byte from main with the PR entry re-appended at the tail.
|
Closing + immediately reopening to retrigger |
There was a problem hiding this comment.
LGTM — straightforward SGLang image bump; the only finding is a minor changelog typo flagged inline.
Extended reasoning...
Overview
This PR bumps the dsr1-fp4-b200-sglang SGLang image from v0.5.9-cu130 to v0.5.12-cu130 in .github/configs/nvidia-master.yaml, temporarily comments out the agentic-coding scenario for that config (with an explanatory comment pointing at an unrelated artifact-name mismatch in e2e-tests.yml), and adds a corresponding entry to perf-changelog.yaml. Scope is two files, all declarative YAML.
Security risks
None. No code, auth, crypto, or permissions involved — only a container image tag bump and a benchmark scenario toggle.
Level of scrutiny
Low. This is the standard mechanical pattern used across the repo for image bumps, matches similar recent PRs (#1404, #1416, #1394), and the disabled scenario is gated behind a clear inline comment explaining the upstream workflow blocker.
Other factors
The bug hunter flagged a documentation-only nit: the changelog says "from v0.5.11-cu130" but the actual previous version (per the diff and the PR description) was v0.5.9-cu130. This appears to be a copy-paste artifact from the adjacent #1416 entry. It does not affect runtime behavior, only the human-readable changelog, so I am comfortable approving with the inline comment attached. The PR author also already re-triggered the full-sweep after a merge-main rebase per their latest comment.
| - config-keys: | ||
| - dsr1-fp4-b200-sglang | ||
| description: | ||
| - "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130" | ||
| - "Temporarily disable agentic-coding scenario (blocked by e2e-tests.yml artifact-name mismatch)" | ||
| pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1415 |
There was a problem hiding this comment.
🟡 The new changelog entry says "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130", but the diff in .github/configs/nvidia-master.yaml shows the previous image was v0.5.9-cu130, not v0.5.11. The PR title and description also state v0.5.9 → v0.5.12, so the "from" version in the changelog should read v0.5.9-cu130.
Extended reasoning...
What the bug is
The new entry in perf-changelog.yaml (lines 2633-2638) describes this PR as:
"Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"
But the actual change in .github/configs/nvidia-master.yaml (line 1673) is:
dsr1-fp4-b200-sglang:
- image: lmsysorg/sglang:v0.5.9-cu130
+ image: lmsysorg/sglang:v0.5.12-cu130The previous version was v0.5.9-cu130, not v0.5.11. The PR title ("Update dsr1-fp4-b200-sglang SGLang image to v0.5.12-cu130") and the PR description ("Updates SGLang image for dsr1-fp4-b200-sglang from v0.5.9-cu130 to v0.5.12-cu130") both confirm this.
Step-by-step proof
- Look at the PR diff for
.github/configs/nvidia-master.yamlline 1673:- image: lmsysorg/sglang:v0.5.9-cu130/+ image: lmsysorg/sglang:v0.5.12-cu130. The pre-image is v0.5.9. - Cross-check the prior changelog entry that brought
dsr1-fp4-b200-sglangto its current version:perf-changelog.yaml:1172records "dsr1-fp4-b200-sglang: v0.5.6-cu129-amd64 → v0.5.9-cu130". So v0.5.9 was indeed the last shipped version, and v0.5.11 was never used for this config-key on main. - The new entry at
perf-changelog.yaml:2636nonetheless claims "from v0.5.11-cu130". This contradicts (1) and (2).
Likely root cause
The adjacent entry at perf-changelog.yaml:2614-2618 (for PR #1416, dsr1-fp8-b200-sglang and dsr1-fp8-b200-sglang-mtp) legitimately goes from v0.5.11 → v0.5.12. The wording for this PR looks copy-pasted from there without updating the "from" version.
Impact
The perf changelog is the canonical, user-visible record of what version each config was running. Getting the "from" version wrong defeats its purpose for anyone bisecting a regression or auditing prior runs — they will look for v0.5.11 results that do not exist for this config.
Suggested fix
Change perf-changelog.yaml:2636 from:
- "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"to:
- "Update SGLang image from v0.5.9-cu130 to v0.5.12-cu130"No runtime behavior changes — just a one-token documentation fix.
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26006037916 |
|
/reuse-sweep-run |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26010770149 |
Updates SGLang image for
dsr1-fp4-b200-sglangfrom v0.5.9-cu130 to v0.5.12-cu130.\nRef #1154
Generated with Claude Code