Skip to content

Update dsr1-fp4-b200-sglang SGLang image to v0.5.12-cu130#1415

Merged
functionstackx merged 5 commits into
mainfrom
claude/issue-1154-dsr1-fp4-b200-sglang
May 18, 2026
Merged

Update dsr1-fp4-b200-sglang SGLang image to v0.5.12-cu130#1415
functionstackx merged 5 commits into
mainfrom
claude/issue-1154-dsr1-fp4-b200-sglang

Conversation

@Klaud-Cold
Copy link
Copy Markdown
Collaborator

Updates SGLang image for dsr1-fp4-b200-sglang from v0.5.9-cu130 to v0.5.12-cu130.
\nRef #1154

Generated with Claude Code

Ref #1154

Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

@github-actions
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, straightforward SGLang image version bump.

Extended reasoning...

Overview

This PR bumps the SGLang image for the dsr1-fp4-b200-sglang config from v0.5.9-cu130 to v0.5.12-cu130 in .github/configs/nvidia-master.yaml and adds a corresponding entry to perf-changelog.yaml.

Security risks

None. This is a container image version pin change in a CI/benchmark config; it does not touch auth, crypto, permissions, or any executable code paths in the repo itself.

Level of scrutiny

Low. This is a mechanical, single-line image tag bump matching the established pattern used across the repo for routine SGLang/vLLM upgrades, scoped to one recipe entry.

Other factors

No bugs were flagged by the bug hunting system, there are no outstanding reviewer comments (only the automated recipe-reminder bot post), and the changelog entry is consistent with prior version-bump PRs in this repo.

# Conflicts:
#	perf-changelog.yaml
@github-actions
Copy link
Copy Markdown
Contributor

The e2e workflow downloads artifact pattern 'agentic_*' but
benchmark-tmpl.yml uploads as 'bmk_agentic_*', so the agentic step
always fails on artifact collection. Comment the agentic-coding block
on this recipe until the workflow naming is aligned; the rest of the
sweep (fixed-seq-len 1k1k + 8k1k) can finish green.
@github-actions
Copy link
Copy Markdown
Contributor

Rebase to current main so the full-sweep-enabled label can launch a
fresh sweep. perf-changelog.yaml restored byte-for-byte from main with
the PR entry re-appended at the tail.
@functionstackx
Copy link
Copy Markdown
Collaborator

Closing + immediately reopening to retrigger full-sweep-enabled after the merge-main rebase. Branch is now up to date with main.

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — straightforward SGLang image bump; the only finding is a minor changelog typo flagged inline.

Extended reasoning...

Overview

This PR bumps the dsr1-fp4-b200-sglang SGLang image from v0.5.9-cu130 to v0.5.12-cu130 in .github/configs/nvidia-master.yaml, temporarily comments out the agentic-coding scenario for that config (with an explanatory comment pointing at an unrelated artifact-name mismatch in e2e-tests.yml), and adds a corresponding entry to perf-changelog.yaml. Scope is two files, all declarative YAML.

Security risks

None. No code, auth, crypto, or permissions involved — only a container image tag bump and a benchmark scenario toggle.

Level of scrutiny

Low. This is the standard mechanical pattern used across the repo for image bumps, matches similar recent PRs (#1404, #1416, #1394), and the disabled scenario is gated behind a clear inline comment explaining the upstream workflow blocker.

Other factors

The bug hunter flagged a documentation-only nit: the changelog says "from v0.5.11-cu130" but the actual previous version (per the diff and the PR description) was v0.5.9-cu130. This appears to be a copy-paste artifact from the adjacent #1416 entry. It does not affect runtime behavior, only the human-readable changelog, so I am comfortable approving with the inline comment attached. The PR author also already re-triggered the full-sweep after a merge-main rebase per their latest comment.

Comment thread perf-changelog.yaml
Comment on lines +2633 to +2638
- config-keys:
- dsr1-fp4-b200-sglang
description:
- "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"
- "Temporarily disable agentic-coding scenario (blocked by e2e-tests.yml artifact-name mismatch)"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1415
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The new changelog entry says "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130", but the diff in .github/configs/nvidia-master.yaml shows the previous image was v0.5.9-cu130, not v0.5.11. The PR title and description also state v0.5.9 → v0.5.12, so the "from" version in the changelog should read v0.5.9-cu130.

Extended reasoning...

What the bug is

The new entry in perf-changelog.yaml (lines 2633-2638) describes this PR as:

"Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"

But the actual change in .github/configs/nvidia-master.yaml (line 1673) is:

 dsr1-fp4-b200-sglang:
-  image: lmsysorg/sglang:v0.5.9-cu130
+  image: lmsysorg/sglang:v0.5.12-cu130

The previous version was v0.5.9-cu130, not v0.5.11. The PR title ("Update dsr1-fp4-b200-sglang SGLang image to v0.5.12-cu130") and the PR description ("Updates SGLang image for dsr1-fp4-b200-sglang from v0.5.9-cu130 to v0.5.12-cu130") both confirm this.

Step-by-step proof

  1. Look at the PR diff for .github/configs/nvidia-master.yaml line 1673: - image: lmsysorg/sglang:v0.5.9-cu130 / + image: lmsysorg/sglang:v0.5.12-cu130. The pre-image is v0.5.9.
  2. Cross-check the prior changelog entry that brought dsr1-fp4-b200-sglang to its current version: perf-changelog.yaml:1172 records "dsr1-fp4-b200-sglang: v0.5.6-cu129-amd64 → v0.5.9-cu130". So v0.5.9 was indeed the last shipped version, and v0.5.11 was never used for this config-key on main.
  3. The new entry at perf-changelog.yaml:2636 nonetheless claims "from v0.5.11-cu130". This contradicts (1) and (2).

Likely root cause

The adjacent entry at perf-changelog.yaml:2614-2618 (for PR #1416, dsr1-fp8-b200-sglang and dsr1-fp8-b200-sglang-mtp) legitimately goes from v0.5.11 → v0.5.12. The wording for this PR looks copy-pasted from there without updating the "from" version.

Impact

The perf changelog is the canonical, user-visible record of what version each config was running. Getting the "from" version wrong defeats its purpose for anyone bisecting a regression or auditing prior runs — they will look for v0.5.11 results that do not exist for this config.

Suggested fix

Change perf-changelog.yaml:2636 from:

    - "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"

to:

    - "Update SGLang image from v0.5.9-cu130 to v0.5.12-cu130"

No runtime behavior changes — just a one-token documentation fix.

@github-actions
Copy link
Copy Markdown
Contributor

@functionstackx
Copy link
Copy Markdown
Collaborator

/reuse-sweep-run

@functionstackx functionstackx merged commit c824b1a into main May 18, 2026
3 of 5 checks passed
@functionstackx functionstackx deleted the claude/issue-1154-dsr1-fp4-b200-sglang branch May 18, 2026 02:46
@github-actions
Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

2 participants