Update dsr1-fp4-b200-sglang SGLang image to v0.5.12-cu130 by Klaud-Cold · Pull Request #1415 · SemiAnalysisAI/InferenceX

Klaud-Cold · 2026-05-17T03:15:58Z

Updates SGLang image for dsr1-fp4-b200-sglang from v0.5.9-cu130 to v0.5.12-cu130.
\nRef #1154

Ref #1154 Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>

github-actions · 2026-05-17T03:16:05Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

github-actions · 2026-05-17T03:16:22Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25980010941
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25980010941

claude

LGTM, straightforward SGLang image version bump.

Extended reasoning...

Overview

This PR bumps the SGLang image for the dsr1-fp4-b200-sglang config from v0.5.9-cu130 to v0.5.12-cu130 in .github/configs/nvidia-master.yaml and adds a corresponding entry to perf-changelog.yaml.

Security risks

None. This is a container image version pin change in a CI/benchmark config; it does not touch auth, crypto, permissions, or any executable code paths in the repo itself.

Level of scrutiny

Low. This is a mechanical, single-line image tag bump matching the established pattern used across the repo for routine SGLang/vLLM upgrades, scoped to one recipe entry.

Other factors

No bugs were flagged by the bug hunting system, there are no outstanding reviewer comments (only the automated recipe-reminder bot post), and the changelog entry is consistent with prior version-bump PRs in this repo.

# Conflicts: # perf-changelog.yaml

github-actions · 2026-05-17T07:22:15Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25980011057
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25980011057

The e2e workflow downloads artifact pattern 'agentic_*' but benchmark-tmpl.yml uploads as 'bmk_agentic_*', so the agentic step always fails on artifact collection. Comment the agentic-coding block on this recipe until the workflow naming is aligned; the rest of the sweep (fixed-seq-len 1k1k + 8k1k) can finish green.

github-actions · 2026-05-17T13:33:40Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25984497141
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25984497141

Rebase to current main so the full-sweep-enabled label can launch a fresh sweep. perf-changelog.yaml restored byte-for-byte from main with the PR entry re-appended at the tail.

functionstackx · 2026-05-17T23:37:30Z

Closing + immediately reopening to retrigger full-sweep-enabled after the merge-main rebase. Branch is now up to date with main.

claude

LGTM — straightforward SGLang image bump; the only finding is a minor changelog typo flagged inline.

Extended reasoning...

Overview

This PR bumps the dsr1-fp4-b200-sglang SGLang image from v0.5.9-cu130 to v0.5.12-cu130 in .github/configs/nvidia-master.yaml, temporarily comments out the agentic-coding scenario for that config (with an explanatory comment pointing at an unrelated artifact-name mismatch in e2e-tests.yml), and adds a corresponding entry to perf-changelog.yaml. Scope is two files, all declarative YAML.

Security risks

None. No code, auth, crypto, or permissions involved — only a container image tag bump and a benchmark scenario toggle.

Level of scrutiny

Low. This is the standard mechanical pattern used across the repo for image bumps, matches similar recent PRs (#1404, #1416, #1394), and the disabled scenario is gated behind a clear inline comment explaining the upstream workflow blocker.

Other factors

The bug hunter flagged a documentation-only nit: the changelog says "from v0.5.11-cu130" but the actual previous version (per the diff and the PR description) was v0.5.9-cu130. This appears to be a copy-paste artifact from the adjacent #1416 entry. It does not affect runtime behavior, only the human-readable changelog, so I am comfortable approving with the inline comment attached. The PR author also already re-triggered the full-sweep after a merge-main rebase per their latest comment.

claude · 2026-05-17T23:45:40Z

+- config-keys:
+    - dsr1-fp4-b200-sglang
+  description:
+    - "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"
+    - "Temporarily disable agentic-coding scenario (blocked by e2e-tests.yml artifact-name mismatch)"
+  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1415


🟡 The new changelog entry says "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130", but the diff in .github/configs/nvidia-master.yaml shows the previous image was v0.5.9-cu130, not v0.5.11. The PR title and description also state v0.5.9 → v0.5.12, so the "from" version in the changelog should read v0.5.9-cu130.

Extended reasoning...

What the bug is

The new entry in perf-changelog.yaml (lines 2633-2638) describes this PR as:

"Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"

But the actual change in .github/configs/nvidia-master.yaml (line 1673) is:

dsr1-fp4-b200-sglang: - image: lmsysorg/sglang:v0.5.9-cu130 + image: lmsysorg/sglang:v0.5.12-cu130

The previous version was v0.5.9-cu130, not v0.5.11. The PR title ("Update dsr1-fp4-b200-sglang SGLang image to v0.5.12-cu130") and the PR description ("Updates SGLang image for dsr1-fp4-b200-sglang from v0.5.9-cu130 to v0.5.12-cu130") both confirm this.

Step-by-step proof

Look at the PR diff for .github/configs/nvidia-master.yaml line 1673: - image: lmsysorg/sglang:v0.5.9-cu130 / + image: lmsysorg/sglang:v0.5.12-cu130. The pre-image is v0.5.9.

Cross-check the prior changelog entry that brought dsr1-fp4-b200-sglang to its current version: perf-changelog.yaml:1172 records "dsr1-fp4-b200-sglang: v0.5.6-cu129-amd64 → v0.5.9-cu130". So v0.5.9 was indeed the last shipped version, and v0.5.11 was never used for this config-key on main.

The new entry at perf-changelog.yaml:2636 nonetheless claims "from v0.5.11-cu130". This contradicts (1) and (2).

Likely root cause

The adjacent entry at perf-changelog.yaml:2614-2618 (for PR #1416, dsr1-fp8-b200-sglang and dsr1-fp8-b200-sglang-mtp) legitimately goes from v0.5.11 → v0.5.12. The wording for this PR looks copy-pasted from there without updating the "from" version.

Impact

The perf changelog is the canonical, user-visible record of what version each config was running. Getting the "from" version wrong defeats its purpose for anyone bisecting a regression or auditing prior runs — they will look for v0.5.11 results that do not exist for this config.

Suggested fix

Change perf-changelog.yaml:2636 from:

- "Update SGLang image from v0.5.11-cu130 to v0.5.12-cu130"

to:

- "Update SGLang image from v0.5.9-cu130 to v0.5.12-cu130"

No runtime behavior changes — just a one-token documentation fix.

github-actions · 2026-05-18T02:26:19Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26006037916
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26006037916

functionstackx · 2026-05-18T02:45:46Z

/reuse-sweep-run

github-actions · 2026-05-18T02:46:26Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26010770149
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26010770149

Update dsr1-fp4-b200-sglang SGLang image to v0.5.12-cu130

d8746da

Ref #1154 Co-authored-by: Klaud Cold <Klaud-Cold@users.noreply.github.com>

Klaud-Cold requested a review from a team May 17, 2026 03:15

Klaud-Cold added the full-sweep-enabled label May 17, 2026

Klaud-Cold requested review from jgangani and kedarpotdar-nv as code owners May 17, 2026 03:15

Klaud-Cold added the full-sweep-enabled label May 17, 2026

github-project-automation Bot added this to InferenceMAX Board May 17, 2026

Klaud-Cold mentioned this pull request May 17, 2026

[Auto] Docker Image Updates Available - 2026-04-25 #1154

Open

claude Bot reviewed May 17, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into HEAD

dcdb600

# Conflicts: # perf-changelog.yaml

Merge main into PR branch + restore changelog cleanly

358482e

Rebase to current main so the full-sweep-enabled label can launch a fresh sweep. perf-changelog.yaml restored byte-for-byte from main with the PR entry re-appended at the tail.

functionstackx closed this May 17, 2026

github-project-automation Bot moved this to Done in InferenceMAX Board May 17, 2026

functionstackx reopened this May 17, 2026

claude Bot reviewed May 17, 2026

View reviewed changes

Merge branch 'main' into claude/issue-1154-dsr1-fp4-b200-sglang

fd2934c

functionstackx merged commit c824b1a into main May 18, 2026
3 of 5 checks passed

functionstackx deleted the claude/issue-1154-dsr1-fp4-b200-sglang branch May 18, 2026 02:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update dsr1-fp4-b200-sglang SGLang image to v0.5.12-cu130#1415

Update dsr1-fp4-b200-sglang SGLang image to v0.5.12-cu130#1415
functionstackx merged 5 commits into
mainfrom
claude/issue-1154-dsr1-fp4-b200-sglang

Klaud-Cold commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

claude Bot left a comment

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

functionstackx commented May 17, 2026

Uh oh!

claude Bot left a comment

Uh oh!

claude Bot May 17, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

functionstackx commented May 18, 2026

Uh oh!

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Klaud-Cold commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

functionstackx commented May 17, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

claude Bot May 17, 2026

Choose a reason for hiding this comment

What the bug is

Step-by-step proof

Likely root cause

Impact

Suggested fix

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

functionstackx commented May 18, 2026

Uh oh!

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants