[Klaud Cold] Update minimaxm2.5-fp8-b300-vllm vLLM image to v0.21.0 by functionstackx · Pull Request #1454 · SemiAnalysisAI/InferenceX

functionstackx · 2026-05-17T23:44:28Z

Summary

Bumps minimaxm2.5-fp8-b300-vllm from vllm/vllm-openai:v0.19.0-cu130 (26d old) to vllm/vllm-openai:v0.21.0.

Test plan

Full sweep passes with full-sweep-enabled label.

🤖 Generated with Claude Code

github-actions · 2026-05-17T23:44:36Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

github-actions · 2026-05-17T23:44:36Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

github-actions · 2026-05-17T23:44:58Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26006185504
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26006185504

claude

LGTM — straightforward image bump following the established pattern; the inline nit about the stale divergence comment is non-functional.

Extended reasoning...

Overview

This PR bumps the minimaxm2.5-fp8-b300-vllm image from vllm/vllm-openai:v0.19.0-cu130 to vllm/vllm-openai:v0.21.0 in .github/configs/nvidia-master.yaml and adds a corresponding perf-changelog.yaml entry. Two-file, ~7-line mechanical change consistent with recent sibling bumps (#1404, #1394, #1416).

Security risks

None. This is a YAML config update pinning a public container image tag for benchmark runs. No code paths, secrets, auth, or permissions are touched.

Level of scrutiny

Very low. Single-line version bump in a benchmark config file, following the same pattern as multiple recently merged image-bump PRs. The full-sweep-enabled label means CI will validate the new image against the full sweep before merge.

Other factors

The bug hunter raised a single [Nit] about a stale divergence comment block (lines 4329-4332) that still references the pre-bump version delta and claims the b300-vllm entry is identical to origin/main. That is a documentation accuracy issue, not a functional bug — the YAML data itself is consistent, and the inline comment is already attached for the author to address if they want. No other concerns.

claude · 2026-05-17T23:49:13Z


 minimaxm2.5-fp8-b300-vllm:
-  image: vllm/vllm-openai:v0.19.0-cu130
+  image: vllm/vllm-openai:v0.21.0


🟡 The divergence comment block at .github/configs/nvidia-master.yaml:4329-4332 becomes stale after this bump: it still claims the b300-vllm entry is 'left identical to origin/main' (no longer true) and records the divergence as 'v0.19.0-cu130 -> v0.19.1' when the actual divergence is now 'v0.21.0 -> v0.19.1' (with b300-vllm-agentic at line 4334 still pinned to v0.19.1). Either update the comment to reflect the new divergence, or reconsider whether bumping b300-vllm while leaving b300-vllm-agentic pinned is intended.

Extended reasoning...

What the bug is\n\nThis PR bumps minimaxm2.5-fp8-b300-vllm at .github/configs/nvidia-master.yaml:4306 from vllm/vllm-openai:v0.19.0-cu130 to vllm/vllm-openai:v0.21.0, but does not touch the adjacent divergence comment block at lines 4329-4332 that documents this exact image. The b300-vllm-agentic sibling at line 4334 remains pinned to vllm/vllm-openai:v0.19.1.\n\n### Why both claims in the comment are now wrong\n\nThe comment currently reads:\n\nyaml\n# Diverged from minimaxm2.5-fp8-b300-vllm (agentic-coding sibling). Reasons below;\n# the original minimaxm2.5-fp8-b300-vllm entry is left identical to origin/main so\n# its fixed-seq-len sweep is unaffected.\n# - image: 'vllm/vllm-openai:v0.19.0-cu130' -> 'vllm/vllm-openai:v0.19.1'\n\n\n1. Claim 1 — 'left identical to origin/main': This PR explicitly modifies the b300-vllm entry, so it is no longer identical to origin/main. The rationale for forking the -agentic sibling (preserving an unchanged fixed-seq-len sweep history on the upstream image) is no longer being honored.\n2. Claim 2 — divergence delta: The recorded delta 'v0.19.0-cu130' -> 'v0.19.1' is now inaccurate on both sides: the upstream b300-vllm is now v0.21.0 (different version, -cu130 suffix dropped), so the actual divergence becomes v0.21.0 -> v0.19.1.\n\n### Step-by-step proof\n\n1. git blame shows the comment was introduced by commit e92a9bf9 (Cameron Quilici, 'Migrate agentic-coding benchmarks to aiperf v0.2'). At that time, b300-vllm was on v0.19.0-cu130 and the -agentic sibling was intentionally forked to v0.19.1 to keep the original sweep on the upstream image.\n2. This PR changes line 4306 image: vllm/vllm-openai:v0.19.0-cu130 → image: vllm/vllm-openai:v0.21.0.\n3. Line 4334 (b300-vllm-agentic image) is untouched by this PR — it still reads vllm/vllm-openai:v0.19.1.\n4. Lines 4329-4332 are also untouched — they still describe the prior divergence delta and assert b300-vllm is identical to origin/main.\n5. Therefore, after the merge, the comment misrepresents both the divergence delta (says v0.19.0-cu130 -> v0.19.1, actually v0.21.0 -> v0.19.1) and the design invariant (says b300-vllm is unchanged from origin/main, but it just changed).\n\n### Impact and fix\n\nThe comment is pure YAML #-prefixed text — there is no runtime impact and the sweep configuration itself is consistent. However, the comment actively misleads future readers and reviewers about why the -agentic sibling exists as a separate entry. The fix is to either (a) update the comment block to record the new v0.21.0 -> v0.19.1 divergence and drop the 'identical to origin/main' wording, or (b) also bump b300-vllm-agentic to v0.21.0 in this PR so the divergence is closed and the comment can be removed entirely. The b200 sibling block at lines 4278-4282 is unaffected.

Update minimaxm2.5-fp8-b300-vllm vLLM image to v0.21.0

e4a190b

functionstackx requested a review from a team May 17, 2026 23:44

functionstackx added the full-sweep-enabled label May 17, 2026

functionstackx requested review from jgangani and kedarpotdar-nv as code owners May 17, 2026 23:44

github-project-automation Bot added this to InferenceMAX Board May 17, 2026

chore: fill pr-link for #1454

a1d561e

functionstackx changed the title ~~Update minimaxm2.5-fp8-b300-vllm vLLM image to v0.21.0~~ [Klaud Cold] Update minimaxm2.5-fp8-b300-vllm vLLM image to v0.21.0 May 17, 2026

claude Bot reviewed May 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Klaud Cold] Update minimaxm2.5-fp8-b300-vllm vLLM image to v0.21.0#1454

[Klaud Cold] Update minimaxm2.5-fp8-b300-vllm vLLM image to v0.21.0#1454
functionstackx wants to merge 2 commits into
mainfrom
update-minimaxm2.5-fp8-b300-vllm-v0.21.0

functionstackx commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

claude Bot left a comment

Uh oh!

claude Bot May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

functionstackx commented May 17, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

claude Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant