[Klaud Cold] Update kimik2.5-int4-b300-vllm vLLM image to v0.21.0 by functionstackx · Pull Request #1457 · SemiAnalysisAI/InferenceX

functionstackx · 2026-05-17T23:45:04Z

Summary

Bumps kimik2.5-int4-b300-vllm from vllm/vllm-openai:v0.20.0-cu130 (14d old) to vllm/vllm-openai:v0.21.0.

Test plan

Full sweep passes with full-sweep-enabled label.

🤖 Generated with Claude Code

github-actions · 2026-05-17T23:45:12Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

github-actions · 2026-05-17T23:45:39Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26006199572
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26006199572

claude

LGTM — straightforward vLLM image bump matching the pattern from recent sibling PRs (#1404, #1394, #1416).

Extended reasoning...

Overview

This PR is a mechanical Docker image version bump for the kimik2.5-int4-b300-vllm recipe in .github/configs/nvidia-master.yaml (vllm/vllm-openai:v0.20.0-cu130 → vllm/vllm-openai:v0.21.0), plus a corresponding entry appended to perf-changelog.yaml. Two files changed, ~8 lines added total.

Security risks

None. This is a config-only change pointing at a publicly published, pinned upstream image tag from the official vLLM project. No code, auth, crypto, or permission paths are touched.

Level of scrutiny

Low. This matches an established pattern visible in the recent commit history — three near-identical sibling PRs (#1404, #1394, #1416) bumped other recipes to v0.21.0 and were merged the same way. The PR is gated by the full-sweep-enabled label which triggers actual sweep validation, so functional correctness is verified by CI rather than by reading the diff.

Other factors

The bug hunter flagged a single cosmetic nit: the changelog description includes a (14d old) parenthetical that no other entry in perf-changelog.yaml has, and the relative age becomes meaningless after merge. It's a one-line description edit with no runtime impact — the inline comment captures the suggested fix and the author can adopt it or not without holding up the bump. Not blocking.

claude · 2026-05-17T23:49:47Z

+    - kimik2.5-int4-b300-vllm
+  description:
+    - "Update vLLM image from v0.20.0-cu130 (14d old) to v0.21.0"
+  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1457


🟡 The changelog entry's description includes an inconsistent (14d old) parenthetical that doesn't appear in any other entry in perf-changelog.yaml. Every other description in the file uses the clean pattern Update {vLLM,SGLang,vLLM ROCm} image from vA to vB, and the 14d age is relative to PR creation time so it becomes meaningless after merge. Suggest changing the description to "Update vLLM image from v0.20.0-cu130 to v0.21.0" for consistency.

Extended reasoning...

What the issue is

The new entry added at perf-changelog.yaml lines 2632–2637 contains the description:

- "Update vLLM image from v0.20.0-cu130 (14d old) to v0.21.0"

The (14d old) parenthetical is unique to this entry. Every other description in perf-changelog.yaml follows the clean pattern Update {vLLM,SGLang,vLLM ROCm} image from vA to vB with no age annotation — see for example the immediately preceding entries at lines 2581, 2587, 2593, 2599, 2605, 2611, 2618, 2624, and 2630.

Proof / verification

Running grep -nE '\([0-9]+d old\)' perf-changelog.yaml returns exactly one match: the new line 2636 added by this PR. The same (14d old) phrasing appears in the PR description ("Bumps kimik2.5-int4-b300-vllm from vllm/vllm-openai:v0.20.0-cu130 (14d old) to ..."), strongly suggesting it leaked into the changelog from the automated PR-generation pipeline that produces both the PR body and the changelog entry.

Why it doesn't belong here

perf-changelog.yaml is a persistent, user-facing record of image bumps. "14 days old" is a value measured relative to PR creation time on 2026-05-17 — once merged, the relative age becomes meaningless and even misleading (a reader six months from now has no way to interpret it). The PR body is the right place for that context; the durable changelog should match the established pattern.

Impact

Cosmetic / consistency only — no runtime effect. Worth fixing because the changelog is the canonical reference for these bumps and it's a one-line edit.

Fix

Change line 2636 to:

- "Update vLLM image from v0.20.0-cu130 to v0.21.0"

Update kimik2.5-int4-b300-vllm vLLM image to v0.21.0

6933092

functionstackx requested a review from a team May 17, 2026 23:45

functionstackx added the full-sweep-enabled label May 17, 2026

functionstackx requested review from jgangani and kedarpotdar-nv as code owners May 17, 2026 23:45

github-project-automation Bot added this to InferenceMAX Board May 17, 2026

chore: fill pr-link for #1457

b3dd2d7

functionstackx changed the title ~~Update kimik2.5-int4-b300-vllm vLLM image to v0.21.0~~ [Klaud Cold] Update kimik2.5-int4-b300-vllm vLLM image to v0.21.0 May 17, 2026

claude Bot reviewed May 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Klaud Cold] Update kimik2.5-int4-b300-vllm vLLM image to v0.21.0#1457

[Klaud Cold] Update kimik2.5-int4-b300-vllm vLLM image to v0.21.0#1457
functionstackx wants to merge 2 commits into
mainfrom
update-kimik2.5-int4-b300-vllm-v0.21.0

functionstackx commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

claude Bot left a comment

Uh oh!

claude Bot May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

functionstackx commented May 17, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

github-actions Bot commented May 17, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

claude Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant