[Klaud Cold] Update kimik2.5-int4-b300-vllm vLLM image to v0.21.0#1457
[Klaud Cold] Update kimik2.5-int4-b300-vllm vLLM image to v0.21.0#1457functionstackx wants to merge 2 commits into
Conversation
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26006199572 |
There was a problem hiding this comment.
LGTM — straightforward vLLM image bump matching the pattern from recent sibling PRs (#1404, #1394, #1416).
Extended reasoning...
Overview
This PR is a mechanical Docker image version bump for the kimik2.5-int4-b300-vllm recipe in .github/configs/nvidia-master.yaml (vllm/vllm-openai:v0.20.0-cu130 → vllm/vllm-openai:v0.21.0), plus a corresponding entry appended to perf-changelog.yaml. Two files changed, ~8 lines added total.
Security risks
None. This is a config-only change pointing at a publicly published, pinned upstream image tag from the official vLLM project. No code, auth, crypto, or permission paths are touched.
Level of scrutiny
Low. This matches an established pattern visible in the recent commit history — three near-identical sibling PRs (#1404, #1394, #1416) bumped other recipes to v0.21.0 and were merged the same way. The PR is gated by the full-sweep-enabled label which triggers actual sweep validation, so functional correctness is verified by CI rather than by reading the diff.
Other factors
The bug hunter flagged a single cosmetic nit: the changelog description includes a (14d old) parenthetical that no other entry in perf-changelog.yaml has, and the relative age becomes meaningless after merge. It's a one-line description edit with no runtime impact — the inline comment captures the suggested fix and the author can adopt it or not without holding up the bump. Not blocking.
| - kimik2.5-int4-b300-vllm | ||
| description: | ||
| - "Update vLLM image from v0.20.0-cu130 (14d old) to v0.21.0" | ||
| pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1457 |
There was a problem hiding this comment.
🟡 The changelog entry's description includes an inconsistent (14d old) parenthetical that doesn't appear in any other entry in perf-changelog.yaml. Every other description in the file uses the clean pattern Update {vLLM,SGLang,vLLM ROCm} image from vA to vB, and the 14d age is relative to PR creation time so it becomes meaningless after merge. Suggest changing the description to "Update vLLM image from v0.20.0-cu130 to v0.21.0" for consistency.
Extended reasoning...
What the issue is
The new entry added at perf-changelog.yaml lines 2632–2637 contains the description:
- "Update vLLM image from v0.20.0-cu130 (14d old) to v0.21.0"The (14d old) parenthetical is unique to this entry. Every other description in perf-changelog.yaml follows the clean pattern Update {vLLM,SGLang,vLLM ROCm} image from vA to vB with no age annotation — see for example the immediately preceding entries at lines 2581, 2587, 2593, 2599, 2605, 2611, 2618, 2624, and 2630.
Proof / verification
Running grep -nE '\([0-9]+d old\)' perf-changelog.yaml returns exactly one match: the new line 2636 added by this PR. The same (14d old) phrasing appears in the PR description ("Bumps kimik2.5-int4-b300-vllm from vllm/vllm-openai:v0.20.0-cu130 (14d old) to ..."), strongly suggesting it leaked into the changelog from the automated PR-generation pipeline that produces both the PR body and the changelog entry.
Why it doesn't belong here
perf-changelog.yaml is a persistent, user-facing record of image bumps. "14 days old" is a value measured relative to PR creation time on 2026-05-17 — once merged, the relative age becomes meaningless and even misleading (a reader six months from now has no way to interpret it). The PR body is the right place for that context; the durable changelog should match the established pattern.
Impact
Cosmetic / consistency only — no runtime effect. Worth fixing because the changelog is the canonical reference for these bumps and it's a one-line edit.
Fix
Change line 2636 to:
- "Update vLLM image from v0.20.0-cu130 to v0.21.0"
Summary
kimik2.5-int4-b300-vllmfromvllm/vllm-openai:v0.20.0-cu130(14d old) tovllm/vllm-openai:v0.21.0.Test plan
full-sweep-enabledlabel.🤖 Generated with Claude Code