retry sglang b300 #1171
Conversation
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
1 similar comment
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
| - config-keys: | ||
| - dsv4-fp4-b300-sglang | ||
| description: | ||
| - "Restore the recipe-per-CONC split (low-latency / balanced / max-throughput) on top of the low-latency-only fallback from #1143; the DeepEP FP8 weight-postprocess path is fixed, so the high-throughput scenario runs again" | ||
| - "Recipes from https://docs.sglang.io/cookbook/autoregressive/DeepSeek/DeepSeek-V4" | ||
| pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1158 |
There was a problem hiding this comment.
🟡 The new changelog entry at lines 1859-1864 is byte-identical to the existing entry at lines 1782-1787 (whose pr-link was just corrected to #1132 in this same PR), but attributes the same 'Restore the recipe-per-CONC split' description to PR #1158 — which was actually titled 'fix sgl b200/b300 dpsk-v4 script' and did not introduce that work. Per the precedent set by retrigger PRs #1148 (line 1827) and #1160 (line 1835), this PR (#1171, 'retry sglang b300') should use its own pr-link with a 'Retrigger dsv4-fp4-b300-sglang' description rather than copy-pasting from #1132.
Extended reasoning...
What the bug is
After this PR, perf-changelog.yaml contains two entries for dsv4-fp4-b300-sglang whose description arrays are byte-identical:
- Lines 1782-1787,
pr-link: .../pull/1132— the description was just corrected from fix sgl b200/b300 dpsk-v4 script #1158 to [NVIDIA] chore: B300 single node DeepSeek v4 SGLang #1132 in this PR (the existing recipe-per-CONC split work). - Lines 1859-1864,
pr-link: .../pull/1158— newly added by this PR with the same text, attributed to a different PR.
The same description text cannot truthfully describe two different PRs. Per git log 8a174e0, PR #1158 was titled "fix sgl b200/b300 dpsk-v4 script" — a script fix, not a recipe-per-CONC split restore on top of the #1143 fallback. The narrative attached to it in the new entry simply does not match the actual scope of #1158.
What this PR should have done
This PR is #1171, titled 'retry sglang b300'. The repo already has a clean precedent for retrigger PRs:
- Line 1827, PR Increase timeout #1148:
description: 'Retriggering dsv4-fp8-mi355x-sglang',pr-link: .../pull/1148 - Line 1835, PR mi355x dsv4 retry #1160:
description: 'Retrigger dsv4-fp8-mi355x-sglang',pr-link: .../pull/1160
Both use the current PR's own pr-link with a short Retrigger ... description. Following this precedent, the new entry should be:
- config-keys:
- dsv4-fp4-b300-sglang
description:
- "Retrigger dsv4-fp4-b300-sglang"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1171Step-by-step proof
- Read the diff: line 1787 changes
pr-linkfrom fix sgl b200/b300 dpsk-v4 script #1158 → [NVIDIA] chore: B300 single node DeepSeek v4 SGLang #1132 for the existing entry. This is correct — [NVIDIA] chore: B300 single node DeepSeek v4 SGLang #1132 actually did the recipe-per-CONC restore. - Read the diff: lines 1859-1864 add a new entry with the same description as lines 1782-1787, but pointing back to fix sgl b200/b300 dpsk-v4 script #1158.
- Compare to git log:
git log --onelineshows8a174e0 fix sgl b200/b300 dpsk-v4 script (#1158)— clearly not a recipe-split restore. - Compare to the retrigger pattern at lines 1827 and 1835: those entries describe what the retrigger PR itself did and link to the retrigger PR's own number. The new entry violates both halves of that pattern.
Addressing the refutation
One verifier noted that the duplicate-description pattern already exists at lines 1837-1846 (#1165) and 1848-1857 (#1170). That counter-example is not load-bearing here: #1165 and #1170 both describe the same DeepSeek-V4-Pro FP4 MI355X ATOM Day-0 marker work — two PRs that genuinely contributed to the same feature. In contrast, #1132 (recipe-per-CONC split restore) and #1158 (script fix) did structurally different things, so reusing #1132's description for #1158 is factual misattribution, not a stylistic duplicate.
Impact and fix
This is a documentation-only file with no runtime impact, so severity is nit. But changelog correctness is load-bearing for downstream attribution in this repo (e.g., #1161 was a dedicated cleanup PR for exactly this kind of misattribution), and the fix is trivial: replace the new entry's description with "Retrigger dsv4-fp4-b300-sglang" and its pr-link with .../pull/1171.
No description provided.