Skip to content

retry sglang b300 #1171

Merged
Qiaolin-Yu merged 2 commits intomainfrom
retry
Apr 26, 2026
Merged

retry sglang b300 #1171
Qiaolin-Yu merged 2 commits intomainfrom
retry

Conversation

@Qiaolin-Yu
Copy link
Copy Markdown
Collaborator

No description provided.

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

@Qiaolin-Yu Qiaolin-Yu merged commit b1b0763 into main Apr 26, 2026
16 checks passed
@Qiaolin-Yu Qiaolin-Yu deleted the retry branch April 26, 2026 09:29
Comment thread perf-changelog.yaml
Comment on lines +1859 to +1864
- config-keys:
- dsv4-fp4-b300-sglang
description:
- "Restore the recipe-per-CONC split (low-latency / balanced / max-throughput) on top of the low-latency-only fallback from #1143; the DeepEP FP8 weight-postprocess path is fixed, so the high-throughput scenario runs again"
- "Recipes from https://docs.sglang.io/cookbook/autoregressive/DeepSeek/DeepSeek-V4"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1158
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The new changelog entry at lines 1859-1864 is byte-identical to the existing entry at lines 1782-1787 (whose pr-link was just corrected to #1132 in this same PR), but attributes the same 'Restore the recipe-per-CONC split' description to PR #1158 — which was actually titled 'fix sgl b200/b300 dpsk-v4 script' and did not introduce that work. Per the precedent set by retrigger PRs #1148 (line 1827) and #1160 (line 1835), this PR (#1171, 'retry sglang b300') should use its own pr-link with a 'Retrigger dsv4-fp4-b300-sglang' description rather than copy-pasting from #1132.

Extended reasoning...

What the bug is

After this PR, perf-changelog.yaml contains two entries for dsv4-fp4-b300-sglang whose description arrays are byte-identical:

The same description text cannot truthfully describe two different PRs. Per git log 8a174e0, PR #1158 was titled "fix sgl b200/b300 dpsk-v4 script" — a script fix, not a recipe-per-CONC split restore on top of the #1143 fallback. The narrative attached to it in the new entry simply does not match the actual scope of #1158.

What this PR should have done

This PR is #1171, titled 'retry sglang b300'. The repo already has a clean precedent for retrigger PRs:

Both use the current PR's own pr-link with a short Retrigger ... description. Following this precedent, the new entry should be:

- config-keys:
    - dsv4-fp4-b300-sglang
  description:
    - "Retrigger dsv4-fp4-b300-sglang"
  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1171

Step-by-step proof

  1. Read the diff: line 1787 changes pr-link from fix sgl b200/b300 dpsk-v4 script #1158[NVIDIA] chore: B300 single node DeepSeek v4 SGLang #1132 for the existing entry. This is correct — [NVIDIA] chore: B300 single node DeepSeek v4 SGLang #1132 actually did the recipe-per-CONC restore.
  2. Read the diff: lines 1859-1864 add a new entry with the same description as lines 1782-1787, but pointing back to fix sgl b200/b300 dpsk-v4 script #1158.
  3. Compare to git log: git log --oneline shows 8a174e0 fix sgl b200/b300 dpsk-v4 script (#1158) — clearly not a recipe-split restore.
  4. Compare to the retrigger pattern at lines 1827 and 1835: those entries describe what the retrigger PR itself did and link to the retrigger PR's own number. The new entry violates both halves of that pattern.

Addressing the refutation

One verifier noted that the duplicate-description pattern already exists at lines 1837-1846 (#1165) and 1848-1857 (#1170). That counter-example is not load-bearing here: #1165 and #1170 both describe the same DeepSeek-V4-Pro FP4 MI355X ATOM Day-0 marker work — two PRs that genuinely contributed to the same feature. In contrast, #1132 (recipe-per-CONC split restore) and #1158 (script fix) did structurally different things, so reusing #1132's description for #1158 is factual misattribution, not a stylistic duplicate.

Impact and fix

This is a documentation-only file with no runtime impact, so severity is nit. But changelog correctness is load-bearing for downstream attribution in this repo (e.g., #1161 was a dedicated cleanup PR for exactly this kind of misattribution), and the fix is trivial: replace the new entry's description with "Retrigger dsv4-fp4-b300-sglang" and its pr-link with .../pull/1171.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

1 participant