dsr1-sglang: extend low-conc sweep to include conc=1 and conc=2#1534
Conversation
…P/MTP) Lowers conc-start from 4 to 1 in all six DSR1 SGLang agg search-spaces re-run by #1502: dsr1-fp{4,8}-b{200,300}-sglang and dsr1-fp8-b{200,300}-sglang-mtp. With step-factor 2 this adds conc=1 and conc=2 data points; conc-end is unchanged.
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers. If additional help is needed, PR authors can reach out to core maintainers over Slack. |
| - dsr1-fp8-b300-sglang-mtp | ||
| description: | ||
| - "Extend low-conc sweep: lower conc-start from 4 to 1 in all search-spaces, adding conc=1 and conc=2 data points (conc-end unchanged, step factor 2)" | ||
| pr-link: TBD |
There was a problem hiding this comment.
🔴 The new perf-changelog entry uses pr-link: TBD, but the canonical placeholder is pr-link: XXX (documented in AGENTS.md:127 and used by existing placeholder entries at perf-changelog.yaml:2516 and :2570). utils/merge_with_reuse.sh:135-137 explicitly filters PR-side entries with if "XXX" not in link and not link.endswith(f"/pull/{pr}"): continue, so TBD matches neither and the entry will be silently skipped — causing the merge script to abort at lines 153-157 with "No PR contributions found...". Fix: change TBD to XXX.
Extended reasoning...
What the bug is
The new perf-changelog entry added at perf-changelog.yaml:3063 uses pr-link: TBD as a placeholder for the not-yet-known PR number. While the intent is reasonable (the PR description explicitly says "pr-link will be patched in a follow-up commit"), the literal string TBD is not the placeholder string that the merge automation recognizes.
The canonical convention
Two pieces of evidence establish that XXX is the canonical placeholder, not TBD:
AGENTS.md:127documents the convention as:pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX- Existing in-flight placeholder entries in this same file use
pr-link: XXX— seeperf-changelog.yaml:2516andperf-changelog.yaml:2570.
Why existing code doesn't handle TBD
utils/merge_with_reuse.sh:135-137 filters candidate PR-side entries with:
link = str(entry.get("pr-link") or "")
if "XXX" not in link and not link.endswith(f"/pull/{pr}"):
continueThe string TBD contains no substring XXX, and it doesn't end with /pull/<pr-number>. So the entry fails both conditions and is silently skipped.
Step-by-step proof
Walking through the merge_with_reuse flow for this PR (say PR #1534):
- The script loads
perf-changelog.yamlfrom the PR side and iterates over candidate entries. - It reaches the new entry at line 3063, reads
link = "TBD". - Check 1:
"XXX" not in "TBD"→ True (no match). - Check 2:
"TBD".endswith("/pull/1534")→ False. - Both conditions in the
ifare True, socontinuefires → entry is dropped fromcontribs. - Since this is the only added entry in the diff (verified —
perf-changelog.yamlhas exactly one added record),contribsis empty. - The script reaches lines 153-157 and aborts with: "No PR contributions found... (expected entry tagged with XXX or /pull/{pr} with new content)".
Impact and fix
When this PR is merged via the standard reuse-sweep flow that merge_with_reuse.sh implements, the merge will fail — or, in the no-conflict path, TBD will land on main verbatim because the automation won't recognize it as a placeholder to rewrite. The PR author's stated plan to "patch in a follow-up commit" doesn't help because the merge script aborts before that commit could be created.
Fix is trivial: change pr-link: TBD → pr-link: XXX on line 3063. The merge automation will then rewrite XXX to the real PR URL on merge, exactly as it does for the other in-flight placeholder entries.
| description: | ||
| - "Extend low-conc sweep: lower conc-start from 4 to 1 in all search-spaces, adding conc=1 and conc=2 data points (conc-end unchanged, step factor 2)" |
There was a problem hiding this comment.
🟡 The changelog entry says "lower conc-start from 4 to 1 in all search-spaces" and the PR description says "in every search-space", but four search-space entries in the listed configs still carry conc-start: 4 after this PR — all at isl: 8192: dsr1-fp4-b200-sglang (tp:4/ep:4, line 1690), dsr1-fp4-b300-sglang (tp:4/ep:4, line 1891), dsr1-fp8-b200-sglang (tp:4/ep:1, line 1977), and dsr1-fp8-b300-sglang (tp:4/ep:1, line 2000). The asymmetry within each isl: 8192 block (sibling tp:8 lowered to 1, tp:4 left at 4) is conspicuous — either soften the wording to "most search-spaces" with a note on why the tp:4 isl=8192 entries were skipped, or lower those four entries too if the omission was unintentional.
Extended reasoning...
What the bug is
The new perf-changelog entry (perf-changelog.yaml:3061-3062) and the PR description both make an absolute claim about scope:
- perf-changelog: "lower conc-start from 4 to 1 in all search-spaces"
- PR description: "Lowers
conc-startfrom4to1in every search-space of the six DSR1 SGLang agg configs"
The diff does not match those claims: four search-space entries in the listed configs still have conc-start: 4 after this PR, all in the isl: 8192 block. They are visible in the diff context (the lines printed by the unified-diff but not marked -/+).
The four missed entries
Reading .github/configs/nvidia-master.yaml at the post-PR positions:
| File:Line | Config | isl/osl | search-space entry |
|---|---|---|---|
| nvidia-master.yaml:1690 | dsr1-fp4-b200-sglang | 8192/1024 | { tp: 4, ep: 4, conc-start: 4, conc-end: 128 } |
| nvidia-master.yaml:1891 | dsr1-fp4-b300-sglang | 8192/1024 | { tp: 4, ep: 4, conc-start: 4, conc-end: 128 } |
| nvidia-master.yaml:1977 | dsr1-fp8-b200-sglang | 8192/1024 | { tp: 4, ep: 1, conc-start: 4, conc-end: 32 } |
| nvidia-master.yaml:2000 | dsr1-fp8-b300-sglang | 8192/1024 | { tp: 4, ep: 1, conc-start: 4, conc-end: 32 } |
Each appears as an unchanged context line in the diff hunks at lines 1682-1694, 1883-1895, 1969-1979, and 1992-2002 of the diff respectively.
Why this is conspicuous
For dsr1-fp4-b200-sglang and dsr1-fp4-b300-sglang, the sibling tp: 8 entry in the same isl: 8192 block was lowered from 4 to 1 in this PR (e.g. { tp: 8, ep: 8, conc-start: 4 → 1, conc-end: 16 }). For dsr1-fp8-b200-sglang and dsr1-fp8-b300-sglang, the same pattern holds: the tp: 8 entry at isl: 8192 was lowered, the tp: 4 entry next to it was not. So within a single isl: 8192 block we now have one entry starting at conc=1 and another starting at conc=4 — that asymmetry will read as an oversight to anyone diffing the file against the changelog.
Step-by-step proof
- Open the PR diff for
.github/configs/nvidia-master.yaml. - In the hunk starting at line 1682 (
dsr1-fp4-b200-sglang), theisl: 1024block has bothtp: 4andtp: 8entries changed fromconc-start: 4toconc-start: 1. Theisl: 8192block has only thetp: 8line changed; thetp: 4, ep: 4, conc-start: 4, conc-end: 128line appears unchanged (no leading+/-). - The same pattern repeats in the hunks for
dsr1-fp4-b300-sglang(line 1883+),dsr1-fp8-b200-sglang(line 1969+), anddsr1-fp8-b300-sglang(line 1992+): everytp: 4entry atisl: 8192is left atconc-start: 4. - Open
perf-changelog.yamlat line 3061: the description reads "lower conc-start from 4 to 1 in all search-spaces". Four counter-examples are visible in the same diff. - The MTP configs (
dsr1-fp8-b{200,300}-sglang-mtp) only have a singletp: 8entry per isl block, so they have no missed entries — the inaccuracy is confined to the four non-MTP configs.
Impact and fix
This is a changelog/PR-description accuracy issue, not a runtime defect — benchmark execution is unaffected. But the perf-changelog is the primary record of what each PR changed for downstream readers, and saying "all" when it's really "most" will mislead anyone scanning the changelog later. Two clean fixes:
- Soften the wording: change "in all search-spaces" to "in most search-spaces" (or list which were lowered) and add a one-line note on why the four
tp: 4isl: 8192entries were intentionally left atconc-start: 4(cost? not interesting at low conc?). The PR description should be updated to match. - Or lower the four missed entries to
conc-start: 1if the omission was an oversight, which would also restore symmetry within eachisl: 8192block.
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26187337365 |
Sets conc-start=1, conc-end=2 in every search-space across the six DSR1 SGLang agg configs (B200/B300, FP8/FP4, no-MTP/MTP). Also updates the perf-changelog entry to match.
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26194109243 |
# Conflicts: # perf-changelog.yaml
Summary
Lowers
conc-startfrom4to1in every search-space of the six DSR1 SGLang agg configs re-run by #1502 (dsr1-fp{4,8}-b{200,300}-sglanganddsr1-fp8-b{200,300}-sglang-mtp). With step factor 2 this addsconc=1andconc=2data points;conc-endis unchanged. Appends a corresponding perf-changelog entry (pr-linkwill be patched in a follow-up commit).