Skip to content

dsr1-sglang: extend low-conc sweep to include conc=1 and conc=2#1534

Merged
Ankur-singh merged 6 commits into
mainfrom
dsr1-conc-1-2
May 22, 2026
Merged

dsr1-sglang: extend low-conc sweep to include conc=1 and conc=2#1534
Ankur-singh merged 6 commits into
mainfrom
dsr1-conc-1-2

Conversation

@Ankur-singh
Copy link
Copy Markdown
Collaborator

Summary

Lowers conc-start from 4 to 1 in every search-space of the six DSR1 SGLang agg configs re-run by #1502 (dsr1-fp{4,8}-b{200,300}-sglang and dsr1-fp8-b{200,300}-sglang-mtp). With step factor 2 this adds conc=1 and conc=2 data points; conc-end is unchanged. Appends a corresponding perf-changelog entry (pr-link will be patched in a follow-up commit).

…P/MTP)

Lowers conc-start from 4 to 1 in all six DSR1 SGLang agg search-spaces
re-run by #1502: dsr1-fp{4,8}-b{200,300}-sglang and dsr1-fp8-b{200,300}-sglang-mtp.
With step-factor 2 this adds conc=1 and conc=2 data points; conc-end is
unchanged.
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

Comment thread perf-changelog.yaml Outdated
- dsr1-fp8-b300-sglang-mtp
description:
- "Extend low-conc sweep: lower conc-start from 4 to 1 in all search-spaces, adding conc=1 and conc=2 data points (conc-end unchanged, step factor 2)"
pr-link: TBD
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The new perf-changelog entry uses pr-link: TBD, but the canonical placeholder is pr-link: XXX (documented in AGENTS.md:127 and used by existing placeholder entries at perf-changelog.yaml:2516 and :2570). utils/merge_with_reuse.sh:135-137 explicitly filters PR-side entries with if "XXX" not in link and not link.endswith(f"/pull/{pr}"): continue, so TBD matches neither and the entry will be silently skipped — causing the merge script to abort at lines 153-157 with "No PR contributions found...". Fix: change TBD to XXX.

Extended reasoning...

What the bug is

The new perf-changelog entry added at perf-changelog.yaml:3063 uses pr-link: TBD as a placeholder for the not-yet-known PR number. While the intent is reasonable (the PR description explicitly says "pr-link will be patched in a follow-up commit"), the literal string TBD is not the placeholder string that the merge automation recognizes.

The canonical convention

Two pieces of evidence establish that XXX is the canonical placeholder, not TBD:

  1. AGENTS.md:127 documents the convention as: pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX
  2. Existing in-flight placeholder entries in this same file use pr-link: XXX — see perf-changelog.yaml:2516 and perf-changelog.yaml:2570.

Why existing code doesn't handle TBD

utils/merge_with_reuse.sh:135-137 filters candidate PR-side entries with:

link = str(entry.get("pr-link") or "")
if "XXX" not in link and not link.endswith(f"/pull/{pr}"):
    continue

The string TBD contains no substring XXX, and it doesn't end with /pull/<pr-number>. So the entry fails both conditions and is silently skipped.

Step-by-step proof

Walking through the merge_with_reuse flow for this PR (say PR #1534):

  1. The script loads perf-changelog.yaml from the PR side and iterates over candidate entries.
  2. It reaches the new entry at line 3063, reads link = "TBD".
  3. Check 1: "XXX" not in "TBD" → True (no match).
  4. Check 2: "TBD".endswith("/pull/1534") → False.
  5. Both conditions in the if are True, so continue fires → entry is dropped from contribs.
  6. Since this is the only added entry in the diff (verified — perf-changelog.yaml has exactly one added record), contribs is empty.
  7. The script reaches lines 153-157 and aborts with: "No PR contributions found... (expected entry tagged with XXX or /pull/{pr} with new content)".

Impact and fix

When this PR is merged via the standard reuse-sweep flow that merge_with_reuse.sh implements, the merge will fail — or, in the no-conflict path, TBD will land on main verbatim because the automation won't recognize it as a placeholder to rewrite. The PR author's stated plan to "patch in a follow-up commit" doesn't help because the merge script aborts before that commit could be created.

Fix is trivial: change pr-link: TBDpr-link: XXX on line 3063. The merge automation will then rewrite XXX to the real PR URL on merge, exactly as it does for the other in-flight placeholder entries.

Comment thread perf-changelog.yaml Outdated
Comment on lines +3061 to +3062
description:
- "Extend low-conc sweep: lower conc-start from 4 to 1 in all search-spaces, adding conc=1 and conc=2 data points (conc-end unchanged, step factor 2)"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The changelog entry says "lower conc-start from 4 to 1 in all search-spaces" and the PR description says "in every search-space", but four search-space entries in the listed configs still carry conc-start: 4 after this PR — all at isl: 8192: dsr1-fp4-b200-sglang (tp:4/ep:4, line 1690), dsr1-fp4-b300-sglang (tp:4/ep:4, line 1891), dsr1-fp8-b200-sglang (tp:4/ep:1, line 1977), and dsr1-fp8-b300-sglang (tp:4/ep:1, line 2000). The asymmetry within each isl: 8192 block (sibling tp:8 lowered to 1, tp:4 left at 4) is conspicuous — either soften the wording to "most search-spaces" with a note on why the tp:4 isl=8192 entries were skipped, or lower those four entries too if the omission was unintentional.

Extended reasoning...

What the bug is

The new perf-changelog entry (perf-changelog.yaml:3061-3062) and the PR description both make an absolute claim about scope:

  • perf-changelog: "lower conc-start from 4 to 1 in all search-spaces"
  • PR description: "Lowers conc-start from 4 to 1 in every search-space of the six DSR1 SGLang agg configs"

The diff does not match those claims: four search-space entries in the listed configs still have conc-start: 4 after this PR, all in the isl: 8192 block. They are visible in the diff context (the lines printed by the unified-diff but not marked -/+).

The four missed entries

Reading .github/configs/nvidia-master.yaml at the post-PR positions:

File:Line Config isl/osl search-space entry
nvidia-master.yaml:1690 dsr1-fp4-b200-sglang 8192/1024 { tp: 4, ep: 4, conc-start: 4, conc-end: 128 }
nvidia-master.yaml:1891 dsr1-fp4-b300-sglang 8192/1024 { tp: 4, ep: 4, conc-start: 4, conc-end: 128 }
nvidia-master.yaml:1977 dsr1-fp8-b200-sglang 8192/1024 { tp: 4, ep: 1, conc-start: 4, conc-end: 32 }
nvidia-master.yaml:2000 dsr1-fp8-b300-sglang 8192/1024 { tp: 4, ep: 1, conc-start: 4, conc-end: 32 }

Each appears as an unchanged context line in the diff hunks at lines 1682-1694, 1883-1895, 1969-1979, and 1992-2002 of the diff respectively.

Why this is conspicuous

For dsr1-fp4-b200-sglang and dsr1-fp4-b300-sglang, the sibling tp: 8 entry in the same isl: 8192 block was lowered from 4 to 1 in this PR (e.g. { tp: 8, ep: 8, conc-start: 4 → 1, conc-end: 16 }). For dsr1-fp8-b200-sglang and dsr1-fp8-b300-sglang, the same pattern holds: the tp: 8 entry at isl: 8192 was lowered, the tp: 4 entry next to it was not. So within a single isl: 8192 block we now have one entry starting at conc=1 and another starting at conc=4 — that asymmetry will read as an oversight to anyone diffing the file against the changelog.

Step-by-step proof

  1. Open the PR diff for .github/configs/nvidia-master.yaml.
  2. In the hunk starting at line 1682 (dsr1-fp4-b200-sglang), the isl: 1024 block has both tp: 4 and tp: 8 entries changed from conc-start: 4 to conc-start: 1. The isl: 8192 block has only the tp: 8 line changed; the tp: 4, ep: 4, conc-start: 4, conc-end: 128 line appears unchanged (no leading +/-).
  3. The same pattern repeats in the hunks for dsr1-fp4-b300-sglang (line 1883+), dsr1-fp8-b200-sglang (line 1969+), and dsr1-fp8-b300-sglang (line 1992+): every tp: 4 entry at isl: 8192 is left at conc-start: 4.
  4. Open perf-changelog.yaml at line 3061: the description reads "lower conc-start from 4 to 1 in all search-spaces". Four counter-examples are visible in the same diff.
  5. The MTP configs (dsr1-fp8-b{200,300}-sglang-mtp) only have a single tp: 8 entry per isl block, so they have no missed entries — the inaccuracy is confined to the four non-MTP configs.

Impact and fix

This is a changelog/PR-description accuracy issue, not a runtime defect — benchmark execution is unaffected. But the perf-changelog is the primary record of what each PR changed for downstream readers, and saying "all" when it's really "most" will mislead anyone scanning the changelog later. Two clean fixes:

  • Soften the wording: change "in all search-spaces" to "in most search-spaces" (or list which were lowered) and add a one-line note on why the four tp: 4 isl: 8192 entries were intentionally left at conc-start: 4 (cost? not interesting at low conc?). The PR description should be updated to match.
  • Or lower the four missed entries to conc-start: 1 if the omission was an oversight, which would also restore symmetry within each isl: 8192 block.

@github-actions
Copy link
Copy Markdown
Contributor

Sets conc-start=1, conc-end=2 in every search-space across the six DSR1
SGLang agg configs (B200/B300, FP8/FP4, no-MTP/MTP). Also updates the
perf-changelog entry to match.
@github-actions
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Collaborator

@Oseltamivir Oseltamivir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@Ankur-singh Ankur-singh merged commit 6a4fe2a into main May 22, 2026
22 checks passed
@Ankur-singh Ankur-singh deleted the dsr1-conc-1-2 branch May 22, 2026 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

3 participants