Skip to content

[Klaud Cold] Update glm5-fp8-b200-sglang (+mtp) SGLang image to v0.5.12-cu130#1447

Open
functionstackx wants to merge 4 commits into
mainfrom
update-glm5-fp8-b200-sglang-v0.5.12
Open

[Klaud Cold] Update glm5-fp8-b200-sglang (+mtp) SGLang image to v0.5.12-cu130#1447
functionstackx wants to merge 4 commits into
mainfrom
update-glm5-fp8-b200-sglang-v0.5.12

Conversation

@functionstackx
Copy link
Copy Markdown
Collaborator

Summary

  • Bumps glm5-fp8-b200-sglang and glm5-fp8-b200-sglang-mtp from lmsysorg/sglang:nightly-dev-cu13-20260317-1eea7448 (33/29 days old) to lmsysorg/sglang:v0.5.12-cu130 (matches other b200 cu130 sglang recipes on main).

Test plan

  • Full sweep passes with full-sweep-enabled label.

🤖 Generated with Claude Code

Drop the 33/29-day-old pinned nightly lmsysorg/sglang:nightly-dev-cu13-20260317-1eea7448
in favor of the stable v0.5.12-cu130 release tag used by other b200 cu130 sglang recipes
on main.
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

2 similar comments
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

@github-actions
Copy link
Copy Markdown
Contributor

@functionstackx functionstackx changed the title Update glm5-fp8-b200-sglang (+mtp) SGLang image to v0.5.12-cu130 [Klaud Cold] Update glm5-fp8-b200-sglang (+mtp) SGLang image to v0.5.12-cu130 May 17, 2026
Comment thread perf-changelog.yaml
Comment on lines +2633 to +2635
- config-keys:
- glm5-fp8-b200-sglang
- glm5-fp8-b200-sglang-mtp
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The PR also bumps the image for glm5-fp8-b200-sglang-agentic at nvidia-master.yaml:2237 (same change as the two siblings), but the new perf-changelog entry only lists glm5-fp8-b200-sglang and glm5-fp8-b200-sglang-mtp under config-keys. Add - glm5-fp8-b200-sglang-agentic to the list (and ideally mention it in the title/description) so the changelog reflects all three bumps and downstream changelog-driven sweep generation picks up the agentic variant on merge.

Extended reasoning...

What the bug is

This PR bumps the image tag for three config entries in .github/configs/nvidia-master.yaml from lmsysorg/sglang:nightly-dev-cu13-20260317-1eea7448 to lmsysorg/sglang:v0.5.12-cu130:

  • glm5-fp8-b200-sglang (line 2191)
  • glm5-fp8-b200-sglang-mtp (line 2210)
  • glm5-fp8-b200-sglang-agentic (line 2237)

However, the new perf-changelog.yaml entry (lines 2632-2638) only lists the first two under config-keys. The PR title ("Update glm5-fp8-b200-sglang (+mtp)"), description ("33/29 days old" — only two age values), and commit message also omit the agentic sibling, strongly suggesting this is an oversight rather than an intentional exclusion.

Why it matters

Per AGENTS.md lines 113-124, every image bump in a *-master.yaml must be paired with a perf-changelog.yaml entry ("required - triggers benchmarks"). The utils/process_changelog.py helper get_config_keys_from_master() resolves the listed config-keys and feeds them to generate_sweep_configs.py test-config --config-keys .... Configs that are missing from the config-keys list are simply skipped by the changelog-driven sweep generation.

The full-sweep-enabled label on this PR happens to cover all three configs for the PR-time sweep, so functionally the PR itself will benchmark the agentic variant. But the merge-time / post-merge baseline refresh path triggered by run-sweep.yml on paths: perf-changelog.yaml keys off the added entry's config-keys — so once merged, the agentic variant's image bump will not get a baseline benchmark refresh from this changelog entry, and any downstream consumer reading the changelog history (e.g. for release notes or change tracking) will miss it.

Step-by-step proof

  1. git show 09a23b0 -- .github/configs/nvidia-master.yaml — three blocks are modified, including the one at line 2237 for glm5-fp8-b200-sglang-agentic. The replacement string is identical to the one applied to the two siblings.
  2. perf-changelog.yaml lines 2632-2638 list only glm5-fp8-b200-sglang and glm5-fp8-b200-sglang-mtp under config-keys.
  3. Compare with the previous dsr1 entry at lines 2614-2619 (also a multi-sibling bump from PR Update dsr1-fp8-b200-sglang and -mtp SGLang image to v0.5.12-cu130 #1416): it correctly lists both dsr1-fp8-b200-sglang and dsr1-fp8-b200-sglang-mtp. The convention is to list every config whose image was bumped.
  4. utils/process_changelog.py reads added entries from this file and uses config-keys directly as the --config-keys argument to generate_sweep_configs.py. A config absent from the list is silently not benchmarked from that entry.

How to fix

Append - glm5-fp8-b200-sglang-agentic to the config-keys list in the new entry, and update the PR title/description (and optionally the changelog description string) to mention the agentic variant. The diff would be:

- config-keys:
    - glm5-fp8-b200-sglang
    - glm5-fp8-b200-sglang-mtp
    - glm5-fp8-b200-sglang-agentic
  description:
    - "Update SGLang image from nightly-dev-cu13-20260317-1eea7448 (33d/29d old) to v0.5.12-cu130"
  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1447

@github-actions
Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant