[NV] H200 GLM5 fp8 update sglang container by hshrivastava-droid · Pull Request #1033 · SemiAnalysisAI/InferenceX

hshrivastava-droid · 2026-04-15T06:53:33Z

Summary

Update the GLM-5 FP8 H200 SGLang benchmark configuration and launch script:

SGLang image: Change from lmsysorg/sglang:v0.5.12-cu130 to lmsysorg/sglang:v0.5.11-cu129
Runner: Switch from h200 to h200-dgxc
Launch option: Add --enable-flashinfer-allreduce-fusion to the server launch command for improved allreduce performance

Changed Files

File	Change
`.github/configs/nvidia-master.yaml`	Updated image tag and runner type for `glm5-fp8-h200-sglang`
`benchmarks/single_node/glm5_fp8_h200.sh`	Added `--enable-flashinfer-allreduce-fusion` flag
`perf-changelog.yaml`	Added changelog entry for these config changes

Validation

Validation run

github-actions · 2026-04-15T06:53:42Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

github-actions · 2026-04-15T06:53:42Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

github-actions · 2026-04-15T06:53:42Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

github-actions · 2026-04-15T06:53:42Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

github-actions · 2026-05-14T17:17:27Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25872417396
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25872417396

github-actions · 2026-05-14T17:26:14Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25872417396
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25872417396

Updated SGLang image version and added server launch option.

github-actions · 2026-05-14T17:49:27Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25874676435
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25874676435

github-actions · 2026-05-14T17:59:53Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25875894361
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25875894361

github-actions · 2026-05-14T23:59:22Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25875894361
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25875894361

github-actions · 2026-05-15T18:16:41Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25875894361
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25875894361

github-actions · 2026-05-18T23:12:28Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25875894361
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=25875894361

github-actions · 2026-05-19T18:10:43Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26113503153
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26113503153

hshrivastava-droid · 2026-05-19T18:43:47Z

@claude add PR description

Klaud-Cold · 2026-05-19T18:44:21Z

Claude finished @hshrivastava-droid's task in 37s —— View job

Adding PR Description

Gather context and analyze changes
Write and update PR description

Added a PR description covering the three key changes in this PR:

SGLang image update (v0.5.12-cu130 → v0.5.11-cu129)
Runner switch (h200 → h200-dgxc)
Added --enable-flashinfer-allreduce-fusion launch flag

Also included links to the validation run and a summary table of changed files.

github-actions · 2026-05-19T19:31:42Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26118912054
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26118912054

hshrivastava-droid · 2026-05-19T20:21:36Z

sglang cookbook- sgl-project/sgl-cookbook#276

hshrivastava-droid · 2026-05-19T20:22:04Z

@functionstackx could you please help reviewing this

functionstackx

can u doc update to correct repo https://github.com/sgl-project/sgl-cookbook
is depreipcated

https://github.com/sgl-project/sglang/tree/main/docs_new
is the new repo

hshrivastava-droid · 2026-05-19T22:41:03Z

updated sglang receipe- sgl-project/sglang#25814
@functionstackx

functionstackx · 2026-05-19T23:05:29Z

/reuse-sweep-run

github-actions · 2026-05-19T23:06:22Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26130688982
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26130688982

update sglang container

7b22742

hshrivastava-droid requested a review from a team April 15, 2026 06:53

hshrivastava-droid added the NVIDIA label Apr 15, 2026

hshrivastava-droid requested a review from kedarpotdar-nv as a code owner April 15, 2026 06:53

hshrivastava-droid added the sweep-enabled label Apr 15, 2026

hshrivastava-droid requested a review from jgangani as a code owner April 15, 2026 06:53

github-project-automation Bot added this to InferenceMAX Board Apr 15, 2026

update PR number

295e1df

hshrivastava-droid changed the title ~~[WIP][NV] update sglang container~~ [WIP][NV] GLM5 fp8 update sglang container Apr 16, 2026

Ankur-singh changed the title ~~[WIP][NV] GLM5 fp8 update sglang container~~ [Do Not Merge][NV] GLM5 fp8 update sglang container Apr 17, 2026

hshrivastava-droid added 4 commits April 23, 2026 16:47

update flashinfer

e2767e6

Merge branch 'main' into nv/glm5-fp8-h200-sglang-v2

de4242d

update contianer

3bb3b5a

Merge branch 'main' into nv/glm5-fp8-h200-sglang-v2

6c72388

hshrivastava-droid added full-sweep-enabled and removed sweep-enabled labels May 14, 2026

Update SGLang image and add launch option

6db622c

Updated SGLang image version and added server launch option.

hshrivastava-droid added 2 commits May 18, 2026 16:24

fix runner

03b8a8b

Merge branch 'main' into nv/glm5-fp8-h200-sglang-v2

829cbc5

hshrivastava-droid changed the title ~~[Do Not Merge][NV] GLM5 fp8 update sglang container~~ [NV] GLM5 fp8 update sglang container May 19, 2026

kedarpotdar-nv approved these changes May 19, 2026

View reviewed changes

kedarpotdar-nv changed the title ~~[NV] GLM5 fp8 update sglang container~~ [NV] H200 GLM5 fp8 update sglang container May 19, 2026

hshrivastava-droid changed the title ~~[NV] H200 GLM5 fp8 update sglang container~~ [WIP][NV] H200 GLM5 fp8 update sglang container May 19, 2026

image update

b24a545

faradawn mentioned this pull request May 19, 2026

Update GLM-5 H200 FP8 sgl-project/sgl-cookbook#276

Closed

Ankur-singh changed the title ~~[WIP][NV] H200 GLM5 fp8 update sglang container~~ [NV] H200 GLM5 fp8 update sglang container May 19, 2026

functionstackx requested changes May 19, 2026

View reviewed changes

faradawn mentioned this pull request May 19, 2026

Update GLM-5 H200 FP8 sgl-project/sglang#25814

Merged

Merge branch 'main' into nv/glm5-fp8-h200-sglang-v2

6063c6e

functionstackx merged commit 475218b into main May 19, 2026
3 of 5 checks passed

functionstackx deleted the nv/glm5-fp8-h200-sglang-v2 branch May 19, 2026 23:05

github-project-automation Bot moved this to Done in InferenceMAX Board May 19, 2026

Conversation

hshrivastava-droid commented Apr 15, 2026 • edited by Klaud-Cold Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changed Files

Validation

Uh oh!

github-actions Bot commented Apr 15, 2026

Uh oh!

github-actions Bot commented Apr 15, 2026

Uh oh!

github-actions Bot commented Apr 15, 2026

Uh oh!

github-actions Bot commented Apr 15, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

github-actions Bot commented May 15, 2026

Uh oh!

github-actions Bot commented May 18, 2026

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

hshrivastava-droid commented May 19, 2026

Uh oh!

Klaud-Cold commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Adding PR Description

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

hshrivastava-droid commented May 19, 2026

Uh oh!

hshrivastava-droid commented May 19, 2026

Uh oh!

functionstackx left a comment

Choose a reason for hiding this comment

Uh oh!

hshrivastava-droid commented May 19, 2026

Uh oh!

functionstackx commented May 19, 2026

Uh oh!

Uh oh!

github-actions Bot commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

hshrivastava-droid commented Apr 15, 2026 •

edited by Klaud-Cold

Loading

Klaud-Cold commented May 19, 2026 •

edited

Loading