[NV] Update: sglang v2 Qwen3.5 h200 MTP by hshrivastava-droid · Pull Request #1017 · SemiAnalysisAI/InferenceX

hshrivastava-droid · 2026-04-08T22:57:16Z

Summary

Enable SGLang speculative decoding v2 (SGLANG_ENABLE_SPEC_V2=1) for the Qwen3.5 FP8 H200 MTP benchmark configuration.

Changes

benchmarks/single_node/qwen3.5_fp8_h200_mtp.sh: Set SGLANG_ENABLE_SPEC_V2=1 environment variable on the sglang.launch_server command to enable the v2 speculative decoding engine for EAGLE-based multi-token prediction
perf-changelog.yaml: Added changelog entry for qwen3.5-fp8-h200-sglang-mtp documenting the spec v2 enablement

Context

The Qwen3.5 FP8 H200 MTP benchmark uses EAGLE speculative decoding (3 speculative steps, 4 draft tokens, topk=1). This PR enables SGLang's v2 speculative decoding implementation via the SGLANG_ENABLE_SPEC_V2=1 flag, which is expected to improve MTP performance.

Updated PR link for Qwen3.5-397B-A17B-FP8 H200 SGLang MTP entry.

github-actions · 2026-04-08T22:57:25Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

hshrivastava-droid · 2026-04-09T18:21:00Z

@claude add PR description

Klaud-Cold · 2026-04-09T18:21:25Z

Claude finished @hshrivastava-droid's task in 38s —— View job

Adding PR Description

Gather context from changed files and diff
Write and update PR description

PR description has been added. It covers:

Summary: Enabling SGLang speculative decoding v2 for the Qwen3.5 FP8 H200 MTP benchmark
Changes: The two modified files and what each change does
Context: Background on the EAGLE speculative decoding config and the purpose of the SGLANG_ENABLE_SPEC_V2=1 flag

jgangani

LGTM

functionstackx

cookbook plz 😭

Enable SGLang speculative decoding v2 engine for the Qwen3.5 FP8 H200 + MTP combination, as validated in SemiAnalysisAI/InferenceX#1017. Signed-off-by: Faradawn Yang <73060648+faradawn@users.noreply.github.com>

hshrivastava-droid · 2026-04-14T16:54:14Z

Sglang cookbook- sgl-project/sgl-cookbook#240

hshrivastava-droid · 2026-04-14T17:25:51Z

@functionstackx - could you please review this?

Oseltamivir

lgtm

hshrivastava-droid added 7 commits April 2, 2026 21:24

add: qwen3.5 MTP

0fe7931

Change PR link in perf-changelog.yaml

545ece0

Updated PR link for Qwen3.5-397B-A17B-FP8 H200 SGLang MTP entry.

Merge branch 'main' into nv/qwen35_h200_v2

041dbe4

update PR number

6a2d113

Merge branch 'main' into nv/qwen35_h200_v2

58af92d

Merge branch 'main' into nv/qwen35_h200_v2

2f9991f

update sglang command

2ed78ca

hshrivastava-droid requested a review from a team April 8, 2026 22:57

github-project-automation bot added this to InferenceMAX Board Apr 8, 2026

claude bot reviewed Apr 8, 2026

View reviewed changes

Comment thread perf-changelog.yaml Outdated

update PR number

0c8cc35

hshrivastava-droid added NVIDIA sweep-enabled labels Apr 8, 2026

hshrivastava-droid added 2 commits April 13, 2026 13:02

Merge branch 'main' into nv/qwen35_h200_v2

e141a82

update image

05eccf7

hshrivastava-droid requested review from jgangani and kedarpotdar-nv as code owners April 13, 2026 20:05

jgangani approved these changes Apr 13, 2026

View reviewed changes

hshrivastava-droid changed the title ~~[WIP][NV] Update: sglang v2 Qwen3.5 h200 MTP~~ [NV] Update: sglang v2 Qwen3.5 h200 MTP Apr 13, 2026

kedarpotdar-nv approved these changes Apr 14, 2026

View reviewed changes

functionstackx requested changes Apr 14, 2026

View reviewed changes

Comment thread benchmarks/single_node/qwen3.5_fp8_h200_mtp.sh

This was referenced Apr 14, 2026

Add SGLANG_ENABLE_SPEC_V2=1 for Qwen3.5 FP8 H200 MTP faradawn/sgl-cookbook#2

Closed

Add SGLANG_ENABLE_SPEC_V2=1 for Qwen3.5 FP8 H200 MTP sgl-project/sgl-cookbook#240

Open

Merge branch 'main' into nv/qwen35_h200_v2

e50b993

hshrivastava-droid requested a review from functionstackx April 14, 2026 17:24

Merge branch 'main' into nv/qwen35_h200_v2

8cca51e

Oseltamivir approved these changes Apr 14, 2026

View reviewed changes

Oseltamivir merged commit 6cb8291 into main Apr 14, 2026
4 checks passed

Oseltamivir deleted the nv/qwen35_h200_v2 branch April 14, 2026 20:19

github-project-automation bot moved this to Done in InferenceMAX Board Apr 14, 2026

This was referenced Apr 18, 2026

Add B200 config: qwen3.5-fp4-sglang-mtp #1075

Merged

Add B300 config: qwen3.5-fp4-sglang-mtp #1083

Merged

Add B300 config: qwen3.5-bf16-sglang-mtp #1082

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NV] Update: sglang v2 Qwen3.5 h200 MTP#1017

[NV] Update: sglang v2 Qwen3.5 h200 MTP#1017
Oseltamivir merged 12 commits intomainfrom
nv/qwen35_h200_v2

hshrivastava-droid commented Apr 8, 2026 •

edited by Klaud-Cold

Loading

Uh oh!

github-actions bot commented Apr 8, 2026

Uh oh!

Uh oh!

hshrivastava-droid commented Apr 9, 2026

Uh oh!

Klaud-Cold commented Apr 9, 2026 •

edited

Loading

Uh oh!

jgangani left a comment

Uh oh!

functionstackx left a comment

Uh oh!

Uh oh!

hshrivastava-droid commented Apr 14, 2026

Uh oh!

hshrivastava-droid commented Apr 14, 2026

Uh oh!

Oseltamivir left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

hshrivastava-droid commented Apr 8, 2026 • edited by Klaud-Cold Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Context

Uh oh!

github-actions bot commented Apr 8, 2026

Uh oh!

Uh oh!

hshrivastava-droid commented Apr 9, 2026

Uh oh!

Klaud-Cold commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Adding PR Description

Uh oh!

jgangani left a comment

Choose a reason for hiding this comment

Uh oh!

functionstackx left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

hshrivastava-droid commented Apr 14, 2026

Uh oh!

hshrivastava-droid commented Apr 14, 2026

Uh oh!

Oseltamivir left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

hshrivastava-droid commented Apr 8, 2026 •

edited by Klaud-Cold

Loading

Klaud-Cold commented Apr 9, 2026 •

edited

Loading