Skip to content

[WIP] [NV] Qwen3.5 FP8 H200 SGLang#855

Merged
ankursingh-nv merged 14 commits into
mainfrom
nv/h200-qwen35
Mar 4, 2026
Merged

[WIP] [NV] Qwen3.5 FP8 H200 SGLang#855
ankursingh-nv merged 14 commits into
mainfrom
nv/h200-qwen35

Conversation

@kedarpotdar-nv
Copy link
Copy Markdown
Collaborator

No description provided.

Comment on lines +32 to +55
--model "$MODEL" \
--host 0.0.0.0 \
--port "$PORT" \
--tp "$TP" \
--expert-parallel-size "$EP_SIZE" \
--reasoning-parser qwen3 \
--tool-call-parser qwen3_coder \
--enable-flashinfer-allreduce-fusion \
--max-running-requests 128 \
--chunked-prefill-size 16384 \
--decode-log-interval 1 \
--mem-fraction-static 0.8 \
--cuda-graph-max-bs "$CONC" \
--context-length "$MAX_SEQ_LEN" \
--max-prefill-tokens 16384 \
--kv-cache-dtype fp8_e4m3 \
--quantization fp8 \
--attention-backend flashinfer \
--stream-interval 50 \
--moe-runner-backend auto \
--tokenizer-worker-num 6 \
--mamba-ssm-dtype bfloat16 \
--disable-radix-cache \
--trust-remote-code \
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u PR an sglang cookbook for this? this is a lot of flags 😭

Image

@kedarpotdar-nv
Copy link
Copy Markdown
Collaborator Author

@faradawn our SGL cookbook hero

@faradawn
Copy link
Copy Markdown

faradawn commented Mar 4, 2026

SGLang PR Created: sgl-project/sgl-cookbook#177. Let me know if there is any issue

Copy link
Copy Markdown
Collaborator

@functionstackx functionstackx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. thanks yall!

@ankursingh-nv ankursingh-nv merged commit f65d6b4 into main Mar 4, 2026
95 of 100 checks passed
@ankursingh-nv ankursingh-nv deleted the nv/h200-qwen35 branch March 4, 2026 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

6 participants