[NV] H200 GLM5 fp8 update sglang container#1033
Conversation
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
3 similar comments
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25872417396 |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25872417396 |
Updated SGLang image version and added server launch option.
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25874676435 |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25875894361 |
3 similar comments
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25875894361 |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25875894361 |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=25875894361 |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26113503153 |
|
@claude add PR description |
|
Claude finished @hshrivastava-droid's task in 37s —— View job Adding PR Description
Added a PR description covering the three key changes in this PR:
Also included links to the validation run and a summary table of changed files. |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26118912054 |
|
sglang cookbook- sgl-project/sgl-cookbook#276 |
|
@functionstackx could you please help reviewing this |
functionstackx
left a comment
There was a problem hiding this comment.
can u doc update to correct repo https://github.com/sgl-project/sgl-cookbook
is depreipcated
https://github.com/sgl-project/sglang/tree/main/docs_new
is the new repo
|
updated sglang receipe- sgl-project/sglang#25814 |
|
/reuse-sweep-run |
|
see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26130688982 |
Summary
Update the GLM-5 FP8 H200 SGLang benchmark configuration and launch script:
lmsysorg/sglang:v0.5.12-cu130tolmsysorg/sglang:v0.5.11-cu129h200toh200-dgxc--enable-flashinfer-allreduce-fusionto the server launch command for improved allreduce performanceChanged Files
.github/configs/nvidia-master.yamlglm5-fp8-h200-sglangbenchmarks/single_node/glm5_fp8_h200.sh--enable-flashinfer-allreduce-fusionflagperf-changelog.yamlValidation