We only run vLLM today. Add SGLang and TRT-LLM equivalents so we can compare engines.
Hardware
- 1× H100, 2× H100, 1× RTX 5090
Configs to add (in .github/configs/nvidia-master.yaml)
gptoss-fp4-h100-1x-sglang
dsr1qwen3-bf16-rtx5090-1x-sglang
dsr1qwen3-fp8-rtx5090-1x-sglang
- TRT-LLM variants (check Blackwell consumer support first for 5090)
Each needs: config block + benchmark script + framework-appropriate image.
We only run vLLM today. Add SGLang and TRT-LLM equivalents so we can compare engines.
Hardware
Configs to add (in
.github/configs/nvidia-master.yaml)gptoss-fp4-h100-1x-sglangdsr1qwen3-bf16-rtx5090-1x-sglangdsr1qwen3-fp8-rtx5090-1x-sglangEach needs: config block + benchmark script + framework-appropriate image.