Skip to content

(ci)(recipe): Add DeepSeek-R1 FP4 TP4 validation and DS recipe for SGLang-ATOM#614

Merged
valarLip merged 8 commits into
mainfrom
yuhua/sgl-dsrecipe-fp4ci
May 12, 2026
Merged

(ci)(recipe): Add DeepSeek-R1 FP4 TP4 validation and DS recipe for SGLang-ATOM#614
valarLip merged 8 commits into
mainfrom
yuhua/sgl-dsrecipe-fp4ci

Conversation

@zhuyuhua-v
Copy link
Copy Markdown
Collaborator

@zhuyuhua-v zhuyuhua-v commented Apr 20, 2026

Motivation

  • add DeepSeek-R1-FP4 TP4 coverage to SGLang-ATOM accuracy flows, including nightly/manual validation and dashboard metadata, with a 0.91 GSM8K threshold
  • align the DeepSeek-R1-FP8 TP4 GSM8K threshold to 0.91 across the ATOM SGLang PR and nightly accuracy workflows to avoid data floating issues.
  • add recipes/sglang_atom/DeepSeek-R1.md in the same style as the vLLM-ATOM recipe, covering server launch, benchmarking, accuracy validation, and profiling usage
  • Updates aiter wheel download, align with PR [atom-vllm CI] align the aiter download logic with atom CI #706

ATOM SGLang CI / Nightly / Benchmark Scope

Scope Workflow Trigger Case 数 用途
CI .github/workflows/atom-sglang-test.yaml PR to main,非 draft,非 closed 2 PR SGLang accuracy smoke
Nightly Accuracy .github/workflows/atom-sglang-accuracy-validation.yaml 每天 18:00 UTC / 北京 02:00,或手动触发 4 全量 SGLang GSM8K accuracy validation
Nightly Benchmark .github/workflows/atom-sglang-benchmark.yaml 每天 17:00 UTC / 北京 01:00,或手动触发 nightly: 5 × 10 = 50 SGLang serving performance benchmark

Shared Accuracy Parameters

Item Value
SGLang ref v0.5.10
Task gsm8k
Metric checked results.gsm8k["exact_match,flexible-extract"]
Few-shot 3
LM Eval concurrency 65
Server args --trust-remote-code --kv-cache-dtype fp8_e4m3 --mem-fraction-static 0.8 --page-size 1 --disable-radix-cache
Common env SGLANG_AITER_FP8_PREFILL_ATTN=0, SGLANG_USE_AITER=1, ATOM_ENABLE_DS_QKNORM_QUANT_FUSION=1

CI Cases

Model Weight Runner TP Extra Args Env Vars Threshold
DeepSeek-R1-FP8 TP4 deepseek-ai/DeepSeek-R1-0528 linux-atom-mi35x-4 4 --tensor-parallel-size 4 AITER_QUICK_REDUCE_QUANTIZATION=INT4; common env 0.91
DeepSeek-R1-FP4 TP4 amd/DeepSeek-R1-0528-MXFP4-MTP-MoEFP4 linux-atom-mi35x-4 4 --tensor-parallel-size 4 AITER_QUICK_REDUCE_QUANTIZATION=INT4; common env 0.91

Nightly Accuracy Cases

Model Weight Runner TP Extra Args Threshold
DeepSeek-R1-FP8 TP4 deepseek-ai/DeepSeek-R1-0528 linux-atom-mi35x-4 4 --tensor-parallel-size 4 0.91
DeepSeek-R1-FP8 TP8 deepseek-ai/DeepSeek-R1-0528 linux-atom-mi35x-8 8 --tensor-parallel-size 8 0.93
DeepSeek-R1-FP4 TP4 amd/DeepSeek-R1-0528-MXFP4-MTP-MoEFP4 linux-atom-mi35x-4 4 --tensor-parallel-size 4 0.91
DeepSeek-R1-FP4 TP8 amd/DeepSeek-R1-0528-MXFP4-MTP-MoEFP4 linux-atom-mi35x-8 8 --tensor-parallel-size 8 0.93

Benchmark Schedule

当前 benchmark workflow 支持两种模式:

Mode Model Selection Param Selection Dashboard
Scheduled nightly 自动选择全部 5 个 SGLang benchmark models 默认 10 组参数 默认 publish
Manual dispatch 通过 checkbox 选择模型 param_lists 输入,默认 10 组参数 publish_to_dashboard 控制,默认 true

Schedule:

  • Cron: 0 17 * * *
  • Beijing time: 每晚 01:00

Benchmark Parameters

Default param sets:

ISL OSL Concurrency Random Range Ratio
1024 1024 4, 8, 16, 32, 64 0.8
8192 1024 4, 8, 16, 32, 64 0.8

Benchmark command:

  • backend: sglang
  • dataset: random
  • num-prompts = concurrency * 10
  • num-warmups = 2 * concurrency
  • request-rate=inf
  • metrics: ttft,tpot,itl,e2el

Benchmark Models

Model Weight Serve Args Runner
DeepSeek-R1-0528 FP8 TP8 deepseek-ai/DeepSeek-R1-0528 --trust-remote-code --tensor-parallel-size 8 atom-mi355-8gpu-oot-benchmark
DeepSeek-R1-0528 FP8 TP4 deepseek-ai/DeepSeek-R1-0528 --trust-remote-code --tensor-parallel-size 4 atom-mi355-8gpu-oot-benchmark
DeepSeek-R1-0528-MXFP4 FP4 TP8 amd/DeepSeek-R1-0528-MXFP4-MTP-MoEFP4 --trust-remote-code --tensor-parallel-size 8 atom-mi355-8gpu-oot-benchmark
DeepSeek-R1-0528-MXFP4 FP4 TP4 amd/DeepSeek-R1-0528-MXFP4-MTP-MoEFP4 --trust-remote-code --tensor-parallel-size 4 atom-mi355-8gpu-oot-benchmark
DeepSeek-R1-0528-MXFP4 FP4 TP8 EP8 amd/DeepSeek-R1-0528-MXFP4-MTP-MoEFP4 --trust-remote-code --tensor-parallel-size 8 --expert-parallel-size 8 atom-mi355-8gpu-oot-benchmark

@ZLkanyo009 ZLkanyo009 marked this pull request as ready for review April 21, 2026 07:50
qichu-yun
qichu-yun previously approved these changes Apr 21, 2026
wuhuikx
wuhuikx previously approved these changes Apr 22, 2026
valarLip
valarLip previously approved these changes Apr 23, 2026
@valarLip
Copy link
Copy Markdown
Collaborator

image still wip?

Copilot AI review requested due to automatic review settings April 23, 2026 06:21
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds DeepSeek-R1 FP4 (MXFP4 weights) TP4 accuracy coverage to the ATOM SGLang CI/validation flows and documents how to run/benchmark/validate DeepSeek-R1 using the SGLang-ATOM backend.

Changes:

  • Add DeepSeek-R1 FP4 TP4 (MXFP4 checkpoint) to PR CI accuracy matrix and to nightly/manual accuracy validation matrix.
  • Align DeepSeek-R1 FP8 TP4 GSM8K accuracy threshold from 0.92 to 0.91 across workflows and dashboard model metadata.
  • Add an SGLang-ATOM DeepSeek-R1 recipe covering server launch, benchmarking, profiling, and GSM8K validation.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
recipes/sglang_atom/DeepSeek-R1.md New SGLang-ATOM DeepSeek-R1 recipe (launch, benchmark, profiling, lm-eval).
.github/workflows/atom-sglang-test.yaml Updates PR CI accuracy threshold and adds DeepSeek-R1 FP4 TP4 to the matrix.
.github/workflows/atom-sglang-accuracy-validation.yaml Adds manual toggle + nightly coverage for DeepSeek-R1 FP4 TP4; aligns FP8 TP4 threshold.
.github/benchmark/sglang_models_accuracy.json Adds/updates dashboard metadata for the two DeepSeek-R1 TP4 accuracy entries (thresholds, baseline fields).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/benchmark/sglang_models_accuracy.json
Comment thread recipes/atom_sglang/DeepSeek-R1.md Outdated
Comment thread recipes/atom_sglang/DeepSeek-R1.md
Comment thread .github/workflows/atom-sglang-test.yaml
Comment thread .github/workflows/atom-sglang-accuracy-validation.yaml
@zhuyuhua-v zhuyuhua-v dismissed stale reviews from wuhuikx, valarLip, and qichu-yun via 91f30ab April 23, 2026 09:18
@zhuyuhua-v zhuyuhua-v marked this pull request as draft April 24, 2026 05:24
@zhuyuhua-v zhuyuhua-v marked this pull request as ready for review April 24, 2026 05:26
Copilot AI review requested due to automatic review settings April 24, 2026 05:26
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/atom-sglang-test.yaml Outdated
Comment thread .github/workflows/atom-sglang-accuracy-validation.yaml
Comment thread .github/workflows/atom-sglang-accuracy-validation.yaml
Comment thread .github/benchmark/sglang_models_accuracy.json
@zhuyuhua-v zhuyuhua-v marked this pull request as draft April 30, 2026 06:37
…Lang-ATOM

Signed-off-by: zhuyuhua-v <yuhzhu@amd.com>
@zhuyuhua-v zhuyuhua-v force-pushed the yuhua/sgl-dsrecipe-fp4ci branch from f5d5175 to 1696e64 Compare May 11, 2026 06:31
Signed-off-by: zhuyuhua-v <yuhzhu@amd.com>
Signed-off-by: zhuyuhua-v <yuhzhu@amd.com>
@zhuyuhua-v zhuyuhua-v marked this pull request as ready for review May 11, 2026 07:13
Copilot AI review requested due to automatic review settings May 11, 2026 07:13
@zhuyuhua-v zhuyuhua-v requested a review from Yuechguo May 11, 2026 07:16
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Comment thread .github/workflows/atom-sglang-accuracy-validation.yaml
Comment thread .github/benchmark/sglang_models_accuracy.json
@zhuyuhua-v zhuyuhua-v marked this pull request as draft May 11, 2026 07:42
Signed-off-by: zhuyuhua-v <yuhzhu@amd.com>
Signed-off-by: zhuyuhua-v <yuhzhu@amd.com>
@zhuyuhua-v zhuyuhua-v marked this pull request as ready for review May 11, 2026 08:42
Copilot AI review requested due to automatic review settings May 11, 2026 08:42
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Comment thread .github/workflows/atom-sglang-test.yaml
Comment thread .github/benchmark/sglang_benchmark_models.json
@zhuyuhua-v
Copy link
Copy Markdown
Collaborator Author

image still wip?

fixed in #747

@zhuyuhua-v zhuyuhua-v marked this pull request as ready for review May 11, 2026 09:00
Copilot AI review requested due to automatic review settings May 11, 2026 09:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Comment thread .github/workflows/atom-sglang-benchmark.yaml
Comment thread .github/benchmark/sglang_benchmark_models.json
Comment thread .github/workflows/atom-sglang-test.yaml
Comment thread .github/workflows/atom-sglang-test.yaml
Comment thread .github/benchmark/sglang_models_accuracy.json
Comment thread recipes/atom_sglang/DeepSeek-R1.md
Signed-off-by: zhuyuhua-v <yuhzhu@amd.com>
Copilot AI review requested due to automatic review settings May 12, 2026 09:00
@zhuyuhua-v zhuyuhua-v force-pushed the yuhua/sgl-dsrecipe-fp4ci branch from 476b5dd to ae99d0f Compare May 12, 2026 09:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 9 comments.

Comment thread .github/workflows/atom-sglang-test.yaml
Comment thread .github/workflows/atom-sglang-test.yaml
Comment thread .github/workflows/atom-sglang-test.yaml
Comment thread .github/workflows/atom-sglang-accuracy-validation.yaml
Comment thread .github/workflows/atom-sglang-accuracy-validation.yaml
Comment thread recipes/atom_sglang/DeepSeek-R1.md
Comment thread recipes/atom_sglang/DeepSeek-R1.md
Comment thread .github/benchmark/sglang_models_accuracy.json
Comment thread .github/benchmark/sglang_benchmark_models.json
@zhuyuhua-v zhuyuhua-v requested review from valarLip and wuhuikx May 12, 2026 14:16
@valarLip valarLip merged commit c615b35 into main May 12, 2026
53 of 58 checks passed
@valarLip valarLip deleted the yuhua/sgl-dsrecipe-fp4ci branch May 12, 2026 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants