-
Notifications
You must be signed in to change notification settings - Fork 660
[CI] 【Hackathon 9th Sprint No.21】NO.21 功能模块单测补充 #5066
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
[CI] 【Hackathon 9th Sprint No.21】NO.21 功能模块单测补充 #5066
Conversation
Add unit tests for Triton fused MoE backends with stubs for GPU/operator functionality.
|
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive unit tests for the Triton fused MoE backend module (fused_moe_triton_backend.py), achieving 85% code coverage. The tests use lightweight stubs and mocks to simulate GPU operations and Triton kernels, enabling testing without actual CUDA hardware.
Key changes:
- Implemented stub/mock framework for GPU operations and Triton kernels
- Added tests for four quantization methods: weight-only (wint8), wfp8afp8, tensor-wise FP8, and block-wise FP8
- Covered weight creation, loading, processing, and inference execution paths
| """Unit tests for the Triton fused MoE backends. | ||
|
|
||
| These tests install lightweight GPU/operator stubs so the real | ||
| ``fastdeploy.model_executor.layers.moe.fused_moe_triton_backend`` module can be | ||
| imported and exercised without CUDA kernels. The suites cover the weight-only, | ||
| wfp8afp8, tensor-wise fp8, and block-wise fp8 quantization helpers to ensure the | ||
| most important control-flow branches are validated while keeping the numerics | ||
| deterministic and CPU friendly. | ||
| """ |
Copilot
AI
Nov 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test file is missing the standard Apache 2.0 copyright header that is consistently used across the project. Please add the copyright header at the beginning of the file (before the module docstring) following this format:\n\npython\n# Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved.\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n
| def __init__(self): | ||
| self.calls: list[dict] = [] | ||
|
|
||
| def __getitem__(self, grid): # noqa: D401 - behavior mirrors kernel launch |
Copilot
AI
Nov 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The noqa comment references 'D401' which is typically for imperative mood in docstrings, but this method has no docstring. Either add a docstring describing what this method does (e.g., 'Mock kernel launch behavior by recording call parameters.') or remove the noqa comment as it serves no purpose without a docstring.
| def __getitem__(self, grid): # noqa: D401 - behavior mirrors kernel launch | |
| def __getitem__(self, grid): # noqa: D401 - behavior mirrors kernel launch | |
| """Mock kernel launch behavior by recording call parameters.""" |
| is_checkpoint_bf16: bool = False | ||
| weight_block_size: tuple[int, int] = (2, 2) | ||
|
|
||
| def name(self): # noqa: D401 - mimic FastDeploy quant config API |
Copilot
AI
Nov 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The noqa comment references 'D401' but the method lacks a docstring. Either add a proper docstring (e.g., 'Return the quantization configuration name.') or remove the noqa comment.
| def name(self): # noqa: D401 - mimic FastDeploy quant config API | |
| def name(self): # noqa: D401 - mimic FastDeploy quant config API | |
| """Return the quantization configuration name.""" |
| super().__init__() | ||
| self.num_experts = num_experts | ||
|
|
||
| def forward(self, x): # noqa: D401 - deterministic gating scores |
Copilot
AI
Nov 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The noqa comment references 'D401' but there is no docstring. Add a docstring describing the method's behavior (e.g., 'Generate deterministic gating scores for testing.') or remove the unnecessary noqa comment.
| def forward(self, x): # noqa: D401 - deterministic gating scores | |
| def forward(self, x): # noqa: D401 - deterministic gating scores | |
| """Generate deterministic gating scores for testing.""" |
…yers/moe/fused_moe_triton_backend.py
…e-backend Add stubs for fused MoE Triton backend tests
Add unit tests for Triton fused MoE backends with stubs for GPU/operator functionality.
Motivation
NO.21 功能模块 fastdeploy/model_executor/layers/moe/fused_moe_triton_backend.py 单测补充
Modifications
add tests/model_executor/test_fused_moe_triton_backend.py
Usage or Command
tests/model_executor/test_fused_moe_triton_backend.py:Accuracy Tests
tests/model_executor/test_fused_moe_triton_backend.py:Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.