[Kernel][Moe Configs] Add more tuned triton configs for ExpertsInt8 and FP8 #25858

Josephasafg · 2025-09-29T06:24:39Z

Purpose

Added new tuned triton configs and updated some which caused Jamba to crash when using.

This file vllm/model_executor/layers/fused_moe/configs/E=16,N=7168,device_name=NVIDIA_H100_80GB_HBM3,dtype=int8_w8a16.json caused Jamba to crash so it needed re-tuning.

Test Plan

Used these files to run Jamba Mini and Large.

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>

mgoin · 2025-09-30T14:30:35Z

LGTM, thanks!

…nd FP8 (vllm-project#25858) Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>

…nd FP8 (#25858) Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

Josephasafg added 3 commits September 29, 2025 09:19

Add new moe triton configs for experts_int8 and fp8

5f9d8ad

Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>

Fix whitespaces

61e57a0

Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>

Updated moe config

65a11ac

Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>

Josephasafg marked this pull request as ready for review September 29, 2025 06:50

Josephasafg requested a review from mgoin as a code owner September 29, 2025 06:50

mgoin approved these changes Sep 30, 2025

View reviewed changes

vllm-bot merged commit 35fe398 into vllm-project:main Sep 30, 2025
5 checks passed

pdasigi pushed a commit to pdasigi/vllm that referenced this pull request Oct 2, 2025

[Kernel][Moe Configs] Add more tuned triton configs for ExpertsInt8 a…

c13967c

…nd FP8 (vllm-project#25858) Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>

yewentao256 pushed a commit that referenced this pull request Oct 3, 2025

[Kernel][Moe Configs] Add more tuned triton configs for ExpertsInt8 a…

d9f8ded

…nd FP8 (#25858) Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Kernel][Moe Configs] Add more tuned triton configs for ExpertsInt8 and FP8 #25858

[Kernel][Moe Configs] Add more tuned triton configs for ExpertsInt8 and FP8 #25858

Uh oh!

Josephasafg commented Sep 29, 2025 •

edited by github-actions bot

Loading

Uh oh!

mgoin commented Sep 30, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Kernel][Moe Configs] Add more tuned triton configs for ExpertsInt8 and FP8 #25858

[Kernel][Moe Configs] Add more tuned triton configs for ExpertsInt8 and FP8 #25858

Uh oh!

Conversation

Josephasafg commented Sep 29, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

mgoin commented Sep 30, 2025

Uh oh!

Uh oh!

Uh oh!

Josephasafg commented Sep 29, 2025 •

edited by github-actions bot

Loading