Skip to content

Conversation

Josephasafg
Copy link
Contributor

@Josephasafg Josephasafg commented Sep 29, 2025

Purpose

Added new tuned triton configs and updated some which caused Jamba to crash when using.

This file vllm/model_executor/layers/fused_moe/configs/E=16,N=7168,device_name=NVIDIA_H100_80GB_HBM3,dtype=int8_w8a16.json caused Jamba to crash so it needed re-tuning.

Test Plan

Used these files to run Jamba Mini and Large.

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>
Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>
Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>
@Josephasafg Josephasafg marked this pull request as ready for review September 29, 2025 06:50
@Josephasafg Josephasafg requested a review from mgoin as a code owner September 29, 2025 06:50
@mgoin
Copy link
Member

mgoin commented Sep 30, 2025

LGTM, thanks!

@vllm-bot vllm-bot merged commit 35fe398 into vllm-project:main Sep 30, 2025
5 checks passed
pdasigi pushed a commit to pdasigi/vllm that referenced this pull request Oct 2, 2025
…nd FP8 (vllm-project#25858)

Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
…nd FP8 (#25858)

Signed-off-by: asafg <39553475+Josephasafg@users.noreply.github.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants