Add Kimi K2.5 INT4 vLLM benchmark for MI300X (TP8) by functionstackx · Pull Request #975 · SemiAnalysisAI/InferenceX

functionstackx · 2026-03-29T01:38:03Z

Closes #974

Add single-node MI300X config for Kimi K2.5 INT4 with vLLM ROCm v0.18.0, matching the existing MI325X recipe.

github-actions · 2026-03-29T01:38:12Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

github-actions · 2026-03-29T01:38:12Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

functionstackx · 2026-03-29T02:21:48Z

@claude do v0.18.0 image..

Klaud-Cold · 2026-03-29T02:22:14Z

Claude finished @functionstackx's task in 1m 10s —— View job

Updated vLLM image to v0.18.0

Read current files to identify image references
Update image tag to v0.18.0 in amd-master.yaml
Update perf-changelog.yaml description
Commit and push changes

Updated the kimik2.5-int4-mi300x-vllm config image from vllm/vllm-openai-rocm:v0.16.0 to vllm/vllm-openai-rocm:v0.18.0 and updated the perf-changelog entry to match.
|

benenzhu · 2026-03-29T02:43:41Z

/usr/local/lib/python3.12/dist-packages/huggingface_hub/file_download.py:805: UserWarning: Not enough free disk space to download the file. The expected file size is: 9809.05 MB. The target location /mnt/hf_hub_cache/models--moonshotai--Kimi-K2.5/blobs only has 12.58 MB free disk space.

Seems the disk is full.

functionstackx · 2026-03-29T02:46:48Z

/usr/local/lib/python3.12/dist-packages/huggingface_hub/file_download.py:805: UserWarning: Not enough free disk space to download the file. The expected file size is: 9809.05 MB. The target location /mnt/hf_hub_cache/models--moonshotai--Kimi-K2.5/blobs only has 12.58 MB free disk space.

Seems the disk is full.

@cquil11 can u clean up the storage/ get more storage from AMD :sad:

Add single-node MI300X config for Kimi K2.5 INT4 with vLLM ROCm v0.16.0, matching the existing MI325X recipe with AMD Andy Luo's optimizations. Closes #974 Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

functionstackx requested a review from a team March 29, 2026 01:38

functionstackx requested review from billishyahao and chunfangamd as code owners March 29, 2026 01:38

github-project-automation bot added this to InferenceMAX Board Mar 29, 2026

functionstackx added the sweep-enabled label Mar 29, 2026

functionstackx mentioned this pull request Mar 29, 2026

Add Kimi-K2.5 INT4 vLLM v0.16.0 benchmark for MI300X #860

Closed

github-actions bot and others added 2 commits March 29, 2026 23:20

Add Kimi K2.5 INT4 vLLM benchmark for MI300X (TP8)

dc2f3d1

Add single-node MI300X config for Kimi K2.5 INT4 with vLLM ROCm v0.16.0, matching the existing MI325X recipe with AMD Andy Luo's optimizations. Closes #974 Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

Update Kimi K2.5 INT4 MI300X vLLM image from v0.16.0 to v0.18.0

49f1bd2

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

functionstackx force-pushed the claude/issue-974-20260329-0122 branch from 41ce571 to 49f1bd2 Compare March 30, 2026 03:20

functionstackx merged commit 2a2bb8c into main Mar 30, 2026
22 checks passed

functionstackx deleted the claude/issue-974-20260329-0122 branch March 30, 2026 07:51

github-project-automation bot moved this to Done in InferenceMAX Board Mar 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Kimi K2.5 INT4 vLLM benchmark for MI300X (TP8)#975

Add Kimi K2.5 INT4 vLLM benchmark for MI300X (TP8)#975
functionstackx merged 2 commits intomainfrom
claude/issue-974-20260329-0122

functionstackx commented Mar 29, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 29, 2026

Uh oh!

github-actions bot commented Mar 29, 2026

Uh oh!

functionstackx commented Mar 29, 2026

Uh oh!

Klaud-Cold commented Mar 29, 2026 •

edited

Loading

Uh oh!

benenzhu commented Mar 29, 2026

Uh oh!

functionstackx commented Mar 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

functionstackx commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 29, 2026

Uh oh!

github-actions bot commented Mar 29, 2026

Uh oh!

functionstackx commented Mar 29, 2026

Uh oh!

Klaud-Cold commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Updated vLLM image to v0.18.0

Uh oh!

benenzhu commented Mar 29, 2026

Uh oh!

functionstackx commented Mar 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

functionstackx commented Mar 29, 2026 •

edited

Loading

Klaud-Cold commented Mar 29, 2026 •

edited

Loading