Add Kimi K2.5 INT4 vLLM benchmark for MI300X (TP8)#975
Conversation
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
1 similar comment
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
|
@claude do v0.18.0 image.. |
|
Claude finished @functionstackx's task in 1m 10s —— View job Updated vLLM image to v0.18.0
Updated the |
/usr/local/lib/python3.12/dist-packages/huggingface_hub/file_download.py:805: UserWarning: Not enough free disk space to download the file. The expected file size is: 9809.05 MB. The target location /mnt/hf_hub_cache/models--moonshotai--Kimi-K2.5/blobs only has 12.58 MB free disk space.Seems the disk is full. |
@cquil11 can u clean up the storage/ get more storage from AMD :sad: |
Add single-node MI300X config for Kimi K2.5 INT4 with vLLM ROCm v0.16.0, matching the existing MI325X recipe with AMD Andy Luo's optimizations. Closes #974 Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
41ce571 to
49f1bd2
Compare
Closes #974
Add single-node MI300X config for Kimi K2.5 INT4 with vLLM ROCm v0.18.0, matching the existing MI325X recipe.
Generated with Claude Code