Add haic0 patch for AMD kimi k2.5 MTP support#1108
Add haic0 patch for AMD kimi k2.5 MTP support#1108haic0 wants to merge 2 commits intoSemiAnalysisAI:mainfrom
Conversation
Signed-off-by: haic0 <haichzha@gbt350-odcdh5-wbb3.png-odc.dcgpu>
chunfangamd
left a comment
There was a problem hiding this comment.
@haic0 could you please branch against the InferenceX repo directly and create the PR?
| search-space: | ||
| - { tp: 8, conc-start: 4, conc-end: 64 } | ||
| - { tp: 4, conc-start: 4, conc-end: 64 } | ||
| - { tp: 8, conc-start: 4, conc-end: 64} |
There was a problem hiding this comment.
It's better to leave the unrelated lines unmodified
|
thanks for the PR @haic0 We haven't thought much about the guidelines for MTP for models that dont natively ship with it as we didn't think we would include it for inferencexv3 & have previously rejected submissions for it #1026 (review) do you think we should include it for inferencex v3 i.e. the questions that need some thought is:
We already have 3 different models that natively ship with MTP (deepseek, glm5, qwen3.5), is it worth it to spend time thinking about MTP for models that don't come natively with MTP? |
| kimik2.5-int4-mi300x-vllm-mtp: | ||
| image: vllm/vllm-openai-rocm:v0.19.0 | ||
| model: moonshotai/Kimi-K2.5 | ||
| model-prefix: kimik2.5 | ||
| runner: mi300x | ||
| precision: int4 | ||
| framework: vllm | ||
| multinode: false | ||
| seq-len-configs: | ||
| - isl: 1024 | ||
| osl: 1024 | ||
| search-space: | ||
| - { tp: 4, conc-start: 4, conc-end: 64, spec-decoding: mtp } | ||
| - isl: 8192 | ||
| osl: 1024 | ||
| search-space: | ||
| - { tp: 4, conc-start: 4, conc-end: 64, spec-decoding: mtp } | ||
|
|
||
| kimik2.5-int4-mi325x-vllm-mtp: |
There was a problem hiding this comment.
The corresponding file benchmarks/single_node/kimik2.5_int4_mi325x_mtp.sh is missing
| search-space: | ||
| - { tp: 4, conc-start: 4, conc-end: 64, spec-decoding: mtp } | ||
|
|
||
| kimik2.5-int4-mi300x-vllm-mtp: |
There was a problem hiding this comment.
The corresponding file benchmarks/single_node/kimik2.5_int4_mi300x_mtp.sh is missing
No description provided.