[AMD/ROCM] qwen3.5 fp4 on mi355x, search space TP2/TP4#1022
[AMD/ROCM] qwen3.5 fp4 on mi355x, search space TP2/TP4#1022
Conversation
Signed-off-by: seungrokj <seungrok.jung@amd.com>
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
perf-changelog.yaml
Outdated
| - qwen3.5-fp4-mi355x-sglang | ||
| description: | ||
| - "TP2 seach space for Qwen3.5 fp4 on SGL" | ||
| pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/xxxx |
There was a problem hiding this comment.
🟡 The new perf-changelog.yaml entry uses a placeholder pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/xxxx instead of the actual PR number. This should be updated to pull/1022 before merging to maintain changelog traceability. The description also contains a typo: "seach" should be "search".
Extended reasoning...
The perf-changelog.yaml entry added by this PR (lines 1327-1330) contains two issues. First and more importantly, the pr-link field uses the literal placeholder value https://github.com/SemiAnalysisAI/InferenceX/pull/xxxx rather than the actual PR number. Based on PR metadata, this PR is #1022, so the link should read https://github.com/SemiAnalysisAI/InferenceX/pull/1022.
The changelog serves as an audit trail linking configuration changes to their originating PRs. A placeholder xxxx in this field means that anyone reading the changelog cannot navigate to the PR that introduced the TP2 search space change for qwen3.5-fp4-mi355x-sglang. This breaks the traceability the changelog is designed to provide.
The specific code path is straightforward: the diff adds these lines at the end of perf-changelog.yaml:
- config-keys:
- qwen3.5-fp4-mi355x-sglang
description:
- "TP2 seach space for Qwen3.5 fp4 on SGL"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/xxxxExisting code does not prevent this because there is no automated validation of pr-link values — the pattern of using XXX or xxxx placeholders is a recurring manual workflow issue visible in other entries in the same file (e.g., glm5-fp8-mi355x-sglang, minimaxm2.5-fp8-h200-vllm, qwen3.5-bf16-mi325x-sglang). This PR's placeholder differs from the typical uppercase XXX pattern but is functionally equivalent.
The impact is limited to documentation quality: the changelog link for the TP4→TP2 search space change on qwen3.5-fp4-mi355x-sglang will be non-functional if the PR is merged as-is. A reader trying to understand why TP was changed from 4 to 2 cannot follow the pr-link to find context.
Additionally, the description contains a typo: "seach" should be "search".
Proof: The diff clearly shows pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/xxxx in the added lines. The PR metadata confirms the PR number is #1022. Replacing xxxx with 1022 and fixing the typo resolves both issues.
Signed-off-by: seungrokj <seungrok.jung@amd.com>
Signed-off-by: seungrokj <seungrok.jung@amd.com>
|
e2e perf/accuracy |
Hi @functionstackx @cquil11
This PR includes TP2 search space to better utilized mi355x's HBM footprint.
@ajith-sirra-amd proposed this. Thanks Ajigth
cc. @andyluo7 @chunfangamd
Regards,
Seungrok