-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[Model] Pass param prefix to LLMHead #24862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Model] Pass param prefix to LLMHead #24862
Conversation
Signed-off-by: whx-sjtu <2952154980@qq.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request systematically addresses a potential bug related to quantization by ensuring the prefix
parameter is passed to ParallelLMHead
across all relevant models. The changes are consistent, straightforward, and correctly implemented. This is a valuable fix that enhances the robustness and correctness of the model quantization process. The changes look good.
3c216af
to
80aea6b
Compare
the prefix is very useful in custom quantization case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Signed-off-by: whx-sjtu <2952154980@qq.com>
Signed-off-by: whx-sjtu <2952154980@qq.com> Signed-off-by: charlifu <charlifu@amd.com>
Purpose
prefix
is an init parameter ofParallelLMHead
, which is later used to get_quant_method by VocabParallelEmbedding. Currently only part of models pass this parameter to initializeParallelLMHead
. Missing of this parameter can cause bugs in the process of getting quantization method. This PR completes the passing of theprefix
parameter for all models.Test Plan
No need to add new test.
Test Result
all tests should pass
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.