-
Notifications
You must be signed in to change notification settings - Fork 184
Add final norm for LoRA models #1446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes issues related to the handling of LoRA models by adding support for identifying the final normalization layer and cleaning up model builder class methods. Key changes include renaming function parameters in packed MatMul calls to ensure consistency and updating the final norm detection logic (via has_final_norm) to handle PEFT models correctly.
Comments suppressed due to low confidence (1)
src/python/py/models/builder.py:920
- The parameter 'basename' replaces the previous 'name' in this function; please ensure that all call sites are updated consistently to prevent mismatches between the naming of the MatMul nodes.
def make_packed_matmul_fp16_or_fp32(self, q_matmul, k_matmul, v_matmul, basename, root_input, **kwargs):
### Description This PR adds the missing pattern to identify the final norm layer in LoRA models. It also cleans up some of the classes in the model builder. ### Motivation and Context The missing final norm layer in LoRA models caused the generated LoRA models to be incorrect.
Address previous PR review comments from #1470 (#1473) Address QNN specific regressions (#1470) Fix array eos_token_id handling (#1463) Constrained decoding integration (#1381) Remove BF16 CPU from valid GQA configuration (#1469) Avoid adding providers if not requested (#1464) Persist provider options across ClearProviders, AppendProvider where possible (#1454) Fix accuracy issues with Gemma models (#1448) Add bfloat16 support in model builder (#1447) Add final norm for LoRA models (#1446) Update version to 0.8.0-rc3 --------- Co-authored-by: kunal-vaishnavi <115581922+kunal-vaishnavi@users.noreply.github.com> Co-authored-by: Nenad Banfic <46795300+nenad1002@users.noreply.github.com> Co-authored-by: Nenad Banfic <nebanfic@microsoft.com> Co-authored-by: Baiju Meswani <bmeswani@microsoft.com> Co-authored-by: Abhishek Jindal <abjindal@microsoft.com> Co-authored-by: Ying Xiong <yingxiong@microsoft.com> Co-authored-by: Michał Moskal <michal@moskal.me> Co-authored-by: Kunal Vaishnavi <kvaishnavi@microsoft.com>
Description
This PR adds the missing pattern to identify the final norm layer in LoRA models. It also cleans up some of the classes in the model builder.
Motivation and Context
The missing final norm layer in LoRA models caused the generated LoRA models to be incorrect.