-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bf16 inference accuracy for mistral, phi3, dbrx #833
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @eaidova
Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>
6cc926b
to
492ad59
Compare
@echarlaix could you please merge? These changes are also contributing to support changes in llama model for transformers 4.43 |
Couple of tests are failing, can you take a look before we can merge ? |
@echarlaix all of them unrelated, gpt-bigcode failed due to issue in optimum patcher (not on our level), optimum-cli failed due to removal config for bloomz from default configs (that leads to changes in group size and config not suitable for testing with small model anymore) - I'll take a look separetly, how we can update this test. mistral/dbrx - I'm looking |
Thanks for fixing ! Will take care of the fix for gpt-bigcode |
* Fix bf16 inference accuracy for mistral, phi3, dbrx * reuse inv_freq * Apply suggestions from code review Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com> * make dim and base optional * fix model patcher for dbrx and add bitwise fix for mistral --------- Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>
* Fix bf16 inference accuracy for mistral, phi3, dbrx * reuse inv_freq * Apply suggestions from code review Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com> * make dim and base optional * fix model patcher for dbrx and add bitwise fix for mistral --------- Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>
* Fix bf16 inference accuracy for mistral, phi3, dbrx * reuse inv_freq * Apply suggestions from code review Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com> * make dim and base optional * fix model patcher for dbrx and add bitwise fix for mistral --------- Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>
* Fix bf16 inference accuracy for mistral, phi3, dbrx * reuse inv_freq * Apply suggestions from code review Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com> * make dim and base optional * fix model patcher for dbrx and add bitwise fix for mistral --------- Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>
* Fix bf16 inference accuracy for mistral, phi3, dbrx * reuse inv_freq * Apply suggestions from code review Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com> * make dim and base optional * fix model patcher for dbrx and add bitwise fix for mistral --------- Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>
* Fix bf16 inference accuracy for mistral, phi3, dbrx * reuse inv_freq * Apply suggestions from code review Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com> * make dim and base optional * fix model patcher for dbrx and add bitwise fix for mistral --------- Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>
What does this PR do?
I found that range of models where these changes #783 are required are not limited only llama and gemma, the same issue depends on transformers version can be found in phi3, mistral and dbrx.