Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix OV model for BLOOM architecture #340

Merged
merged 14 commits into from
Jun 9, 2023
Merged

Fix OV model for BLOOM architecture #340

merged 14 commits into from
Jun 9, 2023

Conversation

echarlaix
Copy link
Collaborator

@echarlaix echarlaix commented Jun 8, 2023

In this PR we patch the model before the ONNX / OpenVINO export for both bloom and llama architecture in order to fix the decoder mask generation.
Fix #327

cc @sammysun0711 @usstq

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 8, 2023

The documentation is not available anymore as the PR was closed or merged.

@echarlaix echarlaix marked this pull request as ready for review June 9, 2023 12:35
@echarlaix echarlaix merged commit 571f6c3 into main Jun 9, 2023
12 checks passed
@echarlaix echarlaix deleted the fix-bloom branch June 9, 2023 16:59
@eaidova
Copy link
Collaborator

eaidova commented Jun 13, 2023

@echarlaix I believe this fix is not complete. function which you patch for llama model, can be also find in other model types e.g. blenderbot and opt, so I assume that more models can be affected here.

@echarlaix
Copy link
Collaborator Author

Hi @eaidova ! Sure let me check it right now and add a fix ! Currently, here are the models architectures we are officially supporting, it could make sense to disable or at least add a warning when exporting other architectures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OV Bloomz with use_cache=True generated output differ from ORT and Pytorch
3 participants