-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add INT8 Stable Diffusion through Optimum #1324
base: main
Are you sure you want to change the base?
Conversation
@patrickvonplaten please review this one. Thanks. |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
Discussed with @echarlaix offline, seems that the |
Hi @hshen14, Let's wait for |
Thanks @anton-l @echarlaix. Sure, let's do that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@echarlaix waiting until you give me the green light here :-)
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
Currently, Optimum-Intel was being upgraded with INC v2.0 API. Will re-visit this PR after the upgrade is done. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
bump to keep issue open |
@echarlaix , do you think it's good time to revisit this? Thanks. |
Sure, I will work on it and open a PR on |
great job. |
That would work perfectly! Thanks @echarlaix |
Is int8 quantization still in the works? I would find this extremely helpful on some of the devices I'm trying to use, especially when running on cpu. |
cc @yiyixuxu @sayakpaul @DN6 here |
I think better person to tag here would be @echarlaix. |
8-bit quantization is useful to improve the inference performance. This PR is to add INT8 quantization for Stable Diffusion through Optimum-Intel quantization API on top of Intel Neural Compressor. The sample code is implemented in Optimum-Intel.