Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add INT8 Stable Diffusion through Optimum #1324

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

hshen14
Copy link
Contributor

@hshen14 hshen14 commented Nov 17, 2022

8-bit quantization is useful to improve the inference performance. This PR is to add INT8 quantization for Stable Diffusion through Optimum-Intel quantization API on top of Intel Neural Compressor. The sample code is implemented in Optimum-Intel.

@hshen14
Copy link
Contributor Author

hshen14 commented Nov 17, 2022

@patrickvonplaten please review this one. Thanks.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@anton-l
Copy link
Member

anton-l commented Nov 17, 2022

cc @echarlaix @michaelbenayoun

@anton-l
Copy link
Member

anton-l commented Nov 17, 2022

Discussed with @echarlaix offline, seems that the neural-compressor+optimum integration will refactor its API quite soon? Should we hold off the promotion until then?

@echarlaix
Copy link
Contributor

Hi @hshen14,

Let's wait for neural-compressor and optimum-intel refactorization before increasing visibility !

@hshen14
Copy link
Contributor Author

hshen14 commented Nov 18, 2022

Hi @hshen14,

Let's wait for neural-compressor and optimum-intel refactorization before increasing visibility !

Thanks @anton-l @echarlaix. Sure, let's do that.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@echarlaix waiting until you give me the green light here :-)

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Dec 24, 2022
@hshen14
Copy link
Contributor Author

hshen14 commented Dec 24, 2022

Currently, Optimum-Intel was being upgraded with INC v2.0 API. Will re-visit this PR after the upgrade is done.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@Thomas-MMJ
Copy link

bump to keep issue open

@hshen14
Copy link
Contributor Author

hshen14 commented Jan 19, 2023

@echarlaix , do you think it's good time to revisit this? Thanks.

@echarlaix
Copy link
Contributor

Sure, I will work on it and open a PR on diffusers once everything is finalized, does that work for you @hshen14 ?

@patrickvonplaten patrickvonplaten added wip and removed stale Issues that haven't received updates labels Mar 2, 2023
@CrazyBoyM
Copy link

great job.

@hshen14
Copy link
Contributor Author

hshen14 commented May 26, 2023

Sure, I will work on it and open a PR on diffusers once everything is finalized, does that work for you @hshen14 ?

That would work perfectly! Thanks @echarlaix

@Ender436
Copy link

Is int8 quantization still in the works? I would find this extremely helpful on some of the devices I'm trying to use, especially when running on cpu.

@patrickvonplaten
Copy link
Contributor

cc @yiyixuxu @sayakpaul @DN6 here

@sayakpaul
Copy link
Member

I think better person to tag here would be @echarlaix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants