Huggingface Diffusers Features #6585

ClashSAN · 2023-01-10T04:26:42Z

ClashSAN
Jan 10, 2023
Collaborator

Perhaps these features can be used within webui as part of a diffusers extension.

Onnx Diffusers Pipeline

Let's you use sd models converted into onnx format. Onnx is faster than pytorch when running on cpu. Onnx also allows you to quantize and use your models easily. This means less accuracy, but also less compute and ram is needed.

Example: (Click to expand:)

+GB	0.6gb	1gb	1.8gb	4gb
model	miniSD	miniSD	anything-v3	anything-v3
vae	int8	int8	int8	int8
unet	unint8	unint8	uint8	fp16
clip	int8	int8	int8	fp16
size	192x256	320x320	256x256	256x384

256x384

takes 5min 24sec to generate on my android (SoC:SD865+), non-optimized running in termux.

Upscaled:

upscaler app

264x312

Supports multiples of 8

int8 example:

it will be less consistent quality. some cherrypicked examples showing miniSD doing people:

192x256

256x256

miniSD will likely work on any 64bit android phone.

Image Variations Model, and Versatile Diffusion

Variations of a submitted input image.
https://huggingface.co/spaces/lambdalabs/stable-diffusion-image-variations

Versatile Diffusion can support image-to-text, image-variation, text-to-image, and text-variation.
https://huggingface.co/spaces/shi-labs/Versatile-Diffusion

PaintByExample

https://huggingface.co/spaces/Fantasy-Studio/Paint-by-Example

Exemplar-based Image Editing with Diffusion Models.

unclip - karlo

https://huggingface.co/spaces/kakaobrain/karlo

Karlo is a text-conditional diffusion model based on unCLIP, composed of prior, decoder, and super-resolution modules.

Imagic

https://huggingface.co/spaces/fffiloni/imagic-stable-diffusion
Tweaking an user-submitted image to change image very specifically, through a prompt.
(this is not cross-attention control/prompt2prompt)

clip guided diffusion

CLIP guided stable diffusion can help to generate more realistic images by guiding stable diffusion at every denoising step with an additional CLIP model.

vq diffusion

https://huggingface.co/spaces/williamberman/vq-diffusion
a model that does better at following your prompt, with your subject, background, and features in the right places

Example: (Click to expand:)

(I failed to get good results, but try it, I might be doing something wrong.)

a person scratching nose with finger, beside a towel, in the bathroom

VQ:

SD:

repaint (unknown)

https://martin-danelljan.github.io/publication/repaint/
Masking method that works well with no prompt.

78Alpha · 2023-02-21T08:01:38Z

78Alpha
Feb 21, 2023

The INT8 portions are interesting to me. That should make things faster and less memory intensive for cards with Tensor cores or Datacenter cards. Of course, on the CPU side it would help as well. Intel has been making their own tool to quantize to INT8 for their CPUs and soon to be ARC.

ONNX seems to be a good intermediary format as well. TRT relies on it at the moment.

2 replies

ClashSAN Feb 21, 2023
Collaborator Author

I agree it would be useful. Maybe you can ask for extension support from the diffusers team if they aren't too busy. They may be open to it, if enough of us will use it.

huggingface/diffusers#2326 (comment)
#2577 (reply in thread)

I was looking for a way to convert to quantized model components in pytorch, I still don't know how.
It could benefit users who have to run in full precision mode. but there is reduced quality. I still don't know the quality, or if you can convert to an int16 mode.

Intel has been making their own tool to quantize to INT8 for their CPUs and soon to be ARC.

openvino is 16% faster than onnx cpu mode, but they do not support dynamic shapes yet. The fastest tool on cpu is transferring miniSD properties to your model and quantizing to int8, then running with your chosen size. miniSD won't allow increments of 8. (The chart info for using regular models in float16 is incorrect, its actually less ram, 2.3 - 3.0gb)

78Alpha Feb 21, 2023

There has been a guy that has pushed a few pull requests for INT8 support or documentation, and he seems to get shot down every time with the same excuse of "We are waiting for a refactor of X or Y". https://github.com/huggingface/diffusers/pulls?q=int8

There was the Gaint demo that mentions INT8 support as well, but likely just Vino: https://github.com/luohao123/gaintmodels

Alchete · 2023-02-21T14:57:49Z

Alchete
Feb 21, 2023

And, this one. MultiDiffusion pipeline now built-in to latest diffusers. Panorama mode enabled. Region-based generation next
#7959

They've already added a "panorama" mode to the base diffuser library and are working on "region-based" composition, which will be another huge leap forward.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Huggingface Diffusers Features #6585

{{title}}

256x384

Upscaled:

264x312

int8 example:

192x256

256x256

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Huggingface Diffusers Features #6585

ClashSAN Jan 10, 2023 Collaborator

Onnx Diffusers Pipeline

256x384

Upscaled:

264x312

int8 example:

192x256

256x256

Image Variations Model, and Versatile Diffusion

PaintByExample

unclip - karlo

Imagic

clip guided diffusion

vq diffusion

repaint (unknown)

Replies: 2 comments · 2 replies

78Alpha Feb 21, 2023

ClashSAN Feb 21, 2023 Collaborator Author

78Alpha Feb 21, 2023

Alchete Feb 21, 2023

ClashSAN
Jan 10, 2023
Collaborator

Replies: 2 comments 2 replies

78Alpha
Feb 21, 2023

ClashSAN Feb 21, 2023
Collaborator Author

Alchete
Feb 21, 2023