Replies: 2 comments 2 replies
-
The INT8 portions are interesting to me. That should make things faster and less memory intensive for cards with Tensor cores or Datacenter cards. Of course, on the CPU side it would help as well. Intel has been making their own tool to quantize to INT8 for their CPUs and soon to be ARC. ONNX seems to be a good intermediary format as well. TRT relies on it at the moment. |
Beta Was this translation helpful? Give feedback.
-
And, this one. MultiDiffusion pipeline now built-in to latest diffusers. Panorama mode enabled. Region-based generation next They've already added a "panorama" mode to the base diffuser library and are working on "region-based" composition, which will be another huge leap forward. |
Beta Was this translation helpful? Give feedback.
-
Perhaps these features can be used within webui as part of a diffusers extension.
Onnx Diffusers Pipeline
Let's you use sd models converted into onnx format. Onnx is faster than pytorch when running on cpu. Onnx also allows you to quantize and use your models easily. This means less accuracy, but also less compute and ram is needed.
Example: (Click to expand:)
256x384
takes 5min 24sec to generate on my android (SoC:SD865+), non-optimized running in termux.
Upscaled:
upscaler app
![Real-CUGAN-se_0109_220507](https://user-images.githubusercontent.com/98228077/211454794-4279bb06-1ae2-49d9-bc09-eda78dd21a5b.png)
264x312
Supports multiples of 8
![](https://user-images.githubusercontent.com/98228077/211454120-2cac2a7e-39ed-4ed3-8096-73d77b6f48c4.png)
int8 example:
it will be less consistent quality. some cherrypicked examples showing miniSD doing people:
192x256
256x256
miniSD will likely work on any 64bit android phone.
Image Variations Model, and Versatile Diffusion
Variations of a submitted input image.
https://huggingface.co/spaces/lambdalabs/stable-diffusion-image-variations
Versatile Diffusion can support image-to-text, image-variation, text-to-image, and text-variation.
https://huggingface.co/spaces/shi-labs/Versatile-Diffusion
PaintByExample
https://huggingface.co/spaces/Fantasy-Studio/Paint-by-Example
Exemplar-based Image Editing with Diffusion Models.
unclip - karlo
https://huggingface.co/spaces/kakaobrain/karlo
Karlo is a text-conditional diffusion model based on unCLIP, composed of prior, decoder, and super-resolution modules.
Imagic
https://huggingface.co/spaces/fffiloni/imagic-stable-diffusion
![image](https://user-images.githubusercontent.com/98228077/211397108-9126dcc0-0583-4bf9-b32f-26d30020f0c9.png)
Tweaking an user-submitted image to change image very specifically, through a prompt.
(this is not cross-attention control/prompt2prompt)
clip guided diffusion
CLIP guided stable diffusion can help to generate more realistic images by guiding stable diffusion at every denoising step with an additional CLIP model.
vq diffusion
https://huggingface.co/spaces/williamberman/vq-diffusion
a model that does better at following your prompt, with your subject, background, and features in the right places
Example: (Click to expand:)
(I failed to get good results, but try it, I might be doing something wrong.)a person scratching nose with finger, beside a towel, in the bathroom
VQ:
![image](https://user-images.githubusercontent.com/98228077/211408361-b3627598-3233-455b-9ecc-750489b14740.png)
SD:
![image](https://user-images.githubusercontent.com/98228077/211408389-28cbaa8a-3324-4e31-bf77-1fc52c126078.png)
repaint (unknown)
https://martin-danelljan.github.io/publication/repaint/
Masking method that works well with no prompt.
Beta Was this translation helpful? Give feedback.
All reactions