Release v0.8.0: Versatile Diffusion - Text, Images and Variations All in One Diffusion Model · huggingface/diffusers

🙆‍♀️ New Models

VersatileDiffusion

VersatileDiffusion, released by SHI-Labs, is a unified multi-flow multimodal diffusion model that is capable of doing multiple tasks such as text2image, image variations, dual-guided(text+image) image generation, image2text.

[Versatile Diffusion] Add versatile diffusion model by @patrickvonplaten @anton-l #1283
Make sure to install transformers from "main":

pip install git+https://github.com/huggingface/transformers

Then you can run:

from diffusers import VersatileDiffusionPipeline
import torch
import requests
from io import BytesIO
from PIL import Image

pipe = VersatileDiffusionPipeline.from_pretrained("shi-labs/versatile-diffusion", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

# initial image
url = "https://huggingface.co/datasets/diffusers/images/resolve/main/benz.jpg"
response = requests.get(url)
image = Image.open(BytesIO(response.content)).convert("RGB")

# prompt
prompt = "a red car"

# text to image
image = pipe.text_to_image(prompt).images[0]

# image variation
image = pipe.image_variation(image).images[0]

# image variation
image = pipe.dual_guided(prompt, image).images[0]

AltDiffusion

AltDiffusion is a multilingual latent diffusion model that supports text-to-image generation for 9 different languages: English, Chinese, Spanish, French, Japanese, Korean, Arabic, Russian and Italian.

Add AltDiffusion by @patrickvonplaten @patil-suraj #1299

Stable Diffusion Image Variations

StableDiffusionImageVariationPipeline by @justinpinkney is a stable diffusion model that takes an image as an input and generates variations of that image. It is conditioned on CLIP image embeddings instead of text.

StableDiffusionImageVariationPipeline by @patil-suraj #1365

Safe Latent Diffusion

Safe Latent Diffusion (SLD), released by ml-research@TUDarmstadt group, is a new practical and sophisticated approach to prevent unsolicited content from being generated by diffusion models. One of the authors of the research contributed their implementation to diffusers.

Add Safe Stable Diffusion Pipeline by @manuelbrack #1244

VQ-Diffusion with classifier-free sampling

vq diffusion classifier free sampling by @williamberman #1294

LDM super resolution

LDM super resolution is a latent 4x super-resolution diffusion model released by CompVis.

Add LDM Super Resolution pipeline by @duongna21 #1116

CycleDiffusion

CycleDiffusion is a method that uses Text-to-Image Diffusion Models for Image-to-Image Editing. It is capable of

Zero-shot image-to-image translation with text-to-image diffusion models such as Stable Diffusion.
Traditional unpaired image-to-image translation with diffusion models trained on two related domains.
Zero-shot image-to-image translation with text-to-image diffusion models such as Stable Diffusion.
Traditional unpaired image-to-image translation with diffusion models trained on two related domains.

Add CycleDiffusion pipeline using Stable Diffusion by @ChenWu98 #888

CLIPSeg + StableDiffusionInpainting.

Uses CLIPSeg to automatically generate a mask using segmentation, and then applies Stable Diffusion in-painting.

K-Diffusion wrapper

K-Diffusion Pipeline is community pipeline that allows to use any sampler from K-diffusion with diffusers models.

[Community Pipelines] K-Diffusion Pipeline by @patrickvonplaten #1360

🌀New SOTA Scheduler

DPMSolverMultistepScheduler is the 🧨 diffusers implementation of DPM-Solver++, a state-of-the-art scheduler that was contributed by one of the authors of the paper. This scheduler is able to achieve great quality in as few as 20 steps. It's a drop-in replacement for the default Stable Diffusion scheduler, so you can use it to essentially half generation times. It works so well that we adopted it for the Stable Diffusion demo Spaces: https://huggingface.co/spaces/stabilityai/stable-diffusion, https://huggingface.co/spaces/runwayml/stable-diffusion-v1-5.

You can use it like this:

from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler

repo_id = "runwayml/stable-diffusion-v1-5"
scheduler = DPMSolverMultistepScheduler.from_pretrained(repo_id, subfolder="scheduler")
stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, scheduler=scheduler)

🌐 Better scheduler API

The example above also demonstrates how to load schedulers using a new API that is coherent with model loading and therefore more natural and intuitive.

You can load a scheduler using from_pretrained, as demonstrated above, or you can instantiate one from an existing scheduler configuration. This is a way to replace the scheduler of a pipeline that was previously loaded:

from diffusers import DiffusionPipeline, EulerDiscreteScheduler

pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipeline.scheduler = DDIMScheduler.from_config(pipeline.scheduler.config)

Read more about these changes in the documentation. See also the community pipeline that allows using any of the K-diffusion samplers with diffusers, as mentioned above!

🎉 Performance

We work relentlessly to incorporate performance optimizations and memory reduction techniques to 🧨 diffusers. These are two of the most noteworthy incorporations in this release:

Enable memory-efficient attention by default if xFormers is installed.
Use batched-matmuls when possible.

🎁 Quality of Life improvements

Fix/Enable all schedulers for in-painting
Easier loading of local pipelines
cpu offloading: mutli GPU support

📝 Changelog

Add multistep DPM-Solver discrete scheduler by @LuChengTHU in #1132
Remove warning about half precision on MPS by @pcuenca in #1163
Fix typo latens -> latents by @duongna21 in #1171
Fix community pipeline links by @pcuenca in #1162
[Docs] Add loading script by @patrickvonplaten in #1174
Fix dtype safety checker inpaint legacy by @patrickvonplaten in #1137
Community pipeline img2img inpainting by @vvvm23 in #1114
[Community Pipeline] Add multilingual stable diffusion to community pipelines by @juancopi81 in #1142
[Flax examples] Load text encoder from subfolder by @duongna21 in #1147
Link to Dreambooth blog post instead of W&B report by @pcuenca in #1180
Fix small typo by @pcuenca in #1178
[DDIMScheduler] fix noise device in ddim step by @patil-suraj in #1189
MPS schedulers: don't use float64 by @pcuenca in #1169
Warning for invalid options without "--with_prior_preservation" by @shirayu in #1065
[ONNX] Improve ONNXPipeline scheduler compatibility, fix safety_checker by @anton-l in #1173
Restore compatibility with deprecated StableDiffusionOnnxPipeline by @pcuenca in #1191
Update pr docs actions by @mishig25 in #1194
handle dtype xformers attention by @patil-suraj in #1196
[Scheduler] Move predict epsilon to init by @patrickvonplaten in #1155
add licenses to pipelines by @natolambert in #1201
Fix cpu offloading by @anton-l in #1177
Fix slow tests by @patrickvonplaten in #1210
[Flax] fix extra copy pasta 🍝 by @camenduru in #1187
[CLIPGuidedStableDiffusion] support DDIM scheduler by @patil-suraj in #1190
Fix layer names convert LDM script by @duongna21 in #1206
[Loading] Make sure loading edge cases work by @patrickvonplaten in #1192
Add LDM Super Resolution pipeline by @duongna21 in #1116
[Conversion] Improve conversion script by @patrickvonplaten in #1218
DDIM docs by @patrickvonplaten in #1219
apply repeat_interleave fix for mps to stable diffusion image2image pipeline by @jncasey in #1135
Flax tests: don't hardcode number of devices by @pcuenca in #1175
Improve documentation for the LPW pipeline by @exo-pla-net in #1182
Factor out encode text with Copied from by @patrickvonplaten in #1224
Match the generator device to the pipeline for DDPM and DDIM by @anton-l in #1222
[Tests] Fix mps+generator fast tests by @anton-l in #1230
[Tests] Adjust TPU test values by @anton-l in #1233
Add a reference to the name 'Sampler' by @apolinario in #1172
Fix Flax usage comments by @pcuenca in #1211
[Docs] improve img2img example by @ruanrz in #1193
[Stable Diffusion] Fix padding / truncation by @patrickvonplaten in #1226
Finalize stable diffusion refactor by @patrickvonplaten in #1269
Edited attention.py for older xformers by @Lime-Cakes in #1270
Fix wrong link in text2img fine-tuning documentation by @daspartho in #1282
[StableDiffusionInpaintPipeline] fix batch_size for mask and masked latents by @patil-suraj in #1279
Add UNet 1d for RL model for planning + colab by @natolambert in #105
Fix documentation typo for UNet2DModel and UNet2DConditionModel by @xenova in #1275
add source link to composable diffusion model by @nanliu1 in #1293
Fix incorrect link to Stable Diffusion notebook by @dhruvrnaik in #1291
[dreambooth] link to bitsandbytes readme for installation by @0xdevalias in #1229
Add Scheduler.from_pretrained and better scheduler changing by @patrickvonplaten in #1286
Add AltDiffusion by @patrickvonplaten in #1299
Better error message for transformers dummy by @patrickvonplaten in #1306
Revert "Update pr docs actions" by @mishig25 in #1307
[AltDiffusion] add tests by @patil-suraj in #1311
Add improved handling of pil by @patrickvonplaten in #1309
cpu offloading: mutli GPU support by @dblunk88 in #1143
vq diffusion classifier free sampling by @williamberman in #1294
doc string args shape fix by @kamalkraj in #1243
[Community Pipeline] CLIPSeg + StableDiffusionInpainting by @unography in #1250
Temporary local test for PIL_INTERPOLATION by @pcuenca in #1317
Fix gpu_id by @anton-l in #1326
integrate ort by @prathikr in #1110
[Custom pipeline] Easier loading of local pipelines by @patrickvonplaten in #1327
[ONNX] Support Euler schedulers by @anton-l in #1328
img2text Typo by @patrickvonplaten in #1329
add docs for multi-modal examples by @natolambert in #1227
[Flax] Fix loading scheduler from subfolder by @skirsten in #1319
Fix/Enable all schedulers for in-painting by @patrickvonplaten in #1331
Correct path to schedlure by @patrickvonplaten in #1322
Avoid nested fix-copies by @anton-l in #1332
Fix img2img speed with LMS-Discrete Scheduler by @NotNANtoN in #896
Fix the order of casts for onnx inpainting by @anton-l in #1338
Legacy Inpainting Pipeline for Onnx Models by @ctsims in #1237
Jax infer support negative prompt by @entrpn in #1337
Update README.md: IMAGIC example code snippet misspelling by @ki-arie in #1346
Update README.md: Minor change to Imagic code snippet, missing dir error by @ki-arie in #1347
Handle batches and Tensors in pipeline_stable_diffusion_inpaint.py:prepare_mask_and_masked_image by @vict0rsch in #1003
change the sample model by @shunxing1234 in #1352
Add bit diffusion [WIP] by @kingstut in #971
perf: prefer batched matmuls for attention by @Birch-san in #1203
[Community Pipelines] K-Diffusion Pipeline by @patrickvonplaten in #1360
Add Safe Stable Diffusion Pipeline by @manuelbrack in #1244
[examples] fix mixed_precision arg by @patil-suraj in #1359
use memory_efficient_attention by default by @patil-suraj in #1354
Replace logger.warn by logger.warning by @regisss in #1366
Fix using non-square images with UNet2DModel and DDIM/DDPM pipelines by @jenkspt in #1289
handle fp16 in UNet2DModel by @patil-suraj in #1216
StableDiffusionImageVariationPipeline by @patil-suraj in #1365

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.8.0: Versatile Diffusion - Text, Images and Variations All in One Diffusion Model