Release v0.4.0 Better, faster, stronger! · huggingface/diffusers

🚗 Faster

We have thoroughly profiled our codebase and applied a number of incremental improvements that, when combined, provide a speed improvement of almost 3x.

On top of that, we now default to using the float16 format. It's much faster than float32 and, according to our tests, produces images with no discernible difference in quality. This beats the use of autocast, so the resulting code is cleaner!

🔑 `use_auth_token` no more

The recently released version of huggingface-hub automatically uses your access token if you are logged in, so you don't need to put it everywhere in your code. All you need to do is authenticate once using huggingface-cli login in your terminal and you're all set.

- pipe = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=True)
+ pipe = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")

We bumped huggingface-hub version to 0.10.0 in our dependencies to achieve this.

🎈More flexible APIs

Schedulers now use a common, simpler unified API design. This has allowed us to remove many conditionals and special cases in the rest of the code, including the pipelines. This is very important for us and for the users of 🧨 diffusers: we all gain clarity and a solid abstraction for schedulers. See the description in #719 for more details

Please update any custom Stable Diffusion pipelines accordingly:

- if isinstance(self.scheduler, LMSDiscreteScheduler):
-    latents = latents * self.scheduler.sigmas[0]
+ latents = latents * self.scheduler.init_noise_sigma

- if isinstance(self.scheduler, LMSDiscreteScheduler):
-     sigma = self.scheduler.sigmas[i]
-     latent_model_input = latent_model_input / ((sigma**2 + 1) ** 0.5)
+ latent_model_input = self.scheduler.scale_model_input(latent_model_input, t)

- if isinstance(self.scheduler, LMSDiscreteScheduler):
-     latents = self.scheduler.step(noise_pred, i, latents, **extra_step_kwargs).prev_sample
- else:
-     latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs).prev_sample
+ latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs).prev_sample

Pipeline callbacks. As a community project (h/t @jamestiotio!), diffusers pipelines can now invoke a callback function during generation, providing the latents at each step of the process. This makes it easier to perform tasks such as visualization, inspection, explainability and others the community may invent.

🛠️ More tasks

Building on top of the previous foundations, this release incorporates several new tasks that have been adapted from research papers or community projects. These include:

Textual inversion. Makes it possible to quickly train a new concept or style and incorporate it into the vocabulary of Stable Diffusion. Hundreds of people have already created theirs, and they can be shared and combined together. See the training Colab to get started.
Dreambooth. Similar goal to textual inversion, but instead of creating a new item in the vocabulary it fine-tunes the model to make it learn a new concept. Training Colab.
Negative prompts. Another community effort led by @shirayu. The Stable Diffusion pipeline can now receive both a positive prompt (the one you want to create), and a negative prompt (something you want to drive the model away from). This opens up a lot of creative possibilities!

🏃‍♀️ Under the hood changes to support better fine-tuning

Gradient checkpointing and 8-bit optimizers have been successfully applied to achieve Dreambooth fine-tuning in a Colab notebook! These updates will make it easier for diffusers to support general-purpose fine-tuning (coming soon!).

⚠️ Experimental: community pipelines

This is big, but it's still an experimental feature that may change in the future.

We are constantly amazed at the amount of imagination and creativity in the diffusers community, so we've made it easy to create custom pipelines and share them with others. You can write your own pipeline code, store it in 🤗 Hub, GitHub or your local filesystem and StableDiffusionPipeline.from_pretrained will be able to load and run it. Read more in the documentation.

We can't wait to see what new tasks the community creates!

💪 Quality of life fixes

Bug fixing, improved documentation, better tests are all important to ensure diffusers is a high-quality codebase, and we always spend a lot of effort working on them. Several first-time contributors have helped here, and we are very grateful for their efforts!

🙏 Significant community contributions

The following people have made significant contributions to the library over the last release:

@Victarry – Add training example for DreamBooth (#554)
@jamestiotio – Add callback parameters for Stable Diffusion pipelines (#521)
@jachiam – Allow resolutions that are not multiples of 64 (#505)
@johnowhitaker – Adding pred_original_sample to SchedulerOutput for some samplers (#614).
@keturn – Interesting discussions and insights on many topics.

✏️ Change list

[Docs] Correct links by @patrickvonplaten in #432
[Black] Update black by @patrickvonplaten in #433
use torch.matmul instead of einsum in attnetion. by @patil-suraj in #445
Renamed variables from single letter to better naming by @daspartho in #449
Docs: fix installation typo by @daspartho in #453
fix table formatting for stable diffusion pipeline doc (add blank line) by @natolambert in #471
update expected results of slow tests by @kashif in #268
[Flax] Make room for more frameworks by @patrickvonplaten in #494
Fix disable_attention_slicing in pipelines by @pcuenca in #498
Rename test_scheduler_outputs_equivalence in model tests. by @pcuenca in #451
Scheduler docs update by @natolambert in #464
Fix scheduler inference steps error with power of 3 by @natolambert in #466
initial flax pndm schedular by @kashif in #492
Fix vae tests for cpu and gpu by @kashif in #480
[Docs] Add subfolder docs by @patrickvonplaten in #500
docs: bocken doc links for relative links by @jjmachan in #504
Removing .float() (autocast in fp16 will discard this (I think)). by @Narsil in #495
Fix MPS scheduler indexing when using mps by @pcuenca in #450
[CrossAttention] add different method for sliced attention by @patil-suraj in #446
Implement FlaxModelMixin by @mishig25 in #493
Karras VE, DDIM and DDPM flax schedulers by @kashif in #508
[UNet2DConditionModel, UNet2DModel] pass norm_num_groups to all the blocks by @patil-suraj in #442
Add init_weights method to FlaxMixin by @mishig25 in #513
UNet Flax with FlaxModelMixin by @pcuenca in #502
Stable diffusion text2img conversion script. by @patil-suraj in #154
[CI] Add stalebot by @anton-l in #481
Fix is_onnx_available by @SkyTNT in #440
[Tests] Test attention.py by @sidthekidder in #368
Finally fix the image-based SD tests by @anton-l in #509
Remove the usage of numpy in up/down sample_2d by @ydshieh in #503
Fix typos and add Typo check GitHub Action by @shirayu in #483
Quick fix for the img2img tests by @anton-l in #530
[Tests] Fix spatial transformer tests on GPU by @anton-l in #531
[StableDiffusionInpaintPipeline] accept tensors for init and mask image by @patil-suraj in #439
adding more typehints to DDIM scheduler by @vishnu-anirudh in #456
Revert "adding more typehints to DDIM scheduler" by @patrickvonplaten in #533
Add LMSDiscreteSchedulerTest by @sidthekidder in #467
[Download] Smart downloading by @patrickvonplaten in #512
[Hub] Update hub version by @patrickvonplaten in #538
Unify offset configuration in DDIM and PNDM schedulers by @jonatanklosko in #479
[Configuration] Better logging by @patrickvonplaten in #545
make fixup support by @younesbelkada in #546
FlaxUNet2DConditionOutput @flax.struct.dataclass by @mishig25 in #550
[Flax] fix Flax scheduler by @kashif in #564
JAX/Flax safety checker by @pcuenca in #558
Flax: ignore dtype for configuration by @pcuenca in #565
Remove check_tf_utils to avoid an unnecessary TF import for now by @anton-l in #566
Fix _upsample_2d by @ydshieh in #535
[Flax] Add Vae for Stable Diffusion by @patrickvonplaten in #555
[Flax] Solve problem with VAE by @patrickvonplaten in #574
[Tests] Upload custom test artifacts by @anton-l in #572
[Tests] Mark the ncsnpp model tests as slow by @anton-l in #575
[examples/community] add CLIPGuidedStableDiffusion by @patil-suraj in #561
Fix CrossAttention._sliced_attention by @ydshieh in #563
Fix typos by @shirayu in #568
Add from_pt argument in .from_pretrained by @younesbelkada in #527
[FlaxAutoencoderKL] rename weights to align with PT by @patil-suraj in #584
Fix BaseOutput initialization from dict by @anton-l in #570
Add the K-LMS scheduler to the inpainting pipeline + tests by @anton-l in #587
[flax safety checker] Use FlaxPreTrainedModel for saving/loading by @patil-suraj in #591
FlaxDiffusionPipeline & FlaxStableDiffusionPipeline by @mishig25 in #559
[Flax] Fix unet and ddim scheduler by @patrickvonplaten in #594
Fix params replication when using the dummy checker by @pcuenca in #602
Allow dtype to be specified in Flax pipeline by @pcuenca in #600
Fix flax from_pretrained pytorch weight check by @mishig25 in #603
Mv weights name consts to diffusers.utils by @mishig25 in #605
Replace dropout_prob by dropout in vae by @younesbelkada in #595
Add smoke tests for the training examples by @anton-l in #585
Add torchvision to training deps by @anton-l in #607
Return Flax scheduler state by @pcuenca in #601
[ONNX] Collate the external weights, speed up loading from the hub by @anton-l in #610
docs: fix Berkeley ref by @ryanrussell in #611
Handle the PIL.Image.Resampling deprecation by @anton-l in #588
Make flax from_pretrained work with local subfolder by @mishig25 in #608
[flax] 'dtype' should not be part of self._internal_dict by @mishig25 in #609
[UNet2DConditionModel] add gradient checkpointing by @patil-suraj in #461
docs: fix stochastic_karras_ve ref by @ryanrussell in #618
Adding pred_original_sample to SchedulerOutput for some samplers by @johnowhitaker in #614
docs: .md readability fixups by @ryanrussell in #619
Flax documentation by @younesbelkada in #589
fix docs: change sample to images by @AbdullahAlfaraj in #613
refactor: pipelines readability improvements by @ryanrussell in #622
Allow passing session_options for ORT backend by @cloudhan in #620
Fix breaking error: "ort is not defined" by @pcuenca in #626
docs: src/diffusers readability improvements by @ryanrussell in #629
Fix formula for noise levels in Karras scheduler and tests by @sgrigory in #627
[CI] Fix onnxruntime installation order by @anton-l in #633
Warning for too long prompts in DiffusionPipelines (Resolve #447) by @shirayu in #472
Fix docs link to train_unconditional.py by @AbdullahAlfaraj in #642
Remove deprecated torch_device kwarg by @pcuenca in #623
refactor: custom_init_isort readability fixups by @ryanrussell in #631
Remove inappropriate docstrings in LMS docstrings. by @pcuenca in #634
Flax pipeline pndm by @pcuenca in #583
Fix SpatialTransformer by @ydshieh in #578
Add training example for DreamBooth. by @Victarry in #554
[Pytorch] Pytorch only schedulers by @kashif in #534
[examples/dreambooth] don't pass tensor_format to scheduler. by @patil-suraj in #649
[dreambooth] update install section by @patil-suraj in #650
[DDIM, DDPM] fix add_noise by @patil-suraj in #648
[Pytorch] add dep. warning for pytorch schedulers by @kashif in #651
[CLIPGuidedStableDiffusion] remove set_format from pipeline by @patil-suraj in #653
Fix onnx tensor format by @anton-l in #654
Fix main: stable diffusion pipelines cannot be loaded by @pcuenca in #655
Fix the LMS pytorch regression by @anton-l in #664
Added script to save during textual inversion training. Issue 524 by @isamu-isozaki in #645
[CLIPGuidedStableDiffusion] take the correct text embeddings by @patil-suraj in #667
Update index.mdx by @tmabraham in #670
[examples] update transfomers version by @patil-suraj in #665
[gradient checkpointing] lower tolerance for test by @patil-suraj in #652
Flax from_pretrained: clean up mismatched_keys. by @pcuenca in #630
trained_betas ignored in some schedulers by @vishnu-anirudh in #635
Renamed x -> hidden_states in resnet.py by @daspartho in #676
Optimize Stable Diffusion by @NouamaneTazi in #371
Allow resolutions that are not multiples of 64 by @jachiam in #505
refactor: update ldm-bert config.json url closes #675 by @ryanrussell in #680
[docs] fix table in fp16.mdx by @NouamaneTazi in #683
Fix slow tests by @NouamaneTazi in #689
Fix BibText citation by @osanseviero in #693
Add callback parameters for Stable Diffusion pipelines by @jamestiotio in #521
[dreambooth] fix applying clip_grad_norm_ by @patil-suraj in #686
Flax: add shape argument to set_timesteps by @pcuenca in #690
Fix type annotations on StableDiffusionPipeline.call by @tasercake in #682
Fix import with Flax but without PyTorch by @pcuenca in #688
[Support PyTorch 1.8] Remove inference mode by @patrickvonplaten in #707
[CI] Speed up slow tests by @anton-l in #708
[Utils] Add deprecate function and move testing_utils under utils by @patrickvonplaten in #659
Checkpoint conversion script from Diffusers => Stable Diffusion (CompVis) by @jachiam in #701
[Docs] fix docstring for issue #709 by @kashif in #710
Update schedulers README.md by @tmabraham in #694
add accelerate to load models with smaller memory footprint by @piEsposito in #361
Fix typos by @shirayu in #718
Add an argument "negative_prompt" by @shirayu in #549
Fix import if PyTorch is not installed by @pcuenca in #715
Remove comments no longer appropriate by @pcuenca in #716
[train_unconditional] fix applying clip_grad_norm_ by @patil-suraj in #721
renamed x to meaningful variable in resnet.py by @i-am-epic in #677
[Tests] Add accelerate to testing by @patrickvonplaten in #729
[dreambooth] Using already created Path in dataset by @DrInfiniteExplorer in #681
Include CLIPTextModel parameters in conversion by @kanewallmann in #695
Avoid negative strides for tensors by @shirayu in #717
[Pytorch] pytorch only timesteps by @kashif in #724
[Scheduler design] The pragmatic approach by @anton-l in #719
Removing autocast for 35-25% speedup. (autocast considered harmful). by @Narsil in #511
No more use_auth_token=True by @patrickvonplaten in #733
remove use_auth_token from remaining places by @patil-suraj in #737
Replace messages that have empty backquotes by @pcuenca in #738
[Docs] Advertise fp16 instead of autocast by @patrickvonplaten in #740
remove use_auth_token from for TI test by @patil-suraj in #747
allow multiple generations per prompt by @patil-suraj in #741
Add back-compatibility to LMS timesteps by @anton-l in #750
update the clip guided PR according to the new API by @patil-suraj in #751
Raise an error when moving an fp16 pipeline to CPU by @anton-l in #749
Better steps deprecation for LMS by @anton-l in #753

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.4.0 Better, faster, stronger!

🚗 Faster

🔑 `use_auth_token` no more

🎈More flexible APIs

🛠️ More tasks

🏃‍♀️ Under the hood changes to support better fine-tuning

⚠️ Experimental: community pipelines

💪 Quality of life fixes

🙏 Significant community contributions

✏️ Change list

Contributors

v0.4.0 Better, faster, stronger!

🚗 Faster

🔑 use_auth_token no more

🎈More flexible APIs

🛠️ More tasks

🏃‍♀️ Under the hood changes to support better fine-tuning

⚠️ Experimental: community pipelines

💪 Quality of life fixes

🙏 Significant community contributions

✏️ Change list

Contributors

🔑 `use_auth_token` no more