diff --git a/docs/source/en/_toctree.yml b/docs/source/en/_toctree.yml
index 150464b09795..d2583121418e 100644
--- a/docs/source/en/_toctree.yml
+++ b/docs/source/en/_toctree.yml
@@ -186,13 +186,21 @@
- sections:
- local: api/configuration
title: Configuration
- - local: api/loaders
- title: Loaders
- local: api/logging
title: Logging
- local: api/outputs
title: Outputs
title: Main Classes
+ - sections:
+ - local: api/loaders/lora
+ title: LoRA
+ - local: api/loaders/single_file
+ title: Single files
+ - local: api/loaders/textual_inversion
+ title: Textual Inversion
+ - local: api/loaders/unet
+ title: UNet
+ title: Loaders
- sections:
- local: api/models/overview
title: Overview
diff --git a/docs/source/en/api/loaders.md b/docs/source/en/api/loaders.md
deleted file mode 100644
index d81b0eb1abcb..000000000000
--- a/docs/source/en/api/loaders.md
+++ /dev/null
@@ -1,49 +0,0 @@
-
-
-# Loaders
-
-Adapters (textual inversion, LoRA, hypernetworks) allow you to modify a diffusion model to generate images in a specific style without training or finetuning the entire model. The adapter weights are very portable because they're typically only a tiny fraction of the pretrained model weights. ๐ค Diffusers provides an easy-to-use `LoaderMixin` API to load adapter weights.
-
-
-
-๐งช The `LoaderMixin`s are highly experimental and prone to future changes. To use private or [gated](https://huggingface.co/docs/hub/models-gated#gated-models) models, log-in with `huggingface-cli login`.
-
-
-
-## UNet2DConditionLoadersMixin
-
-[[autodoc]] loaders.UNet2DConditionLoadersMixin
-
-## TextualInversionLoaderMixin
-
-[[autodoc]] loaders.TextualInversionLoaderMixin
-
-## StableDiffusionXLLoraLoaderMixin
-
-[[autodoc]] loaders.StableDiffusionXLLoraLoaderMixin
-
-## LoraLoaderMixin
-
-[[autodoc]] loaders.LoraLoaderMixin
-
-## FromSingleFileMixin
-
-[[autodoc]] loaders.FromSingleFileMixin
-
-## FromOriginalControlnetMixin
-
-[[autodoc]] loaders.FromOriginalControlnetMixin
-
-## FromOriginalVAEMixin
-
-[[autodoc]] loaders.FromOriginalVAEMixin
diff --git a/docs/source/en/api/loaders/lora.md b/docs/source/en/api/loaders/lora.md
new file mode 100644
index 000000000000..05ff11afc5d4
--- /dev/null
+++ b/docs/source/en/api/loaders/lora.md
@@ -0,0 +1,32 @@
+
+
+# LoRA
+
+LoRA is a fast and lightweight training method that inserts and trains a significantly smaller number of parameters instead of all the model parameters. This produces a smaller file (~100 MBs) and makes it easier to quickly train a model to learn a new concept. LoRA weights are typically loaded into the UNet, text encoder or both. There are two classes for loading LoRA weights:
+
+- [`LoraLoaderMixin`] provides functions for loading and unloading, fusing and unfusing, enabling and disabling, and more functions for managing LoRA weights. This class can be used with any model.
+- [`StableDiffusionXLLoraLoaderMixin`] is a [Stable Diffusion (SDXL)](../../api/pipelines/stable_diffusion/stable_diffusion_xl) version of the [`LoraLoaderMixin`] class for loading and saving LoRA weights. It can only be used with the SDXL model.
+
+
+
+To learn more about how to load LoRA weights, see the [LoRA](../../using-diffusers/loading_adapters#lora) loading guide.
+
+
+
+## LoraLoaderMixin
+
+[[autodoc]] loaders.lora.LoraLoaderMixin
+
+## StableDiffusionXLLoraLoaderMixin
+
+[[autodoc]] loaders.lora.StableDiffusionXLLoraLoaderMixin
\ No newline at end of file
diff --git a/docs/source/en/api/loaders/single_file.md b/docs/source/en/api/loaders/single_file.md
new file mode 100644
index 000000000000..52e44606455b
--- /dev/null
+++ b/docs/source/en/api/loaders/single_file.md
@@ -0,0 +1,37 @@
+
+
+# Single files
+
+Diffusers supports loading pretrained pipeline (or model) weights stored in a single file, such as a `ckpt` or `safetensors` file. These single file types are typically produced from community trained models. There are three classes for loading single file weights:
+
+- [`FromSingleFileMixin`] supports loading pretrained pipeline weights stored in a single file, which can either be a `ckpt` or `safetensors` file.
+- [`FromOriginalVAEMixin`] supports loading a pretrained [`AutoencoderKL`] from pretrained ControlNet weights stored in a single file, which can either be a `ckpt` or `safetensors` file.
+- [`FromOriginalControlnetMixin`] supports loading pretrained ControlNet weights stored in a single file, which can either be a `ckpt` or `safetensors` file.
+
+
+
+To learn more about how to load single file weights, see the [Load different Stable Diffusion formats](../../using-diffusers/other-formats) loading guide.
+
+
+
+## FromSingleFileMixin
+
+[[autodoc]] loaders.single_file.FromSingleFileMixin
+
+## FromOriginalVAEMixin
+
+[[autodoc]] loaders.single_file.FromOriginalVAEMixin
+
+## FromOriginalControlnetMixin
+
+[[autodoc]] loaders.single_file.FromOriginalControlnetMixin
\ No newline at end of file
diff --git a/docs/source/en/api/loaders/textual_inversion.md b/docs/source/en/api/loaders/textual_inversion.md
new file mode 100644
index 000000000000..28d38ddb5bf2
--- /dev/null
+++ b/docs/source/en/api/loaders/textual_inversion.md
@@ -0,0 +1,27 @@
+
+
+# Textual Inversion
+
+Textual Inversion is a training method for personalizing models by learning new text embeddings from a few example images. The file produced from training is extremely small (a few KBs) and the new embeddings can be loaded into the text encoder.
+
+[`TextualInversionLoaderMixin`] provides a function for loading Textual Inversion embeddings from Diffusers and Automatic1111 into the text encoder and loading a special token to activate the embeddings.
+
+
+
+To learn more about how to load Textual Inversion embeddings, see the [Textual Inversion](../../using-diffusers/loading_adapters#textual-inversion) loading guide.
+
+
+
+## TextualInversionLoaderMixin
+
+[[autodoc]] loaders.textual_inversion.TextualInversionLoaderMixin
\ No newline at end of file
diff --git a/docs/source/en/api/loaders/unet.md b/docs/source/en/api/loaders/unet.md
new file mode 100644
index 000000000000..df896a065eb3
--- /dev/null
+++ b/docs/source/en/api/loaders/unet.md
@@ -0,0 +1,27 @@
+
+
+# UNet
+
+Some training methods - like LoRA and Custom Diffusion - typically target the UNet's attention layers, but these training methods can also target other non-attention layers. Instead of training all of a model's parameters, only a subset of the parameters are trained, which is faster and more efficient. This class is useful if you're *only* loading weights into a UNet. If you need to load weights into the text encoder or a text encoder and UNet, try using the [`~loaders.LoraLoaderMixin.load_lora_weights`] function instead.
+
+The [`UNet2DConditionLoadersMixin`] class provides functions for loading and saving weights, fusing and unfusing LoRAs, disabling and enabling LoRAs, and setting and deleting adapters.
+
+
+
+To learn more about how to load LoRA weights, see the [LoRA](../../using-diffusers/loading_adapters#lora) loading guide.
+
+
+
+## UNet2DConditionLoadersMixin
+
+[[autodoc]] loaders.unet.UNet2DConditionLoadersMixin
\ No newline at end of file
diff --git a/src/diffusers/loaders/lora.py b/src/diffusers/loaders/lora.py
index 532a59f3b9bd..d260da9c1298 100644
--- a/src/diffusers/loaders/lora.py
+++ b/src/diffusers/loaders/lora.py
@@ -68,8 +68,7 @@
class LoraLoaderMixin:
r"""
- Load LoRA layers into [`UNet2DConditionModel`] and
- [`CLIPTextModel`](https://huggingface.co/docs/transformers/model_doc/clip#transformers.CLIPTextModel).
+ Load LoRA layers into [`UNet2DConditionModel`] and [`~transformers.CLIPTextModel`].
"""
text_encoder_name = TEXT_ENCODER_NAME
unet_name = UNET_NAME
@@ -94,12 +93,28 @@ def load_lora_weights(
Parameters:
pretrained_model_name_or_path_or_dict (`str` or `os.PathLike` or `dict`):
- See [`~loaders.LoraLoaderMixin.lora_state_dict`].
+ A string (model id of a pretrained model hosted on the Hub), a path to a directory containing the model
+ weights, or a [torch state
+ dict](https://pytorch.org/tutorials/beginner/saving_loading_models.html#what-is-a-state-dict).
kwargs (`dict`, *optional*):
See [`~loaders.LoraLoaderMixin.lora_state_dict`].
adapter_name (`str`, *optional*):
- Adapter name to be used for referencing the loaded adapter model. If not specified, it will use
- `default_{i}` where i is the total number of adapters being loaded.
+ Name for referencing the loaded adapter model. If not specified, it will use `default_{i}` where `i` is
+ the total number of adapters being loaded. Must have PEFT installed to use.
+
+ Example:
+
+ ```py
+ from diffusers import DiffusionPipeline
+ import torch
+
+ pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16).to(
+ "cuda"
+ )
+ pipeline.load_lora_weights(
+ "Yntec/pineappleAnimeMix", weight_name="pineappleAnimeMix_pineapple10.1.safetensors", adapter_name="anime"
+ )
+ ```
"""
# First, ensure that the checkpoint is a compatible one and can be successfully loaded.
state_dict, network_alphas = self.lora_state_dict(pretrained_model_name_or_path_or_dict, **kwargs)
@@ -137,15 +152,7 @@ def lora_state_dict(
**kwargs,
):
r"""
- Return state dict for lora weights and the network alphas.
-
-
-
- We support loading A1111 formatted LoRA checkpoints in a limited capacity.
-
- This function is experimental and might change in the future.
-
-
+ Return state dict and network alphas of the LoRA weights.
Parameters:
pretrained_model_name_or_path_or_dict (`str` or `os.PathLike` or `dict`):
@@ -153,8 +160,7 @@ def lora_state_dict(
- A string, the *model id* (for example `google/ddpm-celebahq-256`) of a pretrained model hosted on
the Hub.
- - A path to a *directory* (for example `./my_model_directory`) containing the model weights saved
- with [`ModelMixin.save_pretrained`].
+ - A path to a *directory* (for example `./my_model_directory`) containing the model weights.
- A [torch state
dict](https://pytorch.org/tutorials/beginner/saving_loading_models.html#what-is-a-state-dict).
@@ -190,7 +196,6 @@ def lora_state_dict(
Mirror source to resolve accessibility issues if you're downloading a model in China. We do not
guarantee the timeliness or safety of the source, and you should refer to the mirror site for more
information.
-
"""
# Load the main state dict first which has the LoRA layers for either of
# UNet and text encoder or both.
@@ -467,25 +472,27 @@ def load_lora_into_unet(
cls, state_dict, network_alphas, unet, low_cpu_mem_usage=None, adapter_name=None, _pipeline=None
):
"""
- This will load the LoRA layers specified in `state_dict` into `unet`.
+ Load LoRA layers specified in `state_dict` into `unet`.
Parameters:
state_dict (`dict`):
- A standard state dict containing the lora layer parameters. The keys can either be indexed directly
- into the unet or prefixed with an additional `unet` which can be used to distinguish between text
- encoder lora layers.
+ A standard state dict containing the LoRA layer parameters. The keys can either be indexed directly
+ into the `unet` or prefixed with an additional `unet`, which can be used to distinguish between text
+ encoder LoRA layers.
network_alphas (`Dict[str, float]`):
- See `LoRALinearLayer` for more details.
+ See
+ [`LoRALinearLayer`](https://github.com/huggingface/diffusers/blob/c697f524761abd2314c030221a3ad2f7791eab4e/src/diffusers/models/lora.py#L182)
+ for more details.
unet (`UNet2DConditionModel`):
The UNet model to load the LoRA layers into.
low_cpu_mem_usage (`bool`, *optional*, defaults to `True` if torch version >= 1.9.0 else `False`):
- Speed up model loading only loading the pretrained weights and not initializing the weights. This also
- tries to not use more than 1x model size in CPU memory (including peak memory) while loading the model.
- Only supported for PyTorch >= 1.9.0. If you are using an older version of PyTorch, setting this
- argument to `True` will raise an error.
+ Only load and not initialize the pretrained weights. This can speedup model loading and also tries to
+ not use more than 1x model size in CPU memory (including peak memory) while loading the model. Only
+ supported for PyTorch >= 1.9.0. If you are using an older version of PyTorch, setting this argument to
+ `True` will raise an error.
adapter_name (`str`, *optional*):
- Adapter name to be used for referencing the loaded adapter model. If not specified, it will use
- `default_{i}` where i is the total number of adapters being loaded.
+ Name for referencing the loaded adapter model. If not specified, it will use `default_{i}` where `i` is
+ the total number of adapters being loaded.
"""
low_cpu_mem_usage = low_cpu_mem_usage if low_cpu_mem_usage is not None else _LOW_CPU_MEM_USAGE_DEFAULT
# If the serialization format is new (introduced in https://github.com/huggingface/diffusers/pull/2918),
@@ -579,26 +586,27 @@ def load_lora_into_text_encoder(
_pipeline=None,
):
"""
- This will load the LoRA layers specified in `state_dict` into `text_encoder`
+ Load LoRA layers specified in `state_dict` into `text_encoder`.
Parameters:
state_dict (`dict`):
- A standard state dict containing the lora layer parameters. The key should be prefixed with an
- additional `text_encoder` to distinguish between unet lora layers.
+ A standard state dict containing the LoRA layer parameters. The key should be prefixed with an
+ additional `text_encoder` to distinguish between UNet LoRA layers.
network_alphas (`Dict[str, float]`):
- See `LoRALinearLayer` for more details.
+ See
+ [`LoRALinearLayer`](https://github.com/huggingface/diffusers/blob/c697f524761abd2314c030221a3ad2f7791eab4e/src/diffusers/models/lora.py#L182)
+ for more details.
text_encoder (`CLIPTextModel`):
The text encoder model to load the LoRA layers into.
prefix (`str`):
Expected prefix of the `text_encoder` in the `state_dict`.
lora_scale (`float`):
- How much to scale the output of the lora linear layer before it is added with the output of the regular
- lora layer.
+ Scale of `LoRALinearLayer`'s output before it is added with the output of the regular LoRA layer.
low_cpu_mem_usage (`bool`, *optional*, defaults to `True` if torch version >= 1.9.0 else `False`):
- Speed up model loading only loading the pretrained weights and not initializing the weights. This also
- tries to not use more than 1x model size in CPU memory (including peak memory) while loading the model.
- Only supported for PyTorch >= 1.9.0. If you are using an older version of PyTorch, setting this
- argument to `True` will raise an error.
+ Only load and not initialize the pretrained weights. This can speedup model loading and also tries to
+ not use more than 1x model size in CPU memory (including peak memory) while loading the model. Only
+ supported for PyTorch >= 1.9.0. If you are using an older version of PyTorch, setting this argument to
+ `True` will raise an error.
adapter_name (`str`, *optional*):
Adapter name to be used for referencing the loaded adapter model. If not specified, it will use
`default_{i}` where i is the total number of adapters being loaded.
@@ -883,11 +891,11 @@ def save_lora_weights(
safe_serialization: bool = True,
):
r"""
- Save the LoRA parameters corresponding to the UNet and text encoder.
+ Save the UNet and text encoder LoRA parameters.
Arguments:
save_directory (`str` or `os.PathLike`):
- Directory to save LoRA parameters to. Will be created if it doesn't exist.
+ Directory to save LoRA parameters to (will be created if it doesn't exist).
unet_lora_layers (`Dict[str, torch.nn.Module]` or `Dict[str, torch.Tensor]`):
State dict of the LoRA layers corresponding to the `unet`.
text_encoder_lora_layers (`Dict[str, torch.nn.Module]` or `Dict[str, torch.Tensor]`):
@@ -898,11 +906,30 @@ def save_lora_weights(
need to call this function on all processes. In this case, set `is_main_process=True` only on the main
process to avoid race conditions.
save_function (`Callable`):
- The function to use to save the state dictionary. Useful during distributed training when you need to
- replace `torch.save` with another method. Can be configured with the environment variable
+ The function to use to save the state dict. Useful during distributed training when you need to replace
+ `torch.save` with another method. Can be configured with the environment variable
`DIFFUSERS_SAVE_MODE`.
safe_serialization (`bool`, *optional*, defaults to `True`):
- Whether to save the model using `safetensors` or the traditional PyTorch way with `pickle`.
+ Whether to save the model using `safetensors` or with `pickle`.
+
+ Example:
+
+ ```py
+ from diffusers import StableDiffusionXLPipeline
+ from peft.utils import get_peft_model_state_dict
+ import torch
+
+ pipeline = StableDiffusionXLPipeline.from_pretrained(
+ "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+ ).to("cuda")
+ pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+ pipeline.fuse_lora()
+
+ # get and save unet state dict
+ unet_state_dict = get_peft_model_state_dict(pipeline.unet, adapter_name="pixel")
+ pipeline.save_lora_weights("fused-model", unet_lora_layers=unet_state_dict)
+ pipeline.load_lora_weights("fused-model", weight_name="pytorch_lora_weights.safetensors")
+ ```
"""
# Create a flat dictionary.
state_dict = {}
@@ -1138,14 +1165,19 @@ def _convert_kohya_lora_to_diffusers(cls, state_dict):
def unload_lora_weights(self):
"""
- Unloads the LoRA parameters.
+ Unload the LoRA parameters from a pipeline.
Examples:
- ```python
- >>> # Assuming `pipeline` is already loaded with the LoRA parameters.
- >>> pipeline.unload_lora_weights()
- >>> ...
+ ```py
+ from diffusers import DiffusionPipeline
+ import torch
+
+ pipeline = DiffusionPipeline.from_pretrained(
+ "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+ ).to("cuda")
+ pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+ pipeline.unload_lora_weights()
```
"""
if not USE_PEFT_BACKEND:
@@ -1174,7 +1206,7 @@ def fuse_lora(
safe_fusing: bool = False,
):
r"""
- Fuses the LoRA parameters into the original parameters of the corresponding blocks.
+ Fuse the LoRA parameters with the original parameters in their corresponding blocks.
@@ -1188,9 +1220,23 @@ def fuse_lora(
Whether to fuse the text encoder LoRA parameters. If the text encoder wasn't monkey-patched with the
LoRA parameters then it won't have any effect.
lora_scale (`float`, defaults to 1.0):
- Controls how much to influence the outputs with the LoRA parameters.
+ Controls LoRA influence on the outputs.
safe_fusing (`bool`, defaults to `False`):
- Whether to check fused weights for NaN values before fusing and if values are NaN not fusing them.
+ Whether to check fused weights for `NaN` values before fusing and if values are `NaN`, then don't fuse
+ them.
+
+ Example:
+
+ ```py
+ from diffusers import DiffusionPipeline
+ import torch
+
+ pipeline = DiffusionPipeline.from_pretrained(
+ "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+ ).to("cuda")
+ pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+ pipeline.fuse_lora(lora_scale=0.7)
+ ```
"""
if fuse_unet or fuse_text_encoder:
self.num_fused_loras += 1
@@ -1239,8 +1285,7 @@ def fuse_text_encoder_lora(text_encoder, lora_scale=1.0, safe_fusing=False):
def unfuse_lora(self, unfuse_unet: bool = True, unfuse_text_encoder: bool = True):
r"""
- Reverses the effect of
- [`pipe.fuse_lora()`](https://huggingface.co/docs/diffusers/main/en/api/loaders#diffusers.loaders.LoraLoaderMixin.fuse_lora).
+ Unfuse the LoRA parameters from the original parameters in their corresponding blocks.
@@ -1253,6 +1298,20 @@ def unfuse_lora(self, unfuse_unet: bool = True, unfuse_text_encoder: bool = True
unfuse_text_encoder (`bool`, defaults to `True`):
Whether to unfuse the text encoder LoRA parameters. If the text encoder wasn't monkey-patched with the
LoRA parameters then it won't have any effect.
+
+ Example:
+
+ ```py
+ from diffusers import DiffusionPipeline
+ import torch
+
+ pipeline = DiffusionPipeline.from_pretrained(
+ "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+ ).to("cuda")
+ pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+ pipeline.fuse_lora(lora_scale=0.7)
+ pipeline.unfuse_lora()
+ ```
"""
if unfuse_unet:
if not USE_PEFT_BACKEND:
@@ -1304,16 +1363,32 @@ def set_adapters_for_text_encoder(
text_encoder_weights: List[float] = None,
):
"""
- Sets the adapter layers for the text encoder.
+ Set the currently active adapter for use in the text encoder.
Args:
adapter_names (`List[str]` or `str`):
- The names of the adapters to use.
+ The adapter to activate.
text_encoder (`torch.nn.Module`, *optional*):
- The text encoder module to set the adapter layers for. If `None`, it will try to get the `text_encoder`
- attribute.
+ The text encoder module to activate the adapter layers for. If `None`, it will try to get the
+ `text_encoder` attribute.
text_encoder_weights (`List[float]`, *optional*):
The weights to use for the text encoder. If `None`, the weights are set to `1.0` for all the adapters.
+
+ Example:
+
+ ```py
+ from diffusers import DiffusionPipeline
+ import torch
+
+ pipeline = DiffusionPipeline.from_pretrained(
+ "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+ ).to("cuda")
+ pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+ pipeline.load_lora_weights(
+ "jbilcke-hf/sdxl-cinematic-1", weight_name="pytorch_lora_weights.safetensors", adapter_name="cinematic"
+ )
+ pipeline.set_adapters_for_text_encoder("pixel")
+ ```
"""
if not USE_PEFT_BACKEND:
raise ValueError("PEFT backend is required for this method.")
@@ -1341,12 +1416,25 @@ def process_weights(adapter_names, weights):
def disable_lora_for_text_encoder(self, text_encoder: Optional["PreTrainedModel"] = None):
"""
- Disables the LoRA layers for the text encoder.
+ Disable the text encoder's LoRA layers.
Args:
text_encoder (`torch.nn.Module`, *optional*):
The text encoder module to disable the LoRA layers for. If `None`, it will try to get the
`text_encoder` attribute.
+
+ Example:
+
+ ```py
+ from diffusers import DiffusionPipeline
+ import torch
+
+ pipeline = DiffusionPipeline.from_pretrained(
+ "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+ ).to("cuda")
+ pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+ pipeline.disable_lora_for_text_encoder()
+ ```
"""
if not USE_PEFT_BACKEND:
raise ValueError("PEFT backend is required for this method.")
@@ -1358,12 +1446,25 @@ def disable_lora_for_text_encoder(self, text_encoder: Optional["PreTrainedModel"
def enable_lora_for_text_encoder(self, text_encoder: Optional["PreTrainedModel"] = None):
"""
- Enables the LoRA layers for the text encoder.
+ Enables the text encoder's LoRA layers.
Args:
text_encoder (`torch.nn.Module`, *optional*):
The text encoder module to enable the LoRA layers for. If `None`, it will try to get the `text_encoder`
attribute.
+
+ Example:
+
+ ```py
+ from diffusers import DiffusionPipeline
+ import torch
+
+ pipeline = DiffusionPipeline.from_pretrained(
+ "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+ ).to("cuda")
+ pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+ pipeline.enable_lora_for_text_encoder()
+ ```
"""
if not USE_PEFT_BACKEND:
raise ValueError("PEFT backend is required for this method.")
@@ -1414,10 +1515,24 @@ def enable_lora(self):
def delete_adapters(self, adapter_names: Union[List[str], str]):
"""
+ Delete an adapter's LoRA layers from the UNet and text encoder(s).
+
Args:
- Deletes the LoRA layers of `adapter_name` for the unet and text-encoder(s).
adapter_names (`Union[List[str], str]`):
- The names of the adapter to delete. Can be a single string or a list of strings
+ The names (single string or list of strings) of the adapter to delete.
+
+ Example:
+
+ ```py
+ from diffusers import DiffusionPipeline
+ import torch
+
+ pipeline = DiffusionPipeline.from_pretrained(
+ "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+ ).to("cuda")
+ pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+ pipeline.delete_adapters("pixel")
+ ```
"""
if not USE_PEFT_BACKEND:
raise ValueError("PEFT backend is required for this method.")
@@ -1437,7 +1552,7 @@ def delete_adapters(self, adapter_names: Union[List[str], str]):
def get_active_adapters(self) -> List[str]:
"""
- Gets the list of the current active adapters.
+ Get a list of currently active adapters.
Example:
@@ -1469,7 +1584,22 @@ def get_active_adapters(self) -> List[str]:
def get_list_adapters(self) -> Dict[str, List[str]]:
"""
- Gets the current list of all available adapters in the pipeline.
+ Get a list of all currently available adapters for each component in the pipeline.
+
+ Example:
+
+ ```py
+ from diffusers import DiffusionPipeline
+
+ pipeline = DiffusionPipeline.from_pretrained(
+ "stabilityai/stable-diffusion-xl-base-1.0",
+ ).to("cuda")
+ pipeline.load_lora_weights(
+ "jbilcke-hf/sdxl-cinematic-1", weight_name="pytorch_lora_weights.safetensors", adapter_name="cinematic"
+ )
+ pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+ pipeline.get_list_adapters()
+ ```
"""
if not USE_PEFT_BACKEND:
raise ValueError(
@@ -1491,14 +1621,27 @@ def get_list_adapters(self) -> Dict[str, List[str]]:
def set_lora_device(self, adapter_names: List[str], device: Union[torch.device, str, int]) -> None:
"""
- Moves the LoRAs listed in `adapter_names` to a target device. Useful for offloading the LoRA to the CPU in case
- you want to load multiple adapters and free some GPU memory.
+ Move a LoRA to a target device. Useful for offloading a LoRA to the CPU in case you want to load multiple
+ adapters and free some GPU memory.
Args:
adapter_names (`List[str]`):
- List of adapters to send device to.
+ List of adapters to send to device.
device (`Union[torch.device, str, int]`):
- Device to send the adapters to. Can be either a torch device, a str or an integer.
+ Device (can be a `torch.device`, `str` or `int`) to place adapters on.
+
+ Example:
+
+ ```py
+ from diffusers import DiffusionPipeline
+ import torch
+
+ pipeline = DiffusionPipeline.from_pretrained(
+ "stabilityai/stable-diffusion-xl-base-1.0",
+ ).to("cuda")
+ pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+ pipeline.set_lora_device(["pixel"], device="cuda")
+ ```
"""
if not USE_PEFT_BACKEND:
raise ValueError("PEFT backend is required for this method.")
@@ -1530,7 +1673,7 @@ def set_lora_device(self, adapter_names: List[str], device: Union[torch.device,
class StableDiffusionXLLoraLoaderMixin(LoraLoaderMixin):
- """This class overrides `LoraLoaderMixin` with LoRA loading/saving code that's specific to SDXL"""
+ """This class overrides [`LoraLoaderMixin`] with LoRA loading/saving code that's specific to SDXL."""
# Overrride to properly handle the loading and unloading of the additional text encoder.
def load_lora_weights(
@@ -1555,12 +1698,26 @@ def load_lora_weights(
Parameters:
pretrained_model_name_or_path_or_dict (`str` or `os.PathLike` or `dict`):
- See [`~loaders.LoraLoaderMixin.lora_state_dict`].
- adapter_name (`str`, *optional*):
- Adapter name to be used for referencing the loaded adapter model. If not specified, it will use
- `default_{i}` where i is the total number of adapters being loaded.
+ A string (model id of a pretrained model hosted on the Hub), a path to a directory containing the model
+ weights, or a [torch state
+ dict](https://pytorch.org/tutorials/beginner/saving_loading_models.html#what-is-a-state-dict).
kwargs (`dict`, *optional*):
See [`~loaders.LoraLoaderMixin.lora_state_dict`].
+ adapter_name (`str`, *optional*):
+ Name for referencing the loaded adapter model. If not specified, it will use `default_{i}` where `i` is
+ the total number of adapters being loaded. Must have PEFT installed to use.
+
+ Example:
+
+ ```py
+ from diffusers import StableDiffusionXLPipeline
+ import torch
+
+ pipeline = StableDiffusionXLPipeline.from_pretrained(
+ "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+ ).to("cuda")
+ pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+ ```
"""
# We could have accessed the unet config from `lora_state_dict()` too. We pass
# it here explicitly to be able to tell that it's coming from an SDXL
diff --git a/src/diffusers/loaders/single_file.py b/src/diffusers/loaders/single_file.py
index 8a4f1a0541fd..696be5fd3a65 100644
--- a/src/diffusers/loaders/single_file.py
+++ b/src/diffusers/loaders/single_file.py
@@ -288,12 +288,15 @@ def from_single_file(cls, pretrained_model_link_or_path, **kwargs):
class FromOriginalVAEMixin:
+ """
+ Load pretrained ControlNet weights saved in the `.ckpt` or `.safetensors` format into an [`AutoencoderKL`].
+ """
+
@classmethod
def from_single_file(cls, pretrained_model_link_or_path, **kwargs):
r"""
- Instantiate a [`AutoencoderKL`] from pretrained controlnet weights saved in the original `.ckpt` or
- `.safetensors` format. The pipeline is format. The pipeline is set in evaluation mode (`model.eval()`) by
- default.
+ Instantiate a [`AutoencoderKL`] from pretrained ControlNet weights saved in the original `.ckpt` or
+ `.safetensors` format. The pipeline is set in evaluation mode (`model.eval()`) by default.
Parameters:
pretrained_model_link_or_path (`str` or `os.PathLike`, *optional*):
@@ -348,8 +351,8 @@ def from_single_file(cls, pretrained_model_link_or_path, **kwargs):
- Make sure to pass both `image_size` and `scaling_factor` to `from_single_file()` if you want to load
- a VAE that does accompany a stable diffusion model of v2 or higher or SDXL.
+ Make sure to pass both `image_size` and `scaling_factor` to `from_single_file()` if you're loading
+ a VAE from SDXL or a Stable Diffusion v2 model or higher.
@@ -482,10 +485,14 @@ def from_single_file(cls, pretrained_model_link_or_path, **kwargs):
class FromOriginalControlnetMixin:
+ """
+ Load pretrained ControlNet weights saved in the `.ckpt` or `.safetensors` format into a [`ControlNetModel`].
+ """
+
@classmethod
def from_single_file(cls, pretrained_model_link_or_path, **kwargs):
r"""
- Instantiate a [`ControlNetModel`] from pretrained controlnet weights saved in the original `.ckpt` or
+ Instantiate a [`ControlNetModel`] from pretrained ControlNet weights saved in the original `.ckpt` or
`.safetensors` format. The pipeline is set in evaluation mode (`model.eval()`) by default.
Parameters:
diff --git a/src/diffusers/loaders/textual_inversion.py b/src/diffusers/loaders/textual_inversion.py
index 4890810d49a6..e36f03437a45 100644
--- a/src/diffusers/loaders/textual_inversion.py
+++ b/src/diffusers/loaders/textual_inversion.py
@@ -116,7 +116,7 @@ def load_textual_inversion_state_dicts(pretrained_model_name_or_paths, **kwargs)
class TextualInversionLoaderMixin:
r"""
- Load textual inversion tokens and embeddings to the tokenizer and text encoder.
+ Load Textual Inversion tokens and embeddings to the tokenizer and text encoder.
"""
def maybe_convert_prompt(self, prompt: Union[str, List[str]], tokenizer: "PreTrainedTokenizer"): # noqa: F821
@@ -276,7 +276,7 @@ def load_textual_inversion(
**kwargs,
):
r"""
- Load textual inversion embeddings into the text encoder of [`StableDiffusionPipeline`] (both ๐ค Diffusers and
+ Load Textual Inversion embeddings into the text encoder of [`StableDiffusionPipeline`] (both ๐ค Diffusers and
Automatic1111 formats are supported).
Parameters:
@@ -335,7 +335,7 @@ def load_textual_inversion(
Example:
- To load a textual inversion embedding vector in ๐ค Diffusers format:
+ To load a Textual Inversion embedding vector in ๐ค Diffusers format:
```py
from diffusers import StableDiffusionPipeline
@@ -352,7 +352,7 @@ def load_textual_inversion(
image.save("cat-backpack.png")
```
- To load a textual inversion embedding vector in Automatic1111 format, make sure to download the vector first
+ To load a Textual Inversion embedding vector in Automatic1111 format, make sure to download the vector first
(for example from [civitAI](https://civitai.com/models/3036?modelVersionId=9857)) and then load the vector
locally:
diff --git a/src/diffusers/loaders/unet.py b/src/diffusers/loaders/unet.py
index 3f63e73d9cec..9555ac9e7d8b 100644
--- a/src/diffusers/loaders/unet.py
+++ b/src/diffusers/loaders/unet.py
@@ -53,6 +53,10 @@
class UNet2DConditionLoadersMixin:
+ """
+ Load LoRA layers into a [`UNet2DCondtionModel`].
+ """
+
text_encoder_name = TEXT_ENCODER_NAME
unet_name = UNET_NAME
@@ -107,6 +111,19 @@ def load_attn_procs(self, pretrained_model_name_or_path_or_dict: Union[str, Dict
guarantee the timeliness or safety of the source, and you should refer to the mirror site for more
information.
+ Example:
+
+ ```py
+ from diffusers import AutoPipelineForText2Image
+ import torch
+
+ pipeline = AutoPipelineForText2Image.from_pretrained(
+ "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+ ).to("cuda")
+ pipeline.unet.load_attn_procs(
+ "jbilcke-hf/sdxl-cinematic-1", weight_name="pytorch_lora_weights.safetensors", adapter_name="cinematic"
+ )
+ ```
"""
from ..models.attention_processor import CustomDiffusionAttnProcessor
from ..models.lora import LoRACompatibleConv, LoRACompatibleLinear, LoRAConv2dLayer, LoRALinearLayer
@@ -393,12 +410,12 @@ def save_attn_procs(
**kwargs,
):
r"""
- Save an attention processor to a directory so that it can be reloaded using the
+ Save attention processor layers to a directory so that it can be reloaded with the
[`~loaders.UNet2DConditionLoadersMixin.load_attn_procs`] method.
Arguments:
save_directory (`str` or `os.PathLike`):
- Directory to save an attention processor to. Will be created if it doesn't exist.
+ Directory to save an attention processor to (will be created if it doesn't exist).
is_main_process (`bool`, *optional*, defaults to `True`):
Whether the process calling this is the main process or not. Useful during distributed training and you
need to call this function on all processes. In this case, set `is_main_process=True` only on the main
@@ -408,7 +425,21 @@ def save_attn_procs(
replace `torch.save` with another method. Can be configured with the environment variable
`DIFFUSERS_SAVE_MODE`.
safe_serialization (`bool`, *optional*, defaults to `True`):
- Whether to save the model using `safetensors` or the traditional PyTorch way with `pickle`.
+ Whether to save the model using `safetensors` or with `pickle`.
+
+ Example:
+
+ ```py
+ import torch
+ from diffusers import DiffusionPipeline
+
+ pipeline = DiffusionPipeline.from_pretrained(
+ "CompVis/stable-diffusion-v1-4",
+ torch_dtype=torch.float16,
+ ).to("cuda")
+ pipeline.unet.load_attn_procs("path-to-save-model", weight_name="pytorch_custom_diffusion_weights.bin")
+ pipeline.unet.save_attn_procs("path-to-save-model", weight_name="pytorch_custom_diffusion_weights.bin")
+ ```
"""
from ..models.attention_processor import (
CustomDiffusionAttnProcessor,
@@ -507,14 +538,30 @@ def set_adapters(
weights: Optional[Union[List[float], float]] = None,
):
"""
- Sets the adapter layers for the unet.
+ Set the currently active adapters for use in the UNet.
Args:
adapter_names (`List[str]` or `str`):
The names of the adapters to use.
- weights (`Union[List[float], float]`, *optional*):
+ adapter_weights (`Union[List[float], float]`, *optional*):
The adapter(s) weights to use with the UNet. If `None`, the weights are set to `1.0` for all the
adapters.
+
+ Example:
+
+ ```py
+ from diffusers import AutoPipelineForText2Image
+ import torch
+
+ pipeline = AutoPipelineForText2Image.from_pretrained(
+ "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+ ).to("cuda")
+ pipeline.load_lora_weights(
+ "jbilcke-hf/sdxl-cinematic-1", weight_name="pytorch_lora_weights.safetensors", adapter_name="cinematic"
+ )
+ pipeline.load_lora_weights("nerijs/pixel-art-xl", weight_name="pixel-art-xl.safetensors", adapter_name="pixel")
+ pipeline.set_adapters(["cinematic", "pixel"], adapter_weights=[0.5, 0.5])
+ ```
"""
if not USE_PEFT_BACKEND:
raise ValueError("PEFT backend is required for `set_adapters()`.")
@@ -535,7 +582,22 @@ def set_adapters(
def disable_lora(self):
"""
- Disables the active LoRA layers for the unet.
+ Disable the UNet's active LoRA layers.
+
+ Example:
+
+ ```py
+ from diffusers import AutoPipelineForText2Image
+ import torch
+
+ pipeline = AutoPipelineForText2Image.from_pretrained(
+ "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+ ).to("cuda")
+ pipeline.load_lora_weights(
+ "jbilcke-hf/sdxl-cinematic-1", weight_name="pytorch_lora_weights.safetensors", adapter_name="cinematic"
+ )
+ pipeline.disable_lora()
+ ```
"""
if not USE_PEFT_BACKEND:
raise ValueError("PEFT backend is required for this method.")
@@ -543,7 +605,22 @@ def disable_lora(self):
def enable_lora(self):
"""
- Enables the active LoRA layers for the unet.
+ Enable the UNet's active LoRA layers.
+
+ Example:
+
+ ```py
+ from diffusers import AutoPipelineForText2Image
+ import torch
+
+ pipeline = AutoPipelineForText2Image.from_pretrained(
+ "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+ ).to("cuda")
+ pipeline.load_lora_weights(
+ "jbilcke-hf/sdxl-cinematic-1", weight_name="pytorch_lora_weights.safetensors", adapter_name="cinematic"
+ )
+ pipeline.enable_lora()
+ ```
"""
if not USE_PEFT_BACKEND:
raise ValueError("PEFT backend is required for this method.")
@@ -551,10 +628,26 @@ def enable_lora(self):
def delete_adapters(self, adapter_names: Union[List[str], str]):
"""
+ Delete an adapter's LoRA layers from the UNet.
+
Args:
- Deletes the LoRA layers of `adapter_name` for the unet.
adapter_names (`Union[List[str], str]`):
- The names of the adapter to delete. Can be a single string or a list of strings
+ The names (single string or list of strings) of the adapter to delete.
+
+ Example:
+
+ ```py
+ from diffusers import AutoPipelineForText2Image
+ import torch
+
+ pipeline = AutoPipelineForText2Image.from_pretrained(
+ "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
+ ).to("cuda")
+ pipeline.load_lora_weights(
+ "jbilcke-hf/sdxl-cinematic-1", weight_name="pytorch_lora_weights.safetensors", adapter_names="cinematic"
+ )
+ pipeline.delete_adapters("cinematic")
+ ```
"""
if not USE_PEFT_BACKEND:
raise ValueError("PEFT backend is required for this method.")