Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/source/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@
title: Train a diffusion model
- local: tutorials/using_peft_for_inference
title: Inference with PEFT
- local: tutorials/custom_pipelines_components
title: Working with fully custom pipelines and components
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it may make more sense to include this doc as a section in the Load community pipelines page to prevent this topic from being spread out too much. We can rename the doc to "Load community pipelines and components" 🙂

I'd also consider if this is a crucial topic we want new users to learn/understand (then it should be in the tutorials) or if its something for more experienced Diffusers users (then it should be a guide).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice points, Steven!

I think it caters to advanced users. So, I would prefer it to be a guide. In that case, where it could live?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can live in the Load community pipelines page, and then we restructure it to:

# Load community pipelines and components (update the toctree to this title as well)

## Community pipelines

## Community components

title: Tutorials
- sections:
- sections:
Expand Down
135 changes: 135 additions & 0 deletions docs/source/en/tutorials/custom_pipelines_components.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# Working with fully custom pipelines and components
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In previous versions, we used "custom" and "community" pipelines interchangeably, which I feel like could cause some confusion. It'd be better if we stayed consistent and referred to them as "community" (I think it's understood that community pipelines are already customized to some extent) pipelines.

Suggested change
# Working with fully custom pipelines and components
# Working with fully community pipelines and components

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


Diffusers supports the use [custom pipelines](../using-diffusers/contribute_pipeline) letting the users add any additional features on top of the [`DiffusionPipeline`]. However, it can get cumbersome if you're dealing with a custom pipeline where its components (such as the UNet, VAE, scheduler) are also custom.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Diffusers supports the use [custom pipelines](../using-diffusers/contribute_pipeline) letting the users add any additional features on top of the [`DiffusionPipeline`]. However, it can get cumbersome if you're dealing with a custom pipeline where its components (such as the UNet, VAE, scheduler) are also custom.
Diffusers support [community pipelines](../using-diffusers/contribute_pipeline), allowing users to add any additional features on top of the [`DiffusionPipeline`]. However, it can get cumbersome if you're dealing with a community pipeline where its components - such as the UNet, VAE, and scheduler - have been customized.


We allow loading of such pipelines by exposing a `trust_remote_code` argument inside [`DiffusionPipeline`]. The advantage of `trust_remote_code` lies in its flexibility. You can have different levels of customizations for a pipeline. Following are a few examples:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We allow loading of such pipelines by exposing a `trust_remote_code` argument inside [`DiffusionPipeline`]. The advantage of `trust_remote_code` lies in its flexibility. You can have different levels of customizations for a pipeline. Following are a few examples:
To load pipelines with custom components, you can use the `trust_remote_code` argument inside [`DiffusionPipeline`]. This argument is very flexible and supports different levels of customized components for a pipeline. For example:


* Only UNet is custom
* UNet and VAE both are custom
* Pipeline is custom
* UNet, VAE, scheduler, and pipeline are custom
Comment on lines +19 to +22
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Only UNet is custom
* UNet and VAE both are custom
* Pipeline is custom
* UNet, VAE, scheduler, and pipeline are custom
* Only the UNet is custom
* The UNet and VAE are both custom
* The UNet, VAE, and scheduler are custom
* The pipeline contains entirely custom components


With `trust_remote_code=True`, you can achieve perform of the above!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
With `trust_remote_code=True`, you can achieve perform of the above!


This tutorial covers how to author your pipeline repository so that it becomes compatible with `trust_remote_code`. You'll use a custom UNet, a custom scheduler, and a custom pipeline for this purpose.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This tutorial covers how to author your pipeline repository so that it becomes compatible with `trust_remote_code`. You'll use a custom UNet, a custom scheduler, and a custom pipeline for this purpose.
This guide covers how to initialize, serialize, and load a pipeline with your custom components using `trust_remote_code`. You'll use an example custom UNet, scheduler, and pipeline from [sayakpaul/custom_pipeline_remote_code](https://huggingface.co/sayakpaul/custom_pipeline_remote_code).


<Tip warning={true}>

You should use `trust_remote_code=True` _only_ when you fully trust the code and have verified its usage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You should use `trust_remote_code=True` _only_ when you fully trust the code and have verified its usage.
You should *only* use `trust_remote_code=True` _only_ if you fully trust the code and have verified the code is safe.


</Tip>

## Pipeline components

In the interest of brevity, you'll use the custom UNet, scheduler, and pipeline classes that we've already authored:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In the interest of brevity, you'll use the custom UNet, scheduler, and pipeline classes that we've already authored:
Download the custom components:


```bash
# Custom UNet
wget https://huggingface.co/sayakpaul/custom_pipeline_remote_code/raw/main/unet/my_unet_model.py
# Custom scheduler
wget https://huggingface.co/sayakpaul/custom_pipeline_remote_code/raw/main/scheduler/my_scheduler.py
# Custom pipeline
wget https://huggingface.co/sayakpaul/custom_pipeline_remote_code/raw/main/my_pipeline.py
```

<Tip warning={true}>

The above classes are just for references. We encourage you to experiment with these classes for desired customizations.

</Tip>
Comment on lines +47 to +51
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok to remove this warning now that we've already clarified that this is just an example.

Suggested change
<Tip warning={true}>
The above classes are just for references. We encourage you to experiment with these classes for desired customizations.
</Tip>


Load the individual components, starting with the UNet:

```python
from my_unet_model import MyUNetModel

pretrained_id = "hf-internal-testing/tiny-sdxl-custom-all"
unet = MyUNetModel.from_pretrained(pretrained_id, subfolder="unet")
```

Then go for the scheduler:

```python
from my_scheduler import MyUNetModel

scheduler = MyScheduler.from_pretrained(pretrained_id, subfolder="scheduler")
```

Finally, the VAE and the text encoders:

```python
from transformers import CLIPTextModel, CLIPTextModelWithProjection, CLIPTokenizer
from diffusers import AutoencoderKL

text_encoder = CLIPTextModel.from_pretrained(pretrained_id, subfolder="text_encoder")
text_encoder_2 = CLIPTextModelWithProjection.from_pretrained(pretrained_id, subfolder="text_encoder_2")
tokenizer = CLIPTokenizer.from_pretrained(pretrained_id, subfolder="tokenizer")
tokenizer_2 = CLIPTokenizer.from_pretrained(pretrained_id, subfolder="tokenizer_2")

vae = AutoencoderKL.from_pretrained(pretrained_id, subfolder="vae")
```

`MyUNetModel`, `MyScheduler`, and `MyPipeline` use blocks that are already supported by Diffusers. If you are using any custom blocks make sure to put them in the module files themselves.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense to place this sentence after the code snippet with MyPipeline because it isn't introduced here yet

Suggested change
`MyUNetModel`, `MyScheduler`, and `MyPipeline` use blocks that are already supported by Diffusers. If you are using any custom blocks make sure to put them in the module files themselves.
`MyUNetModel`, `MyScheduler`, and `MyPipeline` use blocks that are already supported by Diffusers. If you are using any custom blocks, make sure to put them in the module files themselves.


## Pipeline initialization and serialization
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to divide the doc so granularly :)

Suggested change
## Pipeline initialization and serialization


With all the components, you can now initialize the custom pipeline:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
With all the components, you can now initialize the custom pipeline:
With all the components, you can now initialize the pipeline:


```python
pipeline = MyPipeline(
vae=vae,
unet=unet,
text_encoder=text_encoder,
text_encoder_2=text_encoder_2,
tokenizer=tokenizer,
tokenizer_2=tokenizer_2,
scheduler=scheduler,
)
```

Now, push the pipeline to the Hub:

```python
pipeline.push_to_hub("custom_pipeline_remote_code")
```

Since the `pipeline` itself is a custom pipeline, its corresponding Python module will also be pushed ([example](https://huggingface.co/sayakpaul/custom_pipeline_remote_code/blob/main/my_pipeline.py)). If the pipeline has any other custom components, they will be pushed as well ([UNet](https://huggingface.co/sayakpaul/custom_pipeline_remote_code/blob/main/unet/my_unet_model.py), [scheduler](https://huggingface.co/sayakpaul/custom_pipeline_remote_code/blob/main/scheduler/my_scheduler.py)).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Since the `pipeline` itself is a custom pipeline, its corresponding Python module will also be pushed ([example](https://huggingface.co/sayakpaul/custom_pipeline_remote_code/blob/main/my_pipeline.py)). If the pipeline has any other custom components, they will be pushed as well ([UNet](https://huggingface.co/sayakpaul/custom_pipeline_remote_code/blob/main/unet/my_unet_model.py), [scheduler](https://huggingface.co/sayakpaul/custom_pipeline_remote_code/blob/main/scheduler/my_scheduler.py)).
Since the `pipeline` itself is a community pipeline, its corresponding [Python module](https://huggingface.co/sayakpaul/custom_pipeline_remote_code/blob/main/my_pipeline.py) is also pushed. If the pipeline has any other custom components, they are pushed as well ([UNet](https://huggingface.co/sayakpaul/custom_pipeline_remote_code/blob/main/unet/my_unet_model.py), [scheduler](https://huggingface.co/sayakpaul/custom_pipeline_remote_code/blob/main/scheduler/my_scheduler.py)).


If you want to keep the pipeline local, then use the [`PushToHubMixin.save_pretrained`] method.

## Pipeline loading

You can load this pipeline from the Hub by specifying `trust_remote_code=True`:

```python
from diffusers import DiffusionPipeline

reloaded_pipeline = DiffusionPipeline.from_pretrained(
"sayakpaul/custom_pipeline_remote_code",
torch_dtype=torch.float16,
trust_remote_code=True,
).to("cuda")
```

And then perform inference:

```python
prompt = "hey"
num_inference_steps = 2

_ = reloaded_pipeline(prompt=prompt, num_inference_steps=num_inference_steps)[0]
```

For more complex pipelines, readers are welcome to check out [this comment](https://github.com/huggingface/diffusers/pull/5472#issuecomment-1775034461) on GitHub.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For more complex pipelines, readers are welcome to check out [this comment](https://github.com/huggingface/diffusers/pull/5472#issuecomment-1775034461) on GitHub.
<Tip>
For more complex pipelines, check out this [comment](https://github.com/huggingface/diffusers/pull/5472#issuecomment-1775034461) on GitHub.
</Tip>

44 changes: 43 additions & 1 deletion src/diffusers/configuration_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
import json
import os
import re
import sys
from collections import OrderedDict
from pathlib import PosixPath
from typing import Any, Dict, Tuple, Union
Expand Down Expand Up @@ -162,6 +163,30 @@ def save_config(self, save_directory: Union[str, os.PathLike], push_to_hub: bool
self.to_json_file(output_config_file)
logger.info(f"Configuration saved in {output_config_file}")

# Additionally, save the implementation file too. It can happen for a pipeline, for a model, and
# for a scheduler.

# To avoid circular import problems.
from .models import _import_structure as model_modules
from .pipelines import _import_structure as pipeline_modules
from .schedulers import _import_structure as scheduler_modules

_all_available_pipelines_schedulers_model_classes = sum(
(list(model_modules.values()) + list(scheduler_modules.values()) + list(pipeline_modules.values())), []
)
if self.__class__.__name__ not in _all_available_pipelines_schedulers_model_classes:
module_to_save = self.__class__.__module__
absolute_module_path = sys.modules[module_to_save].__file__
try:
with open(absolute_module_path, "r") as original_file:
content = original_file.read()
path_to_write = os.path.join(save_directory, f"{module_to_save}.py")
with open(path_to_write, "w") as new_file:
new_file.write(content)
logger.info(f"{module_to_save}.py saved in {save_directory}")
except Exception as e:
logger.error(e)
Comment on lines +166 to +188
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this for now and just merge the documentation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This touches too many internal functioning which worries me a bit


if push_to_hub:
commit_message = kwargs.pop("commit_message", None)
private = kwargs.pop("private", False)
Expand Down Expand Up @@ -567,7 +592,24 @@ def to_json_string(self) -> str:
String containing all the attributes that make up the configuration instance in JSON format.
"""
config_dict = self._internal_dict if hasattr(self, "_internal_dict") else {}
config_dict["_class_name"] = self.__class__.__name__
cls_name = self.__class__.__name__

# Additionally, save the implementation file too. It can happen for a pipeline, for a model, and
# for a scheduler.

# To avoid circular import problems.
from .models import _import_structure as model_modules
from .pipelines import _import_structure as pipeline_modules
from .schedulers import _import_structure as scheduler_modules

_all_available_pipelines_schedulers_model_classes = sum(
(list(model_modules.values()) + list(scheduler_modules.values()) + list(pipeline_modules.values())), []
)

if cls_name not in _all_available_pipelines_schedulers_model_classes:
config_dict["_class_name"] = [str(self.__class__.__module__), cls_name]
else:
config_dict["_class_name"] = cls_name
config_dict["_diffusers_version"] = __version__

def to_json_saveable(value):
Expand Down