Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions docs/source/en/api/pipelines/kandinsky.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,20 @@ t2i_pipe = DiffusionPipeline.from_pretrained("kandinsky-community/kandinsky-2-1"
t2i_pipe.to("cuda")
```

<Tip warning={true}>

By default, the text-to-image pipeline use [`DDIMScheduler`], you can change the scheduler to [`DDPMScheduler`]

```py
scheduler = DDPMScheduler.from_pretrained("kandinsky-community/kandinsky-2-1", subfolder="ddpm_scheduler")
t2i_pipe = DiffusionPipeline.from_pretrained(
"kandinsky-community/kandinsky-2-1", scheduler=scheduler, torch_dtype=torch.float16
)
t2i_pipe.to("cuda")
```

</Tip>

Now we pass the prompt through the prior to generate image embeddings. The prior
returns both the image embeddings corresponding to the prompt and negative/unconditional image
embeddings corresponding to an empty string.
Expand Down
9 changes: 3 additions & 6 deletions src/diffusers/pipelines/kandinsky/pipeline_kandinsky.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
from ...models import UNet2DConditionModel, VQModel
from ...pipelines import DiffusionPipeline
from ...pipelines.pipeline_utils import ImagePipelineOutput
from ...schedulers import DDIMScheduler
from ...schedulers import DDIMScheduler, DDPMScheduler
from ...utils import (
is_accelerate_available,
is_accelerate_version,
Expand Down Expand Up @@ -88,7 +88,7 @@ class KandinskyPipeline(DiffusionPipeline):
Frozen text-encoder.
tokenizer ([`XLMRobertaTokenizer`]):
Tokenizer of class
scheduler ([`DDIMScheduler`]):
scheduler (Union[`DDIMScheduler`,`DDPMScheduler`]):
A scheduler to be used in combination with `unet` to generate image latents.
unet ([`UNet2DConditionModel`]):
Conditional U-Net architecture to denoise the image embedding.
Expand All @@ -101,7 +101,7 @@ def __init__(
text_encoder: MultilingualCLIP,
tokenizer: XLMRobertaTokenizer,
unet: UNet2DConditionModel,
scheduler: DDIMScheduler,
scheduler: Union[DDIMScheduler, DDPMScheduler],
movq: VQModel,
):
super().__init__()
Expand Down Expand Up @@ -439,9 +439,6 @@ def __call__(
noise_pred,
t,
latents,
# YiYi notes: only reason this pipeline can't work with unclip scheduler is that can't pass down this argument
# need to use DDPM scheduler instead
# prev_timestep=prev_timestep,
generator=generator,
).prev_sample
# post-processing
Expand Down