Add Guidance Rescaling to LatentConsistencyModelPipeline by dg845 · Pull Request #5859 · huggingface/diffusers

dg845 · 2023-11-18T09:13:18Z

What does this PR do?

This PR adds classifier-free guidance rescaling (introduced in this paper) to LatentConsistencyModelPipeline. Using guidance rescaling may improve the LCM sample quality, in particular when using zero terminal SNR (rescale_betas_zero_snr=True) in LCMScheduler.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@patrickvonplaten
@sayakpaul
@luosiallen

dg845 · 2023-11-18T11:23:19Z

For the conditional noise prediction noise_pred_text ($x_{pos}$ in the paper) for rescale_noise_cfg, I am currently using the output of the unet on the same latents and prompt_embeds but with a guidance scale embedding corresponding to a guidance_scale of 1 (e.g., no CFG). While this should theoretically remove the unconditional output and leave only the conditional output, it's not obvious that this is the right thing to do because the LCM might not have seen guidance scale values that low during training/distillation (during training/distillation, a random guidance scale is typically sampled in $[3, 15]$; see Appendix F of the LCM paper).

patrickvonplaten · 2023-11-20T11:28:15Z

cc @patil-suraj feel free to merge if ok for you

patil-suraj · 2023-11-20T11:36:21Z

Do we have any results for this? And as you said, the model has not seen guidance scales below 3 during training, so I'm not sure if this makes a difference in results.

And also we should support this in the base pipelines as well as we can now use LCMs with the base pipelines.

dg845 · 2023-11-20T12:33:17Z

I haven't tested this implementation of guidance rescaling on a full LCM checkpoint yet. I think people have tried guidance rescaling with the LCM LoRA on pipelines that use CFG instead of a guidance scale embedding (which avoids the problem of what the proper $x_{pos}$ value should be).

dg845 · 2023-11-28T05:43:47Z

Here is a script to get some examples:

import torch
from diffusers import LatentConsistencyModelPipeline

seed = 0
device = "cuda"
torch_dtype = torch.float16
model_id_or_path = "SimianLuo/LCM_Dreamshaper_v7"
pipe = LatentConsistencyModelPipeline.from_pretrained(
    model_id_or_path,
    torch_dtype=torch_dtype,
)
pipe.to(torch_device="cuda", torch_dtype=torch_dtype)

generator = torch.manual_seed(seed)
image = pipe(
    prompt="Self-portrait oil painting, a beautiful cyborg with golden hair, 8k",
    num_inference_steps=4,
    guidance_scale=8.5,  # 7.5 in the original LCM paper CFG formulation
    generator=generator,
    guidance_rescale=0.7,  # The default suggested in the original guidance rescale paper
).images[0]

image.save(f"samples_seed_{seed}.png")

I ran the inference in mixed precision due to GPU memory constraints.

Here are some examples:

Seed 0:

Seed 2937:

Seed 3409:

Seed 49283:

The examples look pretty good to me, but I'm not sure if they represent a noticeable improvement over samples without guidance rescaling, curious what people think about the sample quality @luosiallen @patil-suraj @patrickvonplaten.

dg845 · 2023-11-29T05:04:45Z

After some further investigation, it seems that images generated with guidance rescale and images generated without guidance rescale tend to be very similar because the CFG noise prediction noise_pred_cfg and non-CFG (conditional) noise prediction noise_pred_cond have very similar standard deviations throughout sampling, and thus the rescaled noise prediction is very similar to the original CFG noise prediction (at least for the prompts I've tested so far with the SimianLuo/LCM_Dreamshaper_v7 checkpoint). Note that when this is the case, increasing the guidance_rescale factor typically doesn't have much effect because we're interpolating between two very similar noise predictions.

In particular, I believe the above samples generated using guidance rescale are very similar to those generated without guidance rescale. They're not necessary visually indistinguishable; my experience so far is that images generated with guidance rescale tend to be a little darker than without (due to the fact that samples generated without CFG [e.g., guidance_scale = 1.0] tend to be darker).

sayakpaul · 2023-12-20T04:35:24Z

@patil-suraj a gentle ping.

HuggingFaceDocBuilderDev · 2023-12-20T04:43:06Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

github-actions · 2024-01-16T15:06:09Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

patrickvonplaten · 2024-01-17T10:52:28Z

@patil-suraj can you please check here again?

dg845 · 2024-01-18T10:32:20Z

To summarize my testing regarding this PR, it seems that the scales of the CFG estimate noise_pred_cfg and the conditional noise prediction noise_pred_cond are very similar for the SimianLuo/LCM_Dreamshaper_v7 LCM checkpoint. Thus using guidance rescaling typically alters the final sample only a little bit (see #5859 (comment)).

In the original guidance rescale paper, the authors observe that as terminal SNR goes to 0 (at timesteps near num_train_timesteps $T$), and at high guidance weights noise_pred_cfg becomes large and can result in saturated images, and proposed guidance rescaling to fix this. I am not sure if the same conditions hold in general for LCM models since they don't perform CFG normally.

I guess the advantage of merging the PR would be that guidance rescaling would be available for the LCM pipelines as a feature. (I believe it's already available for other pipelines compatible with LCMScheduler such as StableDiffusionPipeline because CFG and guidance rescaling are already implemented for those pipelines.)

The downsides are as follows:

With current LCM checkpoints, guidance rescaling does not seem to have a big effect on the samples
The LCM pipelines are more complex with guidance rescaling implemented (pipelines which implement CFG basically get guidance rescaling for free, but the LCM pipelines don't use CFG normally so a CFG-like implementation is currently used in order to support guidance rescaling)
Guidance rescaling is not currently theoretically well-justified for LCM models

patrickvonplaten · 2024-01-19T10:14:15Z

cc @patil-suraj again

github-actions · 2024-02-12T15:06:21Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

yiyixuxu · 2024-02-21T23:31:27Z

gentle pin @patil-suraj

github-actions · 2024-03-17T15:04:33Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

sayakpaul · 2024-03-17T15:13:29Z

@patil-suraj could you give this a look?

yiyixuxu · 2024-03-19T18:08:28Z

interesting experiment!
In general, I think we should not add a feature unless there is a use case for it.

I will leave this PR open so more people can test it out and see they find this feature helpful

github-actions · 2024-04-13T15:04:36Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

dg845 added 2 commits November 18, 2023 01:01

Add guidance rescale support to LatentConsistencyModelPipeline.

6d92208

make style

6a7d6f4

patrickvonplaten requested a review from patil-suraj November 20, 2023 11:27

Merge branch 'main' into lcm-pipeline-rescale-cfg

459d477

Merge branch 'main' into lcm-pipeline-rescale-cfg

39f6965

github-actions Bot added the stale Issues that haven't received updates label Jan 16, 2024

patrickvonplaten removed the stale Issues that haven't received updates label Jan 17, 2024

github-actions Bot added the stale Issues that haven't received updates label Feb 12, 2024

github-actions Bot closed this Feb 21, 2024

yiyixuxu removed the stale Issues that haven't received updates label Feb 21, 2024

yiyixuxu reopened this Feb 21, 2024

github-actions Bot added the stale Issues that haven't received updates label Mar 17, 2024

sayakpaul removed the stale Issues that haven't received updates label Mar 17, 2024

yiyixuxu added the contributions-welcome label Mar 19, 2024

github-actions Bot added the stale Issues that haven't received updates label Apr 13, 2024

Conversation

dg845 commented Nov 18, 2023

What does this PR do?

Before submitting

Who can review?

Uh oh!

dg845 commented Nov 18, 2023

Uh oh!

patrickvonplaten commented Nov 20, 2023

Uh oh!

patil-suraj commented Nov 20, 2023

Uh oh!

dg845 commented Nov 20, 2023

Uh oh!

dg845 commented Nov 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dg845 commented Nov 29, 2023

Uh oh!

sayakpaul commented Dec 20, 2023

Uh oh!

HuggingFaceDocBuilderDev commented Dec 20, 2023

Uh oh!

github-actions Bot commented Jan 16, 2024

Uh oh!

patrickvonplaten commented Jan 17, 2024

Uh oh!

dg845 commented Jan 18, 2024

Uh oh!

patrickvonplaten commented Jan 19, 2024

Uh oh!

github-actions Bot commented Feb 12, 2024

Uh oh!

yiyixuxu commented Feb 21, 2024

Uh oh!

github-actions Bot commented Mar 17, 2024

Uh oh!

sayakpaul commented Mar 17, 2024

Uh oh!

yiyixuxu commented Mar 19, 2024

Uh oh!

github-actions Bot commented Apr 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

dg845 commented Nov 28, 2023 •

edited

Loading