Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vq diffusion classifier free sampling #1294

Conversation

williamberman
Copy link
Contributor

@williamberman williamberman commented Nov 15, 2022

Adds classifier free sampling to VQ diffusion. This results in significantly better image quality.

The pipeline now has a default guidance_scale of 5.0

Additionally, the ithq dataset uses a learned parameter for the classifier free embeddings. We modify the convert script to add this parameter to the ported model. Weights will have to be reuploaded

Prompts: "teddy bear playing in the pool" and "horse"

Diffusers VQ diffusion with classifier free sampling

classifier_free_sampling

Diffusers VQ diffusion without classifier free sampling

no_classifier_free_sampling

Original VQ diffusion implementation with classifier free sampling

orig_classifier_free_sampling

Original VQ diffusion implementation without classifier free sampling

orig_no_classifier_free_sampling

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@williamberman williamberman marked this pull request as draft November 15, 2022 07:28
@williamberman williamberman force-pushed the will/vq-diffusion-classifier-free-guidance branch from d0d5beb to 4ee1e06 Compare November 15, 2022 21:01
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@williamberman williamberman force-pushed the will/vq-diffusion-classifier-free-guidance branch from 4ee1e06 to cac658d Compare November 15, 2022 22:19
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@williamberman williamberman force-pushed the will/vq-diffusion-classifier-free-guidance branch from cac658d to f5fbbbe Compare November 15, 2022 22:27
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@williamberman williamberman force-pushed the will/vq-diffusion-classifier-free-guidance branch from f5fbbbe to 8746de8 Compare November 15, 2022 22:52
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@williamberman williamberman force-pushed the will/vq-diffusion-classifier-free-guidance branch from 8746de8 to fe8db41 Compare November 16, 2022 00:04
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@williamberman williamberman force-pushed the will/vq-diffusion-classifier-free-guidance branch from fe8db41 to bfc4459 Compare November 16, 2022 00:13
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@williamberman williamberman force-pushed the will/vq-diffusion-classifier-free-guidance branch from bfc4459 to 40dc3ff Compare November 16, 2022 00:28
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@williamberman williamberman force-pushed the will/vq-diffusion-classifier-free-guidance branch from 40dc3ff to 10e2ea3 Compare November 16, 2022 01:17
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@williamberman williamberman force-pushed the will/vq-diffusion-classifier-free-guidance branch from 10e2ea3 to 08984ab Compare November 16, 2022 01:29
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Comment on lines 240 to 241
"https://huggingface.co/datasets/williamberman/misc/resolve/main"
"/vq_diffusion/teddy_bear_pool_classifier_free_sampling.png"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be moved to the huggingface testing dataset. FWIW you might have to also regenerate the image because I get a different image on VQDiffusionPipelineIntegrationtests#test_vq_diffusion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool :-) I'll move it!

@williamberman williamberman marked this pull request as ready for review November 16, 2022 02:59
@williamberman williamberman changed the title [wip] vq diffusion classifier free sampling vq diffusion classifier free sampling Nov 16, 2022
Comment on lines 207 to 226
class LearnedClassifierFreeSamplingEmbeddings(ModelMixin, ConfigMixin):
"""
Utility class for storing learned text embeddings for classifier free sampling
"""

@register_to_config
def __init__(self, learnable: bool, hidden_size: Optional[int] = None, length: Optional[int] = None):
super().__init__()

self.learnable = learnable

if self.learnable:
assert hidden_size is not None, "learnable=True requires `hidden_size` to be set"
assert length is not None, "learnable=True requires `length` to be set"

embeddings = torch.zeros(length, hidden_size)
else:
embeddings = None

self.embeddings = nn.Parameter(embeddings)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is the preferred way to add the learned embeddings to the pipeline. An alternative might be to add the additional vector to the scheduler instead

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very model specific, so moving it to the pipeline here directly :-)
Think that's a bit cleaner! The model works much better now though - thanks!

@@ -64,6 +65,7 @@ def __init__(
tokenizer: CLIPTokenizer,
transformer: Transformer2DModel,
scheduler: VQDiffusionScheduler,
learned_classifier_free_sampling_embeddings: LearnedClassifierFreeSamplingEmbeddings,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's definitely the right way to do it - it's quite specific to vq-diffusion IMO though, so will move it here :-)

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@patrickvonplaten patrickvonplaten merged commit f1fcfde into huggingface:main Nov 16, 2022
@patrickvonplaten
Copy link
Contributor

Very nice job @williamberman !

yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* vq diffusion classifier free sampling

* correct

* uP

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants