Skip to content

Conversation

yiyixuxu
Copy link
Collaborator

part of my embedding refactor, separated by model/pipeline so it is easier to work with

this PR focuses on embeddings only used in Pixar-alpha: i.e. CombinedTimestepSizeEmbeddings and CaptionProjection:

  1. rename them to PixArtAlphaCombinedTimestepSizeEmbeddings and PixArtAlphaTextProjection so it is clear that these embeddings are only used in PixArt-Alpha
  2. remove code that's not needed (let me know if I got anything wrong here)

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

aspect_ratio, batch_size=batch_size, embedder=self.aspect_ratio_embedder
)
conditioning = timesteps_emb + torch.cat([resolution, aspect_ratio], dim=1)
resolution_emb = self.additional_condition_proj(resolution.flatten()).to(hidden_dtype)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nice refactor!

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if the SLOW tests pass. Could you please run the slow tests with these changes as well?

Comment on lines +857 to +859
if do_classifier_free_guidance:
resolution = torch.cat([resolution, resolution], dim=0)
aspect_ratio = torch.cat([aspect_ratio, aspect_ratio], dim=0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a new addition?

Copy link
Collaborator Author

@yiyixuxu yiyixuxu Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really - it gets duplicated later inside embedding

if size.shape[0] != batch_size:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice <3

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sayakpaul
fast tests fail because there are some randomly initialized weights in some components. I think we need to put torch.manual_seed(0) before making each component e.g.

vae = AutoencoderKL()

should we open a new PR to only update the tests, and I rebase after that? I'm not comfortable updating tests directly from this PR since I updated the code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that wasn't the case before. Wonder what changed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -235,7 +235,7 @@ def __init__(

self.caption_projection = None
if caption_channels is not None:
self.caption_projection = CaptionProjection(in_features=caption_channels, hidden_size=inner_dim)
self.caption_projection = PixArtAlphaTextProjection(in_features=caption_channels, hidden_size=inner_dim)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might actually be worth breaking up Transformer2D up into a dedicated one for PixArt.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a future PR, yeah? I am happy to work on it once this is merged.

Copy link
Collaborator Author

@yiyixuxu yiyixuxu Dec 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely for a future PR

But I think we should refactor transformers and UNet after we clean up all the lower-level classes and make such decisions for all models/pipelines at once so it will be consistent

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes 100 percent!

@yiyixuxu yiyixuxu merged commit 3e71a20 into main Dec 19, 2023
@yiyixuxu yiyixuxu deleted the refactor-embeddings branch December 19, 2023 17:07
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
pixart-alpha

Co-authored-by: yiyixuxu <yixu310@gmail,com>
donhardman pushed a commit to donhardman/diffusers that referenced this pull request Dec 29, 2023
pixart-alpha

Co-authored-by: yiyixuxu <yixu310@gmail,com>
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
pixart-alpha

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants