Skip to content

Conversation

sayakpaul
Copy link
Member

What does this PR do?

Introduces resolution binning as originally done by @lawrence-cj in #5716.

This is particularly useful for generating non-square images.

Quality comparison

1024x2048

No Binning Binning

1536x720

No Binning Binning

Code

import torch

from diffusers import PixArtAlphaPipeline

prompts = [
    "A small cactus with a happy face in the Sahara desert.",
    "Pirate ship trapped in a cosmic maelstrom nebula, rendered in cosmic beach whirlpool engine, volumetric lighting, spectacular, ambient lights, light pollution, cinematic atmosphere, art nouveau style, illustration art artwork by SenseiJaye, intricate detail.",
    "stars, water, brilliantly, gorgeous large scale scene, a little girl, in the style of dreamy realism, light gold and amber, blue and pink, brilliantly illuminated in the background.",
    "nature vs human nature, surreal, UHD, 8k, hyper details, rich colors, photograph.",
]

pipe = PixArtAlphaPipeline.from_pretrained(
    "PixArt-alpha/PixArt-XL-2-1024-MS", torch_dtype=torch.float16
)
pipe.enable_model_cpu_offload()

no_resolution_binning = []
resolution_binning = []

for i, prompt in enumerate(prompts):
    generator = torch.manual_seed(2344) 
    image = pipe(prompt, height=1536, width=720, generator=generator, use_resolution_binning=False).images[0]
    image.save(f"no_resolution_binning_{i}_1536x720.png")
    no_resolution_binning.append(image)
    
    generator = torch.manual_seed(2344)
    image = pipe(prompt, height=1536, width=720, generator=generator).images[0]
    image.save(f"resolution_binning_{i}_1536x720.png")
    resolution_binning.append(image)

sayakpaul and others added 2 commits November 10, 2023 10:02
Co-authored-by: lawrence-cj <jschen@mail.dlut.edu.cn>
@sayakpaul
Copy link
Member Author

sayakpaul commented Nov 10, 2023

@DN6 we're introducing a new dependency here - torchvision. Could you please help reviewing if I added all the necessary guardrails to not break anything?

@lawrence-cj
Copy link
Contributor

No problem on my side.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice addition! Let's add a test as well and please no torchvision dependency 🙏

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Nov 10, 2023

The documentation is not available anymore as the PR was closed or merged.

@sayakpaul
Copy link
Member Author

Eliminated torchvision as a dependency and the results look fine:

1024x2048

No Binning Binning

1536x720

No Binning Binning

@sayakpaul
Copy link
Member Author

sayakpaul commented Nov 11, 2023

@patrickvonplaten I have added a pure PyTorch implementation.

The reason why we cannot do a fast test for resolution binning is that even for very resolutions, classify_height_width_bin() would return extremely high resolutions to run on a CPU. For example, for the (32, 48) resolution, the resolution will be mapped to (832, 1216) which is quite high for a CPU.

This is why I choose to add a slow test for it and include that in #5752 (as it requires changing the assertion values anyway). Is that okay?

@sayakpaul sayakpaul mentioned this pull request Nov 13, 2023
6 tasks
Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean!

@sayakpaul sayakpaul merged commit ed759f0 into main Nov 14, 2023
@sayakpaul sayakpaul deleted the feat/resolution-bin-pixart-alpha branch November 14, 2023 03:05
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* feat: add resolution binning

Co-authored-by: lawrence-cj <jschen@mail.dlut.edu.cn>

* rename

* debug

* add :test

* remove unused variable

* set resolution_binning to False.

---------

Co-authored-by: lawrence-cj <jschen@mail.dlut.edu.cn>
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
* feat: add resolution binning

Co-authored-by: lawrence-cj <jschen@mail.dlut.edu.cn>

* rename

* debug

* add :test

* remove unused variable

* set resolution_binning to False.

---------

Co-authored-by: lawrence-cj <jschen@mail.dlut.edu.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants