Skip to content

SanaCombinedTimestepGuidanceEmbeddings do not work with SanaPipeline #12540

@frutiemax92

Description

@frutiemax92

Describe the bug

When building a SanaTransformer2DModel, it lets you which time_embed to use with guidance_embeds. The default choice is AdaLayerNormSingle, which works fine when calling the pipeline; the other choice, SanaCombinedTimestepGuidanceEmbeddings, does not work however.

Reproduction

from diffusers import SanaTransformer2DModel, SanaPAGPipeline
import torch

config = SanaTransformer2DModel.load_config('Efficient-Large-Model/Sana_600M_512px_diffusers', subfolder='transformer')
config['num_layers'] = 14
config['patch_size'] = 2
config['interpolation_scale'] = 1.0
config['guidance_embeds'] = True

new_transformer = SanaTransformer2DModel.from_config(config)

total_params = sum(p.numel() for p in new_transformer.parameters())
print(f"Total number of parameters: {total_params:,}")

pipe = SanaPAGPipeline.from_pretrained(
  "Efficient-Large-Model/Sana_1600M_512px_MultiLing_diffusers",
  pag_applied_layers = ["transformer_blocks.8"],
  transformer=new_transformer,
  torch_dtype=torch.bfloat16,
  vae=None
).to('cuda')
with torch.no_grad():
  image = pipe(
      prompt='cat',
      output_type='latent',
      height=768,
      width=512,
      pag_scale=1.0,
      guidance_scale=4.0,
      num_inference_steps=20,
  )[0]

Logs

File "/home/lucas/Documents/YAT/create_sana_transformer.py", line 23, in <module>
    image = pipe(
            ^^^^^
  File "/home/lucas/Documents/YAT/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/lucas/Documents/YAT/.venv/lib/python3.12/site-packages/diffusers/pipelines/pag/pipeline_pag_sana.py", line 877, in __call__
    noise_pred = self.transformer(
                 ^^^^^^^^^^^^^^^^^
  File "/home/lucas/Documents/YAT/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lucas/Documents/YAT/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lucas/Documents/YAT/.venv/lib/python3.12/site-packages/diffusers/models/transformers/sana_transformer.py", line 539, in forward
    timestep, embedded_timestep = self.time_embed(
                                  ^^^^^^^^^^^^^^^^
  File "/home/lucas/Documents/YAT/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lucas/Documents/YAT/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: SanaCombinedTimestepGuidanceEmbeddings.forward() got an unexpected keyword argument 'batch_size'

System Info

I am on Ubuntu 24.04 LTS.

pip show diffusers
Name: diffusers
Version: 0.35.2
Summary: State-of-the-art diffusion in PyTorch and JAX.
Home-page: https://github.com/huggingface/diffusers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/diffusers/graphs/contributors)
Author-email: diffusers@huggingface.co
License: Apache 2.0 License
Location: /home/lucas/Documents/YAT/.venv/lib/python3.12/site-packages
Requires: filelock, huggingface-hub, importlib_metadata, numpy, Pillow, regex, requests, safetensors
Required-by:

Who can help?

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions