Skip to content

ignoww/RALU

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

26 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🌟 [CVPR 2026] Training-free Mixed-Resolution Latent Upsampling for Spatially Accelerated Diffusion Transformers

πŸ“‘ Introduction

Training-free Mixed-Resolution Latent Upsampling for Spatially Accelerated Diffusion Transformers

Wongi Jeong*, Kyungryeol Lee*, Hoigi Seo, Se Young Chun (*co-first)

πŸ“šarXiv

This paper proposes Region-Adaptive Latent Upsampling (RALU), a training-free framework for accelerating Diffusion Transformers along the spatial dimension. RALU selectively upsamples only edge-sensitive regions during denoising to suppress artifacts, while preserving the model’s semantic fidelity and visual quality. It further introduces a noise-timestep rescheduling strategy to ensure stable generation across resolution transitions, making it compatible with temporal acceleration methods.

πŸ”₯ Recent News

  • [2026.02.20] RALU is accepted at CVPR 2026 !!
  • [2025.08.07] RALU code has been released.
  • [2025.07.11] RALU is on arXiv.

πŸš€ Usage

  1. Environment Setup

    Make sure your environment is capable of running FLUX. Only a few additional packages need to be installed.

    Configure Parameters

    • use_RALU_default: Use the predefined configurations (4Γ— or 7Γ— speedup) as described in the RALU paper.
    • level: When using --use_RALU_default, specify the desired acceleration level (either 4 or 7).
    • N: A list of denoising step counts for each of the three stages.
    • e: A list of end timesteps for each stage. The last value must be 1.0, as it denotes the final timestep.
    • up_ratio: The ratio of tokens to be early upsampled in Stage 2.
  2. Run the Example

    Execute the RALU_inference.py script.

    Option 1: Using the default RALU setting (4Γ— or 7Γ— speedup)

    python RALU_inference.py --use_RALU_default --level 4

    Option 2: Using custom N and e values

    python RALU_inference.py --N 4 5 6 --e 0.3 0.45 1.0   # for N=[4, 5, 6], e=[0.3, 0.45, 1.0]

    Note: The last value of e must always be 1.0, indicating the end of the diffusion process.

πŸ“Έ Example Outputs

The images below compare the results of applying 4Γ— and 7Γ— acceleration using naive reduction of num_inference_steps in FLUX.1-dev vs. using RALU with the same speedup factors.

πŸ™ Acknowledgments

This code is based on the FLUX pipeline implementation provided by Diffusers. The referenced works are as follows:

About

[CVPR 2026] Training-free Mixed-Resolution Latent Upsampling for Spatially Accelerated Diffusion Transformers

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages