Skip to content

Implementing for flow models? #12

@zaptrem

Description

@zaptrem

I tried to implement this for flow models as described in the appendix, but the results are complete collapse (exploding images). Did I make a mistake or is this technique fundamentally incompatible with flow models (which have no renoising step)? Also the paper doesn't define v lambda.

def euler_cfgpp_update(
    x_t:    torch.Tensor,
    t:      float,
    dt:     float,
    v_u: torch.Tensor,
    v_c: torch.Tensor,
    lambda_val:   float,
) -> Tensor:
    # Unconditional velocity at (x_t, t)
    # v_u = model_uncond(x_t, t)
    # Conditional velocity at (x_t, t)
    # v_c = model_cond(x_t, t)

    # Unconditional “Tweedie” estimate: x̃ₐ⁽∅⁾   = xₜ - t * v_u
    x_null = x_t + (1 - t) * v_u
    # Conditional “Tweedie” estimate:   x̃ₐ⁽ᶜ⁾   = xₜ - t * v_c
    x_cond = x_t + (1 - t) * v_c

    # normal cfg prediction
    # x_cfg = x_t + (1 - t) * (v_u + 2.3 * (v_c - v_u))

    # CFG++ “Tweedie” estimate (interpolation):
    # x̃ₐ⁽λ⁾ = (1-λ)* x̃ₐ⁽∅⁾  +  λ * x̃ₐ⁽ᶜ⁾
    x_cfgpp = x_null + lambda_val * (x_cond - x_null)
    # Next time = t + dt
    t_next = t + dt

    # Euler step for CFG++:
    # xₜ₁ = x̃ₐ⁽λ⁾(xₜ₀) + ( xₜ - x̃ₐ⁽∅⁾(xₜ₀) ) / t₀  *  t₁
    # (Make sure t != 0 to avoid divide-by-zero!)
    # eps = 1e-12
    x_next = x_cfgpp + (x_t - x_null) * ((1 - t_next) / (1 - t))

    # vanilla cfg
    # x_next = x_cfg + (x_t - x_cfg) * ((1 - t_next) / (1 - (t + eps)))

    return x_next

@geonyeong-park @CFGpp-diffusion @jeongsol-kim

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions