# DDIM 公式（精简版）

DDIM（Denoising Diffusion Implicit Models）与 DDPM 共享同样的前向噪声过程，但**反向采样是非马尔可夫/可确定性**（可用参数 $\eta$ 控制随机性）。

## 1) 前向扩散（与 DDPM 相同）
设 $t=1,\dots,T$：

$$
\beta_t \in (0,1),\quad \alpha_t=1-\beta_t,\quad \bar{\alpha}_t=\prod_{i=1}^t \alpha_i
$$

$$
q(x_t\mid x_0)=\mathcal{N}\!\left(\sqrt{\bar{\alpha}_t}\,x_0,\; (1-\bar{\alpha}_t)I\right)
$$

等价采样：

$$
x_t=\sqrt{\bar{\alpha}_t}\,x_0+\sqrt{1-\bar{\alpha}_t}\,\epsilon,\quad \epsilon\sim\mathcal{N}(0,I)
$$

---

## 2) 反向采样（DDIM 核心）
模型预测噪声 $\epsilon_\theta(x_t,t)$，由此估计 $x_0$：

$$
\hat{x}_0=\frac{x_t-\sqrt{1-\bar{\alpha}_t}\,\epsilon_\theta(x_t,t)}{\sqrt{\bar{\alpha}_t}}
$$

定义（采样步数可以少于训练步数）：

$$
\sigma_t=\eta\,\sqrt{\frac{1-\bar{\alpha}_{t-1}}{1-\bar{\alpha}_t}}\,\sqrt{1-\frac{\bar{\alpha}_t}{\bar{\alpha}_{t-1}}}
$$

DDIM 更新公式：

$$
x_{t-1}=\sqrt{\bar{\alpha}_{t-1}}\,\hat{x}_0+\sqrt{1-\bar{\alpha}_{t-1}-\sigma_t^2}\,\epsilon_\theta(x_t,t)+\sigma_t\,z,
$$

其中 $z\sim\mathcal{N}(0,I)$。当 $\eta=0$ 时，采样是**确定性**的。

---

## 3) 训练目标
训练依然使用 DDPM 的噪声预测损失：

$$
\mathcal{L}=\mathbb{E}_{t,x_0,\epsilon}\left[\|\epsilon-\epsilon_\theta(x_t,t)\|^2\right]
$$

In [3]:
# DDIM 采样系数示例（不含模型）
import torch

beta_start = 1e-4
beta_end = 0.02
num_timesteps = 10
eta = 0.2 # eta=0 -> 确定性采样

betas = torch.linspace(beta_start, beta_end, num_timesteps)
alphas = 1.0 - betas
alphas_cumprod = torch.cumprod(alphas, dim=0)

# 计算所有 t 的 sigma_t
alpha_bar_t = alphas_cumprod
alpha_bar_prev = torch.cat([torch.tensor([1.0], dtype=alphas_cumprod.dtype), alphas_cumprod[:-1]], dim=0)

sigma_t = eta * torch.sqrt((1 - alpha_bar_prev) / (1 - alphas_cumprod)) * torch.sqrt(1 - alphas_cumprod / alpha_bar_prev)

print("alphas_cumprod", alphas_cumprod)
print("alpha_bar_prev", alpha_bar_prev)
print("sigma_t", sigma_t)

alphas_cumprod tensor([0.9999, 0.9976, 0.9931, 0.9864, 0.9776, 0.9667, 0.9537, 0.9389, 0.9222,
        0.9037])
alpha_bar_prev tensor([1.0000, 0.9999, 0.9976, 0.9931, 0.9864, 0.9776, 0.9667, 0.9537, 0.9389,
        0.9222])
sigma_t tensor([0.0000, 0.0020, 0.0079, 0.0117, 0.0147, 0.0173, 0.0196, 0.0217, 0.0236,
        0.0254])


In [2]:
# DDIM 一步更新的数值演示（用随机张量代替模型输出）
import torch

B, C, H, W = 2, 3, 4, 4
x_t = torch.randn(B, C, H, W)

# 假设模型输出 eps_theta
eps_theta = torch.randn_like(x_t)

# 复用上一个单元的 alpha_bar_t / alpha_bar_prev / sigma_t
# 估计 x0
x0_hat = (x_t - torch.sqrt(1 - alpha_bar_t) * eps_theta) / torch.sqrt(alpha_bar_t)

# 采样噪声 z
z = torch.randn_like(x_t)

x_prev = (
    torch.sqrt(alpha_bar_prev) * x0_hat
    + torch.sqrt(1 - alpha_bar_prev - sigma_t**2) * eps_theta
    + sigma_t * z
)

print("x_prev shape", x_prev.shape)

x_prev shape torch.Size([2, 3, 4, 4])


In [4]:
# DDIM 跳步采样示例（从 1000 步里选 50 步）
import torch

num_train_steps = 1000
num_sample_steps = 50
eta = 0.0

betas = torch.linspace(1e-4, 0.02, num_train_steps)
alphas = 1.0 - betas
alphas_cumprod = torch.cumprod(alphas, dim=0)

# 选取稀疏时间步索引（均匀采样）
steps = torch.linspace(0, num_train_steps - 1, num_sample_steps, dtype=torch.long)
steps = steps.flip(0)  # 反向采样：T-1 -> 0

print(steps)
# 初始噪声
B, C, H, W = 2, 3, 32, 32
x = torch.randn(B, C, H, W)

for i in range(len(steps)):
    t = steps[i]
    prev_t = steps[i + 1] if i + 1 < len(steps) else torch.tensor(0)

    alpha_bar_t = alphas_cumprod[t]
    alpha_bar_prev = alphas_cumprod[prev_t] if t > 0 else torch.tensor(1.0)

    # 这里用随机噪声代替 eps_theta(x_t, t)
    eps_theta = torch.randn_like(x)

    # 估计 x0
    x0_hat = (x - torch.sqrt(1 - alpha_bar_t) * eps_theta) / torch.sqrt(alpha_bar_t)

    # DDIM sigma_t
    sigma_t = eta * torch.sqrt((1 - alpha_bar_prev) / (1 - alpha_bar_t)) * torch.sqrt(1 - alpha_bar_t / alpha_bar_prev)

    # 随机项（eta=0 时不会影响）
    z = torch.randn_like(x)

    x = (
        torch.sqrt(alpha_bar_prev) * x0_hat
        + torch.sqrt(1 - alpha_bar_prev - sigma_t**2) * eps_theta
        + sigma_t * z
    )

print("final sample shape", x.shape)

tensor([999, 978, 958, 937, 917, 897, 876, 856, 835, 815, 795, 774, 754, 733,
        713, 693, 672, 652, 632, 611, 591, 570, 550, 530, 509, 489, 468, 448,
        428, 407, 387, 366, 346, 326, 305, 285, 265, 244, 224, 203, 183, 163,
        142, 122, 101,  81,  61,  40,  20,   0])
final sample shape torch.Size([2, 3, 32, 32])
