# Image Generation

## 1.1 Generative adversarial network
#### In this exercise, you will implement a Deep Convolutional Generative Network (DCGAN) to synthesis images by using the provided anime faces dataset.

---


- Construct a <font color=red>$\text{DCGAN}$</font> with GAN objective, you can refer to the [tutorial website](https://pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html) provided by PyTorch} for implementation.
    \begin{equation*} \begin{aligned}
    &\max _{D} \mathcal{L}(D) =\mathbb{E}_{\boldsymbol{x} \sim p_{\text {data }}} \log D(\boldsymbol{x})+\mathbb{E}_{z \sim p_{\boldsymbol{z}}} \log (1-D(G(\boldsymbol{z}))) \\
    &\min _{G} \mathcal{L}(G) =\mathbb{E}_{z \sim p_{x}} \log (1-D(G(\boldsymbol{z}))
    \end{aligned} \end{equation*}
- <font color=red>Draw</font> some samples generated from your generator at <font color=red>different training stages </font>. For example, you may show the results when running at $5^{\text{th}}$ and final epoch 100. (10\%)


<img src="https://i.imgur.com/tnRR3tr.png" width="350px" />
<img src="https://i.imgur.com/g9AnDwN.png" width="350px" />





In [None]:
# Downlaod and unzip data
!gdown 1K1oB7GOUerTCIa68bbxETcGajLeE_5j1
!unzip resized_64x64.zip

### Please write the gan code here

In [None]:
# Please write the gan code here
# Note: In our experience, you can just select around 10000 images for training and get acceptable result.

### 1.1.a 
### Draw some samples generated from your generator at different training stages. For example, you may show the results when running at 5th and final epoch 100

### 1.1.b
### The Helvetica Scenario often happens during the training procedure of GAN. Please explain why this problem occurs and how to avoid it. We suggest you can read the original paper and do the discuss.


## 1.2 Denoising Diffusion Probabilistic Model (30%)

#### In this exercise, you will implement a <font color=red>Denoising Diffusion Probabilistic Model (DDPM) </font>to generate images by the provided  <font color=red>anime faces dataset</font>. The Figure below is the process of the Diffusion Model. It consists of a forward process, which gradually adds noise, and the reverse process will transform the noise back into a sample from the target distribution. Here is the [link1](https://lilianweng.github.io/posts/2021-07-11-diffusion-models/) and [link2](https://www.youtube.com/watch?v=azBugJzmz-o&t=190s) to the detailed introduction to the diffusion model. 

<img src="https://i.imgur.com/BqpRi4v.png"/>

1. Construct  <font color='blue'>DDPM</font> by fulfilling the <font color='red'>2 TODOs</font> and follow the instruction. Noticed that you are not allowed to directly call library or API to load the model. The total epoch is 10. (20\%)

  (a) **Draw** some generated samples based on diffusion steps $T = 500$ and $T = 1000$. We provide the **pre-trained weights** which are trained with 500 and 1000 steps. Hint: In the paper, the steps start at 1..

  (b) **Discuss** the result based on different diffusion steps.

### Training (You can skip this)

- Notice that becuase the diffusion requires high computational device, Colab may not be suitable. Thus, we provide the code of Training for reference. 

In [None]:
!gdown 1E8yulcTDMk9dvz2dJ_TniLKdU4n6AFwa
!gdown 1g_RYSP1A2rXg_ud18ARlWXK8BWhiHdjV

In [None]:
!pip install torchmetrics

In [None]:
import torch, sys
from torch.optim import Adam
from torch.optim.lr_scheduler import StepLR
from tqdm import tqdm, trange
from model import Unet
from torchmetrics import MeanMetric
from dataloader import get_loader

device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(device) #make sure this is cuda

In [None]:
T = 500
ALPHA = 1-torch.linspace(1e-4, 2e-2, T)
def alpha(t):
    at = torch.prod(ALPHA[:t]).reshape((1, ))
    return torch.sqrt(torch.cat((at, 1-at)))
ALPHA_bar = torch.stack([alpha(t) for t in range(T)]).to(device)

batch_size = 64
update_step = 1
save_step = 2
save_step_ = 20
num_workers = 6
epochs = 100
loss_func = torch.nn.MSELoss()
lr = 5e-4
model = Unet(
    in_channels=3
)
state_dict = torch.load('checkpoint.pth')
optimizer = Adam(model.parameters(), lr=lr)
scheduler = StepLR(optimizer, step_size=10, gamma=0.5)

In [None]:
def train(model, data_loader):
    running_loss = MeanMetric(accumulate=True)

    model.train()
    optimizer.zero_grad()

    for epoch in (overall:=trange(1, epochs+1, position=1, desc='[Overall]')):
        running_loss.reset()

        for i, X_0 in enumerate(bar := tqdm(data_loader, position=0, desc=f'[Train {epoch:3d}] lr={scheduler.get_last_lr()[0]:.2e}'), start=1):
            X_0 = X_0.to(device)
            eps = torch.randn(X_0.shape, device=device)
            t = torch.randint(0, T, (X_0.shape[0], ), device=device)

            # print(ALPHA_bar[t, 0].reshape(-1, 1, 1, 1)*X_0)
            with torch.no_grad():
                X_noise = ALPHA_bar[t, 0].reshape(-1, 1, 1, 1)*X_0 + ALPHA_bar[t, 1].reshape(-1, 1, 1, 1)*eps
            # X_noise = X_noise.to(device)
            t = t.to(device)
            
            pred = model(X_noise, t+1)

            loss = loss_func(eps, pred)
            loss.backward()

            if i%update_step == 0 or i == bar.total:
                optimizer.step()
                optimizer.zero_grad()

            running_loss.update(loss.item())
            bar.set_postfix_str(f'loss {running_loss.compute():.2e}')

        scheduler.step()
        tqdm.write('\r\033[K', end='')

        if epoch % save_step == 0:
            save_checkpoint(epoch, model, optimizer, 'checkpoint.pth')
        if epoch % save_step_ == 0:
            save_checkpoint(epoch, model, optimizer, f'checkpoint_{epoch}.pth')    

def save_checkpoint(epoch, model, optimizer, path):
    torch.save({
        'epoch': epoch,
        'model_state_dict': model.state_dict(),
        'optimizer_state_dict': optimizer.state_dict(),
        'scheduler_state_dict': scheduler.state_dict()
    }, path)
    tqdm.write('Save checkpoint')

In [None]:
# Training start
train_loader = get_loader(
    'resized_64x64/',
    batch_size, 
    num_workers
)
model = model.to(device)
train(model, train_loader)

### Sampling

In [None]:
!gdown 1E8yulcTDMk9dvz2dJ_TniLKdU4n6AFwa 
# dataloader.py
!gdown 1g_RYSP1A2rXg_ud18ARlWXK8BWhiHdjV 
# model.py
!gdown 1n9K-HSY3GJKTS4HkHTCA_AJT0q1cKBZ1 
# checkpoint_epoch100_T1000.pth
!gdown 1jPycQFo_f_fPRUg6OuauTrsXbdibTvKI 
# checkpoint_epoch100_T500.pth

In [None]:
import torch, os
from tqdm import trange, tqdm
from torchvision.utils import save_image
from torchvision.transforms import ColorJitter
import torch, sys
from tqdm import tqdm, trange
from model import Unet

device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(device) # cuda is recommand
T = 1000   # 500 or 1000
model = Unet(in_channels=3).to(device)
state_dict = torch.load('checkpoint_100epoch_T1000.pth')
model.load_state_dict(state_dict['model_state_dict'])

In [None]:
ALPHA = (1-torch.linspace(1e-4, 2e-2, T)).reshape((-1, 1)).to(device)

@torch.no_grad()
def generate_and_save(model, gen_N, chan=3, resolu=(28, 28)):
    model.eval()
    

    L = []

    # ----- TODO ----- #
    # Sample gaussian noise X_T (5%). Please see DDPM paper https://arxiv.org/pdf/2006.11239.pdf or the link in the homework pdf file 
    #                  #
    X_T = ?
    #                  #
    #                  #
    #                  #
    # ----- TODO ----- #

    for t in (bar := trange(T-1, -1, -1)):

        bar.set_description(f'[Denoising] step: {t}')

        # ----- TODO ----- #
        # Sampling: Please see DDPM paper https://arxiv.org/pdf/2006.11239.pdf or the link in the homework pdf file 
        #                  #
        #                  #
        #                  #
        #                  #
        #                  #
        # ----- TODO ----- #

        if t < 1:
            L.append(X_T)
    
    save_image(torch.cat(L)/2+0.5, 'L.jpg')

### 1.2.a 
#### **Draw** some generated samples based on diffusion steps $T = 500$ and $T = 1000$. We provide the **pre-trained weights** which are trained with 500 and 1000 steps. Hint: In the paper, the steps start at 1..

In [None]:
# gen_N is the setting of output images number
# resolu is the setting of output images resolution, you should not change this.
# This function will automatically save the sample images, what you need to do is to show in here.
generate_and_save(model, gen_N=64, resolu=(64, 64))

### 1.2.b
#### **Discuss** the result based on different diffusion steps.

## 1.3 Comparison between GAN and DDPM (10%)
#### (a) Both GAN and DDPM are generative models. The following figures are randomly generated results by using GAN (left) and DDPM (right). Please describe the pros and cons of the two models. (10%)

<img src="https://i.imgur.com/pU77cfa.jpg" width="600px"/>

### 1.3.a
### Both GAN and DDPM are generative models. The figures are randomly generated results by using GAN (left) and DDPM (right). Please describe the pros and cons of the two models based on your observation.