<a href="https://colab.research.google.com/github/kwanhong66/PyTorchKaggle/blob/master/GAN_implementation_from_scratch_using_PyTorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## PyTorch x Kaggle

- kaggle: Cats faces 64x64 (For generative models)
  - https://www.kaggle.com/spandan2/cats-faces-64x64-for-generative-models

- notebook
  - https://www.kaggle.com/bunnyyy/gan-implementation-from-scratch-using-pytorch

- GAN
  - https://dreamgonfly.github.io/blog/gan-explained/#gan-%EC%A7%81%EC%A0%91-%EB%A7%8C%EB%93%A4%EC%96%B4%EB%B3%B4%EA%B8%B0

## Dataset with Kaggle API

In [None]:
!pip install -q kaggle 

In [None]:
!wget 'https://raw.githubusercontent.com/kwanhong66/KaggleShoveling/master/token/kaggle.json'

--2020-12-15 07:53:13--  https://raw.githubusercontent.com/kwanhong66/KaggleShoveling/master/token/kaggle.json
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 63 [text/plain]
Saving to: ‘kaggle.json’


2020-12-15 07:53:14 (2.01 MB/s) - ‘kaggle.json’ saved [63/63]



In [None]:
!mkdir ~/.kaggle
!cp kaggle.json ~/.kaggle

In [None]:
!chmod 600 ~/.kaggle/kaggle.json

In [None]:
!kaggle datasets download spandan2/cats-faces-64x64-for-generative-models

Downloading cats-faces-64x64-for-generative-models.zip to /content
 93% 89.0M/96.0M [00:00<00:00, 84.4MB/s]
100% 96.0M/96.0M [00:00<00:00, 123MB/s] 


In [None]:
!mkdir input

In [None]:
!unzip '*.zip' -d ./input/

[1;30;43m스트리밍 출력 내용이 길어서 마지막 5000줄이 삭제되었습니다.[0m
  inflating: ./input/cats/cats/5499.jpg  
  inflating: ./input/cats/cats/55.jpg  
  inflating: ./input/cats/cats/550.jpg  
  inflating: ./input/cats/cats/5500.jpg  
  inflating: ./input/cats/cats/5501.jpg  
  inflating: ./input/cats/cats/5502.jpg  
  inflating: ./input/cats/cats/5503.jpg  
  inflating: ./input/cats/cats/5504.jpg  
  inflating: ./input/cats/cats/5505.jpg  
  inflating: ./input/cats/cats/5506.jpg  
  inflating: ./input/cats/cats/5507.jpg  
  inflating: ./input/cats/cats/5508.jpg  
  inflating: ./input/cats/cats/5509.jpg  
  inflating: ./input/cats/cats/551.jpg  
  inflating: ./input/cats/cats/5510.jpg  
  inflating: ./input/cats/cats/5511.jpg  
  inflating: ./input/cats/cats/5512.jpg  
  inflating: ./input/cats/cats/5513.jpg  
  inflating: ./input/cats/cats/5514.jpg  
  inflating: ./input/cats/cats/5515.jpg  
  inflating: ./input/cats/cats/5516.jpg  
  inflating: ./input/cats/cats/5517.jpg  
  inflating: ./input/cats/cats

In [None]:
import numpy as np # linear algebra
import pandas as pd

import os
from tqdm.notebook import tqdm
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision.transforms as tt

import cv2

from torch.utils.data import DataLoader
from torchvision.datasets import ImageFolder
from torchvision.utils import save_image
from torchvision.utils import make_grid

In [None]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

In [None]:
input_dir = './input/cats'

In [None]:
print(os.listdir(input_dir+ '/cats')[:5])

['9440.jpg', '14771.jpg', '339.jpg', '5787.jpg', '10283.jpg']


* `DataLoader`의 `pin_memory` 옵션은 Tensor를 CUDA 고정 메모리에 올린다
  - https://discuss.pytorch.org/t/when-to-set-pin-memory-to-true/19723

In [None]:
image_size = 64
batch_size = 128
latent_size = 128
stats = (0.5, 0.5, 0.5), (0.5, 0.5, 0.5)

img_folder_dataset = ImageFolder(input_dir, transform=tt.Compose([tt.Resize(image_size),
                                                                  tt.CenterCrop(image_size),
                                                                  tt.ToTensor(),
                                                                  tt.Normalize(*stats)]))
train_dataloader = DataLoader(img_folder_dataset, batch_size, shuffle=True, num_workers=3, pin_memory=True)

## Generator and Discriminator

* [PyTorch] ConvTranspose2d 와 Conv2d 의 관계
  - https://simonjisu.github.io/python/2019/10/27/convtranspose2d.html

* Batch Normalization (and CNN)
  - https://shuuki4.wordpress.com/2016/01/13/batch-normalization-%EC%84%A4%EB%AA%85-%EB%B0%8F-%EA%B5%AC%ED%98%84/

### Generator

Generator intakes a s tensor of size N*(128x1x1) and outputs a (3x64x64) Tensor which inturn is an image. Since the GAN training are unstable as they start off with random noises, we require batchnormalization so the outputs remain normalized and under control.

In [None]:
generator = nn.Sequential(
    # in: latent_size x 1 x 1
    
    nn.ConvTranspose2d(latent_size, 512, kernel_size=4, stride=1, padding=0, bias=False),
    nn.BatchNorm2d(512),
    nn.ReLU(True),
    # out: 512 x 4 x 4

    nn.ConvTranspose2d(512, 256, kernel_size=4, stride=2, padding=1, bias=False),
    nn.BatchNorm2d(256),
    nn.ReLU(True),
    # out: 256 x 8 x 8

    nn.ConvTranspose2d(256, 128, kernel_size=4, stride=2, padding=1, bias=False),
    nn.BatchNorm2d(128),
    nn.ReLU(True),
    # out: 128 x 16 x 16

    nn.ConvTranspose2d(128, 64, kernel_size=4, stride=2, padding=1, bias=False),
    nn.BatchNorm2d(64),
    nn.ReLU(True),
    # out: 64 x 32 x 32

    nn.ConvTranspose2d(64, 3, kernel_size=4, stride=2, padding=1, bias=False),
    nn.Tanh()
    # out: 3 x 64 x 64
)

### Discriminator

Discriminator intakes the image outputs a tensor of size (1x1x1) providing a score between 0-1 according to the probability of real and fake.

In [None]:
discriminator = nn.Sequential(
    # in: 3 x 64 x 64

    nn.Conv2d(3, 64, kernel_size=4, stride=2, padding=1, bias=False),
    nn.BatchNorm2d(64),
    nn.LeakyReLU(0.2, inplace=True),
    # out: 64 x 32 x 32

    nn.Conv2d(64, 128, kernel_size=4, stride=2, padding=1, bias=False),
    nn.BatchNorm2d(128),
    nn.LeakyReLU(0.2, inplace=True),
    # out: 128 x 16 x 16

    nn.Conv2d(128, 256, kernel_size=4, stride=2, padding=1, bias=False),
    nn.BatchNorm2d(256),
    nn.LeakyReLU(0.2, inplace=True),
    # out: 256 x 8 x 8 

    nn.Conv2d(256, 512, kernel_size=4, stride=2, padding=1, bias=False),
    nn.BatchNorm2d(512),
    nn.LeakyReLU(0.2, inplace=True),
    # out: 512 x 4 x 4

    nn.Conv2d(512, 1, kernel_size=4, stride=1, padding=0, bias=False),
    # out: 1 x 1 x 1

    nn.Flatten(),
    nn.Sigmoid()
)

A denorm function to denormalize the images produced by generator to make it understandable to human eyes.

In [None]:
def denorm(img_tensors):
  return img_tensors * stats[1][0] + stats[0][0]

In [23]:
# To save the samples produced during epochs

sample_dir = 'generated'
os.makedirs(sample_dir, exist_ok=True)

def save_samples(index, latent_tensors, show=True):
  fake_images = generator(latent_tensors).to(device)
  fake_fname = 'generated-images-{0:0=4d}'.format(index)
  save_image(denorm(fake_images), os.path.join(sample_dir, fake_fname), nrow=8)
  print('Saving', fake_fname)
  if show:
    fig, ax = plt.subplots(figsize=(8, 8))
    ax.set_xticks([])
    ax.set_yticks([])
    ax.imshow(make_grid(fake_images.cpu().detach(), nrow=8).permute(1, 2, 0))

## Training setup with generator and discriminator

### Function for the training discriminator
  - With real and fake images, discriminator is trained on discriminating a image
  - real image with target 1 and fake image with target 0

  -  구분자는 진짜 이미지를 입력하면 1에 가까운 확률값을 출력하고, 가짜 데이터를 입력하면 0에 가까운 확률값을 출력해야 한다. 
  - 따라서 구분자의 손실 함수는 두 가지의 합으로 이루어진다. 진짜 이미지를 입력했을 때의 출력값과 1과의 차이, 그리고 가짜 이미지를 입력했을 때의 출력값과 0과의 차이, 두 경우의 합이 구분자의 손실 함수다. 
  - 이 손실 함수의 값을 최소화하는 방향으로 구분자의 매개 변수가 업데이트된다.

In [25]:
def train_discriminator(real_images, opt_d):

  # Clear discriminator gradients
  opt_d.zero_grad()

  # Pass real images through discriminator
  real_preds = discriminator(real_images).to(device)
  real_targets = torch.ones(real_images.size(0), 1).to(device)
  real_loss = F.binary_cross_entropy(real_preds, real_targets)
  real_score = torch.mean(real_preds).item()

  # Generate fake images
  latent = torch.randn(batch_size, latent_size, 1, 1).to(device)
  fake_images = generator(latent).to(device)

  # Pass fake images through discriminator
  fake_targets = torch.zeros(fake_images.size(0), 1).to(device)
  fake_preds = discriminator(fake_images).to(device)
  fake_loss = F.binary_cross_entropy(fake_preds, fake_targets)
  fake_score = torch.mean(fake_preds).item()

  # Update discriminaotr weights
  loss = real_loss + fake_loss
  loss.backward()
  opt_d.step()

  return loss.item(), real_score, fake_score

### Function for training the generator

- 생성자의 목적은 구분자를 속이는 것이다. 다시 말해 생성자가 만들어낸 가짜 이미지를 구분자에 넣었을 때 출력값이 1에 가깝게 나오도록 해야 한다. 
- 이 값이 1에서 떨어진 정도가 생성자의 손실 함수가 되고, 이를 최소화 시키도록 생성자를 학습시키게 된다.

In [26]:
def train_generator(opt_g):

  # Clear generator gradients
  opt_g.zero_grad()

  # Generate fake images
  latent = torch.randn(batch_size, latent_size, 1, 1).to(device)
  fake_images = generator(latent).to(device)

  # Try to fool the discriminator
  preds = discriminator(fake_images).to(device)
  targets = torch.ones(fake_images.size(0), 1).to(device)
  loss = F.binary_cross_entropy(preds, targets)

  # Update generator weights
  loss.backward()
  opt_g.step()

  return loss.item(), latent

## Training model

In [27]:
def fit(epochs, lr, start_idx=1):
  torch.cuda.empty_cache()

  # Losses & scores
  losses_g = []
  losses_d = []
  real_scores = []
  fake_scores = []

  # Create optimizer
  opt_d = torch.optim.Adam(discriminator.to(device).parameters(), lr=lr, betas=(0.5, 0.999))
  opt_g = torch.optim.Adam(generator.to(device).parameters(), lr=lr, betas=(0.5, 0.999))

  for epoch in range(epochs):
    for real_images, _ in tqdm(train_dataloader):

      # Train discriminator
      real_images = real_images.to(device)
      loss_d, real_score, fake_score = train_discriminator(real_images, opt_d)

      # Train generator
      loss_g, latent = train_generator(opt_g)

    # Record losses and scores
    losses_g.append(loss_g)
    losses_d.append(loss_d)
    real_scores.append(real_score)
    fake_scores.append(fake_score)

    # Log losses & scores (last batch)
    print("Epoch [{}/{}], loss_g: {}, loss_d: {}, real_score: {}, fake_score: {}".format(
        epoch+1, epochs, loss_g, loss_d, real_score, fake_score))

    # Save generated images
    save_samples(epoch+start_idx, latent, show=False)
    
    return losses_g, losses_d, latent, fake_scores

In [None]:
model = fit(epochs=10, lr=0.0002).to(device)