<a href="https://colab.research.google.com/github/yohki/GAN-Cookbook/blob/master/GAN_Cookbook_DCGAN_(2).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Generator

今までのGANとDCGANを切り替えて使えるようにしているのでコードが多少長くなっている。DCGANのモデルは34〜54行目。

![alt text](https://drive.google.com/uc?id=1C2Z03je-tWywYQMhAXfqzTt6SJGvDwLQ)

In [10]:
#!/usr/bin/env python3
import sys
import numpy as np
from keras.layers import Dense, Reshape, Input, BatchNormalization
from keras.layers.core import Activation
from keras.layers.convolutional import UpSampling2D, Convolution2D, MaxPooling2D,Deconvolution2D
from keras.layers.advanced_activations import LeakyReLU
from keras.models import Sequential, Model
from keras.optimizers import Adam, SGD, Nadam,Adamax
from keras import initializers
from keras.utils import plot_model

class Generator(object):
    def __init__(self, width = 28, height= 28, channels = 1, latent_size=100, model_type = 'simple'):
        
        self.W = width
        self.H = height
        self.C = channels
        self.LATENT_SPACE_SIZE = latent_size
        self.latent_space = np.random.normal(0,1,(self.LATENT_SPACE_SIZE,))

        if model_type=='simple':
            self.Generator = self.model()
            self.OPTIMIZER = Adam(lr=0.0002, decay=8e-9)
            self.Generator.compile(loss='binary_crossentropy', optimizer=self.OPTIMIZER)
        elif model_type=='DCGAN':
            self.Generator = self.dc_model()
            self.OPTIMIZER = Adam(lr=1e-4, beta_1=0.2)
            self.Generator.compile(loss='binary_crossentropy', optimizer=self.OPTIMIZER,metrics=['accuracy'])
        self.save_model()
        self.summary()
        
    # DCGANのモデル
    def dc_model(self):

        model = Sequential()

        model.add(Dense(256*8*8,activation=LeakyReLU(0.2), input_dim=self.LATENT_SPACE_SIZE))
        model.add(BatchNormalization())

        model.add(Reshape((8, 8, 256)))
        model.add(UpSampling2D())

        model.add(Convolution2D(128, 5, 5, border_mode='same',activation=LeakyReLU(0.2)))
        model.add(BatchNormalization())
        model.add(UpSampling2D())

        model.add(Convolution2D(64, 5, 5, border_mode='same',activation=LeakyReLU(0.2)))
        model.add(BatchNormalization())
        model.add(UpSampling2D())

        model.add(Convolution2D(self.C, 5, 5, border_mode='same', activation='tanh'))
        
        return model

    # 普通のGANモデル
    def model(self, block_starting_size=128,num_blocks=4):
        model = Sequential()
        
        block_size = block_starting_size 
        model.add(Dense(block_size, input_shape=(self.LATENT_SPACE_SIZE,)))
        model.add(LeakyReLU(alpha=0.2))
        model.add(BatchNormalization(momentum=0.8))

        for i in range(num_blocks-1):
            block_size = block_size * 2
            model.add(Dense(block_size))
            model.add(LeakyReLU(alpha=0.2))
            model.add(BatchNormalization(momentum=0.8))

        model.add(Dense(self.W * self.H * self.C, activation='tanh'))
        model.add(Reshape((self.W, self.H, self.C)))
        
        return model

    def summary(self):
        return self.Generator.summary()

    def save_model(self):
        plot_model(self.Generator.model, to_file='DCGAN/Generator_Model.png')


Using TensorFlow backend.


* l.38:最初の入力は16384個のノイズ。 
* l.41: それを8x8x256の3次元ボリュームに変換。
* l.42: 縦横2倍にアップスケールして16x16x256にする。
* l.44: 16x16x256の入力に対して、5x5x256のフィルタを使って畳み込みを行い、16x16の特徴マップを128個生成している。 
* 通常畳み込みを行うと出力サイズが小さくなるが、`border_mode='same'`は周りを0で埋めることでサイズを変えずに畳み込みを行うことができる（ゼロパディング）。

![alt text](https://deepage.net/img/convolutional_neural_network/zero_padding.jpg
)

* l.46: 再び縦横2倍にアップスケールしてサイズを32x32x128にする。
* l.48: 2回目の畳み込み。32x32x128の入力に対してフィルタサイズが5x5x128、フィルタ数が64なので出力は32x32x64になる。
* l.50: 再度アップサンプリング。これでサイズが64x64x64になる。
* l.52: 最終画像として出力するために畳み込みを行う。入力サイズ=64x64x64、フィルタサイズ=5x5x64、フィルタ数=3、出力サイズ=64x64x3。これで64x64のRGB画像ができる。



# Discriminator

![alt text](https://drive.google.com/uc?id=1hP6PuCuwHefJhvbnTmJ5t1qyZqM17f5B)

In [0]:
#!/usr/bin/env python3
import sys
import numpy as np
from keras.layers import Input, Dense, Reshape, Flatten, Dropout, BatchNormalization, Lambda, concatenate
from keras.layers.core import Activation
from keras.layers.convolutional import Convolution2D
from keras.layers.advanced_activations import LeakyReLU
from keras.models import Sequential, Model
from keras.optimizers import Adam, SGD,Nadam, Adamax
import keras.backend as K
from keras.utils import plot_model


class Discriminator(object):
    def __init__(self, width = 28, height= 28, channels = 1, latent_size=100,model_type = 'simple'):
        self.W = width
        self.H = height
        self.C = channels
        self.CAPACITY = width*height*channels
        self.SHAPE = (width,height,channels)
        
        if model_type=='simple':
            self.Discriminator = self.model()
            self.OPTIMIZER = Adam(lr=0.0002, decay=8e-9)
            self.Discriminator.compile(loss='binary_crossentropy', optimizer=self.OPTIMIZER, metrics=['accuracy'] )
        elif model_type=='DCGAN':
            self.Discriminator = self.dc_model()
            self.OPTIMIZER = Adam(lr=1e-4, beta_1=0.2)
            self.Discriminator.compile(loss='binary_crossentropy', optimizer=self.OPTIMIZER, metrics=['accuracy'] )

        self.save_model()
        self.summary()

    def dc_model(self):
        model = Sequential()
        model.add(Convolution2D(64, 5, 5, subsample=(2,2), input_shape=(self.W,self.H,self.C), border_mode='same',activation=LeakyReLU(alpha=0.2)))
        model.add(Dropout(0.3))
        model.add(BatchNormalization())
        model.add(Convolution2D(128, 5, 5, subsample=(2,2), border_mode='same',activation=LeakyReLU(alpha=0.2)))
        model.add(Dropout(0.3))
        model.add(BatchNormalization())
        model.add(Flatten())
        model.add(Dense(1, activation='sigmoid'))
        return model

    def model(self):
        model = Sequential()
        model.add(Flatten(input_shape=self.SHAPE))
        model.add(Dense(self.CAPACITY, input_shape=self.SHAPE))
        model.add(LeakyReLU(alpha=0.2))
        model.add(Dense(int(self.CAPACITY/2)))
        model.add(LeakyReLU(alpha=0.2))
        model.add(Dense(1, activation='sigmoid'))
        return model

    def summary(self):
        return self.Discriminator.summary()

    def save_model(self):
        plot_model(self.Discriminator.model, to_file='DCGAN/Discriminator_Model.png')



- `Convolution2D`の`subsample=(2,2)`はストライドのことと思われる。`Conv2D(... stride=(2,2))`と同等で、これによりダウンサンプリングが行われる。
- 2層目の`Convolution2D`では`subsample=(2,2)`としておきつつ、`border_mode='same'`として出力サイズを同じに保っているがなぜ？
- l.38, 41: Batch Normalizationは入れた方がいい説とそうでない説がある。（→[KerasでDCGAN書く](https://qiita.com/t-ae/items/236457c29ba85a7579d5)）
- l.37, 40: `Dropout()`は一定割合で結合を切ってやることで過学習を防ぐ処理。

# GAN

前の章で使っていたものと同じ。

In [0]:
#!/usr/bin/env python3
import sys
import numpy as np
from keras.models import Sequential, Model
from keras.optimizers import Adam, SGD
from keras.utils import plot_model

class GAN(object):
    def __init__(self,discriminator,generator):
        self.OPTIMIZER = SGD(lr=2e-4,nesterov=True)
        self.Generator = generator

        self.Discriminator = discriminator
        self.Discriminator.trainable = False
        
        self.gan_model = self.model()
        self.gan_model.compile(loss='binary_crossentropy', optimizer=self.OPTIMIZER)
        self.save_model()
        self.summary()

    def model(self):
        model = Sequential()
        model.add(self.Generator)
        model.add(self.Discriminator)
        return model

    def summary(self):
        return self.gan_model.summary()

    def save_model(self):
        plot_model(self.gan_model.model, to_file='DCGAN/GAN_Model.png')


# Trainer

In [0]:
from keras.datasets import mnist
from random import randint
import numpy as np
import matplotlib.pyplot as plt
from copy import deepcopy
import time
import datetime

class Trainer:
    def __init__(self, width = 28, height= 28, channels = 1, latent_size=100, epochs =50000, batch=32, checkpoint=50,model_type=-1,data_path = ''):
        self.W = width
        self.H = height
        self.C = channels
        self.EPOCHS = epochs
        self.BATCH = batch
        self.CHECKPOINT = checkpoint
        self.model_type=model_type

        self.LATENT_SPACE_SIZE = latent_size

        self.generator = Generator(height=self.H, width=self.W, channels=self.C, latent_size=self.LATENT_SPACE_SIZE,model_type = 'DCGAN')
        self.discriminator = Discriminator(height=self.H, width=self.W, channels=self.C,model_type = 'DCGAN')
        self.gan = GAN(generator=self.generator.Generator, discriminator=self.discriminator.Discriminator)

        #self.load_MNIST()
        self.load_npy(data_path)

    def load_npy(self,npy_path):
        self.X_train = np.load(npy_path)
        self.X_train = self.X_train[:int(0.25*float(len(self.X_train)))]
        self.X_train = (self.X_train.astype(np.float32) - 127.5)/127.5
        self.X_train = np.expand_dims(self.X_train, axis=3)
        return

    def load_MNIST(self,model_type=3):
        allowed_types = [-1,0,1,2,3,4,5,6,7,8,9]
        if self.model_type not in allowed_types:
            print('ERROR: Only Integer Values from -1 to 9 are allowed')

        (self.X_train, self.Y_train), (_, _) = mnist.load_data()
        if self.model_type!=-1:
            self.X_train = self.X_train[np.where(self.Y_train==int(self.model_type))[0]]
        
        # Rescale -1 to 1
        # Find Normalize Function from CV Class  
        self.X_train = ( np.float32(self.X_train) - 127.5) / 127.5
        self.X_train = np.expand_dims(self.X_train, axis=3)
        return

    def train(self):
        for e in range(self.EPOCHS):
            e_start = time.time()
            b = 0
            X_train_temp = deepcopy(self.X_train)
            while self.BATCH < len(X_train_temp):
                b_start = time.time()
                # Keep track of Batches
                b=b+1

                # Train Discriminator
                # Make the training batch for this model be half real, half noise
                # Grab Real Images for this training batch
                if self.flipCoin():
                    count_real_images = int(self.BATCH)
                    starting_index = randint(0, (len(X_train_temp)-count_real_images))
                    real_images_raw = X_train_temp[ starting_index : (starting_index + count_real_images) ]
                    #self.plot_check_batch(b,real_images_raw)
                    # Delete the images used until we have none left
                    X_train_temp = np.delete(X_train_temp,range(starting_index,(starting_index + count_real_images)),0)
                    x_batch = real_images_raw.reshape( count_real_images, self.W, self.H, self.C )
                    y_batch = np.ones([count_real_images,1])
                else:
                    # Grab Generated Images for this training batch
                    latent_space_samples = self.sample_latent_space(self.BATCH)
                    x_batch = self.generator.Generator.predict(latent_space_samples)
                    y_batch = np.zeros([self.BATCH,1])

                # Now, train the discriminator with this batch
                discriminator_loss = self.discriminator.Discriminator.train_on_batch(x_batch,y_batch)[0]
            
                # In practice, flipping the label when training the generator improves convergence
                if self.flipCoin(chance=0.9):
                    y_generated_labels = np.ones([self.BATCH,1])
                else:
                    y_generated_labels = np.zeros([self.BATCH,1])
                x_latent_space_samples = self.sample_latent_space(self.BATCH)
                generator_loss = self.gan.gan_model.train_on_batch(x_latent_space_samples,y_generated_labels)
    
                b_elapsed = time.time() - b_start
                if b % self.CHECKPOINT == 0:
                    print('Batch: ' + str(int(b)) + 
                       ', [Discriminator :: Loss: ' + str(discriminator_loss) + 
                       '], [ Generator :: Loss: '+str(generator_loss) + 
                       '], {0}s'.format(b_elapsed))
                    label = str(e)+'_'+str(b)
                    self.plot_checkpoint(label)
            e_elapsed = time.time() - e_start
            print ('Epoch: '+str(int(e)) + 
                   ', [Discriminator :: Loss: ' + str(discriminator_loss) + 
                   '], [ Generator :: Loss: ' + str(generator_loss) + 
                   '], {0}s'.format(e_elapsed))
                        
            if e % self.CHECKPOINT == 0 :
                self.plot_checkpoint(e)
        return

    def flipCoin(self,chance=0.5):
        return np.random.binomial(1, chance)

    def sample_latent_space(self, instances):
        return np.random.normal(0, 1, (instances,self.LATENT_SPACE_SIZE))

    def plot_checkpoint(self,e):
        filename = "DCGAN/" + str(datetime.date.today()) + "/" + str(e) + ".png"

        noise = self.sample_latent_space(16)
        images = self.generator.Generator.predict(noise)
        
        plt.figure(figsize=(10,10))
        for i in range(images.shape[0]):
            plt.subplot(4, 4, i+1)
            if self.C==1:
                image = images[i, :, :]
                image = np.reshape(image, [self.H,self.W])
                image = (255*(image - np.min(image))/np.ptp(image)).astype(int)
                plt.imshow(image,cmap='gray')
            elif self.C==3:
                image = images[i, :, :, :]
                image = np.reshape(image, [self.H,self.W,self.C])
                image = (255*(image - np.min(image))/np.ptp(image)).astype(int)
                plt.imshow(image)
            
            plt.axis('off')
        plt.tight_layout()
        plt.savefig(filename)
        plt.close('all')
        return

    def plot_check_batch(self,b,images):
        filename = "DCGAN/batch_check_"+str(b)+".png"
        subplot_size = int(np.sqrt(images.shape[0]))
        plt.figure(figsize=(10,10))
        for i in range(images.shape[0]):
            plt.subplot(subplot_size, subplot_size, i+1)
            image = images[i, :, :, :]
            image = np.reshape(image, [self.H,self.W,self.C])
            plt.imshow(image)
            plt.axis('off')
        plt.tight_layout()
        plt.savefig(filename)
        plt.close('all')
        return


* l.53: 最初に`X_train_temp`に本物の画像を全部コピーしておく。
* これまではバッチサイズ（32とか128とか）のうち半分を本物の画像の中からランダムに選択したもの、もう半分はGeneratorで生成した画像として、それらを一つにまとめた学習用データとして学習させていた。
* 今回はバッチサイズ（128）の分の本物画像、もしくは生成画像だけを取ってきて学習を行うことを、本物の画像データがなくなるまで繰り返す。これによってパフォーマンス向上と、ネットワークが発散してしまうことを防げるらしい。
* l.81: Generatorの学習の際に、10%だけ逆のラベル（＝偽物）を与えて学習させている。これにより収束しやすくなるらしい。

# 学習実行

バッチサイズとエポック数の関係については
- [機械学習／ディープラーニングにおけるバッチサイズ、イテレーション数、エポック数の決め方](https://qiita.com/kenta1984/items/bad75a37d552510e4682)

In [14]:
import datetime, os

%cd /content
from google.colab import drive
drive.mount('./gdrive', force_remount=True)

!mkdir -p '/content/gdrive/My Drive/Colab Notebooks/DCGAN'
dir = '/content/gdrive/My Drive/Colab Notebooks/DCGAN/' + str(datetime.date.today())
os.makedirs(dir, exist_ok=True)

/content
Mounted at ./gdrive


In [15]:
%cd '/content/gdrive/My Drive/Colab Notebooks/'

HEIGHT  = 64
WIDTH   = 64
CHANNEL = 3
LATENT_SPACE_SIZE = 100
EPOCHS = 100
BATCH = 128
CHECKPOINT = 100
PATH = "lsun/data/church_outdoor_train_lmdb_color.npy"

trainer = Trainer(height=HEIGHT,\
                 width=WIDTH,\
                 channels=CHANNEL,\
                 latent_size=LATENT_SPACE_SIZE,\
                 epochs =EPOCHS,\
                 batch=BATCH,\
                 checkpoint=CHECKPOINT,\
                 model_type='DCGAN',\
                 data_path=PATH)
                 
trainer.train()

W0710 05:25:40.994905 140080986978176 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:74: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

  identifier=identifier.__class__.__name__))
W0710 05:25:41.036245 140080986978176 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

W0710 05:25:41.044001 140080986978176 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

W0710 05:25:41.149880 140080986978176 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:133: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_

/content/gdrive/My Drive/Colab Notebooks


W0710 05:25:41.223943 140080986978176 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:174: The name tf.get_default_session is deprecated. Please use tf.compat.v1.get_default_session instead.

W0710 05:25:44.369856 140080986978176 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:1834: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.

W0710 05:25:44.697921 140080986978176 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/keras/optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.

W0710 05:25:44.708741 140080986978176 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_impl.py:180: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future vers

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 16384)             1654784   
_________________________________________________________________
batch_normalization_1 (Batch (None, 16384)             65536     
_________________________________________________________________
reshape_1 (Reshape)          (None, 8, 8, 256)         0         
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 16, 16, 256)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 16, 16, 128)       819328    
_________________________________________________________________
batch_normalization_2 (Batch (None, 16, 16, 128)       512       
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 32, 32, 128)       0         
__________



_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_4 (Conv2D)            (None, 32, 32, 64)        4864      
_________________________________________________________________
dropout_1 (Dropout)          (None, 32, 32, 64)        0         
_________________________________________________________________
batch_normalization_4 (Batch (None, 32, 32, 64)        256       
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 16, 16, 128)       204928    
_________________________________________________________________
dropout_2 (Dropout)          (None, 16, 16, 128)       0         
_________________________________________________________________
batch_normalization_5 (Batch (None, 16, 16, 128)       512       
_________________________________________________________________
flatten_1 (Flatten)          (None, 32768)             0         
__________



_________________________________________________________________
Layer (type)                 Output Shape              Param #   
sequential_1 (Sequential)    (None, 64, 64, 3)         2750083   
_________________________________________________________________
sequential_2 (Sequential)    (None, 1)                 243329    
Total params: 2,993,412
Trainable params: 2,716,931
Non-trainable params: 276,481
_________________________________________________________________


  'Discrepancy between trainable weights and collected trainable'
  'Discrepancy between trainable weights and collected trainable'


Batch: 100, [Discriminator :: Loss: 0.10093285], [ Generator :: Loss: 3.3650723], 0.1575636863708496s
Batch: 200, [Discriminator :: Loss: 0.06930032], [ Generator :: Loss: 0.058838014], 0.16363024711608887s
Batch: 300, [Discriminator :: Loss: 0.11418487], [ Generator :: Loss: 0.036510292], 0.38782310485839844s
Batch: 400, [Discriminator :: Loss: 0.022124382], [ Generator :: Loss: 5.0902863], 0.1645643711090088s
Epoch: 0, [Discriminator :: Loss: 0.050763927], [ Generator :: Loss: 4.750782], 174.83482813835144s
Batch: 100, [Discriminator :: Loss: 0.012861535], [ Generator :: Loss: 5.845931], 0.16380810737609863s
Batch: 200, [Discriminator :: Loss: 0.095501855], [ Generator :: Loss: 4.580192], 0.5424139499664307s
Batch: 300, [Discriminator :: Loss: 0.044095334], [ Generator :: Loss: 5.332164], 0.16562438011169434s
Batch: 400, [Discriminator :: Loss: 0.052587934], [ Generator :: Loss: 5.133879], 0.2949178218841553s
Batch: 500, [Discriminator :: Loss: 0.014289669], [ Generator :: Loss: 5.87

KeyboardInterrupt: ignored

# 学習結果

## Epoch 1
![alt text](https://drive.google.com/uc?id=1-G76knnWkU-CCnDFF8vaCskcWZFPpbMz)

## Epoch 10
![alt text](https://drive.google.com/uc?id=11l7mlpUW1SMKpo69us-fc1FZ7U34OrmP)

## Epoch 20
![alt text](https://drive.google.com/uc?id=15E4p2BeY7bzLBTZEweq4P3SkcLe-9VEQ)

## Epoch 30
![alt text](https://drive.google.com/uc?id=181o0lskvCHSCZ6de-LjNSdshAE2lQGnH)

## Epoch 40
![alt text](https://drive.google.com/uc?id=1Aa9aaSoFa9BnbIjQ0jLxh-PUYMaI0z4Q)

## Epoch 50
![alt text](https://drive.google.com/uc?id=1DEwm8NZxqALTWz83z70MQV3b29CIwLzH)

## Epoch 60
![alt text](https://drive.google.com/uc?id=1GKt6h9MYz9xV2L8YFg12Um8TCGZLQIiv)

## Epoch 64
![alt text](https://drive.google.com/uc?id=1HjSAMIo7IQwhH1KrRkUt-Vd6yHKcbnMQ)