# Transposed convolution arithmetic

A guide to convolution arithmetic for deep learning

---

## 3. Training DCGANs with Keras and TensorFlow

A DCGANs implementation using the transposed convolution technique and the [Keras](https://keras.io/) library.


### 1. Load data

#### Load libraries

In [1]:
import numpy as np

%matplotlib inline
import matplotlib.pyplot as plt

In [2]:
from keras.datasets import mnist
from keras.models import Sequential, Model
from keras.layers import Input, Dense, LeakyReLU, BatchNormalization
from keras.layers import Conv2D, Conv2DTranspose, Reshape, Flatten
from keras.optimizers import Adam
from keras import initializers
from keras.utils import plot_model
from keras import backend as K

Using TensorFlow backend.


#### Getting the data

In [3]:
# load dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()

#### Reshaping and normalizing the inputs

In [4]:
print('X_train.shape', X_train.shape)

if K.image_data_format() == 'channels_first':
    X_train = X_train.reshape(X_train.shape[0], 1, 28, 28)
    X_test = X_test.reshape(X_test.shape[0], 1, 28, 28)
    input_shape = (1, 28, 28)
else:
    X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
    X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)
    input_shape = (28, 28, 1)

# the generator is using tanh activation, for which we need to preprocess 
# the image data into the range between -1 and 1.

X_train = np.float32(X_train)
X_train = (X_train / 255 - 0.5) * 2
X_train = np.clip(X_train, -1, 1)

print('X_train reshape:', X_train.shape)

X_train.shape (60000, 28, 28)
X_train reshape: (60000, 28, 28, 1)


### 2. Define model

#### Generator

Our generator using the **inverse of convolution**, called transposed convolution. 

In between layers, BatchNormalization stabilizes learning. 

The activation function after each layer is a LeakyReLU. 

The output of the tanh at the last layer produces the fake image. 

![generator model](../img/generative.png)

In [51]:
# latent space dimension
latent_dim = 100

# imagem dimension 28x28
img_dim = 784

init = initializers.RandomNormal(stddev=0.02)

# Generator network
generator = Sequential()

# FC: 7x7x256
generator.add(Dense(7*7*256, input_shape=(latent_dim,), kernel_initializer=init))
generator.add(Reshape((7, 7, 256)))
generator.add(BatchNormalization())
generator.add(LeakyReLU(0.2))

# Conv 1: 14x14x128
generator.add(Conv2DTranspose(128, kernel_size=2, strides=2, padding='valid'))
generator.add(BatchNormalization())
generator.add(LeakyReLU(0.2))

# # Conv 2: 28x28x64
# generator.add(Conv2DTranspose(64, kernel_size=3, strides=2, padding='same'))
# generator.add(BatchNormalization())
# generator.add(LeakyReLU(0.2))

# # Conv 3: 28x28x32
# generator.add(Conv2DTranspose(32, kernel_size=3, strides=1, padding='same'))
# generator.add(BatchNormalization())
# generator.add(LeakyReLU(0.2))

# # Conv 4: 28x28x1
# generator.add(Conv2DTranspose(1, kernel_size=3, strides=2, padding='same',
#                               activation='tanh'))

generator.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_43 (Dense)             (None, 12544)             1266944   
_________________________________________________________________
reshape_43 (Reshape)         (None, 7, 7, 256)         0         
_________________________________________________________________
batch_normalization_89 (Batc (None, 7, 7, 256)         1024      
_________________________________________________________________
leaky_re_lu_92 (LeakyReLU)   (None, 7, 7, 256)         0         
_________________________________________________________________
conv2d_transpose_51 (Conv2DT (None, 14, 14, 128)       131200    
_________________________________________________________________
batch_normalization_90 (Batc (None, 14, 14, 128)       512       
_________________________________________________________________
leaky_re_lu_93 (LeakyReLU)   (None, 14, 14, 128)       0         
Total para

#### Generator model visualization

In [None]:
# prints a summary representation of your model
# generator.summary()

#### Discriminator

Our discriminator is a **convolutional neural network** that takes a 28x28 image with 1 channel. The values in the image is expected to be between -1 and 1.

It takes a digit image and classifies whether an image is real (1) or not (0).

The last activation is sigmoid to tell us the probability of whether the input image is real or not.

![discriminator model](../img/discriminative.png)

In [55]:
# Discriminator network
discriminator = Sequential()

# Conv 1: 14x14x64
discriminator.add(Conv2D(32, kernel_size=3, strides=2, padding='same',
                         input_shape=(28, 28, 1), kernel_initializer=init))
discriminator.add(LeakyReLU(0.2))

# Conv 2:
# discriminator.add(Conv2D(64, kernel_size=3, strides=2, padding='same'))
# discriminator.add(LeakyReLU(0.2))

# # Conv 3: 
# discriminator.add(Conv2D(128, kernel_size=3, strides=2, padding='same'))
# discriminator.add(LeakyReLU(0.2))

# # Conv 3: 
# discriminator.add(Conv2D(512, kernel_size=3, strides=1, padding='same'))
# discriminator.add(LeakyReLU(0.2))

# # FC
# discriminator.add(Flatten())
# discriminator.add(LeakyReLU(0.2))

# # Output
# discriminator.add(Dense(1, activation='sigmoid'))

discriminator.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_7 (Conv2D)            (None, 14, 14, 32)        320       
_________________________________________________________________
leaky_re_lu_97 (LeakyReLU)   (None, 14, 14, 32)        0         
Total params: 320
Trainable params: 320
Non-trainable params: 0
_________________________________________________________________


#### Discriminator model visualization

In [None]:
# prints a summary representation of your model
discriminator.summary()

### 3. Compile model

#### Compile discriminator

In [None]:
# Optimizer
opt = Adam(lr=0.0002, beta_1=0.5)

discriminator.compile(opt, loss='binary_crossentropy',
                      metrics=['binary_accuracy'])

#### Combined network

We connect the generator and the discriminator to make a DCGAN.

In [None]:
# d_g = discriminador(generador(z))
discriminator.trainable = False

z = Input(shape=(latent_dim,))
img = generator(z)
decision = discriminator(img)
d_g = Model(inputs=z, outputs=decision)

d_g.compile(opt, loss='binary_crossentropy',
            metrics=['binary_accuracy'])

#### GAN model vizualization

In [None]:
# prints a summary representation of your model
d_g.summary()

### 4. Fit model

We train the discriminator and the generator in turn in a loop as follows:

1. Set the discriminator trainable
2. Train the discriminator with the real digit images and the images generated by the generator to classify the real and fake images.
3. Set the discriminator non-trainable
4. Train the generator as part of the GAN. We feed latent samples into the GAN and let the generator to produce digit images and use the discriminator to classify the image.

In [None]:
epochs = 100
batch_size = 64
smooth = 0.1

real = np.ones(shape=(batch_size, 1))
fake = np.zeros(shape=(batch_size, 1))

d_loss = []
g_loss = []

for e in range(epochs + 1):
    for i in range(len(X_train) // batch_size):
        
        # Train Discriminator weights
        discriminator.trainable = True
        
        # Real samples
        X_batch = X_train[i*batch_size:(i+1)*batch_size]
        d_loss_real = discriminator.train_on_batch(x=X_batch,
                                                   y=real * (1 - smooth))
        
        # Fake Samples
        z = np.random.normal(loc=0, scale=1, size=(batch_size, latent_dim))
        X_fake = generator.predict_on_batch(z)
        d_loss_fake = discriminator.train_on_batch(x=X_fake, y=fake)
         
        # Discriminator loss
        d_loss_batch = 0.5 * (d_loss_real[0] + d_loss_fake[0])
        
        # Train Generator weights
        discriminator.trainable = False
        g_loss_batch = d_g.train_on_batch(x=z, y=real)

        print(
            'epoch = %d/%d, batch = %d/%d, d_loss=%.3f, g_loss=%.3f' % (e + 1, epochs, i, len(X_train) // batch_size, d_loss_batch, g_loss_batch[0]),
            100*' ',
            end='\r'
        )
    
    d_loss.append(d_loss_batch)
    g_loss.append(g_loss_batch[0])
    print('epoch = %d/%d, d_loss=%.3f, g_loss=%.3f' % (e + 1, epochs, d_loss[-1], g_loss[-1]), 100*' ')

    if e % 10 == 0:
        samples = 10
        x_fake = generator.predict(np.random.normal(loc=0, scale=1, size=(samples, latent_dim)))

        for k in range(samples):
            plt.subplot(2, 5, k+1)
            plt.imshow(x_fake[k].reshape(28, 28), cmap='gray')
            plt.xticks([])
            plt.yticks([])

        plt.tight_layout()
        plt.show()

### 5. Evaluate model

In [None]:
# plotting the metrics
plt.plot(d_loss)
plt.plot(g_loss)
plt.title('Model loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Discriminator', 'Adversarial'], loc='center right')
plt.show()

## References

* [Generative Adversarial Networks or GANs](https://arxiv.org/abs/1406.2661)
* [How to Train a GAN? Tips and tricks to make GANs work](https://github.com/soumith/ganhacks)
* [THE MNIST DATABASE of handwritten digits](http://yann.lecun.com/exdb/mnist/)
* [Convolution](https://devblogs.nvidia.com/deep-learning-nutshell-core-concepts/)