image, in a one-shot manner.
A traditional convolutional neural network for image classification, and related tasks, will
use pooling layers to downsample input images. For example, an average pooling or max pooling
layer will reduce the feature maps from a convolutional by half on each dimension, resulting
in an output that is one quarter the area of the input. Convolutional layers themselves also
perform a form of downsampling by applying each filter across the input images or feature maps;
the resulting activations are an output feature map that is smaller because of the border effects.
Often padding is used to counter this effect. The generator model in a GAN requires an inverse
operation of a pooling layer in a traditional convolutional layer. It needs a layer to translate
from coarse salient features to a more dense and detailed output.
A simple version of an unpooling or opposite pooling layer is called an upsampling layer.
It works by repeating the rows and columns of the input. A more elaborate approach is to
perform a backwards convolutional operation, originally referred to as a deconvolution, which is
incorrect, but is more commonly referred to as a fractional convolutional layer or a transposed
convolutional layer. Both of these layers can be used on a GAN to perform the required
upsampling operation to transform a small input into a large image output. In the following
sections, we will take a closer look at each and develop an intuition for how they work so that
we can use them effectively in our GAN mode

In [5]:
import numpy as np
import tensorflow
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import UpSampling2D

X = np.array([[1,2],[3,4]])

# We must add a channel dimension (e.g. grayscale) and also a sample dimension (e.g. we have 1 sample) so that we can pass it as input to the model. The
# data dimensions in order are: samples, rows, columns, and channels.

X = X.reshape((1,2,2,1))

In [6]:
model = Sequential()
model.add(UpSampling2D(input_shape=(2,2,1)))
model.summary()


Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
up_sampling2d (UpSampling2D) (None, 4, 4, 1)           0         
Total params: 0
Trainable params: 0
Non-trainable params: 0
_________________________________________________________________


In [7]:
#the layer has no parameters or model weights. This is because it is not learning anything; it is just
#doubling the input.
yhat = model.predict(X)
yhat

array([[[[1.],
         [1.],
         [2.],
         [2.]],

        [[1.],
         [1.],
         [2.],
         [2.]],

        [[3.],
         [3.],
         [4.],
         [4.]],

        [[3.],
         [3.],
         [4.],
         [4.]]]], dtype=float32)

In [8]:
#  reshape output to remove sample and channel to make printing easier
yhat = yhat.reshape((4,4))
print(yhat)

[[1. 1. 2. 2.]
 [1. 1. 2. 2.]
 [3. 3. 4. 4.]
 [3. 3. 4. 4.]]


In [None]:
#By default, the UpSampling2D will double each input dimension. This is defined by the
#size argument that is set to the tuple (2,2). You may want to use different factors on each
#dimension, such as double the width and triple the height. This could be achieved by setting
#the size argument to (2, 3)

# example of using different scale factors for each dimension
model.add(UpSampling2D(size=(2, 3)))


In [None]:
#Additionally, by default, the UpSampling2D layer will use a nearest neighbor algorithm to
#fill in the new rows and columns. This has the effect of simply doubling rows and columns, as
#described and is specified by the interpolation argument set to ‘nearest’. Alternately, a
#bilinear interpolation method can be used which draws upon multiple surrounding points. This
#can be specified via setting the interpolation argument to ‘bilinear’.


model.add(UpSampling2D(interpolation='bilinear'))


In [11]:
# Simple Generator model with the upsampling2D layer 

# To be useful in a GAN, each UpSampling2D layer must be followed by a Conv2D layer that will learn to interpret the doubled
# input and be trained to translate it into meaningful detail

from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Reshape
from tensorflow.keras.layers import UpSampling2D
from tensorflow.keras.layers import Conv2D

model = Sequential()
# define input shape and enough activations for the 128 5x5 image
model.add(Dense(128 * 5 * 5,input_dim=100))
# reshape the vector of activation into 128 feature maps with 5x5
model.add(Reshape((5,5,128)))
#double the input from 128 5x5 to 1 10x10 feature map
model.add(UpSampling2D())
#fill in deatil in upsampled feature maps and output a single image
model.add(Conv2D(1,(3,3),padding='same'))

model.summary()

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_2 (Dense)              (None, 3200)              323200    
_________________________________________________________________
reshape_1 (Reshape)          (None, 5, 5, 128)         0         
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 10, 10, 128)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 10, 10, 1)         1153      
Total params: 324,353
Trainable params: 324,353
Non-trainable params: 0
_________________________________________________________________


In [None]:
# How to use a Transpose Convolutional Layer



We can now define our model. The model has only the Conv2DTranspose layer, which
takes 2 × 2 grayscale images as input directly and outputs the result of the operation. The
Conv2DTranspose both upsamples and performs a convolution. As such, we must specify both
the number of filters and the size of the filters as we do for Conv2D layers. Additionally, we
must specify a stride of (2,2) because the upsampling is achieved by the stride behavior of the
convolution on the input. Specifying a stride of (2,2) has the effect of spacing out the input.
Specifically, rows and columns of 0.0 values are inserted to achieve the desired stride. In this
example, we will use one filter, with a 1 × 1 kernel and a stride of 2 × 2 so that the 2 × 2 input
image is upsampled to 4 × 4

In [14]:
from tensorflow.keras.layers import Conv2DTranspose
model = Sequential()
model.add(Conv2DTranspose(1,(1,1),strides=(2,2),input_shape=(2,2,1)))
model.summary()

Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_transpose (Conv2DTran (None, 4, 4, 1)           2         
Total params: 2
Trainable params: 2
Non-trainable params: 0
_________________________________________________________________


In [16]:
#We can now define our model. The model has only the Conv2DTranspose layer, which
#takes 2 × 2 grayscale images as input directly and outputs the result of the operation. The
#Conv2DTranspose both upsamples and performs a convolution. As such, we must specify both
#the number of filters and the size of the filters as we do for Conv2D layers. Additionally, we
#must specify a stride of (2,2) because the upsampling is achieved by the stride behavior of the
#convolution on the input. Specifying a stride of (2,2) has the effect of spacing out the input.
#Specifically, rows and columns of 0.0 values are inserted to achieve the desired stride. In this
#example, we will use one filter, with a 1 × 1 kernel and a stride of 2 × 2 so that the 2 × 2 input
#image is upsampled to 4 × 4.


weights = [np.asarray([[[[1]]]]),np.asarray([0])]
model.set_weights(weights)

In [17]:
yhat = model.predict(X)

In [19]:
yhat = yhat.reshape((4,4))
print(yhat)

[[1. 0. 2. 0.]
 [0. 0. 0. 0.]
 [3. 0. 4. 0.]
 [0. 0. 0. 0.]]


The Conv2DTranspose is more complex than the UpSampling2D layer, but it is also effective
when used in GAN models, specifically the generator model. Either approach can be used,
although the Conv2DTranspose layer is preferred

our little GAN generator model must produce a 10 × 10 image and take a
100-element vector from the latent space as input, as in the previous UpSampling2D example.
First, a Dense fully connected layer can be used to interpret the input vector and create a
sufficient number of activations (outputs) that can be reshaped into a low-resolution version of
our output image, in this case, 128 versions of a 5 × 5 image

Next, the 5 × 5 feature maps can be upsampled to a 10 × 10 feature map. We will use a
3 × 3 kernel size for the single filter, which will result in a slightly larger than doubled width
and height in the output feature map (11 × 11). Therefore, we will set the padding argument
to ‘same’ to ensure the output dimensions are 10 × 10 as required

In [22]:
model = Sequential()
model.add(Dense(128*5*5,input_dim=100))
model.add(Reshape((5,5,128)))
model.add(Conv2DTranspose(1,(3,3),strides=(2,2),padding='same'))
model.summary()

Model: "sequential_9"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_5 (Dense)              (None, 3200)              323200    
_________________________________________________________________
reshape_3 (Reshape)          (None, 5, 5, 128)         0         
_________________________________________________________________
conv2d_transpose_2 (Conv2DTr (None, 10, 10, 1)         1153      
Total params: 324,353
Trainable params: 324,353
Non-trainable params: 0
_________________________________________________________________
