In this exercise, we will practice the usage of convolution operations. 
Since the output after appyling convolution operations depends on the parameters, such as a filter size, channel size, and filter size.
Thus, this exercise amis to readers be familiar with conv operations.

Firstly, let's import tensorflow.

In [1]:
import tensorflow as tf

In most of convents, such as VGG-Net, ResNet and others, convolution, max- or average- pooling operations are widely used. 

Let's assume that inputs to our model has a shape [batch_size, height, width, # of channels], in this case [None, 32, 32, 3].

In [2]:
x = tf.placeholder(tf.float32, shape=[None, 32, 32, 3])
x

<tf.Tensor 'Placeholder:0' shape=(?, 32, 32, 3) dtype=float32>

The link for the documentation of conv2d operation is given as below: https://www.tensorflow.org/api_docs/python/tf/layers/conv2d.

Let's investigate the change of tensor shape after applying a basic 3x3 convolution with filter_size=16 and stride=1.

In [3]:
tf.layers.conv2d(x, filters=16, kernel_size=3, strides=1)

W0801 15:00:11.167540 139799745857280 deprecation.py:323] From <ipython-input-3-74add751a9a4>:1: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.keras.layers.Conv2D` instead.
W0801 15:00:11.172342 139799745857280 deprecation.py:506] From /home/wykgroup/appl/anaconda3/envs/ML_study/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor


<tf.Tensor 'conv2d/BiasAdd:0' shape=(?, 30, 30, 16) dtype=float32>

Since we apply the 3x3 convolution without zero padding (which is a default option of tf.layers.conv2d), we can see that the height and width of image inputs are changed from 32 to 30 and the number of channels is changed from 3 to 16. 

In most convnets, they aims to maintain the height and width of features but only to change number of channels and reduce the size of feature maps by applying pooling operations. 
For this purpose, we should apply zero-paddings in convolutions.

![](convolution_padding.png)

In [4]:
tf.layers.conv2d(x, filters=16, kernel_size=3, strides=1, padding='same')

<tf.Tensor 'conv2d_1/BiasAdd:0' shape=(?, 32, 32, 16) dtype=float32>

We can check that the height and width are not altered. 

What will be happend if we apply a convolution with stride=2?

In [5]:
tf.layers.conv2d(x, filters=16, kernel_size=3, strides=2)

<tf.Tensor 'conv2d_2/BiasAdd:0' shape=(?, 15, 15, 16) dtype=float32>

In [6]:
tf.layers.conv2d(x, filters=16, kernel_size=3, strides=2, padding='same')

<tf.Tensor 'conv2d_3/BiasAdd:0' shape=(?, 16, 16, 16) dtype=float32>

We can see that applying a convolution with a stride=2 and a zero-padding changes the height and width as a half. Some models implement the down-sampling of feature maps by applying a convolution with a stride=2.

![](convolution_pooling.jpeg)

Next we will see the effect of pooling operations, mostly used for down-samplings. 
The documentations for max- and average- pooling are given as belows: 
Max-pooling2d: https://www.tensorflow.org/api_docs/python/tf/layers/max_pooling2d
Average-pooling2d: https://www.google.com/search?client=firefox-b-d&q=tf+average+pooling

The shape change after applying pooling operation is same for those two different poolings with same parameter setting.

To downsample an input image as a half of its original size, max- or average- pooling of pool_size=2 and strides=2 is usually used. Let's see the change of shape. 

In [7]:
tf.layers.max_pooling2d(x, pool_size=2, strides=2, padding='same')

W0801 15:00:11.582760 139799745857280 deprecation.py:323] From <ipython-input-7-99515e918c53>:1: max_pooling2d (from tensorflow.python.layers.pooling) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.MaxPooling2D instead.


<tf.Tensor 'max_pooling2d/MaxPool:0' shape=(?, 16, 16, 3) dtype=float32>

In [8]:
tf.layers.average_pooling2d(x, pool_size=2, strides=2, padding='same')

W0801 15:00:11.704526 139799745857280 deprecation.py:323] From <ipython-input-8-245eb0b817e4>:1: average_pooling2d (from tensorflow.python.layers.pooling) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.AveragePooling2D instead.


<tf.Tensor 'average_pooling2d/AvgPool:0' shape=(?, 16, 16, 3) dtype=float32>

We investigated how convolution and pooling operations working. 

For an example, let's build the VGG-Net as original one used for the ImageNet challenge. 

![](vggnet.png)

The VGG-Net is composed of sequential combination of convolution, pooling, and dense layers. 
For an easiler implementation, let us define those operations as functions.

In [9]:
def conv(x, filters, kernel_size=3, strides=1, padding='same', activation=tf.nn.relu, use_bias=True):
   return tf.layers.conv2d(x, 
                           filters=filters, 
                           kernel_size=kernel_size, 
                           strides=strides,
                           padding=padding, 
                           activation=activation,
                           use_bias=use_bias)

In [10]:
def pool(x, pool_size=2, strides=2, padding='same'):
    return tf.layers.max_pooling2d(x, 
                                   pool_size=pool_size, 
                                   strides=strides, 
                                   padding=padding)

In [11]:
def dense(x, hidden_dim, activation=None):
    return tf.layers.dense(x, 
                           units=hidden_dim, 
                           activation=activation)

In [12]:
def vgg_net(x):
    filters_list = [64, 128, 256, 512]
    hidden_dim_list = [4096, 1000]
    
    x = conv(x, filters_list[0])
    print (x)
    x = conv(x, filters_list[0])
    print (x)
    x = pool(x)
    print (x)

    
    x = conv(x, filters_list[1])
    print (x)
    x = conv(x, filters_list[1])
    print (x)
    x = pool(x)
    print (x)
    
    x = conv(x, filters_list[2])
    print (x)
    x = conv(x, filters_list[2])
    print (x)
    x = conv(x, filters_list[2])
    print (x)
    x = pool(x)
    print (x)
    
    x = conv(x, filters_list[3])
    print (x)
    x = conv(x, filters_list[3])
    print (x)
    x = conv(x, filters_list[3])
    print (x)
    x = pool(x)
    print (x)
    
    x = conv(x, filters_list[3])
    print (x)
    x = conv(x, filters_list[3])
    print (x)
    x = conv(x, filters_list[3])
    print (x)
    x = pool(x)
    print (x)
    
    x = tf.layers.flatten(x)
    print (x)
    x = dense(x, hidden_dim_list[0], activation=tf.nn.relu)
    print (x)
    x = dense(x, hidden_dim_list[0], activation=tf.nn.relu)
    print (x)
    logits = dense(x, hidden_dim_list[1])
    return logits

In [13]:
x_vgg = tf.placeholder(tf.float32, [None, 224, 224, 3])
x

<tf.Tensor 'Placeholder:0' shape=(?, 32, 32, 3) dtype=float32>

In [14]:
vgg_net(x_vgg)

W0801 15:00:11.973469 139799745857280 deprecation.py:323] From <ipython-input-12-417687362aea>:47: flatten (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.flatten instead.


Tensor("conv2d_4/Relu:0", shape=(?, 224, 224, 64), dtype=float32)
Tensor("conv2d_5/Relu:0", shape=(?, 224, 224, 64), dtype=float32)
Tensor("max_pooling2d_1/MaxPool:0", shape=(?, 112, 112, 64), dtype=float32)
Tensor("conv2d_6/Relu:0", shape=(?, 112, 112, 128), dtype=float32)
Tensor("conv2d_7/Relu:0", shape=(?, 112, 112, 128), dtype=float32)
Tensor("max_pooling2d_2/MaxPool:0", shape=(?, 56, 56, 128), dtype=float32)
Tensor("conv2d_8/Relu:0", shape=(?, 56, 56, 256), dtype=float32)
Tensor("conv2d_9/Relu:0", shape=(?, 56, 56, 256), dtype=float32)
Tensor("conv2d_10/Relu:0", shape=(?, 56, 56, 256), dtype=float32)
Tensor("max_pooling2d_3/MaxPool:0", shape=(?, 28, 28, 256), dtype=float32)
Tensor("conv2d_11/Relu:0", shape=(?, 28, 28, 512), dtype=float32)
Tensor("conv2d_12/Relu:0", shape=(?, 28, 28, 512), dtype=float32)
Tensor("conv2d_13/Relu:0", shape=(?, 28, 28, 512), dtype=float32)
Tensor("max_pooling2d_4/MaxPool:0", shape=(?, 14, 14, 512), dtype=float32)
Tensor("conv2d_14/Relu:0", shape=(?, 14

W0801 15:00:12.150186 139799745857280 deprecation.py:323] From <ipython-input-11-6686e59b8b73>:4: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dense instead.


Tensor("flatten/Reshape:0", shape=(?, 25088), dtype=float32)
Tensor("dense/Relu:0", shape=(?, 4096), dtype=float32)
Tensor("dense_1/Relu:0", shape=(?, 4096), dtype=float32)


<tf.Tensor 'dense_2/BiasAdd:0' shape=(?, 1000) dtype=float32>

We successfully implement the VGG-Net as shown in the above figure and confirm the shape change of tensors. Just put output logits from the network into the objective function for the optimization!

For the last example, we will implement the residual block of PreActResNet, one of the variants of original residual network firtly proposed by Keiming He.

![](preact_resnet.png)

In [15]:
def residual_block(x0, filters):
    x = tf.layers.batch_normalization(x0)
    x = tf.nn.relu(x)
    x = conv(x, filters, activation=None)
    x = tf.layers.batch_normalization(x)
    x = tf.nn.relu(x)
    x = conv(x, filters, activation=None)
    
    return x + x0

In [16]:
x = tf.placeholder(tf.float32, shape=[None, 32, 32, 3]) # ex) CIFAR-10, 100
x

<tf.Tensor 'Placeholder_2:0' shape=(?, 32, 32, 3) dtype=float32>

In [17]:
x_conv = conv(x, filters=64, activation=None, use_bias=False)
x_conv

<tf.Tensor 'conv2d_17/Conv2D:0' shape=(?, 32, 32, 64) dtype=float32>

In [18]:
x_residual = residual_block(x_conv, 64)
x_residual

W0801 15:00:12.495867 139799745857280 deprecation.py:323] From <ipython-input-15-f82e21716cf3>:2: batch_normalization (from tensorflow.python.layers.normalization) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.BatchNormalization instead.  In particular, `tf.control_dependencies(tf.GraphKeys.UPDATE_OPS)` should not be used (consult the `tf.keras.layers.batch_normalization` documentation).


<tf.Tensor 'add:0' shape=(?, 32, 32, 64) dtype=float32>

We successfully the basic residual block!
But we would like to throw a question: what will be happend if the dimensionality of x0 and x in residual blocks are different? It can be happend if we want to change the shape of feature maps. 
How can we utilize the residual block for such cases?
We remain this question for a homework. 

In the next practice, we will implement overall pipeline of building classification models based on one of the variants of residual networks, Wide Residual Networks (WRN), with CIFAR-10 dataset.

We would like readers to see some references for preliminaries.
* Original resnet: https://arxiv.org/abs/1512.03385
* Further study on skip-connection (PreActResNet): https://arxiv.org/abs/1603.05027
* Wide ResNet: https://arxiv.org/abs/1605.07146