# Basic convolution in TensorFlow

Some of the arguments of ``conv2d`` may not directly be clear. This notebook demonstrates some effects. It assumes basic knowledge about convolution. To get started with 2D convolution in TensorFlow see the documentation 
https://www.tensorflow.org/api_guides/python/nn#Convolution and
https://www.tensorflow.org/api_docs/python/tf/nn/conv2d.

Note that the convolution operators in Tensorflow do, stricly speaking, not implement convolution but _cross-correlation_.

In [None]:
import tensorflow as tf
import numpy as np

## 1x1 convolution

Some deep learning architectures use 1x1 convolutions. What happens in this "degenerated" case? Let's consider a 3x3 image with 3 channels  and a 1x1 filter with 3 input channels and 1 output channel. The resulting image will be 3x3 with 1 channel (size 1x3x3x1), where the value of each pixel is the dot product across channels of the filter with the corresponding pixel in the input image.

In [None]:
input  = tf.Variable(tf.random_normal([1,3,3,3])) # input: [batch, in_height, in_width, in_channels]
filter = tf.Variable(tf.random_normal([1,1,3,1])) # filter: [filter_height, filter_width, in_channels, out_channels]
output = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='VALID')

In [None]:
session = tf.Session()
with session.as_default(): # this way the sessionis not closed after the block
    session.run(tf.global_variables_initializer())
    print("input")
    print(input.eval())
    print("filter")
    print(filter.eval())
    print("result")
    result = session.run(output)
    print(result)

Let's check the result. Each output at position `(x,y)` should be the scalar product of the input channels at `(x,y)` and the filter channels at `(x,y)`. We inspect position `(0,0)`:

In [None]:
a = input.eval(session=session)[0,0,0]
print("\ninput at position (0,0), all channels:", a)
b = filter.eval(session=session)[0,0,:,0]
print("filter at position (0,0), all channels:", b)

print("check at (0,0)", np.dot(a,b))

A typical application of 1x1 convolution is changing the number of channels (e.g., for dimensionality reduction).

## Strides 

The strides, which determine how much the filter is shifted in each dimension during convolution/cross-correlation, are defined by four integers `[batch, in_height, in_width, in_channels]`.

The first argument `batch` refers to the batch and is typically set to 1, because one normally does not want to skip over examples in the batch.

The last argument controls skipping input channels, which is typically not desired and therfore `in_channels` is typically set to 1.

The middle arguments refer to the shifts in the two image dimensions.


## 3x3 convolution

### Padding ``VALID``

In [None]:
filter = tf.Variable(tf.random_normal([3,3,3,1])) # filter: [filter_height, filter_width, in_channels, out_channels]
output = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='VALID')

Using padding `VALID`, the dimensions of the output are reduced. When applying a 3x3 filter to a 3x3 image with 3 channels (features), there is only one position where the filter can be placed without crossing the image boundaries. Thus, we get a single pixel output.

Before running the code: How many parameters does the filter operation have?

In [None]:
with tf.Session() as session:
    session.run(tf.global_variables_initializer())
    print("input")
    print(input.eval())
    print("filter")
    print(filter.eval())
    print("result")
    result = session.run(output)
    print(result)

## Padding ``SAME`` does zero-padding, the dimensions are preserved

In [None]:
filter = tf.Variable(tf.random_normal([3,3,3,1])) # filter: [filter_height, filter_width, in_channels, out_channels]
output = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='SAME')
with tf.Session() as session:
    session.run(tf.global_variables_initializer())
    print("input")
    print(input.eval())
    print("filter")
    print(filter.eval())
    print("result")
    result = session.run(output)
    print(result)

If the number of output channels (features) of the filter is increased to 3, how many parameters does the resulting filter operation have?

In [None]:
filter = tf.Variable(tf.random_normal([3,3,3,3])) # filter: [filter_height, filter_width, in_channels, out_channels]
output = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='SAME')
with tf.Session() as session:
    session.run(tf.global_variables_initializer())
    print("input")
    print(input.eval())
    print("filter")
    print(filter.eval())
    print("result")
    result = session.run(output)
    print(result)