 ## Determine the dimensions of the output based on the input size and the filter size
 new_height = (input_height - filter_height + 2 * P)/S + 1
 
 new_width = (input_width - filter_width + 2 * P)/S + 1

**Instructions:**
1. Finish off each TODO in the conv2d function.
2. Setup the strides, padding and filter weight/bias (F_w and F_b) such that the output shape is (1, 2, 2, 3). Note that all of these except strides should be TensorFlow variables.

In [5]:
"""
Setup the strides, padding and filter weight/bias such that
the output shape is (1, 2, 2, 3).
"""
import tensorflow as tf
import numpy as np

# `tf.nn.conv2d` requires the input be 4D (batch_size, height, width, depth)
# (1, 4, 4, 1)
x = np.array([
    [0, 1, 0.5, 10],
    [2, 2.5, 1, -8],
    [4, 0, 5, 6],
    [15, 1, 2, 3]], dtype=np.float32).reshape((1, 4, 4, 1))

X = tf.constant(x)

For the 'SAME' padding, the output height and width are computed as:
> out_height = ceil(float(in_height) / float(strides[1]))

> out_width  = ceil(float(in_width) / float(strides[2]))

For the 'VALID' padding, the output height and width are computed as:
> out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))

> out_width  = ceil(float(in_width - filter_width + 1) / float(strides[2]))

In [13]:
def conv2d(input):
    # Filter (weights and bias)
    # The shape of the filter weight is (height, width, input_depth, output_depth)
    # The shape of the filter bias is (output_depth,)
    # TODO: Define the filter weights `F_W` and filter bias `F_b`.
    # NOTE: Remember to wrap them in `tf.Variable`, they are trainable parameters after all.
    F_W = tf.Variable(tf.random_normal([2, 2, 1, 3]))
    F_b = tf.Variable(tf.random_normal([3]))
    # TODO: Set the stride for each dimension (batch_size, height, width, depth)
    strides = [1, 2, 2, 1]
    # TODO: set the padding, either 'VALID' or 'SAME'.
    padding = 'SAME'
    # https://www.tensorflow.org/versions/r0.11/api_docs/python/nn.html#conv2d
    # `tf.nn.conv2d` does not include the bias computation so we have to add it ourselves after.
    return tf.nn.conv2d(input, F_W, strides, padding) + F_b

out = conv2d(X)
out

<tf.Tensor 'add_5:0' shape=(1, 2, 2, 3) dtype=float32>

## Using Pooling Layers in TensorFlow

In the below exercise, you'll be asked to set up the dimensions of the pooling filters, strides, as well as the appropriate padding. You should go over the TensorFlow documentation for tf.nn.max_pool(). Padding works the same as it does for a convolution.

**Instructions**
> Finish off each TODO in the maxpool function.

> Setup the strides, padding and ksize such that the output shape after pooling is (1, 2, 2, 1).

In [16]:
"""
Set the values to `strides` and `ksize` such that
the output shape after pooling is (1, 2, 2, 1).
"""
import tensorflow as tf
import numpy as np

# `tf.nn.max_pool` requires the input be 4D (batch_size, height, width, depth)
# (1, 4, 4, 1)
x = np.array([
    [0, 1, 0.5, 10],
    [2, 2.5, 1, -8],
    [4, 0, 5, 6],
    [15, 1, 2, 3]], dtype=np.float32).reshape((1, 4, 4, 1))
X = tf.constant(x)

def maxpool(input):
    # TODO: Set the ksize (filter size) for each dimension (batch_size, height, width, depth)
    ksize = [1, 2, 2, 1]
    # TODO: Set the stride for each dimension (batch_size, height, width, depth)
    strides = [1, 2, 2, 1]
    # TODO: set the padding, either 'VALID' or 'SAME'.
    padding = 'VALID'
    # https://www.tensorflow.org/versions/r0.11/api_docs/python/nn.html#max_pool
    return tf.nn.max_pool(input, ksize, strides, padding)
    
out = maxpool(X)
out

<tf.Tensor 'MaxPool_1:0' shape=(1, 2, 2, 1) dtype=float32>

I want to transform the input shape (1, 4, 4, 1) to (1, 2, 2, 1). I choose 'VALID' for the padding algorithm. I find it simpler to understand and it achieves the result I'm looking for.

> out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))

> out_width  = ceil(float(in_width - filter_width + 1) / float(strides[2]))

Plugging in the values:

> out_height = ceil(float(4 - 2 + 1) / float(2)) = ceil(1.5) = 2

> out_width  = ceil(float(4 - 2 + 1) / float(2)) = ceil(1.5) = 2
