# Convolutional Neural Networks
- Course: Self Driving Car Nanodegree
- Lesson 8: Convolutional Neural Networks
- Topic: Basic Convnet operations with TF

### Setup
H = height, W = width, D = depth

- We have an input of shape 32x32x3 (HxWxD)
- 20 filters of shape 8x8x3 (HxWxD)
- A stride of 2 for both the height and width (S)
- With padding of size 1 (P)

Formula for calculating the height or width from the output:

    new_height = (input_height - filter_height + 2 * P)/S + 1
    new_width = (input_width - filter_width + 2 * P)/S + 1
    new_height and new width = 14

In [None]:
input = tf.placeholder(tf.float32, (None, 32, 32, 3))
filter_weights = tf.Variable(tf.truncated_normal((8, 8, 3, 20))) # (height, width, input_depth, output_depth)
filter_bias = tf.Variable(tf.zeros(20))
strides = [1, 2, 2, 1] # (batch, height, width, depth)
padding = 'SAME'
conv = tf.nn.conv2d(input, filter_weights, strides, padding) + filter_bias

For padding = SAME, the conv form will be [1, 16, 16, 20] due to padding on TF, and not 14x14. For padding = VALID, the conv format would be [1, 13, 13, 20]. 

### Convnets with TF
TF provides the following functions to create convolutional layers: **tf.nn.conv2d()** and **tf.nn.bias_add()**.
The stride has the following sintax: 
    
    stride = [batch, input_height, input_width, input_channels]
Generally, batch and input_channels are set to 1, while input_height and input_width set the stride offset from the filter.

In [None]:
# Output depth
k_output = 64

# Image Properties
image_width = 10
image_height = 10
color_channels = 3

# Convolution filter
filter_size_width = 5
filter_size_height = 5

# Input/Image
input = tf.placeholder(
    tf.float32,
    shape=[None, image_height, image_width, color_channels])

# Weight and bias
weight = tf.Variable(tf.truncated_normal(
    [filter_size_height, filter_size_width, color_channels, k_output]))
bias = tf.Variable(tf.zeros(k_output))

# Apply Convolution
conv_layer = tf.nn.conv2d(input, weight, strides=[1, 2, 2, 1], padding='SAME')
# Add bias
conv_layer = tf.nn.bias_add(conv_layer, bias)
# Apply activation function
conv_layer = tf.nn.relu(conv_layer)

### Max Pooling
This technique consists on reducing the input image. This is done convoluting a small part of the image and extracting the maximum values from each convolution. For example, [[1, 0], [4, 6]] becomes 6, because 6 is the maximum value in this set.

TF provides the **tf.nn.max_pol()** to apply max pooling to convolutional layers.

ksize represents the filter size, commonly this and strides use 2x2 as shows the example below.

This technique decreases the size of the output and consequently prevents overfitting. This is due to reducing the numbers of parameters in future layers.

In [None]:
...
conv_layer = tf.nn.conv2d(input, weight, strides=[1, 2, 2, 1], padding='SAME')
conv_layer = tf.nn.bias_add(conv_layer, bias)
conv_layer = tf.nn.relu(conv_layer)
# Apply Max Pooling
conv_layer = tf.nn.max_pool(
    conv_layer,
    ksize=[1, 2, 2, 1],
    strides=[1, 2, 2, 1],
    padding='SAME')