# Exercise 1 - Pooling

## Objective

In this exercise, you will implement a simplified version of the max pooling layer.

## Details

You will have to implement two functions and a small script. The first function is a padding 
function. Using the input size and the pooling layer parameters (stride and filter size), 
this function finds the padding `wpad` and `hpad` (width and height padding) such that the 
input dimensions are padded.

The next function calculates the output dimensions after pooling given the padded array
dimensions and the pooling parameters (stride and filter size).

Finally, the script calculates the pooling layer output.

You can run `python pooling.py` to check your implementation - note that the checking of the output will require input of a 3x3 filter and stride of 3.

## Tips

Pooling only affects the spatial dimensions and preserves the batch size (first axis of the padded array) 
and the number of channels (last axis).

In [1]:
import numpy as np

In [2]:
input_array = np.random.rand(1, 224, 224, 16)
pool_size = 3
pool_stride = 3

In [3]:
input_array.shape

(1, 224, 224, 16)

In [7]:
_, w, h, _ = input_array.shape
wpad = (w // pool_stride) * pool_stride + pool_size - w
hpad = (h // pool_stride) * pool_stride + pool_size - h
paddings = [[0, 0], [0, wpad], [0, hpad], [0, 0]]
paddings

[[0, 0], [0, 1], [0, 1], [0, 0]]

In [19]:
input_array = np.random.rand(3, 3)
padded = np.pad(input_array, [[1, 2], [0, 1]], mode='constant', constant_values=0)
padded

array([[0.        , 0.        , 0.        , 0.        ],
       [0.79399574, 0.44335109, 0.0400669 , 0.        ],
       [0.04497678, 0.43997423, 0.48284877, 0.        ],
       [0.83902768, 0.2055138 , 0.96229641, 0.        ],
       [0.        , 0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.        ]])

In [21]:
import argparse

import numpy as np

from utils import check_output


def get_paddings(array, pool_size, pool_stride):
    """ 
    get padding sizes 
    args:
    - array [array]: input np array NxwxHxC
    - pool_size [int]: window size
    - pool_stride [int]: stride
    returns:
    - paddings [list[list]]: paddings in np.pad format
    """
    # IMPLEMENT THIS FUNCTION
    _, w, h, _ = array.shape
    wpad = (w // pool_stride) * pool_stride + pool_size - w
    hpad = (h // pool_stride) * pool_stride + pool_size - h
    return [[0, 0], [0, wpad], [0, hpad], [0, 0]]


def get_output_size(shape, pool_size, pool_stride):
    """ 
    given input shape, pooling window and stride, output shape 
    args:
    - shape [list]: input shape
    - pool_size [int]: window size
    - pool_stride [int]: stride
    returns
    - output_shape [list]: output array shape
    """
    # IMPLEMENT THIS FUNCTION
    _, w, h, _ = shape    
    new_w = (w - pool_size) / pool_stride + 1
    new_h = (h - pool_size) / pool_stride + 1
    return [shape[0], int(new_w), int(new_h), shape[3]]


if __name__ == '__main__':
#     parser = argparse.ArgumentParser(description='Download and process tf files')
#     parser.add_argument('-f', '--pool_size', required=True, type=int, default=3,
#                         help='pool filter size')
#     parser.add_argument('-s', '--stride', required=True, type=int, default=3,
#                         help='stride size')
#     args = parser.parse_args()

    input_array = np.random.rand(1, 224, 224, 16)
    pool_size = 3
    pool_stride = 3

    # padd the input layer
    paddings = get_paddings(input_array, pool_size, pool_stride)
    padded = np.pad(input_array, paddings, mode='constant', constant_values=0)

    # get output size
    output_size = get_output_size(padded.shape, pool_size, pool_stride)
    output = np.zeros(output_size)

    # IMPLEMENT THE POOLING CALCULATION 
    check_output(output)

Success!
