<h1><font color = 'brown' size = '6'>
<b>
Pooling Layer in CNNs
</b>
</font>
</h1>

<h1>
<ul>
<font color = 'brown green' size = '5'>
<b>

<li>
Convolutional layers in a convolutional neural network summarize the presence of features in an input image.
</li><br>

<li>
A problem with the output feature maps is that they are sensitive to the location of the features in the input.
</li><br>

<li>
One approach to address this sensitivity is to downsample the feature maps.
</li><br>

<li>
This has the effect of making the resulting downsampled feature maps more
robust to changes in the position of the feature in the image, referred to by the technical phrase <i>local translation invariance</i>.
</li><br>

<li>
Pooling layers provide an approach to down sampling feature maps by summarizing the presence of features in patches of the feature map.
</li><br>

<li>
Two common pooling methods are average pooling and max pooling that summarize the average presence of a feature and the most activated presence of a feature respectively.
</li><br>

<li>
This means that the pooling layer will always reduce the size of each feature map by a factor of 2, e.g. each dimension is halved, reducing the number of pixels or values in each feature map to one quarter the size.
</li><br>

<li>
For example, a pooling layer applied to a feature map of 6 × 6 (36 pixels) will result in an output pooled feature map of 3 × 3 (9 pixels). The pooling  operation is specified, rather than learned.
</li><br>

<li>
Two common functions used in the pooling operation are:

<ul>
<li>
Average Pooling: Calculate the average value for each patch on the feature map.</li>
<br>
<li>
Maximum Pooling (or Max Pooling): Calculate the maximum value for each patch of
the feature map.
</li>

</ul>
</li><br>

</b>
</font>
</ul>
</h1>

<h1><font color = 'brown' size = '6'>
<b>
Average Pooling Layer
</b>
</font>
</h1>

<h1>
<ul>
<font color = 'brown green' size = '5'>
<b>

<li>
On two-dimensional feature maps, pooling is typically applied in 2 × 2 patches of the feature map with a stride of (2,2).
</li><br>

<li>
Average pooling involves calculating the average for each patch of the feature map.
</li><br>

<li>
This means that each 2 × 2 square of the feature map is downsampled to the
average value in the square.
</li><br>

</b>
</font>
</ul>
</h1>

1. Importing the required libraries

In [1]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import AveragePooling2D

from numpy import asarray

2. Define input data

In [2]:
data = [
        [0, 0, 0, 1, 1, 0, 0, 0],
        [0, 0, 0, 1, 1, 0, 0, 0],
        [0, 0, 0, 1, 1, 0, 0, 0],
        [0, 0, 0, 1, 1, 0, 0, 0],
        [0, 0, 0, 1, 1, 0, 0, 0],
        [0, 0, 0, 1, 1, 0, 0, 0],
        [0, 0, 0, 1, 1, 0, 0, 0],
        [0, 0, 0, 1, 1, 0, 0, 0]
        ]

data = asarray(data)
data = data.reshape(1, 8, 8, 1)
data.shape

(1, 8, 8, 1)

3. create model

In [3]:
model = Sequential()
model.add(Conv2D(filters = 1, kernel_size = (3, 3), activation = 'relu', input_shape = (8, 8, 1)))
model.add(AveragePooling2D())

4. summarize model

In [4]:
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 6, 6, 1)           10        
                                                                 
 average_pooling2d (Average  (None, 3, 3, 1)           0         
 Pooling2D)                                                      
                                                                 
Total params: 10 (40.00 Byte)
Trainable params: 10 (40.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


5. Define a vertical line detector

In [5]:
detector = [
            [[[0]],[[1]],[[0]]],
            [[[0]],[[1]],[[0]]],
            [[[0]],[[1]],[[0]]]
            ]
weights = [asarray(detector), asarray([0.0])]
weights

[array([[[[0]],
 
         [[1]],
 
         [[0]]],
 
 
        [[[0]],
 
         [[1]],
 
         [[0]]],
 
 
        [[[0]],
 
         [[1]],
 
         [[0]]]]),
 array([0.])]

6. store the weights in the model

In [6]:
model.set_weights(weights)

7. apply filter to input data

In [7]:
yhat = model.predict(data)



8. enumerate rows

In [8]:
for r in range(yhat.shape[1]):
  print([yhat[0, r, c, 0] for c in range(yhat.shape[2])])

[0.0, 3.0, 0.0]
[0.0, 3.0, 0.0]
[0.0, 3.0, 0.0]


<h1><font color = 'brown' size = '6'>
<b>
Max Pooling Layer
</b>
</font>
</h1>

<h1>
<ul>
<font color = 'brown green' size = '5'>
<b>

<li>
Maximum pooling, or max pooling, is a pooling operation that calculates the maximum, or largest, value in each patch of each feature map.
</li><br>

<li>
The results are downsampled or pooled feature maps that highlight the most present feature in the patch, not the average presence of the features, as in the case of average pooling.
</li><br>

<li>
This has been found to work better in practice than average pooling for computer vision tasks like image classification.
</li><br>

</b>
</font>
</ul>
</h1>

1. importing the required libraries

In [9]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D

from numpy import asarray

2. Define input data

In [10]:
data = [
        [0, 0, 0, 1, 1, 0, 0, 0],
        [0, 0, 0, 1, 1, 0, 0, 0],
        [0, 0, 0, 1, 1, 0, 0, 0],
        [0, 0, 0, 1, 1, 0, 0, 0],
        [0, 0, 0, 1, 1, 0, 0, 0],
        [0, 0, 0, 1, 1, 0, 0, 0],
        [0, 0, 0, 1, 1, 0, 0, 0],
        [0, 0, 0, 1, 1, 0, 0, 0]
       ]
data = asarray(data)
data = data.reshape(1, 8, 8, 1)
data.shape

(1, 8, 8, 1)

3. create model

In [11]:
model = Sequential()
model.add(Conv2D(filters = 1, kernel_size = (3, 3), input_shape = (8, 8, 1)))
model.add(MaxPooling2D())

4. summarize model

In [12]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_1 (Conv2D)           (None, 6, 6, 1)           10        
                                                                 
 max_pooling2d (MaxPooling2  (None, 3, 3, 1)           0         
 D)                                                              
                                                                 
Total params: 10 (40.00 Byte)
Trainable params: 10 (40.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


5. Define a vertical line detector

In [13]:
detector = [
            [[[0]],[[1]],[[0]]],
            [[[0]],[[1]],[[0]]],
            [[[0]],[[1]],[[0]]]
           ]

weights = [asarray(detector), asarray([0.0])]

6. store the weights in the model

In [14]:
model.set_weights(weights)

7. apply filter to input data

In [15]:
yhat = model.predict(data)



8. enumerate rows

In [16]:
for r in range(yhat.shape[1]):
  print([yhat[0, r, c, 0] for c in range(yhat.shape[2])])

[0.0, 3.0, 0.0]
[0.0, 3.0, 0.0]
[0.0, 3.0, 0.0]
