### How Max Pooling Works:

Max pooling is a downsampling operation that reduces the spatial dimensions (height and width) of the input. This operation is typically applied to a feature map in convolutional neural networks (CNNs) to reduce its size and computation.


In max pooling, a filter (or kernel) slides over the input feature map, and for each patch of the feature map that the filter covers, it picks the maximum value. This leads to a reduction in the output size.

### Common Parameters in Max Pooling:

-   **Kernel size** (`k`): The size of the window that slides over the input. For example, a 2x2 or 3x3 window.
-   **Stride** (`s`): How far the window moves at each step. Typically, this is the same as the kernel size (for example, stride 2 for a 2x2 window).
-   **Padding** (`p`): Whether padding is applied to the input before pooling. Usually, padding is set to 0, meaning no padding.

### Formula for Output Size (for 2D input):

If the input size is $H×W$ (Height x Width), the output size $OH×OW$ (Output Height x Output Width) after applying max pooling is:

$OH= \left\lfloor \frac{H - k + 2p}{s} \right\rfloor + 1$

$OW= \left\lfloor \frac{W - k + 2p}{s} \right\rfloor + 1$

Where:

-   $H$ = Height of the input
-   $W$ = Width of the input
-   $k$ = Size of the pooling kernel (for example, 2x2 or 3x3)
-   $s$ = Stride
-   $p$ = Padding (usually 0 for most cases)
-   $⌊x⌋$ = Floor function (rounds down to the nearest integer)

### Example:

Suppose you have an input of size $6×6$ and apply a $2x2$ max pooling with a stride of 2 and no padding $(p = 0)$:

$OH = \left\lfloor \frac{6 - 2 + 2(0)}{2} \right\rfloor + 1 = \left\lfloor \frac{4}{2} \right\rfloor + 1 = 2 + 1 = 3$
 
$OW = \left\lfloor \frac{6 - 2 + 2(0)}{2} \right\rfloor + 1 = \left\lfloor \frac{4}{2} \right\rfloor + 1 = 2 + 1 = 3$

So the output size would be $3 \times 3$.



In [1]:
import torch
import torch.nn as nn

# Input tensor of size (batch_size, channels, height, width)
x = torch.randn(1, 1, 6, 6)

# Apply 2x2 max pooling with stride 2
maxpool = nn.MaxPool2d(kernel_size=2, stride=2)

# Apply the max pooling operation
output = maxpool(x)

# Print the output size
print(f"Random tensor: \n{x}")
print(f"Random tensor shape: {x.shape}")

print(f"Output tensor: \n{output}")
print(f"Output tensor shape: {output.shape}")

Random tensor: 
tensor([[[[-2.2014, -0.7309,  0.0099, -1.0123,  1.5577,  2.3575],
          [ 0.2920, -0.0343,  1.0708,  1.5329,  1.5917,  0.4434],
          [-0.9276, -0.4189, -0.5363,  1.3055, -1.0608,  0.0906],
          [ 0.4941, -1.7198, -1.3727,  0.0968,  0.5845,  1.8343],
          [ 0.1267,  1.5990,  0.0188, -0.3399,  1.2139, -0.8310],
          [-0.5266, -0.7266,  0.4163, -2.3195,  1.1874, -1.2139]]]])
Random tensor shape: torch.Size([1, 1, 6, 6])
Output tensor: 
tensor([[[[0.2920, 1.5329, 2.3575],
          [0.4941, 1.3055, 1.8343],
          [1.5990, 0.4163, 1.2139]]]])
Output tensor shape: torch.Size([1, 1, 3, 3])


In [7]:
import ipywidgets as widgets
from IPython.display import display

# Function to calculate output height (OH)
def calculate_output_height(H, k, p, s):
    return (H - k + 2 * p) // s + 1

# Create widgets for the user input
height_input = widgets.IntSlider(value=28, min=16, max=64, step=1, description="Input Height (H):")
kernel_size_input = widgets.IntSlider(value=2, min=1, max=10, step=1, description="Kernel Size (k):")
padding_input = widgets.IntSlider(value=0, min=0, max=10, step=1, description="Padding (p):")
stride_input = widgets.IntSlider(value=2, min=1, max=10, step=1, description="Stride (s):")

# Output widget to display the result
output_display = widgets.Output()

# Function to update the output display when widgets change
def update_output(change):
    with output_display:
        # Get the input values
        H = height_input.value
        k = kernel_size_input.value
        p = padding_input.value
        s = stride_input.value
        
        # Calculate the output height (OH)
        OH = calculate_output_height(H, k, p, s)
        
        # Clear output and display result
        output_display.clear_output(wait=True)
        print(f"Output Height (OH): {OH}")

# Set up event listeners for each widget to update output
height_input.observe(update_output, names='value')
kernel_size_input.observe(update_output, names='value')
padding_input.observe(update_output, names='value')
stride_input.observe(update_output, names='value')

# Display the widgets and the output area
display(height_input, kernel_size_input, padding_input, stride_input, output_display)

# Initial output
update_output(None)


IntSlider(value=28, description='Input Height (H):', max=64, min=16)

IntSlider(value=2, description='Kernel Size (k):', max=10, min=1)

IntSlider(value=0, description='Padding (p):', max=10)

IntSlider(value=2, description='Stride (s):', max=10, min=1)

Output()