# Conv's Parameters

`Conv2d`: Applies a 2D convolution over an input signal composed of several input planes.

`
class torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None)
`



In [4]:
import torch
import torch.nn as nn

## 1. Padding
Purpose: Adds pixels (usually zeros) to the borders of the input feature map to control the output size.

Common Settings:

`padding=0`: No padding ("Valid" convolution).

`padding=1`: (with a 3×3 kernel): Preserves the input spatial dimensions in the output.

`padding='same'`: Automatically calculates padding to make the output size the same as the input.

`padding='valid'`:the same as no padding.

In [7]:
# padding=0
m = nn.Conv2d(16, 33, 5, stride=1,padding=0)
input = torch.randn(20, 16, 50, 100)
output = m(input)
print(output.size())

torch.Size([20, 33, 46, 96])


In [8]:
m = nn.Conv2d(16, 33, 5, stride=1,padding='valid')
input = torch.randn(20, 16, 50, 100)
output = m(input)
print(output.size())

torch.Size([20, 33, 46, 96])


In [10]:
m = nn.Conv2d(16, 33, 5, stride=1,padding=1)
input = torch.randn(20, 16, 50, 100)
output = m(input)
print(output.size())

torch.Size([20, 33, 48, 98])


In [11]:
m = nn.Conv2d(16, 33, 5, stride=1,padding='same')
input = torch.randn(20, 16, 50, 100)
output = m(input)
print(output.size())

torch.Size([20, 33, 50, 100])


## 2. Stride
Purpose: The step size with which the kernel slides across the input.

Common Settings:

`stride=1`: Default value, moves one pixel at a time.

`stride=2`: Performs downsampling, reducing the output size by roughly half.

`stride > 2`: More aggressive downsampling.

In [13]:
m = nn.Conv2d(16, 33, 5, stride=2,padding=0)
input = torch.randn(20, 16, 50, 100)
output = m(input)
print(output.size())

torch.Size([20, 33, 23, 48])


In [15]:
m = nn.Conv2d(16, 33, 5, stride=(1,2),padding=0)
input = torch.randn(20, 16, 50, 100)
output = m(input)
print(output.size())

torch.Size([20, 33, 46, 48])


## 3. Kernel Size
Purpose: The size of the receptive field, determining how large an area each neuron sees in the input.

Common Settings:

`1×1`: Used for increasing or decreasing dimensionality (channel-wise), as in Inception networks.

`3×3`: The most common size, offers a good balance of effectiveness and computational cost.

`5×5, 7×7`: Larger receptive fields, but more parameters.

In [16]:
m = nn.Conv2d(16, 33, 3, stride=1,padding=0)
input = torch.randn(20, 16, 50, 100)
output = m(input)
print(output.size())

torch.Size([20, 33, 48, 98])


In [17]:
m = nn.Conv2d(16, 33, 5, stride=1,padding=0)
input = torch.randn(20, 16, 50, 100)
output = m(input)
print(output.size())

torch.Size([20, 33, 46, 96])


## 4. Number of Kernels / Filters
Purpose: Determines the number of channels in the output feature map, i.e., how many different features are learned.

Characteristic: Typically increases as the network gets deeper (e.g., 64 → 128 → 256 → 512).

In [18]:
m = nn.Conv2d(16, 48, 3, stride=1,padding=0)
input = torch.randn(20, 16, 50, 100)
output = m(input)
print(output.size())

torch.Size([20, 48, 48, 98])


In [19]:
m = nn.Conv2d(48, 128, 3, stride=1,padding=0)
output = m(output)
print(output.size())

torch.Size([20, 128, 46, 96])


## 5.Groups
groups controls the connections between inputs and outputs. in_channels and out_channels must both be divisible by groups. For example,

`groups=1`:all inputs are convolved to all outputs.

`groups=2`:the operation becomes equivalent to having two conv layers side by side, each seeing half the input channels and producing half the output channels, and both subsequently concatenated.

`groups= in_channels`:each input channel is convolved with its own set of filters (of size $\frac{out\_ channels}{in\_ channels}$).

## 6.Padding Mode

`padding_mode (str, optional)`: 'zeros', 'reflect', 'replicate' or 'circular'. Default: 'zeros'

## Output Size Calculation Formula

General Formula:
    
    Output Height = ⌊(Input Height + 2×Padding - Kernel Height) / Stride⌋ + 1
    
    Output Width = ⌊(Input Width + 2×Padding - Kernel Width) / Stride⌋ + 1