# Basic Convolution

Convolution is a means to compress information or data across a window of data. It can be in 1D or 2D. Here we demonstrate through simple code how the convolution works and how we would code it in vanilla element-wise operations.

In [1]:
import numpy as np
import torch
import torch.nn as nn

## Mathematics

For 1D convolution:

$$ y[n] = \sum_{k=0}^{M-1} h[k] \cdot x[n-k]$$

Where $M$ shows that the $h$ filter is bounded. 


For 2D convolution:

$$ Y[i,j] = \sum_{m=0}^{K-1}\sum_{n=0}^{K-1} H[m,n] \cdot X[i-m,j-n]$$

$X$ or $x$ are padded when they are out of bounds.

## Code for Simple Convolution

### For 1D Convolution:

In [2]:
# Setup for simple 1D convolution

# Simple vector for 1D convolution
# Padded on 2 ends
vector_x = torch.tensor([0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 0.0])

# Simple 1x3 kernel
kernel = torch.tensor([-1.0, 0.0, 1.0])

# Simple convolution function
def conv1d(x, k):
    # Get the length of the input and kernel
    x_len = len(x)
    k_len = len(k)

    # Calculate the length of the output
    out_len = x_len - k_len + 1

    # Initialize the output tensor
    out = torch.zeros(out_len)

    # Perform the convolution operation
    for i in range(out_len):
        for j in range(k_len):
            out[i] += x[i + j] * k[j]
    return out

# Perform the convolution
out = conv1d(vector_x, kernel)
print("Convolution output:", out)

Convolution output: tensor([ 2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2., -9.])


In [3]:
# 1D Convolution using PyTorch
vector_x = torch.tensor([[0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 0.0]])
vector_x = vector_x.unsqueeze(0) # Add batch and channel dimensions

# Reshape kernel to match PyTorch's expected input
kernel = torch.tensor([[[-1.0, 0.0, 1.0]]])  # (out_channels=1, in_channels=1, kernel_size=3)

conv1d = nn.Conv1d(in_channels=1, out_channels=1, kernel_size=3, bias=False)
conv1d.weight.data = kernel
pytorch_1d = conv1d(vector_x)

print("PyTorch 1D output:", pytorch_1d)
print("PyTorch 1D size:", pytorch_1d.size())

# Size is batch x output_channels x length

PyTorch 1D output: tensor([[[ 2.,  2.,  2.,  2.,  2.,  2.,  2.,  2.,  2., -9.]]],
       grad_fn=<ConvolutionBackward0>)
PyTorch 1D size: torch.Size([1, 1, 10])


### For 2D Convolution:

In [4]:
# Simple 2D matrix with zero-padding already applied
matrix_x = torch.tensor([
    [0.0,  0.0,  0.0,  0.0, 0.0],
    [0.0,  1.0,  2.0,  3.0, 0.0],
    [0.0,  4.0,  5.0,  6.0, 0.0],
    [0.0,  7.0,  8.0,  9.0, 0.0],
    [0.0,  0.0,  0.0,  0.0, 0.0]
])

# Simple 3x3 kernel (e.g., Sobel-like edge detector)
kernel_2d = torch.tensor([
    [-1.0, 0.0, 1.0],
    [-2.0, 0.0, 2.0],
    [-1.0, 0.0, 1.0]
])

# 2D convolution function (no batch or channels)
def conv2d(x, k):
    IX, IY = x.shape
    KX, KY = k.shape
    OX = IX - KX + 1
    OY = IY - KY + 1

    out = torch.zeros((OX, OY))

    for ox in range(OX):
        for oy in range(OY):
            for kx in range(KX):
                for ky in range(KY):
                    out[ox, oy] += x[ox + kx, oy + ky] * k[kx, ky]
    return out

# Perform 2D convolution
output_2d = conv2d(matrix_x, kernel_2d)

print("2D Convolution Output:\n", output_2d)

2D Convolution Output:
 tensor([[  9.,   6.,  -9.],
        [ 20.,   8., -20.],
        [ 21.,   6., -21.]])


In [5]:
# Input 2D matrix with padding already applied
matrix_x = torch.tensor([
    [0.0,  0.0,  0.0,  0.0, 0.0],
    [0.0,  1.0,  2.0,  3.0, 0.0],
    [0.0,  4.0,  5.0,  6.0, 0.0],
    [0.0,  7.0,  8.0,  9.0, 0.0],
    [0.0,  0.0,  0.0,  0.0, 0.0]
])

# Reshape input to match PyTorch's 2D conv format: (batch, channels, height, width)
matrix_x = matrix_x.unsqueeze(0).unsqueeze(0)  # shape: (1, 1, 5, 5)

# Define a 3x3 kernel, same as in the manual version
kernel_2d = torch.tensor([[[[-1.0, 0.0, 1.0],
                            [-2.0, 0.0, 2.0],
                            [-1.0, 0.0, 1.0]]]])  # shape: (out_channels=1, in_channels=1, 3, 3)

# Set up Conv2d layer
conv2d = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=3, bias=False)
conv2d.weight.data = kernel_2d  # manually set the weights

# Perform the convolution
pytorch_2d = conv2d(matrix_x)

print("PyTorch 2D output:\n", pytorch_2d)
print("PyTorch 2D size:", pytorch_2d.size())  # should be (1, 1, H_out, W_out)

PyTorch 2D output:
 tensor([[[[  9.,   6.,  -9.],
          [ 20.,   8., -20.],
          [ 21.,   6., -21.]]]], grad_fn=<ConvolutionBackward0>)
PyTorch 2D size: torch.Size([1, 1, 3, 3])


## Code for Multiple Channel Convolution

### For 1D with multiple input and output channels:

In [6]:
# === Inputs ===
# Let's say we have 3 input channels (C_in=3), each of length 12
vector_x = torch.tensor([
    [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 0.0],   # Channel 1
    [0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,  1.0, 0.0],   # Channel 2
    [0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0,  0.0, 0.0]    # Channel 3
])  # Shape: (in_channels=3, length=12)

# === Kernel ===
# We have 3 output channels. Each output channel has 3 input-channel filters.
# So the full kernel shape is (out_channels=3, in_channels=3, kernel_size=3)
kernel = torch.tensor([
    [[-1.0,  0.0,  1.0],    [0.5,  0.0, -0.5],  [ 1.0,  1.0, 1.0]],    # Kernel for output channel 0
    [[ 1.0,  1.0, -1.0],    [0.0,  1.0,  0.0],  [-1.0,  0.0, 1.0]],    # Kernel for output channel 1
    [[ 0.0,  1.0,  0.0],    [1.0, -1.0,  1.0],  [ 0.0, -1.0, 0.0]]     # Kernel for output channel 2
])  # Shape: (3, 3, 3)

# === Convolution function with multi-input and multi-output ===
def conv1d_multi(x, k):
    in_channels, x_len = x.shape
    out_channels, _, k_len = k.shape
    out_len = x_len - k_len + 1
    out = torch.zeros(out_channels, out_len)

    for oc in range(out_channels):
        for ox in range(out_len):
            for ic in range(in_channels):
                for kx in range(k_len):
                    out[oc, ox] += x[ic, ox + kx] * k[oc, ic, kx]
    return out

# === Perform convolution ===
out = conv1d_multi(vector_x, kernel)

print("Convolution output shape:", out.shape)  # Should be (3, output_len)
print("Convolution output:\n", out)

Convolution output shape: torch.Size([3, 10])
Convolution output:
 tensor([[ 2.5000,  3.0000,  3.0000,  3.0000,  3.0000,  3.0000,  3.0000,  3.0000,
          3.0000, -8.5000],
        [ 1.0000,  1.0000,  1.0000,  4.0000,  4.0000,  4.0000,  7.0000,  7.0000,
          7.0000, 20.0000],
        [ 1.0000,  2.0000,  4.0000,  5.0000,  5.0000,  7.0000,  8.0000,  8.0000,
         10.0000, 10.0000]])


In [7]:

# === Input: 3 input channels, 1 batch ===
# Shape: (batch_size=1, in_channels=3, length=12)
vector_x = torch.tensor([[
    [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 0.0],    # Channel 1
    [0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,  1.0, 0.0],    # Channel 2
    [0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0,  0.0, 0.0]     # Channel 3
]])

# === Kernel: 3 output channels, 3 input channels, kernel size 3 ===
kernel = torch.tensor([
    [[-1.0,  0.0,  1.0],   [0.5,  0.0, -0.5],  [ 1.0,  1.0, 1.0]],    # Kernel for output channel 0
    [[ 1.0,  1.0, -1.0],   [0.0,  1.0,  0.0],  [-1.0,  0.0, 1.0]],    # Kernel for output channel 1
    [[ 0.0,  1.0,  0.0],   [1.0, -1.0,  1.0],  [ 0.0, -1.0, 0.0]]     # Kernel for output channel 2
])

# === Create Conv1d layer ===
conv1d = nn.Conv1d(in_channels=3, out_channels=3, kernel_size=3, bias=False)

# Set weights manually
with torch.no_grad():
    conv1d.weight.copy_(kernel)

# === Apply convolution ===
output = conv1d(vector_x)

print("PyTorch Convolution Output Shape:", output.shape)  # (1, 3, 10)
print("PyTorch Convolution Output:\n", output[0])  # Remove batch dimension for display

PyTorch Convolution Output Shape: torch.Size([1, 3, 10])
PyTorch Convolution Output:
 tensor([[ 2.5000,  3.0000,  3.0000,  3.0000,  3.0000,  3.0000,  3.0000,  3.0000,
          3.0000, -8.5000],
        [ 1.0000,  1.0000,  1.0000,  4.0000,  4.0000,  4.0000,  7.0000,  7.0000,
          7.0000, 20.0000],
        [ 1.0000,  2.0000,  4.0000,  5.0000,  5.0000,  7.0000,  8.0000,  8.0000,
         10.0000, 10.0000]], grad_fn=<SelectBackward0>)


### For 2D convolutions:

In [9]:
# === Input tensor: (in_channels=3, H=5, W=5), already padded ===
input_tensor = torch.stack([
    torch.tensor([
        [0.0, 0.0, 0.0, 0.0, 0.0],
        [0.0, 1.0, 2.0, 3.0, 0.0],
        [0.0, 4.0, 5.0, 6.0, 0.0],
        [0.0, 7.0, 8.0, 9.0, 0.0],
        [0.0, 0.0, 0.0, 0.0, 0.0]
    ]),
    torch.ones((5, 5)),               # 2nd channel: all 1s
    torch.eye(5)                      # 3rd channel: identity matrix
])  # Shape: (3, 5, 5)

# === Kernel: (out_channels=2, in_channels=3, kernel_size=3x3)
kernel = torch.tensor([
    [  # Kernel for output channel 0
        [[-1.0, 0.0, 1.0],
         [-2.0, 0.0, 2.0],
         [-1.0, 0.0, 1.0]],
        [[0.0, 0.0, 0.0],
         [0.5, 0.5, 0.5],
         [0.0, 0.0, 0.0]],
        [[1.0, 0.0, -1.0],
         [0.0, 0.0,  0.0],
         [-1.0, 0.0, 1.0]]
    ],
    [  # Kernel for output channel 1
        [[1.0, 1.0, 1.0],
         [0.0, 0.0, 0.0],
         [-1.0, -1.0, -1.0]],
        [[0.0, 0.5, 0.0],
         [0.0, 0.5, 0.0],
         [0.0, 0.5, 0.0]],
        [[0.0, 0.0, 0.0],
         [1.0, -1.0, 1.0],
         [0.0, 0.0, 0.0]]
    ]
])  # Shape: (2, 3, 3, 3)

def conv2d_multi(x, k):
    in_channels, IX, IY = x.shape
    out_channels, _, KX, KY = k.shape
    OX, OY = IX - KX + 1, IY - KY + 1
    out = torch.zeros((out_channels, OX, OY))

    for oc in range(out_channels):
        for ox in range(OX):
            for oy in range(OY):
                for ic in range(in_channels):
                    for kx in range(KX):
                        for ky in range(KY):
                            out[oc, ox, oy] += x[ic, ox + kx, oy + ky] * k[oc, ic, kx, ky]
    return out

# Run convolution
manual_output = conv2d_multi(input_tensor, kernel)

print("Manual Multi-Channel 2D Convolution Output:\n", manual_output)
print("Shape:", manual_output.shape)  # Expected: (2, 3, 3)


Manual Multi-Channel 2D Convolution Output:
 tensor([[[ 12.5000,   7.5000,  -8.5000],
         [ 21.5000,  11.5000, -18.5000],
         [ 21.5000,   7.5000, -17.5000]],

        [[ -8.5000, -12.5000,  -9.5000],
         [ -9.5000, -17.5000,  -9.5000],
         [ 10.5000,  17.5000,  11.5000]]])
Shape: torch.Size([2, 3, 3])


In [10]:
# Reshape input to PyTorch format: (batch_size=1, in_channels=3, H=5, W=5)
input_tensor_pt = input_tensor.unsqueeze(0)

# Create Conv2d layer: 3 in channels → 2 out channels
conv2d = nn.Conv2d(in_channels=3, out_channels=2, kernel_size=3, bias=False)

# Set weights manually to match
with torch.no_grad():
    conv2d.weight.copy_(kernel)

# Apply convolution
pytorch_output = conv2d(input_tensor_pt)

print("PyTorch Conv2d Output:\n", pytorch_output[0])  # remove batch dimension for display
print("Shape:", pytorch_output.shape)  # Expected: (1, 2, 3, 3)


PyTorch Conv2d Output:
 tensor([[[ 12.5000,   7.5000,  -8.5000],
         [ 21.5000,  11.5000, -18.5000],
         [ 21.5000,   7.5000, -17.5000]],

        [[ -8.5000, -12.5000,  -9.5000],
         [ -9.5000, -17.5000,  -9.5000],
         [ 10.5000,  17.5000,  11.5000]]], grad_fn=<SelectBackward0>)
Shape: torch.Size([1, 2, 3, 3])


## Convolutions when batch size is 2

### For 1D Convolution:

In [12]:
# === Manual Conv1D: With 2 samples in the batch ===

# Shape: (batch_size=2, in_channels=3, length=12)
vector_x = torch.tensor([
    [  # Sample 1
        [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 0.0],
        [0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,  1.0, 0.0],
        [0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0,  0.0, 0.0]
    ],
    [  # Sample 2
        [ 0.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0, 0.0, 0.0],
        [ 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
        [ 0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0]
    ]   
])

# Kernel shape: (out_channels=3, in_channels=3, kernel_size=3)
kernel = torch.tensor([
    [[-1.0,  0.0,  1.0],   [0.5, 0.0, -0.5], [1.0, 1.0, 1.0]],
    [[1.0,  1.0, -1.0],    [0.0, 1.0, 0.0],  [-1.0, 0.0, 1.0]],
    [[0.0,  1.0,  0.0],    [1.0, -1.0, 1.0], [0.0, -1.0, 0.0]]
])

# Manual multi-batch convolution
def conv1d_batch(x, k):
    batch_size, in_channels, x_len = x.shape
    out_channels, _, k_len = k.shape
    out_len = x_len - k_len + 1
    out = torch.zeros((batch_size, out_channels, out_len))

    for b in range(batch_size):
        for oc in range(out_channels):
            for i in range(out_len):
                for ic in range(in_channels):
                    for j in range(k_len):
                        out[b, oc, i] += x[b, ic, i + j] * k[oc, ic, j]
    return out

# Run it
out_manual = conv1d_batch(vector_x, kernel)

print("Manual Conv Output Shape:", out_manual.shape)  # (2, 3, 10)
print("Manual Conv Output:\n", out_manual)

Manual Conv Output Shape: torch.Size([2, 3, 10])
Manual Conv Output:
 tensor([[[ 2.5000,  3.0000,  3.0000,  3.0000,  3.0000,  3.0000,  3.0000,
           3.0000,  3.0000, -8.5000],
         [ 1.0000,  1.0000,  1.0000,  4.0000,  4.0000,  4.0000,  7.0000,
           7.0000,  7.0000, 20.0000],
         [ 1.0000,  2.0000,  4.0000,  5.0000,  5.0000,  7.0000,  8.0000,
           8.0000, 10.0000, 10.0000]],

        [[ 9.0000, -1.0000,  0.0000, -1.0000,  0.0000, -1.0000,  0.0000,
          -1.0000,  0.0000,  0.0000],
         [ 2.0000, 10.0000,  9.0000,  8.0000,  7.0000,  6.0000,  5.0000,
           4.0000,  3.0000,  1.0000],
         [ 9.0000,  7.0000,  7.0000,  5.0000,  5.0000,  3.0000,  3.0000,
           1.0000,  1.0000, -1.0000]]])


In [13]:
# === Input: (batch=2, channels=3, length=12)
vector_x = torch.tensor([
    [  # Sample 1
        [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 0.0],
        [0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0],
        [0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0]
    ],
    [  # Sample 2
        [0.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0, 0.0, 0.0],
        [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
        [0.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0]
    ]
])  # Shape: (2, 3, 12)

# Same kernel as manual
kernel = torch.tensor([
    [[-1.0,  0.0,  1.0],   [0.5, 0.0, -0.5], [1.0, 1.0, 1.0]],
    [[1.0,  1.0, -1.0],    [0.0, 1.0, 0.0],  [-1.0, 0.0, 1.0]],
    [[0.0,  1.0,  0.0],    [1.0, -1.0, 1.0], [0.0, -1.0, 0.0]]
])

# Define conv layer
conv1d = nn.Conv1d(in_channels=3, out_channels=3, kernel_size=3, bias=False)

# Manually assign weights
with torch.no_grad():
    conv1d.weight.copy_(kernel)

# Run convolution
out_pytorch = conv1d(vector_x)

print("PyTorch Conv Output Shape:", out_pytorch.shape)  # (2, 3, 10)
print("PyTorch Conv Output:\n", out_pytorch)

PyTorch Conv Output Shape: torch.Size([2, 3, 10])
PyTorch Conv Output:
 tensor([[[ 2.5000,  3.0000,  3.0000,  3.0000,  3.0000,  3.0000,  3.0000,
           3.0000,  3.0000, -7.5000],
         [ 1.0000,  1.0000,  1.0000,  4.0000,  4.0000,  4.0000,  7.0000,
           7.0000,  7.0000, 21.0000],
         [ 1.0000,  2.0000,  4.0000,  5.0000,  5.0000,  7.0000,  8.0000,
           8.0000, 10.0000, 10.0000]],

        [[ 9.0000, -1.0000,  0.0000, -1.0000,  0.0000, -1.0000,  0.0000,
          -1.0000,  0.0000,  0.0000],
         [ 2.0000, 10.0000,  9.0000,  8.0000,  7.0000,  6.0000,  5.0000,
           4.0000,  3.0000,  1.0000],
         [ 9.0000,  7.0000,  7.0000,  5.0000,  5.0000,  3.0000,  3.0000,
           1.0000,  1.0000, -1.0000]]], grad_fn=<ConvolutionBackward0>)


### For 2D Convolutions:

In [12]:
# === Input tensor: (batch=2, channels=3, H=5, W=5)
# Create two different 3-channel 5x5 images
input_batch = torch.stack([
    torch.stack([
        torch.tensor([
            [0.0, 0.0, 0.0, 0.0, 0.0],
            [0.0, 1.0, 2.0, 3.0, 0.0],
            [0.0, 4.0, 5.0, 6.0, 0.0],
            [0.0, 7.0, 8.0, 9.0, 0.0],
            [0.0, 0.0, 0.0, 0.0, 0.0]
        ]),
        torch.ones((5, 5)),                # All 1s
        torch.eye(5)                       # Identity
    ]),
    torch.stack([
        torch.full((5, 5), 2.0),           # All 2s
        torch.arange(25).view(5, 5).float(),
        torch.zeros((5, 5))                # All 0s
    ])
])  # Shape: (2, 3, 5, 5)

# === Kernel: (out_channels=2, in_channels=3, 3x3)
kernel = torch.tensor([
    [  # Kernel for output channel 0
        [[-1.0, 0.0, 1.0],
         [-2.0, 0.0, 2.0],
         [-1.0, 0.0, 1.0]],
        [[0.0, 0.0, 0.0],
         [0.5, 0.5, 0.5],
         [0.0, 0.0, 0.0]],
        [[1.0, 0.0, -1.0],
         [0.0, 0.0,  0.0],
         [-1.0, 0.0, 1.0]]
    ],
    [  # Kernel for output channel 1
        [[1.0, 1.0, 1.0],
         [0.0, 0.0, 0.0],
         [-1.0, -1.0, -1.0]],
        [[0.0, 0.5, 0.0],
         [0.0, 0.5, 0.0],
         [0.0, 0.5, 0.0]],
        [[0.0, 0.0, 0.0],
         [1.0, -1.0, 1.0],
         [0.0, 0.0, 0.0]]
    ]
])  # Shape: (2, 3, 3, 3)

def conv2d_multi_batch(x, k):
    B, C, H, W = x.shape
    OC, IC, KX, KY = k.shape
    OX, OY = H - KX + 1, W - KY + 1
    out = torch.zeros((B, OC, OX, OY))

    for b in range(B):
        for oc in range(OC):
            for ox in range(OX):
                for oy in range(OY):
                    for ic in range(IC):
                        for kx in range(KX):
                            for ky in range(KY):
                                out[b, oc, ox, oy] += x[b, ic, ox + kx, oy + ky] * k[oc, ic, kx, ky]
    return out

# Run manual convolution
manual_output = conv2d_multi_batch(input_batch, kernel)
print("Manual Batch Output Shape:", manual_output.shape)
print("Manual Batch Output:\n", manual_output)

Manual Batch Output Shape: torch.Size([2, 2, 3, 3])
Manual Batch Output:
 tensor([[[[ 12.5000,   7.5000,  -8.5000],
          [ 21.5000,  11.5000, -18.5000],
          [ 21.5000,   7.5000, -17.5000]],

         [[ -8.5000, -12.5000,  -9.5000],
          [ -9.5000, -17.5000,  -9.5000],
          [ 10.5000,  17.5000,  11.5000]]],


        [[[  9.0000,  10.5000,  12.0000],
          [ 16.5000,  18.0000,  19.5000],
          [ 24.0000,  25.5000,  27.0000]],

         [[  9.0000,  10.5000,  12.0000],
          [ 16.5000,  18.0000,  19.5000],
          [ 24.0000,  25.5000,  27.0000]]]])


In [13]:
# Same input as above (2, 3, 5, 5)
input_batch_pt = input_batch.clone()

# Create Conv2d: 3 input → 2 output channels
conv2d = nn.Conv2d(in_channels=3, out_channels=2, kernel_size=3, bias=False)

# Set the weights to match
with torch.no_grad():
    conv2d.weight.copy_(kernel)

# Run PyTorch convolution
output_pt = conv2d(input_batch_pt)

print("PyTorch Batch Output Shape:", output_pt.shape)
print("PyTorch Batch Output:\n", output_pt)

PyTorch Batch Output Shape: torch.Size([2, 2, 3, 3])
PyTorch Batch Output:
 tensor([[[[ 12.5000,   7.5000,  -8.5000],
          [ 21.5000,  11.5000, -18.5000],
          [ 21.5000,   7.5000, -17.5000]],

         [[ -8.5000, -12.5000,  -9.5000],
          [ -9.5000, -17.5000,  -9.5000],
          [ 10.5000,  17.5000,  11.5000]]],


        [[[  9.0000,  10.5000,  12.0000],
          [ 16.5000,  18.0000,  19.5000],
          [ 24.0000,  25.5000,  27.0000]],

         [[  9.0000,  10.5000,  12.0000],
          [ 16.5000,  18.0000,  19.5000],
          [ 24.0000,  25.5000,  27.0000]]]], grad_fn=<ConvolutionBackward0>)


# Some Notes
- Output, filters, and input have some indices that go to the other component. For example, number of input channel is independent of number of output channel for the input and output date. However, input and output channels are used by the filter. So take note of that.
- In the 2D convolution with batch size, the sequence of for loop by default is:
    - Batch > output channel > output x > output y > input channel > kernel x > kernel y
    - The input index is inferred based on the output and kernel indices
    - This is the *output centric* indexing
- The inner-most loops change the kernel indices frequently. So the kernel changes most frequently for every loop iteration.
- Also, observe that for *output centric* indexing, the output changes the slowest too. This is similar to output-stationary.
- Output size of the tensor (after the channel) is always pre-determined already.
- There are alaways $n_\textrm{chin} \times n_\textrm{chout}$ filters. But in English, there are always $n_\textrm{chin}$ filter for every output channel. Moreoever, they are each filter is unique.