# 2D Convolution
This notebook aims to help gain a better understanding of 2D convolution by defining a convolution function of varing stride, input, and output layers and comparing its accuraccy by comparing it to PyTorch's convolution function.

### Imports
Import the libraries needed to define the convolution function of varing stride, input, and output layers.

In [1]:
import numpy as np
import torch

### Define PyTorch Convolution
Define a function that uses PyTorch to perform convolution for a random image

In [2]:
# Set print size of PyTorch output
np.set_printoptions(precision=3, floatmode="fixed")
torch.set_printoptions(precision=3)

# Perform convolution using PyTorch
def conv_pytorch(image, conv_weights, stride=1, pad =1):
  # Convert image and kernel to tensors
  image_tensor = torch.from_numpy(image) # (batchSize, channelsIn, imageHeightIn, =imageWidthIn)
  conv_weights_tensor = torch.from_numpy(conv_weights) # (channelsOut, channelsIn, kernelHeight, kernelWidth)
  # Undergo convolution
  output_tensor = torch.nn.functional.conv2d(image_tensor, conv_weights_tensor, stride=stride, padding=pad)
  # Convert back from PyTorch and return 
  return(output_tensor.numpy()) # (batchSize channelsOut imageHeightOut imageHeightIn)

### Define Numpy Convolution
Define a function that uses Numpy functions to perform convolution for a random image

In [3]:
def conv_numpy_1(image, weights, pad=1):
    # Perform zero padding
    if pad != 0:
        image = np.pad(image, ((0, 0), (0 ,0), (pad, pad), (pad, pad)),'constant')

    # Determine sizes of image array and kernel weights
    batchSize,  channelsIn, imageHeightIn, imageWidthIn = image.shape
    channelsOut, channelsIn, kernelHeight, kernelWidth = weights.shape

    # Determine size of output arrays
    imageHeightOut = np.floor(1 + imageHeightIn - kernelHeight).astype(int)
    imageWidthOut = np.floor(1 + imageWidthIn - kernelWidth).astype(int)

    # Define an array structure for the output of the CNN
    out = np.zeros((batchSize, channelsOut, imageHeightOut, imageWidthOut), dtype=np.float32)

    # For the location of each pixel on the final image
    for c_y in range(imageHeightOut):
      for c_x in range(imageWidthOut):
        # For the location of each pixel on the kernel
        for c_kernel_y in range(kernelHeight):
          for c_kernel_x in range(kernelWidth):
            # Compute the value of the pixel and the value of the corresponding kernel weight
            this_pixel_value = image[0, 0, c_kernel_y+c_y, c_kernel_x+c_x]
            this_weight = weights[0, 0, c_kernel_y, c_kernel_x]

            # Multiply the value of the kernel by the value at the location of the pixel to compute the activation map
            out[0, 0, c_y, c_x] += np.sum(this_pixel_value * this_weight)

    return out

### Initialize Hyperparameters
Initialize hyperparameters for the convolution function

In [4]:
np.random.seed(1)
n_batch = 1
image_height = 4
image_width = 6
channels_in = 1
kernel_size = 3
channels_out = 1

### Define Input 
Define a random image as the input for the convoluution function

In [5]:
input_image= np.random.normal(size=(n_batch, channels_in, image_height, image_width))

### Initialize Kernel Weights
Initialize the weights of the kernel for the convolution function

In [6]:
conv_weights = np.random.normal(size=(channels_out, channels_in, kernel_size, kernel_size))

### Convolution (PyTorch)
Perform PyTorch's convolution function on the input image and print the feature map (output of the convolution function)

In [7]:
conv_results_pytorch = conv_pytorch(input_image, conv_weights, stride=1, pad=1)
print("PyTorch Results")
print(conv_results_pytorch)

PyTorch Results
[[[[-0.92877074 -2.76029052  0.71617666  0.11420725  0.5596409
    -0.38718181]
   [-1.51472398  0.28314694  1.00809468  0.46584486 -1.0939884
     2.00448021]
   [-1.63398828  3.55485207 -2.15395759 -0.89198292 -1.85607063
     2.29928377]
   [ 0.56543283 -0.94654296 -0.62942895  2.99601051 -1.81129165
    -0.53341   ]]]]


### Convolution (Numpy)
Perform Numpy's convolution function on the input image and print the feature map

In [8]:
print("Numpy results")
conv_results_numpy = conv_numpy_1(input_image, conv_weights)
print(conv_results_numpy)

Numpy results
[[[[-0.9287708  -2.7602906   0.7161767   0.1142073   0.5596409
    -0.38718182]
   [-1.5147239   0.28314698  1.0080947   0.46584484 -1.0939885
     2.0044804 ]
   [-1.6339881   3.554852   -2.1539576  -0.8919829  -1.8560706
     2.2992837 ]
   [ 0.56543285 -0.946543   -0.629429    2.9960103  -1.8112916
    -0.53341   ]]]]


### Define Numpy Convolution (Stride)
Define a function that uses Numpy functions to perform convolution for a random image, with the incorporation of a stride for the kernel

In [9]:
def conv_numpy_2(image, weights, stride=1, pad=1):

    # Perform zero padding
    if pad != 0:
        image = np.pad(image, ((0, 0), (0 ,0), (pad, pad), (pad, pad)),'constant')

    # Determine sizes of image array and kernel weights
    batchSize,  channelsIn, imageHeightIn, imageWidthIn = image.shape
    channelsOut, channelsIn, kernelHeight, kernelWidth = weights.shape

    # Determine size of output arrays
    imageHeightOut = np.floor(1 + (imageHeightIn - kernelHeight) / stride).astype(int)
    imageWidthOut = np.floor(1 + (imageWidthIn - kernelWidth) / stride).astype(int)

    # Define an array structure for the output of the CNN
    out = np.zeros((batchSize, channelsOut, imageHeightOut, imageWidthOut), dtype=np.float32)

    # For the location of each pixel on the final image
    for c_y in range(imageHeightOut):
      for c_x in range(imageWidthOut):
        # For the location of each pixel on the kernel
        for c_kernel_y in range(kernelHeight):
          for c_kernel_x in range(kernelWidth):
            # Compute the value of the pixel and the value of the corresponding kernel weight
            # Incorporate stride towards the location of the pixel
            this_pixel_value = image[0, 0, c_kernel_y+c_y*stride, c_kernel_x+c_x*stride]
            this_weight = weights[0, 0, c_kernel_y, c_kernel_x]

            # Multiply the value of the kernel by the value at the location of the pixel to compute the activation map
            out[0, 0, c_y, c_x] += np.sum(this_pixel_value * this_weight)

    return out

### Initialize Hyperparameters
Initialize hyperparameters for the convolution function

In [10]:
np.random.seed(1)
n_batch = 1
image_height = 12
image_width = 10
channels_in = 1
kernel_size = 3
channels_out = 1
stride = 2

### Define Input 
Define a random image as the input for the convolution function

In [11]:
input_image= np.random.normal(size=(n_batch, channels_in, image_height, image_width))

### Initialize Kernel Weights
Initialize the weights of the kernel for the convolution function

In [12]:
conv_weights = np.random.normal(size=(channels_out, channels_in, kernel_size, kernel_size))

### Convolution (PyTorch)
Perform PyTorch's convolution function on the input image and print the feature map

In [13]:
conv_results_pytorch = conv_pytorch(input_image, conv_weights, stride, pad=1)
print("PyTorch Results")
print(conv_results_pytorch)

PyTorch Results
[[[[-0.80936502 -4.55000504 -5.48644408 -9.50590007 -4.5119328 ]
   [-0.05546526  1.14484207 -5.38831003 -3.9102454   0.09650551]
   [-0.18614325  0.65958396  1.62966335  2.27526627  4.87444365]
   [ 2.38590982 -0.22541781  3.28823207 -4.23915446 -1.4026945 ]
   [ 0.82496306  1.71026929 -3.24597272  3.24605864  1.7087309 ]
   [ 0.8088478   3.69534498  3.49057599 -2.11278279 -2.71367209]]]]


### Convolution (Numpy)
Perform Numpy's convolution function on the input image and print the feature map

In [14]:
print("Numpy results")
conv_results_numpy = conv_numpy_2(input_image, conv_weights, stride, pad=1)
print(conv_results_numpy)

Numpy results
[[[[-0.8093651  -4.550005   -5.4864435  -9.505899   -4.511933  ]
   [-0.05546521  1.1448419  -5.3883104  -3.9102452   0.09650555]
   [-0.18614332  0.65958387  1.6296635   2.2752664   4.8744435 ]
   [ 2.38591    -0.22541776  3.288232   -4.2391543  -1.4026946 ]
   [ 0.82496303  1.7102693  -3.2459726   3.2460587   1.7087309 ]
   [ 0.8088478   3.6953452   3.4905758  -2.112783   -2.713672  ]]]]


### Define Numpy Convolution (Input and Output Layers)
Define a function that uses Numpy functions to perform convolution for a random image, with the incorporation of the input layers of the image and the output layer of the feature map

In [15]:
def conv_numpy_3(image, weights, stride=1, pad=1):
    # Perform zero padding
    if pad != 0:
        image = np.pad(image, ((0, 0), (0 ,0), (pad, pad), (pad, pad)),'constant')

    # Determine sizes of image array and kernel weights
    batchSize,  channelsIn, imageHeightIn, imageWidthIn = image.shape
    channelsOut, channelsIn, kernelHeight, kernelWidth = weights.shape

    # Determine size of output arrays
    imageHeightOut = np.floor(1 + (imageHeightIn - kernelHeight) / stride).astype(int)
    imageWidthOut = np.floor(1 + (imageWidthIn - kernelWidth) / stride).astype(int)

    # Define an array structure for the output of the CNN
    out = np.zeros((batchSize, channelsOut, imageHeightOut, imageWidthOut), dtype=np.float32)

    # For the location of each pixel on the final image
    for c_y in range(imageHeightOut):
      for c_x in range(imageWidthOut):
        # For each input layer and output layer
        for c_channel_out in range(channelsOut):
          for c_channel_in in range(channelsIn):
            # For the location of each pixel on the kernel
            for c_kernel_y in range(kernelHeight):
              for c_kernel_x in range(kernelWidth):
                  # Compute the value of the pixel and the value of the corresponding kernel for each input and output layer
                  this_pixel_value = image[0, c_channel_in, c_kernel_y+c_y, c_kernel_x+c_x]
                  this_weight = weights[c_channel_out, c_channel_in, c_kernel_y, c_kernel_x]

                  # Multiply the value of the kernel by the value at the location of the pixel to compute the activation map
                  out[0, c_channel_out, c_y, c_x] += np.sum(this_pixel_value * this_weight)
    return out

### Initialize Hyperparameters
Initialize the hyperparameters for the convolution function

In [18]:
np.random.seed(1)
n_batch = 1
image_height = 4
image_width = 6
channels_in = 5
kernel_size = 3
channels_out = 2

### Define Input
Define a random image as the input for the convolution function

In [19]:
input_image= np.random.normal(size=(n_batch, channels_in, image_height, image_width))

### Initialize Kernel Weights
Initialize the weights of the kernel for the convolution function

In [20]:
conv_weights = np.random.normal(size=(channels_out, channels_in, kernel_size, kernel_size))

### Convolution (PyTorch)
Perform PyTorch's convolution function on the input image and print the feature map

In [21]:
conv_results_pytorch = conv_pytorch(input_image, conv_weights, stride=1, pad=1)
print("PyTorch Results")
print(conv_results_pytorch)

PyTorch Results
[[[[ -0.78493005   5.46317843  -2.48025844   5.02562432  -3.59404407
      7.78488108]
   [ -6.74353604   2.53411115  -0.66394352   7.14885746  -9.83852552
      7.8488439 ]
   [ -4.79449808  14.0742739   -1.06040823   2.70604367 -10.18183894
      2.0036554 ]
   [  1.80853737   0.28671588   4.64779395  -1.83962197   3.25877536
      1.0733081 ]]

  [[  4.14990184   5.37204436   1.69949782   0.49957395   0.5894419
      4.36072584]
   [ -4.12345821   5.1360658    4.67700784  -3.89519639  -4.99028029
      2.54604406]
   [  3.99092569   5.76840128  -2.31524793   8.47292739   1.7520073
      2.76562704]
   [  1.52850472   0.3179325   11.51848119  -5.44439849  -2.29293586
      1.26966775]]]]


### Convolution (Numpy)
Perform Numpy's convolution function on the input image and print the feature map

In [22]:
print("Numpy results")
conv_results_numpy = conv_numpy_3(input_image, conv_weights, stride=1, pad=1)
print(conv_results_numpy)

Numpy results
[[[[ -0.78493017   5.4631777   -2.4802582    5.025625    -3.594044
      7.784882  ]
   [ -6.7435346    2.5341108   -0.6639434    7.1488576   -9.838525
      7.8488436 ]
   [ -4.794498    14.074275    -1.0604087    2.7060435  -10.181838
      2.003656  ]
   [  1.8085374    0.2867157    4.647794    -1.8396223    3.2587752
      1.0733079 ]]

  [[  4.149902     5.3720446    1.6994978    0.4995738    0.5894427
      4.360726  ]
   [ -4.123459     5.136065     4.6770077   -3.8951955   -4.990279
      2.546044  ]
   [  3.9909263    5.768401    -2.3152485    8.472927     1.7520071
      2.765627  ]
   [  1.5285047    0.317932    11.518481    -5.4443984   -2.2929363
      1.2696676 ]]]]


### Define Numpy Convolution (Stride + Input and Output Layer)
Define a function that uses Numpy functions to perform convolution for a random image, with the incorporation of a stride for the kernel, the input layers of the image, and the output layers of the feature map

In [23]:
def conv_numpy_4(image, weights, stride=1, pad=1):
    # Perform zero padding
    if pad != 0:
        image = np.pad(image, ((0, 0), (0 ,0), (pad, pad), (pad, pad)),'constant')

    # Determine sizes of image array and kernel weights
    batchSize,  channelsIn, imageHeightIn, imageWidthIn = image.shape
    channelsOut, channelsIn, kernelHeight, kernelWidth = weights.shape

    # Deterime size of output arrays
    imageHeightOut = np.floor(1 + (imageHeightIn - kernelHeight) / stride).astype(int)
    imageWidthOut = np.floor(1 + (imageWidthIn - kernelWidth) / stride).astype(int)

    # Define an array structure for the output of the CNN
    out = np.zeros((batchSize, channelsOut, imageHeightOut, imageWidthOut), dtype=np.float32)

    # For each iamge
    for c_batch in range(batchSize):
      # For the location of each pixel on the final image
      for c_y in range(imageHeightOut):
        for c_x in range(imageWidthOut):
          # For each input and output layer
          for c_channel_out in range(channelsOut):
            for c_channel_in in range(channelsIn):
              # For the location of each pixel on the kernel
              for c_kernel_y in range(kernelHeight):
                for c_kernel_x in range(kernelWidth):
                    # Compute the value of the pixel and the value of the corresopnding kernel weights for each input and output layer
                    # Incorporate stride towards the location of the pixel
                    this_pixel_value = image[c_batch, c_channel_in, c_kernel_y+c_y*stride, c_kernel_x+c_x*stride]
                    this_weight = weights[c_channel_out, c_channel_in, c_kernel_y, c_kernel_x]

                    # Multiply the value of the kernel by the value at the location of the pixel to compute the activation map
                    out[c_batch, c_channel_out, c_y, c_x] += np.sum(this_pixel_value * this_weight)
    return out

### Initialize Hyperparameters

In [24]:
np.random.seed(1)
n_batch = 2
image_height = 4
image_width = 6
channels_in = 5
kernel_size = 3
channels_out = 2

### Define Input Image

In [25]:
input_image= np.random.normal(size=(n_batch, channels_in, image_height, image_width))

### Initialize Kernel Weights

In [26]:
conv_weights = np.random.normal(size=(channels_out, channels_in, kernel_size, kernel_size))

### Convolution (PyTorch)
Perform PyTorch's convolution function on the input image and print the feature map

In [27]:
conv_results_pytorch = conv_pytorch(input_image, conv_weights, stride=1, pad=1)
print("PyTorch Results")
print(conv_results_pytorch)

PyTorch Results
[[[[ -3.63266998  -1.64414063   0.16871541  -1.16683534  -3.86470503
      6.04548497]
   [ -9.00441048   7.30303198   4.41414413   0.36094645  -6.73911902
      3.93877819]
   [ -1.39066168  13.50223385   3.80719001  -9.37925647   3.99065754
      5.44193873]
   [  2.80529354   6.87386967  -9.28708597  -4.4677558   -1.50120554
      4.60699744]]

  [[  1.93991531  -1.40994827   2.39733412  -0.23498018  -0.39417446
     -1.48273797]
   [  5.04926785  -3.33537397  -7.59638391  -1.58618293   3.04942231
     -1.85714215]
   [  3.51356752   0.47452556  -1.95244809  -1.29143814  -0.58875724
     -0.94794943]
   [  6.52359837  -0.01988546  -3.29757656  -1.24783696   3.24882183
     -2.67951281]]]


 [[[  4.15358547  -4.76441146  11.63518341   0.50610631  -4.01175338
     -2.08113863]
   [ -1.12514613  -0.67652481  16.74850235  -7.03000422  -5.97797666
     -2.6288367 ]
   [  0.77808623  -3.98359269 -10.28402713   1.57539725  -8.88848701
      1.16275197]
   [  0.55557572  -2.

### Convolution (Numpy)
Perform Numpy's convolution function on the input image and print the feature map

In [28]:
print("Numpy results")
conv_results_numpy = conv_numpy_4(input_image, conv_weights, stride=1, pad=1)
print(conv_results_numpy)

Numpy results
[[[[ -3.63267     -1.6441412    0.16871582  -1.1668352   -3.8647041
      6.0454845 ]
   [ -9.004409     7.3030324    4.414145     0.36094612  -6.7391195
      3.9387782 ]
   [ -1.3906618   13.5022335    3.80719     -9.379255     3.9906573
      5.4419394 ]
   [  2.8052933    6.873869    -9.287086    -4.4677553   -1.5012054
      4.606998  ]]

  [[  1.9399152   -1.4099485    2.3973339   -0.23498023  -0.3941749
     -1.4827379 ]
   [  5.0492682   -3.3353736   -7.5963855   -1.5861826    3.0494225
     -1.8571423 ]
   [  3.5135674    0.47452566  -1.9524485   -1.2914382   -0.58875686
     -0.9479495 ]
   [  6.5235977   -0.0198854   -3.297577    -1.2478366    3.2488215
     -2.6795127 ]]]


 [[[  4.153585    -4.7644105   11.635186     0.5061073   -4.0117536
     -2.0811388 ]
   [ -1.1251465   -0.67652464  16.7485      -7.0300035   -5.9779778
     -2.6288366 ]
   [  0.77808625  -3.9835927  -10.284027     1.5753975   -8.888488
      1.1627522 ]
   [  0.5555758   -2.290444     1.