In this problem, you need to implement a 2D convolutional layer in Python. This function will process an input matrix using a specified convolutional kernel, padding, and stride.

Example:

Input:

import numpy as np


input_matrix = np.array([
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12],
    [13, 14, 15, 16]
])

kernel = np.array([
    [1, 0],
    [-1, 1]
])

padding = 1
stride = 2

output = simple_conv2d(input_matrix, kernel, padding, stride)
print(output)

Output:

 [[  3.   9.]
  [ 11.  17.]]

Reasoning:

The function performs a 2D convolution operation on the input matrix using the specified kernel, padding, and stride. The output matrix contains the results of the convolution operation.


In [25]:
import numpy as np
def CNN(input_matrix, kernel, padding=1,stride=1):
  k_h, k_w = kernel.shape

  #padding
  padded_matrix=np.pad(input_matrix,((padding,padding),(padding,padding)), mode='constant')#pads with zero

  #output size
  out_h=(len(padded_matrix)-k_h)//stride +1
  out_w=(len(padded_matrix[0])-k_w)//stride +1
  output=[]
  #convolution with kernel and stride
  for row in range(0,out_h*stride, stride):
    pixel=[]
    for col in range(0,out_w*stride,stride):
        pixel.append(np.sum(kernel*padded_matrix[row:row +k_h,col:col+k_w]))
    output.append(pixel)

  return output

In [26]:
print(CNN(np.array([ [1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16] ]),np.array([ [1, 0], [-1, 1] ]),1,2))

[[1, 1, -4], [9, 7, -4], [0, 14, 16]]


The Convolutional layer is a fundamental component used extensively in Computer Vision tasks. Here are the crucial parameters:

Parameters

1.	input_matrix:

A 2D NumPy array representing the input data, such as an image. Each element in this array corresponds to a pixel or a feature value in the input space. The dimensions of the input matrix are typically represented as height×widthheight×width.

2.	kernel:

Another 2D NumPy array representing the convolutional filter. The kernel is smaller than the input matrix and slides over it to perform the convolution operation. Each element in the kernel serves as a weight that modifies the input during convolution. The kernel size is denoted as kernel_height×kernel_widthkernel_height×kernel_width.

3.	padding:

An integer specifying the number of rows and columns of zeros added around the input matrix. Padding controls the spatial dimensions of the output, allowing the kernel to process edge elements effectively or to maintain the original input size.

4.	stride:

An integer that represents the number of steps the kernel moves across the input matrix for each convolution. A stride greater than one reduces the output size, as the kernel skips over elements.

Implementation

1.	Padding the Input:

The input matrix is padded with zeros based on the specified padding value. This increases the input size and enables the kernel to cover elements at the borders and corners.

2.	Calculating Output Dimensions:

The height and width of the output matrix are calculated using the following formulas:

output_height=(input_height, padded−kernel_heightstride)+1output_height=(strideinput_height, padded−kernel_height)+1output_width=(input_width, padded−kernel_widthstride)+1output_width=(strideinput_width, padded−kernel_width)+1

3.	Performing Convolution:

o	A nested loop iterates over each position where the kernel can be applied to the padded input matrix.

o	At each position, a region of the input matrix, matching the size of the kernel, is selected.

o	Element-wise multiplication between the kernel and the input region is performed, followed by summing the results to produce a single value. This value is then stored in the corresponding position of the output matrix.

4.	Output:

The function returns the output matrix, which contains the results of the convolution operation performed across the entire input.
