# Convolutional Neural Networks (CNNs)
*Prepared by Daniel Ramirez, Source: ChatGPT*


### **What are CNNs?**

CNNs are a type of deep learning model primarily used for processing structured array data like images. They're patricular powerful for computer vision tasks (e.g. image classification, object detection, and video analysis).

![CNN visualized](./image_ref/pic1.jpeg)

### **Key Components of CNNs**

- Convolutional Layers: The core building blocks of CNNs. They apply a number of filters to the input to create feature maps that capture spatial hierarchies and patterns in data.

- ReLU (Rectified Linear Unit): A non-linear activation function applied after each convolution operation, introducing non-linearity to the model, allowing it to learn more complex patterns. (e.g: `ReLU = max(0, x)`)

- Pooling Layers: Used to reduce the spatial dimensions (width, height) of the input volume for the next convolutional layer. Common types include Max Pooling and Average Pooling.

- Fully Connected Layers: These layers are traditional neural network layers where all neurons from the previous layer are connected to each neuron. They are typically used at the end of the network for classification purposes.

In [7]:
import numpy as np

def max_pooling(input_matrix, pool_size, stride):
    """
    Apply max pooling to a 2D input matrix.

    :param input_matrix: 2D array of input values.
    :param pool_size: Size of the pooling window (e.g., 2 for 2x2).
    :param stride: Stride with which the window moves across the input.
    :return: 2D array after max pooling.
    """
    # Determine the size of the output matrix
    output_shape = (
        (input_matrix.shape[0] - pool_size) // stride + 1,
        (input_matrix.shape[1] - pool_size) // stride + 1
    )

    # Initialize the output matrix
    output_matrix = np.zeros(output_shape)

    # Apply max pooling
    for i in range(0, input_matrix.shape[0] - pool_size + 1, stride):
        for j in range(0, input_matrix.shape[1] - pool_size + 1, stride):
            output_matrix[i // stride, j // stride] = np.max(
                input_matrix[i:i + pool_size, j:j + pool_size]
            )

    return output_matrix

# Example input mat
# rix
input_matrix = np.array([
    [1, 3, 2, 4],
    [5, 7, 6, 8],
    [9, 11, 10, 12],
    [13, 15, 14, 16]
])

# Applying 2x2 max pooling with stride 2
pooled_matrix = max_pooling(input_matrix, pool_size=2, stride=2)

print('Input Matrix: \n{0}\n\n Output Matrix: \n {1}'.format(input_matrix, pooled_matrix))


Input Matrix: 
[[ 1  3  2  4]
 [ 5  7  6  8]
 [ 9 11 10 12]
 [13 15 14 16]]

 Output Matrix: 
 [[ 7.  8.]
 [15. 16.]]


### How CNNs Work

- CNNs take an input image, apply a series of convolutional, non-linear, pooling (downsampling), and fully connected layers, and produce an output.

- Each layer extracts different features from the input, and this hierarchical feature extraction process helps the model make sense of the input data.

### Understanding Filters (Kernels) and Feature Maps

- Filters are small matrices applied over the input data. They detect patterns such as edges, corners, colors etc.

- When a filter is applied to the input, it creates a feature map that gives the activations of the filter at every spatial position

### Stride and Padding

- Stride: The number of pixels by which we slide the filter over the input matrix. A larger stride means the feature map will be smaller.

- Padding: Sometimes added to the input matrix to ensure that the feature map remains the same size as the input.

### Advanced Layers

- Dropout Layer: Helps with regularization by randomly setting a fraction of input units to 0 at each update during training, which helps to prevent overfitting.

- Normalization Layers: Such as Batch Normalization, help stabilize and speed up learning.