## Convolutional Neural Networks (CNNs)

This notebook provides an introduction to Convolutional Neural Networks (CNNs), covering their components, mathematical foundations, and applications in image processing and computer vision.

---

### Table of Contents
1. **Introduction to CNNs**
2. **CNNs and Their Components**
3. **Convolution Operation**
4. **Pooling Layers**
5. **Fully Connected Layers**
6. **Applications in Image Processing and Computer Vision**
7. **Example: Building a CNN with TensorFlow/Keras**

---

## 1. Introduction to CNNs

Convolutional Neural Networks (CNNs) are a class of deep learning models specifically designed for processing grid-like data, such as images. CNNs leverage spatial hierarchies of features to automatically and adaptively learn patterns from data.

### Why CNNs for Images?
- **Local Receptive Fields:** CNNs focus on small regions of an image, capturing local patterns like edges, textures, and shapes.
- **Parameter Sharing:** Convolutional filters are shared across the image, reducing the number of parameters compared to fully connected networks.
- **Translation Invariance:** CNNs can recognize patterns regardless of their position in the image.

---

## 2. CNNs and Their Components

CNNs consist of the following key components:
1. **Convolutional Layers:** Extract features from input images using filters.
2. **Pooling Layers:** Reduce spatial dimensions while retaining important information.
3. **Fully Connected Layers:** Combine features for final classification or regression.

---

## 3. Convolution Operation

### Mathematical Explanation
Convolution is a mathematical operation used in Convolutional Neural Networks (CNNs) to extract features from an image. The convolution operation involves sliding a filter (kernel) over the input image to produce a feature map. Mathematically, it is expressed as:

$\[
\text{Output}(i, j) = \sum_{m=0}^{k-1} \sum_{n=0}^{k-1} \text{Input}(i+m, j+n) \cdot \text{Filter}(m, n)
\]$

Where:
- $\( \text{Input} \)$ is the input image.
- $\( \text{Filter} \)$ is the kernel (e.g., 3x3 or 5x5 matrix).
- $\( k \)$ is the size of the filter.


In [4]:
import numpy as np

# Input image (5x5)
input_image = np.array([
    [1, 2, 3, 4, 5],
    [6, 7, 8, 9, 10],
    [11, 12, 13, 14, 15],
    [16, 17, 18, 19, 20],
    [21, 22, 23, 24, 25]
])

# Filter (3x3)
filter_kernel = np.array([
    [1, 0, -1],
    [1, 0, -1],
    [1, 0, -1]
])

# Convolution function
def convolution2d(image, kernel):
    kernel_size = kernel.shape[0]
    output_size = image.shape[0] - kernel_size + 1
    output = np.zeros((output_size, output_size))
    
    for i in range(output_size):
        for j in range(output_size):
            output[i, j] = np.sum(image[i:i+kernel_size, j:j+kernel_size] * kernel)
    
    return output

# Apply convolution
feature_map = convolution2d(input_image, filter_kernel)
print("Feature Map:\n", feature_map)

Feature Map:
 [[-6. -6. -6.]
 [-6. -6. -6.]
 [-6. -6. -6.]]


In [5]:
def convolution2d_with_padding(image, kernel, padding=1):
    kernel_size = kernel.shape[0]
    padded_image = np.pad(image, pad_width=padding, mode='constant', constant_values=0)
    output_size = image.shape[0]
    output = np.zeros((output_size, output_size))
    
    for i in range(output_size):
        for j in range(output_size):
            output[i, j] = np.sum(padded_image[i:i+kernel_size, j:j+kernel_size] * kernel)
    
    return output

# Apply convolution with zero padding
feature_map_padded = convolution2d_with_padding(input_image, filter_kernel)
print("Feature Map with Zero Padding:\n", feature_map_padded)

Feature Map with Zero Padding:
 [[ -9.  -4.  -4.  -4.  13.]
 [-21.  -6.  -6.  -6.  27.]
 [-36.  -6.  -6.  -6.  42.]
 [-51.  -6.  -6.  -6.  57.]
 [-39.  -4.  -4.  -4.  43.]]


#### **Example Scenario: Edge Detection**
- We take a **3×3 Sobel filter** for detecting horizontal edges.
- The **input matrix** represents a 5×5 grayscale image.

#### **Input Image (5×5)**
\[
\begin{bmatrix}
10 & 10 & 20 & 30 & 40 \\
20 & 30 & 40 & 50 & 60 \\
30 & 40 & 50 & 60 & 70 \\
40 & 50 & 60 & 70 & 80 \\
50 & 60 & 70 & 80 & 90
\end{bmatrix}
\]

#### **Filter (3×3 Sobel Horizontal Edge Detector)**
\[
\begin{bmatrix}
-1 & -2 & -1 \\
0 & 0 & 0 \\
1 & 2 & 1
\end{bmatrix}
\]

#### **Convolution Process (Stride = 1)**
We slide the 3×3 filter over the input image and compute the **dot product** at each position. Let’s compute a few steps:

- **Step 1: Top-left region**
  - Extracted region:
    \[
    \begin{bmatrix}
    10 & 10 & 20 \\
    20 & 30 & 40 \\
    30 & 40 & 50
    \end{bmatrix}
    \]
  - Compute convolution:
    \[
    (10×-1) + (10×-2) + (20×-1) + (20×0) + (30×0) + (40×0) + (30×1) + (40×2) + (50×1) = -10 - 20 - 20 + 30 + 80 + 50 = 110
    \]
  - Result for this region: **110**

Applying this to the whole input, we obtain a **3×3 feature map**.

#### **Feature Map (After Convolution)**
\[
\begin{bmatrix}
110 & 150 & 190 \\
150 & 190 & 230 \\
190 & 230 & 270
\end{bmatrix}
\]

---





## 4. Pooling Layers

Pooling layers reduce the spatial dimensions of feature maps while retaining important information. Common types include:
- **Max Pooling:** Selects the maximum value in each window.
- **Average Pooling:** Computes the average value in each window.

In [13]:
def max_pooling2d(image, pool_size=2, stride=2):
    output_size = (image.shape[0] - pool_size) // stride + 1
    output = np.zeros((output_size, output_size))
    
    for i in range(0, output_size):
        for j in range(0, output_size):
            output[i, j] = np.max(image[i*stride:i*stride+pool_size, j*stride:j*stride+pool_size])
    
    return output


print("Pooled Output:\n", input_image)
# Apply max pooling
pooled_output = max_pooling2d(input_image)
print("Pooled Output:\n", pooled_output)

Pooled Output:
 [[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]
 [16 17 18 19 20]
 [21 22 23 24 25]]
Pooled Output:
 [[ 7.  9.]
 [17. 19.]]


### Example: Max Pooling

Pooling reduces the dimensions of the feature map while preserving important information.

#### **Max Pooling (2×2, Stride = 2)**
We take the maximum value in each 2×2 region.

#### **Feature Map (Before Pooling)**
\[
\begin{bmatrix}
110 & 150 & 190 \\
150 & 190 & 230 \\
190 & 230 & 270
\end{bmatrix}
\]

#### **After Max Pooling (2×2)**
\[
\begin{bmatrix}
190 & 230 \\
230 & 270
\end{bmatrix}
\]


1. **Convolution extracts features (edges, textures, etc.).**
2. **Pooling reduces spatial dimensions while preserving important features.**
---

## 5. Fully Connected Layers

Fully connected layers combine the features extracted by convolutional and pooling layers to produce the final output. Each neuron in a fully connected layer is connected to every neuron in the previous layer.

```import tensorflow as tf
from tensorflow.keras import layers

# Define a fully connected layer
fc_layer = layers.Dense(units=10, activation='softmax')
```

---

## 6. Applications in Image Processing and Computer Vision

CNNs are widely used in:
- **Image Classification:** Assigning labels to images (e.g., CIFAR-10, ImageNet).
- **Object Detection:** Identifying and localizing objects in images (e.g., YOLO, Faster R-CNN).
- **Image Segmentation:** Partitioning images into regions (e.g., U-Net, Mask R-CNN).
- **Face Recognition:** Identifying individuals from images (e.g., FaceNet).

---

## 7. Example: Building a CNN with TensorFlow/Keras

Let's build a simple CNN for image classification using the CIFAR-10 dataset.

```python
import tensorflow as tf
from tensorflow.keras import datasets, layers, models

# Load CIFAR-10 dataset
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()

# Normalize pixel values to [0, 1]
train_images, test_images = train_images / 255.0, test_images / 255.0

# Define CNN architecture
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(train_images, train_labels, epochs=10, validation_data=(test_images, test_labels))

# Evaluate the model
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f"Test Accuracy: {test_acc}")
```

---

### Summary
This notebook introduced the fundamental concepts of CNNs, including their components, mathematical foundations, and applications. We also implemented a simple CNN using TensorFlow/Keras for image classification.

---

### Next Steps
- Experiment with different CNN architectures.
- Explore advanced topics like transfer learning and data augmentation.
- Apply CNNs to real-world datasets and problems.

---