## CNN Lecture 6: Architecture and LeNet-5

### I. The Basic CNN Architecture

A Convolutional Neural Network (CNN) architecture is built by integrating the core concepts of convolution, padding/strides, and pooling.

#### A. Standard Architecture Flow
The most common CNN architecture follows a standard sequential pattern:

1.  **Input:** The process starts with an input image, typically an RGB image (e.g., $32 \times 32 \times 3$).
2.  **Convolutional Layer (Conv Layer):** The input is passed through a convolutional layer, which utilizes filters (or kernels).
3.  **Non-Linear Activation:** Immediately after convolution, a non-linear activation function (like ReLU) is usually applied to the set of numbers.
4.  **Pooling Layer (Pool Layer):** The feature map is then passed through a pooling layer.
5.  **Repetition:** The combination of (Convolution + Pooling) can be repeated multiple times (Conv2, Pool2, etc.).
6.  **Flattening:** The final 3-dimensional tensor that results from the convolution/pooling stages is converted into a 1-dimensional vector of numbers using the **Flatten** operation.
7.  **Fully Connected Layers (FC Layers):** This 1D vector is then passed through one or more fully connected layers.
8.  **Output Layer:** The final layer is the output layer, which may use a single node (for binary classification) or multiple nodes (for multi-class classification) to produce the output.

#### B. Architectural Variation
While the above structure is the basic idea, various CNN architectures are created by introducing variations in several parameters:

*   Number of convolutional layers.
*   Number of filters in the convolutional layers.
*   Value of the stride.
*   Presence or absence of padding.
*   Number of fully connected layers and the number of nodes within them.
*   Choice of activation function.
*   Use of techniques like Dropout or Batch Normalization.

### II. Foundational Architecture: LeNet-5

LeNet-5 is a historically important CNN architecture, often cited as the **first CNN built**. It was created by Yann LeCun (considered the father of CNNs) and published in 1998.

#### A. Purpose and Naming
*   **Purpose:** LeNet-5 was designed to recognize machine-written pin codes for a US Navy Postal Service project.
*   **Naming:** The architecture is named **LeNet-5** because it contains **five layers** that are counted (excluding input, pooling, and flattening layers).

#### B. LeNet-5 Detailed Architecture

LeNet-5 expects a $32 \times 32$ input image.

| Stage | Operation / Layer | Configuration Details | Output Size | Notes |
| :--- | :--- | :--- | :--- | :--- |
| **Input** | Image | $32 \times 32$ | $32 \times 32$ | |
| **Layer 1** | **Conv Layer 1** | 6 filters; Kernel size $5 \times 5$; Stride 1, No padding. | $28 \times 28 \times 6$ | Activation: **Tanh** (Hyperbolic Tangent), which was the best available in 1998. |
| **Layer 2** | **Average Pooling 1** | Pool size $2 \times 2$; Stride 2. | $14 \times 14 \times 6$ | Uses **Average Pooling**, not Max Pooling. |
| **Layer 3** | **Conv Layer 2** | 16 filters; Kernel size $5 \times 5$; Stride 1, No padding. | $10 \times 10 \times 16$ | |
| **Layer 4** | **Average Pooling 2** | Pool size $2 \times 2$; Stride 2. | $5 \times 5 \times 16$ | |
| **---** | **Flatten** | $5 \times 5 \times 16$ tensor is flattened. | 400 | The output is 400 dimensions ($5 \times 5 \times 16$). |
| **Layer 5** | **Fully Connected 1** | 120 neurons. | 120 | |
| **Layer 6** | **Fully Connected 2** | 84 neurons. | 84 | |
| **Layer 7** | **Output Layer** | Softmax activation. | 10 | 10 nodes are used for classifying 10 digits. |

***Note on Layer Counting:** In this context, layers 1, 3, 5, 6, and 7 are often counted as the five main layers, or sometimes the Conv/Pool combinations are grouped. **Pooling layers themselves have zero trainable parameters**. The full model has approximately 60,000 trainable parameters.

#### C. Evolution of CNN Architectures
The principles established by LeNet are used as building blocks for more complex architectures that emerged later, especially since the ImageNet competition became famous. Future studies will cover architectures such as **AlexNet**, **GoogleNet**, **VGGNet**, **ResNet**, and **Inception Modules**.

---