## Lecture Notes: Comparing CNN vs. ANN

### I. Introduction and Context

*   This topic serves to continue the deep learning playlist.
*   Understanding the similarities and differences between CNNs and ANNs is **crucial** for upcoming videos, particularly the explanation of backpropagation in CNNs.
*   Drawing parallels between the two architectures can help in applying concepts learned from backpropagation in ANNs to CNNs.

### II. Limitations of ANNs in Image Processing

When using ANNs for image classification tasks, three major disadvantages arise:

1.  **High Computational Cost:** Using ANNs for image tasks is computationally very costly.
2.  **Overfitting:** There is an increased chance of overfitting.
3.  **Loss of Important Features:** Features like the **spatial arrangement of pixels** are lost. This happens because the 2D image data must be flattened (brought to 1D) before being used as input by the ANN.

### III. Processing Image Data: ANN vs. CNN

The source uses a 28x28 image (like those in the MNIST dataset) to illustrate the processing difference.

| Feature | Artificial Neural Network (ANN) | Convolutional Neural Network (CNN) |
| :--- | :--- | :--- |
| **Input Format** | The 28x28 image is flattened (brought to 1D). | Uses the image in its original 2D form. |
| **Input Size** | 784 inputs (28 multiplied by 28). | 28x28. |
| **Architecture** | Input layer leads to hidden layers (fully connected layers). | **Convolution:** Filters are applied, creating a feature map. |
| **Post-Processing** | Directly passes output to the softmax layer. | Feature map gets ReLU applied, followed by pooling (Max Pooling), then **flattening**. |
| **Final Layer** | Softmax layer. | Flattened data goes to fully connected layers, then a Softmax layer. |

### IV. Fundamental Similarities: The Core Principle

Despite architectural differences, the basic working principle of CNN filters and ANN nodes is similar.

*   **ANN Node Operation:** Inputs ($X_1, X_2$, etc.) are multiplied by their respective weights ($W_1, W_2$, etc.)—performing a **dot product**. A bias is added, and the result is passed through an **activation function** (e.g., ReLU).
*   **CNN Filter Operation:** When a filter convolves, it performs the same mathematical operations on the input pixels within its scope. It calculates the **dot product** of the input pixel values and the filter’s internal values (which serve as the weights, $W_{i}$). It then adds the filter's bias and passes the result through an activation function (like ReLU).

**Key Similarity Takeaway:** Both architectures execute the same core process: **dot product + bias + activation function**.

**Difference in Application:**

*   ANN nodes work on **all** inputs.
*   CNN filters operate on an input **chunk** via the movement of a window, allowing it to capture 2D spatial arrangement.

### V. Key Differences: Trainable Parameters and Computational Cost

The primary difference lies in how the number of trainable parameters relates to the input image size, which explains the computational advantage of CNNs.

#### CNN Parameters (Input Size Independent)

*   **Computational Efficiency:** CNNs are computationally cost-effective.
*   The number of trainable parameters in a CNN **does not depend on the size of the input image**. It only depends on:
    1.  The size of the filter.
    2.  The number of filters implemented.

*   **Example Calculation:** For a 28x28x3 image using 50 filters of size 3x3x3:
    *   Weights per filter: $3 \times 3 \times 3 = 27$.
    *   Total weights: $27 \times 50 = 1350$.
    *   Total biases (one per filter): 50.
    *   **Total Trainable Parameters: 1400**.
    *   If the image size were dramatically increased (e.g., 1080x1080x3), the number of trainable parameters would **remain 1400**, because the filter size and number of filters are unchanged.

#### ANN Parameters (Input Size Dependent)

*   The number of weights in an ANN increases automatically as the input size increases.
*   If the input size grows (e.g., 1080x1080x3), the required number of weights between the input and hidden layers (calculated as Input Units $\times$ Hidden Units) **increases significantly** (potentially into the millions).
*   This reliance on input size for weight calculation is why ANNs have increased computational cost and slower training times when dealing with large images.

### VI. Conclusion

The architectural design of CNNs allows them to capture the 2D spatial arrangement of pixels while ensuring the number of trainable weights remains fixed regardless of image input size, thereby mitigating the computational cost and overfitting issues faced by ANNs. This context sets the stage for training CNNs using the backpropagation algorithm.

***