### 5. Comparison and Conclusions

#### Comparison with Custom CNN
Comparing this Transfer Learning experiment to my custom CNN from the previous exercise revealed a significant trade-off between model complexity and efficiency. My custom CNN was lightweight and designed specifically for $28 \times 28$ inputs, allowing it to complete training very quickly. In contrast, the ResNet-18 model is a much heavier architecture with millions of parameters. Even though I reduced the training duration to just **7 epochs**, the **total training time was longer** than my custom model. This is because each individual step required significantly more computation, further increased by the overhead of resizing images to $64 \times 64$. While the pre-trained weights allowed the model to reach high accuracy in fewer epochs, the "wall-clock" time to get there was higher, proving that for simple datasets like FashionMNIST, a massive pre-trained model can be computationally inefficient compared to a smaller, custom-built solution.

#### The Tuning Process & Unfreezing Layers
Getting the ResNet model to perform well required significant iteration with hyperparameters. Initially, I treated the ResNet strictly as a fixed feature extractor, freezing all layers and using a standard learning rate. However, the model struggled to converge smoothly. The turning point came when I switched to the **Adam optimizer** and significantly **lowered the starting learning rate to 0.0001**. This smaller step size allowed the model to fine-tune the weights without destroying the pre-learned features. Additionally, **unfreezing Layer 4** (the final convolutional block) was crucial. It became clear that the original weights in the deeper layers were too specialized for ImageNet objects; by unfreezing them, I allowed the model to repurpose those high-level filters to recognize clothing-specific features like sleeves and zippers, which immediately broke the performance plateau.

#### The Challenge of Domain Mismatch
One key finding from this assignment is that Transfer Learning is not a "magic bullet" when the source and target domains are vastly different. ResNet was trained on ImageNetâ€”high-resolution, colorful, real-world photography. FashionMNIST, on the other hand, consists of tiny, grayscale, low-contrast icons. This "domain gap" explains why the model didn't immediately achieve near-perfect results. Unlike a scenario where we transfer from "photos of cats" to "photos of dogs," here we transferred from "photos of the world" to "pixelated icons of clothes." This mismatch made the data preprocessing (duplicating channels, resizing to 64x64) and the fine-tuning of deeper layers absolutely critical to the model's success.


## ===========================================================================================================================


### Visual Comparison: Custom CNN (HW4) vs. Transfer Learning (HW5)

This section compares the performance and internal representations of the custom-built CNN versus the pre-trained ResNet-18 model.

#### 1. Training Dynamics: Loss & Accuracy Curves
The graphs below illustrate the distinct learning behaviors of the two architectures.

| Custom CNN (Homework 4) | Transfer Learning (Homework 5) |
| :---: | :---: |
| ![HW4 Training Plot](<../data/models results images/homework_4_training_plot.png>) | ![HW5 Training Plot](<../data/models results images/homework_5_training_plot.png>) |

**Observation:** The **Custom CNN** begins with low accuracy and high loss, showing a steep and steady improvement curve as it learns features from scratch. In contrast, the **ResNet-18** (Transfer Learning) starts at a much higher baseline due to its pre-trained weights, requiring only minor fine-tuning to reach convergence.

---

#### 2. Quantitative Performance: Classification Reports
A side-by-side look at the precision, recall, and F1-scores for each class.

| Custom CNN (Homework 4) | Transfer Learning (Homework 5) |
| :---: | :---: |
| ![HW4 Report](<../data/models results images/homework_4_classification_report.png>) | ![HW5 Report](<../data/models results images/homework_5_classification_report.png>) |

**Observation:** Surprisingly, the final performance metrics are remarkably close. This demonstrates that for a low-resolution dataset like FashionMNIST, a well-tuned lightweight model can compete effectively with a massive industrial-grade architecture.

---

#### 3. Error Analysis: Confusion Matrices
Visualizing where the models make mistakes.

| Custom CNN (Homework 4) | Transfer Learning (Homework 5) |
| :---: | :---: |
| ![HW4 Confusion Matrix](<../data/models results images/homework_4_confusion_matrix.png>) | ![HW5 Confusion Matrix](<../data/models results images/homework_5_confusion_matix.png>) |

**Observation:** Both models struggle with similar "hard" classes (like Shirt vs. T-shirt/top), confirming that the difficulty lies in the visual ambiguity of the data rather than a specific model deficiency.

---

#### 4. Interpretability: Learned Filters
Comparing the visual patterns learned by the models.

| Custom CNN (Homework 4) | Transfer Learning (Homework 5) |
| :---: | :---: |
| ![HW4 Filters](<../data/models results images/homework_4_filters.png>) | ![HW5 Filters](<../data/models results images/homework_5_filters.png>) |