### DS 542 Midterm Challenge Report
Ann Liang <br>
March 30, 2025 <br>
Kaggle User Name: aaaaaliang <br>

#### AI Disclosure Statement 
This machine learning challenge was completed using a Pretrained CNN model (ResNet50) and an AI tool (ChatGPT). The following list details how ChatGPT was used throughout the completion of this challenge: 
- Model architecture comparison, selection, recommendation of adjusting the number of convolutional layers and using ResNet18 and Resnet50 with the technique of unfreezing the latter layers. 
- Troubleshoot of SCC set up, create batch jobs, write shell scripts, and define best GPU usage. 
- Data augmentation strategies recommendation of using random horizontal flip, color jitter, and random erasing. 
- Hyperparameter tuning strategies recommendation of adjusting optimizer, learning rate scheduler, and weight decay. 
- Use of WandB for experiment tracking. 
- Organize notes on experiment observations and help interpret results for final report. 
1. Written by Me: 
- Basic structure of CNN models, training and testing loop, data loading, model initiation, loss function, optimizer, and scheduler
- Basic structure of batch job shell scripts 
- Basic data transformation steps, such as random rotation, random cropping, and random normalization 
2. Written with AI Assistance: 
- Detailed structure of ResNet18 and ResNet50 functions. 
- Refined data augmentation recommendation on random horizontal flip, color jitter, and random erasing. 
- Refined hyperparameter tuning recommendation on optimizer, learning rate scheduler, and weight decay. 
- Code organization and CUDA setting for SCC adoption. 
- Organize notes on experiment observations and help interpret results for final report. 
3. Code Comment: 
- All codes are noted with detailed comments in their respective .py files. 

#### Model Description and Justification
There are three primary CNN models implemented: 
1. **Simple CNN**: 
- Architecture: A convolutional neural network with four layers. 
- Why Simple CNN? To provide an initial benchmark performance for comparison. 
3. **Sophisticated CNN (ResNet18)**: 
- Architecture: A more refined convolutional neural network using ResNet18 with residual network and skip connections. 
- Why ResNet18? To better extract features and improve model performance from Simple CNN. 
4. **Pretrained CNN (ResNet50)**: 
- Architecture: A convolutional neural network pretrained on ImageNet using ResNet50 with deeper residual network. 
-  Why ResNet50? To better fine-tune on CIFAR-100 dataset for the best model performance with the following advantages: 
    - Deeper network can capture more complicated patterns.
    - Pretrained model has stronger feature extraction strength. 
    - Unfreezing the latter layers adapts to CIFAR-100 faster

#### Result Analysis 

1. **Simple CNN (Ecophs=30)**: 
- Optimizer: SGD 
- Batch Size: 32 
- Learning Rate: 0.01 
- Ecoph: 30 
- Test Accuracy: 61.16% 
- Kaggle Accuracy: 45.23% (File: submission_ood (3).csv)
- Analysis: The Simple CNN model established baseline performance, but the number of ecoph was not as instructed on Kaggle and was re-created as below. The CIFAR-100 test accuracy was much higher than Kaggle showing that the model generalization ability was weak. 

2. **Simple CNN (Ecophs=5)**: 
- Optimizer: SGD 
- Batch Size: 32 
- Learning Rate: 0.01 
- Ecoph: 5 
- Test Accuracy: 32.89%
- Kaggle Accuracy: 24.80% (File: submission_ood_simple_cnn.csv)
- Analysis: After adjusting the number of ecoph from 30 to 5, the baseline performance decreased from 45.23% to 24.80%. This emphasizes the number of ecophs impact the number of times the model learns from the dataset, and thus, the performance.  

3. **Sophisticated CNN (ResNet18)**: 
- Optimizer: SGD 
- Batch Size: 32 
- Learning Rate: 0.01 
- Ecoph: 5 
- Test Accuracy: 35.24% 
- Kaggle Accuracy: 29.72% (File: submission_ood_soph_cnn (1).csv)
- Analysis: After traning with ResNet18, the performance was improved with an increased accuracy of ~5%. This highlights the advantage of training deeper networks and creating better performance and the opportunity for further improvement. 

4. **Pretrained CNN (ResNet50)**: 
- Optimizer: SGD 
- Batch Size: 64 
- Learning Rate: 0.003 
- Ecoph: 50 
- Test Accuracy: 56.98% 
- Kaggle Accuracy: 44.95% (File: submission_ood_pretrained_cnn.csv)
- Analysis: After training with ResNet50, the performance was more improved with an increased accuracy of ~15%. This shows that ResNet50 has even deeper networks for training and can capture more features. Although the model has slightly lower Kaggle accuracy than Simple CNN (30 epochs), it generalized better with a lower accuracy gap between test and Kaggle (12% vs. 16%). 

Since the Pretrained CNN (ResNet50) has the best overall balance, the following sections focus on explaining its design. 

#### Hyperparameter Tuning
Based on prior results, hyperparameter tuning was performed to experiment with the following parameters and establish the optimal model. Below highlights the change from Sophisticated CNN (ResNet18) to Pretrained CNN (ResNet50). 
- **Optimizer**: Changed from SGD to AdamW for better L2 regularization and adaptive learning rates.
- **Batch Size**: Increased from 32 to 64 for better gradient estimates. 
- **Learning Rate**: Decreased from 0.01 to 0.003 to avoid over-estimating weights.
- **Learning Rate Scheduler**: Changed StepLR from to CosineAnnealingLR to gradually reduce learning rate. 
- **Ecoph**: Define as 50 to form better fine-tuning for model. 
- **Weight Decay**: Increased from 5e-4 to 2e-4 to balance learning capacity. 

#### Regularization Techniques
Regularization techniques were incorporated to reduce model complexity and increase generalization. 
- **Weight Decay**: It stabilizes training by preventing the weights from getting too large. It increases the model performance. 
- **Label Smoothing**: It improves test accuracy by eliminating noises from dataset. It improves model robutness. 
- **Dropout**: It deactivates neurons in training by preventing overfitting and improving generalization. It handles model batch normalization layers better.

#### Data Augmentation Strategy
Data augmentation was used to generate random variations, prevent over-fitting, and improve model generalization. The key strategies include: 
- **To Tensor**: It converts images to tensors to allow batch processing in model. 
- **Random Horizontal Flip**: It creates more variation in image orientations. 
- **Random Rotation**: It exposes model to a wider range of image orientations. 
- **Color Jitter**: It changes brightness, contrast, and saturation of the images. 
- **Random Cropping with Padding**: It exposes model to various image compositions. 
- **Random Erasing**: It forces model to learn by erasing some portions of the images. 
- **Normalize**: It standardizes image inputs.


#### Experiment Tracking Summary
Weights and Biases (WandB) was used to track the data of training and validation sets of each model run. Below is the dashboard of successful model runs, excluding failed and crashed runs from setting up at an early stage: 
- **Training and Validation Loss**: An ideal model should have both losses decreasing and remaining low. Some models show this pattern, meaning they are learning well. But, some models have validation loss that remains high or fluctuates, meaning they are not generalizing well to data. 
- **Training and Validation Accuracy**: An ideal model should have both accuracies increase together. Some models show good progress, while some might be overfitting. 

![Image](wandb_tracking.png)

#### Areas for Improvement 
The following aspects can be addressed to further improve CNN models:  
- Experiment with larger bath sizes. 
- Fine-tune more layers and epochs to increase accuracy. 
- Try out different  optimizers and schedulers.
- Explore other hyperparameter tuning, regularization techniques, and data augmentation strategies for higher model accuracy. 

#### Conclusion
This challenge reinforced the effectiveness of using a pretrained CNN model, such as ResNet50, over a more simple CNN model. By adjusting hyperparameters and data augmentation strategies, the final accuracy score achieved considerable improvements in both the CIFAR-100 test and Kaggle performance.