
# Homework 7 - Convolutional Neural Networks

## Exploring Advanced CNN Architectures on CIFAR-10

Last week, we developed a relatively simple CNN model using just numpy for the MNIST digit recognition task. This week, we will dive deeper into more complex CNN architectures by tackling the more challenging CIFAR-10 dataset.

### Task Overview:

CIFAR-10 is a standard benchmark dataset consisting of 32 x 32 color images across 10 different classes. Your objective is to build and train convolutional neural networks to perform classification on CIFAR-10.

### Objective:

This week's assignment is intentionally open-ended to encourage creativity and exploration. There are numerous tutorials online that describe how to train a convolutional network on CIFAR-10. We encourage you to seek out these resources and utilize them. A great starting point is the [CIFAR-10 tutorial from PyTorch](https://colab.research.google.com/github/pytorch/tutorials/blob/gh-pages/_downloads/cifar10_tutorial.ipynb), which includes code for loading the dataset and evaluating model performance. You are welcome to use any other resources, but please make sure to cite them.

### Guidelines:

1. **Frameworks:** There are no restrictions on using frameworks such as PyTorch or TensorFlow for building and training models. You are not limited to using numpy.
2. **Dataset:** You can only train your model on the CIFAR-10 training set. Pre-trained models or additional datasets for further training are not allowed.
3. **Compute Resource:** You must train your model on the free Colab GPU or TPU. This means you can only train the model for about an hour or so, which is significantly less compute than typically used for training CIFAR-10 models. Therefore, this assignment is as much about building an efficient model as it is about building an accurate one.

### Evaluation:

To evaluate the performance of your model, you will need to implement a function that computes the accuracy on the test set. Refer to the linked tutorial for guidance.

### Challenge:

There have been numerous advances in the field that allow for the efficient training of CIFAR-10 models. For context, [this approach](https://github.com/davidcpage/cifar10-fast/tree/master) achieves 96% accuracy in under a minute on a single GPU!

### Strategies to Explore:

There are various strategies you can employ to enhance the accuracy and efficiency of your model. Here are some ideas:

1. **Deeper Models:** Explore deeper architectures to capture more complex features.
2. **Residual Connections:** Implement residual connections to improve gradient flow.
3. **[Data Augmentation and Normalization:](https://d2l.ai/chapter_computer-vision/kaggle-cifar10.html#image-augmentation)** Utilize techniques such as rotation, flipping, and scaling to augment your dataset.
4. **Regularization and Dropout:** Apply regularization techniques and dropout layers to prevent overfitting.
5. **[Learning Rate Schedules:](https://d2l.ai/chapter_optimization/lr-scheduler.html)** Experiment with different learning rate schedules to optimize training.
6. **Different Forms of Normalization:** Investigate various normalization methods, such as batch normalization and layer normalization.

### Expected Performance:

Even without applying all the advanced techniques, you should be able to achieve at least 50% accuracy. With some tweaking, you can probably reach around 75%. Using the basic AlexNet architecture discussed in class, you should be able to achieve about 60% accuracy. If you're up for a challenge, aim for 96% by employing the aforementioned strategies!

### Submission:

Submit your Google Colab notebook with all cells executed, and include the generated plots. Ensure that your code is well-documented, and provide clear explanations for each step.


RuntimeError: operator torchvision::nms does not exist