<a target="_blank" rel="noopener noreferrer" href="https://colab.research.google.com/github/epacuit/introduction-machine-learning/blob/main/tutorials/tutorial6_release.ipynb">![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)</a>


(tutorial6=)
# CIFAR-10 CNN Assignment

In this assignment, you will build a Convolutional Neural Network (CNN) using TensorFlow/Keras to classify images from the CIFAR-10 dataset. This dataset contains 10 classes: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck.

**Outline:**

1. Load and explore CIFAR-10.
2. Split the training data into training and validation sets (40,000 / 10,000).
3. Visualize sample images with their class names.
4. Build your CNN model.
5. Compile and train the model (for 3 epochs to keep training time short).
6. Evaluate the model on the test set (10,000 images).

## 1. Load CIFAR-10 Dataset

I'll load the CIFAR-10 dataset using TensorFlow’s `tf.keras.datasets.cifar10.load_data()`. This dataset automatically comes split into a training set (50,000 images) and a test set (10,000 images).

In [1]:
# TODO: Load the CIFAR-10 dataset
import tensorflow as tf
import numpy as np

# Load dataset
(x_train_full, y_train_full), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

# Normalize the images to [0, 1]
x_train_full = x_train_full.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

print('Full training set shape:', x_train_full.shape)
print('Test set shape:', x_test.shape)

# The CIFAR-10 classes
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
               'dog', 'frog', 'horse', 'ship', 'truck']
print('Classes:', class_names)

Full training set shape: (50000, 32, 32, 3)
Test set shape: (10000, 32, 32, 3)
Classes: ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']


## 2. Split Data into Training and Validation Sets

Using the full training set (50,000 images), split the data into:

- **Training set:** 40,000 images
- **Validation set:** 10,000 images

You can use scikit-learn’s `train_test_split` function to perform this split.

In [2]:
# TODO: Split x_train_full and y_train_full into training and validation sets
from sklearn.model_selection import train_test_split

# YOUR CODE HERE
raise NotImplementedError()

print('Training set shape:', x_train.shape)
print('Validation set shape:', x_val.shape)

ModuleNotFoundError: No module named 'sklearn'

## 3. Visualize Sample Images

To get a sense of the data, here's a grid of 9 sample images from the training set along with their corresponding class names (using the index of the label to look up the class name from the list provided).

In [None]:
import matplotlib.pyplot as plt

plt.figure(figsize=(10,10))
for i in range(9):
    ax = plt.subplot(3, 3, i+1)
    plt.imshow(x_train[i])
    # y_train[i] is in an array shape; extract the integer label
    plt.title(class_names[int(y_train[i])])
    plt.axis('off')
plt.show()

## 4. Build Your CNN Model

Construct a simple CNN using TensorFlow/Keras with the following guidelines:

- **Input Layer:** Your model will accept images of shape (32, 32, 3).
- **Convolutional Layers:** Include at least two convolutional layers (each followed by a pooling layer).
- **Flattening:** Use a `Flatten` layer to convert the 2D feature maps to a 1D vector.
- **Dense Layers:** Add one or more dense layers to learn non-linear combinations of the features.
- **Output Layer:** An output dense layer with 10 units (one for each class) using softmax activation, so that the output is a probability distribution over classes.

Write your model architecture below.

In [None]:
# TODO: Build your CNN model using tf.keras
from tensorflow import keras
from tensorflow.keras import layers

# YOUR CODE HERE
raise NotImplementedError()

model.summary()

## 5. Compile the Model

Compile your model using the `sparse_categorical_crossentropy` loss function.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

## 6. Train the Model

Train your model for 3 epochs. Note that the full CIFAR-10 training set (40,000 images) may take a while, so you only need to train for 3 epochs. But if you'd like to train more to see how much accuracy you can achieve, feel free to.

Do you end up overfitting the data? Underfitting the data?

In [None]:
# TODO: Train your model

epochs = 3

# YOUR CODE HERE
raise NotImplementedError()
      

## 7. Evaluate on Test Data

Evaluate the performance of your trained model on the CIFAR-10 test set (10,000 images). This will give you a clear picture of how well your model generalizes to unseen data.

In [None]:
# YOUR CODE HERE
raise NotImplementedError()