# Advanced Image Classification with ImageNet

In this assignment, you will be asked to develop a convolutional neural network (CNN) to classify images from the CIFAR-100 dataset. At each step, you'll be guided through the process of developing a model architecture to solve a problem. Your goal is to create a CNN that attains at least 55% accuracy on the validation set.

### The CIFAR-100 Dataset

The [CIFAR-100 dataset](https://www.cs.toronto.edu/~kriz/cifar.html) consists of 60000 32x32 colour images in 100 classes, with 600 images per class. There are 50000 training images and 10000 test images. The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch contains exactly 1000 randomly-selected images from each class. The training batches contain the remaining images in random order, but some training batches may contain more images from one class than another. Between them, the training batches contain exactly 500 images from each class.

### Tools

You will use Keras with TensorFlow to develop your CNN. For this assignment, it's strongly recommended that you use a GPU to accelerate your training, or else you might find it difficult to train your network in a reasonable amount of time. If you have a computer with a GPU that you wish to use, you can follow the [TensorFlow instructions](https://www.tensorflow.org/install/) for installing TensorFlow with GPU support. Otherwise, you can use [Google Colab](https://colab.research.google.com/) to complete this assignment. Colab provides free access to GPU-enabled machines. If you run into any issues, please contact us as soon as possible so that we can help you resolve them.

## Task 1: Data Exploration and Preprocessing (Complete or Incomplete)
### 1a: Load and Explore the Dataset
- Use the code below to download the dataset.
- Explore the dataset: examine the shape of the training and test sets, the dimensions of the images, and the number of classes. Show a few examples from the training set.

In [31]:
from keras.datasets import cifar100

# Load the CIFAR-100 dataset
(x_train, y_train), (x_test, y_test) = cifar100.load_data(label_mode='fine')

In [32]:
class_names = [
    "apple", "aquarium_fish", "baby", "bear", "beaver", "bed", "bee", "beetle", "bicycle", "bottle",
    "bowl", "boy", "bridge", "bus", "butterfly", "camel", "can", "castle", "caterpillar", "cattle",
    "chair", "chimpanzee", "clock", "cloud", "cockroach", "couch", "crab", "crocodile", "cup", "dinosaur",
    "dolphin", "elephant", "flatfish", "forest", "fox", "girl", "hamster", "house", "kangaroo", "keyboard",
    "lamp", "lawn_mower", "leopard", "lion", "lizard", "lobster", "man", "maple_tree", "motorcycle", "mountain",
    "mouse", "mushroom", "oak_tree", "orange", "orchid", "otter", "palm_tree", "pear", "pickup_truck", "pine_tree",
    "plain", "plate", "poppy", "porcupine", "possum", "rabbit", "raccoon", "ray", "road", "rocket",
    "rose", "sea", "seal", "shark", "shrew", "skunk", "skyscraper", "snail", "snake", "spider",
    "squirrel", "streetcar", "sunflower", "sweet_pepper", "table", "tank", "telephone", "television", "tiger", "tractor",
    "train", "trout", "tulip", "turtle", "wardrobe", "whale", "willow_tree", "wolf", "woman", "worm"
]

In [6]:
# Examine the shape 
import pandas as pd

print(x_train.shape)
print(y_train.shape)
print(x_test.shape)
print(y_test.shape)

(50000, 32, 32, 3)
(50000, 1)
(10000, 32, 32, 3)
(10000, 1)


In [9]:
# Look at the data types 
print(x_train.dtype)
print(y_train.dtype)

uint8
int32


In [20]:
# Look at sample values 
print(x_train[0, :5, :5, 0])  
print(y_train)

[[255 255 255 255 255]
 [255 254 254 254 254]
 [255 254 255 255 255]
 [255 254 255 255 255]
 [255 254 255 255 255]]
[[19]
 [29]
 [ 0]
 ...
 [ 3]
 [ 7]
 [73]]


In [27]:
# Find the unique values 
import numpy as np

unique_values_x = np.unique(x_train)
print(unique_values_x)
unique_values_y = np.unique(y_train)
print(unique_values_y)

[  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17
  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35
  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53
  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71
  72  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89
  90  91  92  93  94  95  96  97  98  99 100 101 102 103 104 105 106 107
 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125
 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143
 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161
 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179
 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197
 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215
 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233
 234 235 236 237 238 239 240 241 242 243 244 245 24

### 1b: Data Preprocessing
- With the data downloaded, it's time to preprocess it. Start by normalizing the images so that they all have pixel values in the range [0, 1].
- Next, convert the labels to one-hot encoded vectors.
- Finally, split the training set into training and validation sets. Use 80% of the training set for training and the remaining 20% for validation.

In [33]:
# Normalize the images 
x_train_n = x_train.astype('float32') / 255.0
x_test_n = x_test.astype('float32') / 255.0
print(x_train_n)
print(x_test_n)

[[[[1.         1.         1.        ]
   [1.         1.         1.        ]
   [1.         1.         1.        ]
   ...
   [0.7647059  0.8039216  0.75686276]
   [0.83137256 0.8784314  0.8       ]
   [0.7137255  0.7607843  0.654902  ]]

  [[1.         1.         1.        ]
   [0.99607843 0.99607843 0.99607843]
   [0.99607843 0.99607843 0.99607843]
   ...
   [0.6666667  0.6901961  0.5882353 ]
   [0.6313726  0.65882355 0.50980395]
   [0.57254905 0.6039216  0.44313726]]

  [[1.         1.         1.        ]
   [0.99607843 0.99607843 0.99607843]
   [1.         1.         1.        ]
   ...
   [0.7411765  0.78039217 0.6627451 ]
   [0.6509804  0.69803923 0.50980395]
   [0.4745098  0.52156866 0.34117648]]

  ...

  [[0.5803922  0.7254902  0.30980393]
   [0.5568628  0.7137255  0.22352941]
   [0.54901963 0.7019608  0.23529412]
   ...
   [0.11764706 0.06666667 0.00392157]
   [0.25490198 0.24313726 0.05882353]
   [0.29803923 0.3019608  0.07843138]]

  [[0.47843137 0.6156863  0.25882354]
   [0.4

In [34]:
from tensorflow.keras.utils import to_categorical

# Perform one hot encoding on the labels 
y_train_oh = to_categorical(y_train, num_classes=100)
y_test_oh = to_categorical(y_test, num_classes=100)

In [36]:
# Split the training set into training and validation sets 
from sklearn.model_selection import train_test_split

x_train_new, x_test_new, y_train_new, y_test_new = train_test_split(
    x_train_n, 
    y_train_oh,
    test_size=0.2, # 20% of the data is used for testing
    random_state=42 # Providing a value here means getting the same "random" split every time
)

In [None]:
# Verify the data has been split correctly 
print(f'x_train shape: {x_train_new.shape}')
print(f'y_train shape: {y_train_new.shape}')
print(f'x_test shape: {x_test_new.shape}')
print(f'y_test shape: {y_test_new.shape}')

x_train shape: (40000, 32, 32, 3)
y_train shape: (40000, 100)
x_test shape: (10000, 32, 32, 3)
y_test shape: (10000, 100)


## Task 2: Model Development (Complete or Incomplete)
### Task 2a: Create a Baseline CNN Model
- Design a CNN architecture. Your architecture should use convolutional layers, max pooling layers, and dense layers. You can use any number of layers, and you can experiment with different numbers of filters, filter sizes, strides, padding, etc. The design doesn't need to be perfect, but it should be unique to you.
- Print out the model summary.

In [45]:
# Create a CNN model 
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

model = Sequential()

# Add convolution layers 
model.add(Conv2D(32, (3,3), activation='relu', input_shape=(32, 32, 3)))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.25))

model.add(Conv2D(64, (3,3), activation='relu'))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.25))

# Add fully connected layers
model.add(Flatten())
model.add(Dense(100, activation='relu'))

model.summary()

### Task 2b: Compile the model

- Select an appropriate loss function and optimizer for your model. These can be ones we have looked at already, or they can be different. 
- Briefly explain your choices (one or two sentences each).
- <b>Loss function:</b> ______
- <b>Optimizer:</b> ______

In [68]:
from tensorflow.keras.metrics import TopKCategoricalAccuracy

model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy', TopKCategoricalAccuracy(k=5, name='top_5_accuracy')]
)

# The loss function categorical crossentropy was selected because this is a multi-class classification problem.
# The Adam optimizer was selected due to the size of the dataset and for its faster convergence time. 

## Task 3: Model Training and Evaluation (Complete or Incomplete)
### Task 3a: Train the Model

- Train your model for an appropriate number of epochs. Explain your choice of the number of epochs used - you can change this number before submitting your assignment.
- Use a batch size of 32.
- Use the validation set for validation.

In [70]:
history = model.fit(
    x_train_new, # Training data
    y_train_new, # Training labels
    epochs=30, # Number of epochs -- use a larger number due to the dataset being larger 
    batch_size=32, # Number of samples per batch
    validation_data=(x_test_new, y_test_new)
)

Epoch 1/30
[1m1250/1250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m35s[0m 28ms/step - accuracy: 0.1731 - loss: 3.4732 - top_5_accuracy: 0.4366 - val_accuracy: 0.2457 - val_loss: 3.0891 - val_top_5_accuracy: 0.5354
Epoch 2/30
[1m1250/1250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m35s[0m 28ms/step - accuracy: 0.2906 - loss: 2.8536 - top_5_accuracy: 0.5894 - val_accuracy: 0.3222 - val_loss: 2.7160 - val_top_5_accuracy: 0.6262
Epoch 3/30
[1m1250/1250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m36s[0m 29ms/step - accuracy: 0.3621 - loss: 2.5079 - top_5_accuracy: 0.6669 - val_accuracy: 0.3493 - val_loss: 2.5973 - val_top_5_accuracy: 0.6493
Epoch 4/30
[1m1250/1250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m35s[0m 28ms/step - accuracy: 0.4161 - loss: 2.2629 - top_5_accuracy: 0.7208 - val_accuracy: 0.3619 - val_loss: 2.5664 - val_top_5_accuracy: 0.6641
Epoch 5/30
[1m1250/1250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m29s[0m 24ms/step - accuracy: 0.4546 - loss: 2.0

### Task 3b: Accuracy and other relevant metrics on the test set

- Report the accuracy of your model on the test set.
- While accuracy is a good metric, there are many other ways to numerically evaluate a model. Report at least one other metric, and explain what it measures and how it is calculated.

- <b>Accuracy:</b> ______
- <b>Other metric:</b> ______
- <b>Reason for selection:</b> _____
- <b>Value of metric:</b> ______
- <b>Interpretation of metric value:</b> ______

In [71]:
loss, accuracy, top_k_accuracy = model.evaluate(x_test_new, y_test_new)

print(f'Loss:     {loss:.2f}')
print(f'Accuracy: {accuracy*100:.2f}%')
print(f'Top-5 Accuracy: {top_k_accuracy*100:.2f}%') 

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 7ms/step - accuracy: 0.3584 - loss: 4.4786 - top_5_accuracy: 0.6393
Loss:     4.45
Accuracy: 35.72%
Top-5 Accuracy: 63.94%


In [None]:
# Top k accuracy measures the probability predictions of all the classes and determines whether the correct classes appear within the ones with the top scores.

### Task 3c: Visualize the model's learning

- Plot the training accuracy and validation accuracy with respect to epochs.
- Select an image that the model correctly classified in the test set, and an image that the model incorrectly classified in the test set. Plot the images and report the model's classification probabilities for each.
- Briefly discuss the results. What do the plots show? Do the results make sense? What do the classification probabilities indicate?

In [66]:
import matplotlib.pyplot as plt

plt.figure(figsize=(12, 4))


<Figure size 1200x400 with 0 Axes>

<Figure size 1200x400 with 0 Axes>

## Task 4: Model Enhancement (Complete or Incomplete)
### Task 4a: Implementation of at least one advanced technique

- Now it's time to improve your model. Implement at least one technique to improve your model's performance. You can use any of the techniques we have covered in class, or you can use a technique that we haven't covered. If you need inspiration, you can refer to the [Keras documentation](https://keras.io/).
- Explain the technique you used and why you chose it.
- If you used a technique that requires tuning, explain how you selected the values for the hyperparameters.

In [None]:
# Create a CNN model 
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

model2 = Sequential()

# Add convolution layers 
model2.add(Conv2D(32, (3,3), activation='relu', input_shape=(32, 32, 3)))
model2.add(MaxPooling2D((2,2)))
model2.add(Dropout(0.25))

model2.add(Conv2D(64, (3,3), activation='relu'))
model2.add(MaxPooling2D((2,2)))
model2.add(Dropout(0.25))

# Add fully connected layers
model2.add(Flatten())
model2.add(Dense(512, activation='relu'))
model2.add(Dense(100, activation='softmax'))

model2.summary()

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


### Task 4b: Evaluation of the enhanced model

- Re-train your model using the same number of epochs as before.
- Compare the accuracy and other selected metric on the test set to the results you obtained before.
- As before, plot the training accuracy and validation accuracy with respect to epochs, and select an image that the model correctly classified in the test set, and an image that the model incorrectly classified in the test set. Plot the images and report the model's classification probabilities for each.

In [74]:
from tensorflow.keras.optimizers import Adam

model2.compile(
    optimizer=Adam(learning_rate=0.0001),
    loss='categorical_crossentropy',
    metrics=['accuracy', TopKCategoricalAccuracy(k=5, name='top_5_accuracy')]
)

history2 = model2.fit(
    x_train_new, # Training data
    y_train_new, # Training labels
    epochs=30, # Number of epochs -- use a larger number due to the dataset being larger 
    batch_size=32, # Number of samples per batch
    validation_data=(x_test_new, y_test_new)
)

loss, accuracy, top_k_accuracy = model2.evaluate(x_test_new, y_test_new)

print(f'Loss:     {loss:.2f}')
print(f'Accuracy: {accuracy*100:.2f}%')
print(f'Top-5 Accuracy: {top_k_accuracy*100:.2f}%') 

Epoch 1/30
[1m1250/1250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m34s[0m 25ms/step - accuracy: 0.0351 - loss: 4.4441 - top_5_accuracy: 0.1248 - val_accuracy: 0.1171 - val_loss: 3.9091 - val_top_5_accuracy: 0.3222
Epoch 2/30
[1m1250/1250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m26s[0m 21ms/step - accuracy: 0.1332 - loss: 3.7954 - top_5_accuracy: 0.3451 - val_accuracy: 0.1632 - val_loss: 3.6175 - val_top_5_accuracy: 0.4141
Epoch 3/30
[1m1250/1250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m33s[0m 26ms/step - accuracy: 0.1726 - loss: 3.5462 - top_5_accuracy: 0.4177 - val_accuracy: 0.1948 - val_loss: 3.4722 - val_top_5_accuracy: 0.4534
Epoch 4/30
[1m1250/1250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m34s[0m 27ms/step - accuracy: 0.1975 - loss: 3.4159 - top_5_accuracy: 0.4517 - val_accuracy: 0.2180 - val_loss: 3.3331 - val_top_5_accuracy: 0.4822
Epoch 5/30
[1m1250/1250[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m33s[0m 26ms/step - accuracy: 0.2257 - loss: 3.2

### Task 4c: Discussion of the results

- Briefly discuss the results. 
- Did the model's performance improve? 
- Why do you think this is?
- Do you think there is room for further improvement? Why or why not?
- What other techniques might you try in the future?
- Your answer should be no more than 200 words.

My model's performance improved after the enhancements were made. I think there can be further improvements made based on the current accuracy level. I think it improved due to increasing the number of neurons in the Dense layer, which helps the model to learn more complex patterns. In addition, I think adding a softmax activation in my output layer may have helped improved the model's performance by making it better at making decisions regarding classifying classes.

## Criteria

|Criteria|Complete|Incomplete|
|----|----|----|
|Task 1|The task has been completed successfully and there are no errors.|The task is still incomplete and there is at least one error.|
|Task 2|The task has been completed successfully and there are no errors.|The task is still incomplete and there is at least one error.|
|Task 3|The task has been completed successfully and there are no errors.|The task is still incomplete and there is at least one error.|
|Task 4|The task has been completed successfully and there are no errors.|The task is still incomplete and there is at least one error.|

## Submission Information

🚨 **Please review our [Assignment Submission Guide](https://github.com/UofT-DSI/onboarding/blob/main/onboarding_documents/submissions.md)** 🚨 for detailed instructions on how to format, branch, and submit your work. Following these guidelines is crucial for your submissions to be evaluated correctly.

### Submission Parameters:
* Submission Due Date: `HH:MM AM/PM - DD/MM/YYYY`
* The branch name for your repo should be: `assignment-1`
* What to submit for this assignment:
    * This Jupyter Notebook (assignment_1.ipynb) should be populated and should be the only change in your pull request.
* What the pull request link should look like for this assignment: `https://github.com/<your_github_username>/deep_learning/pull/<pr_id>`
    * Open a private window in your browser. Copy and paste the link to your pull request into the address bar. Make sure you can see your pull request properly. This helps the technical facilitator and learning support staff review your submission easily.

Checklist:
- [ ] Created a branch with the correct naming convention.
- [ ] Ensured that the repository is public.
- [ ] Reviewed the PR description guidelines and adhered to them.
- [ ] Verify that the link is accessible in a private browser window.

If you encounter any difficulties or have questions, please don't hesitate to reach out to our team via our Slack at `#cohort-3-help`. Our Technical Facilitators and Learning Support staff are here to help you navigate any challenges.