---
---
# Advanced Convolutional Neural Network
---
---

In this notebook, we will delve into advanced Convolutional Neural Network (CNN) architectures for complex image classification tasks. Building upon our previous experience with simpler datasets like MNIST-Fashion, we'll now tackle the more challenging CIFAR-100 dataset, which consists of 100 classes of various objects and animals.

## Libraries Imports

In [None]:
#import tensorflow and keras
import tensorflow as tf
from keras import layers, models, datasets

#import pandas and matplotlib for accurcy visualization
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

## Loading Data

The CNN method truly shines when used with big images and especially multi-channel images

**The CIFAR-100** dataset is a well-known dataset used in machine learning and computer vision for evaluating image recognition algorithms. It is a more complex and diverse dataset compared to its counterpart, CIFAR-10, primarily due to the larger number of classes.

![Cifar100](https://datasets.activeloop.ai/wp-content/uploads/2022/09/CIFAR-100-dataset-Activeloop-Platform-visualization-image.webp)

>- CIFAR-100 contains 60,000 32x32 color images.
>- The images are divided into 100 classes, each containing 600 images.
>- The 100 classes are grouped into 20 superclasses.
>- Each superclass encompasses several classes that are more specific; for example, the "aquatic mammals" superclass includes classes like "beaver", "dolphin", and "otter".

In [None]:
# Loading the Dataset from Keras
cifar = datasets.cifar100
(X_train_full, y_train_full), (X_test, y_test) = cifar.load_data(label_mode='coarse')

#Spliting the training Dataset into train and validation
X_valid, X_train = X_train_full[:4000] / 255.0, X_train_full[4000:] / 255.0
y_valid, y_train = y_train_full[:4000], y_train_full[4000:]

In [None]:
X_train_full.shape

(50000, 32, 32, 3)

In [None]:
np.unique(y_train)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])

## Modeling

In [None]:
model = models.Sequential([
                          layers.Flatten(input_shape=[32, 32, 3]),
                          layers.Dense(16, activation="relu"),
                          layers.Dense(8, activation="relu"),
                          layers.Dense(20, activation="softmax")
                          ])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.summary()

Model: "sequential_8"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 flatten_8 (Flatten)         (None, 3072)              0         
                                                                 
 dense_24 (Dense)            (None, 16)                49168     
                                                                 
 dense_25 (Dense)            (None, 8)                 136       
                                                                 
 dense_26 (Dense)            (None, 20)                180       
                                                                 
Total params: 49484 (193.30 KB)
Trainable params: 49484 (193.30 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [None]:
#Fit or Train the model
history = model.fit(X_train, y_train, batch_size=1000, epochs=10, validation_data=(X_valid, y_valid))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


## Model evaluating

In [None]:
#Model evaluating
model.evaluate(X_test, y_test)



[100.51434326171875, 0.10509999841451645]

As observed the traditional way does not work well with complex images. Now let's try out the CNN approach

In [None]:
model = models.Sequential([
                          layers.Conv2D(32, kernel_size=(4, 4), activation='relu', input_shape=(32, 32,3)),
                          layers.MaxPool2D(2,2),
                          layers.Flatten(),
                          layers.Dense(16, activation="relu"),
                          layers.Dense(8, activation="relu"),
                          layers.Dense(20, activation="softmax")
                          ])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.summary()

Model: "sequential_10"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_5 (Conv2D)           (None, 29, 29, 32)        1568      
                                                                 
 max_pooling2d_5 (MaxPoolin  (None, 14, 14, 32)        0         
 g2D)                                                            
                                                                 
 flatten_10 (Flatten)        (None, 6272)              0         
                                                                 
 dense_30 (Dense)            (None, 16)                100368    
                                                                 
 dense_31 (Dense)            (None, 8)                 136       
                                                                 
 dense_32 (Dense)            (None, 20)                180       
                                                     

In [None]:
# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(X_train, y_train, batch_size=2000, epochs=10, validation_data=(X_valid, y_valid))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x7bada0368a30>

In [None]:
#Model evaluating
model.evaluate(X_test, y_test)



[280.3564453125, 0.18209999799728394]

As you can see the accuracy here improved upon using the new CNN approach.

It's also worth noting that this is a simple CNN Design in real life the neural nets are much more complex and performs much better.

### Challenge 1: Optimize the above CNN for the CIFAR-10 dataset.

1. How does your optimized model compare to the baseline in terms of accuracy and F1-score?
2. Which techniques had the most significant impact on model performance?
3. How might you further improve the model's performance?

Remember to document your experimental process, including any architectures or techniques you tried that didn't work well. This is an important part of the machine learning process!

### Challenge 2 (Bonus)

Can you achieve an F1-score of over 0.85 on the test set? Share your best model architecture and training process with your classmates!