---
---
# Convolutional Neural Network
---
---

In this notebook, we will explore and compare different architectural approaches for image processing tasks, specifically focusing on comparing traditional architectures and Convolutional Neural Networks (CNNs). We'll work with multiple image datasets to gain hands-on experience and see how these models perform in practical scenarios.

## Libraries Imports

In [None]:
#import tensorflow and keras
import tensorflow as tf
from keras import layers, models, datasets

#import pandas and matplotlib for accurcy visualization
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

## Loading Data

### Fashion MNIST
It dataset is a modern alternative to the traditional MNIST dataset, a staple in the machine learning community for handwriting recognition tasks. Developed as a more challenging and representative benchmark for machine learning algorithms, Fashion MNIST comprises grayscale images of various fashion products from 10 different categories.

### Key Features of Fashion MNIST:

- **Dataset Size:** Fashion MNIST consists of a total of 70,000 images, divided into a training set of 60,000 images and a test set of 10,000 images.
- **Image Details:** Each image in the dataset is 28x28 pixels in size, represented in grayscale (1 channel).
- **Categories:** The dataset includes 10 types of fashion items:
>- T-shirt/top
>- Trouser
>- Pullover
>- Dress
>- Coat
>- Sandal
>- Shirt
>- Sneaker
>- Bag
>- Ankle boot

### Purpose and Applications:

Fashion MNIST is specifically designed to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits but involves a different, arguably more complex, set of images.



In [None]:
# Loading the Dataset from Keras
fashion_mnist = datasets.fashion_mnist
(X_train_full, y_train_full), (X_test, y_test) = fashion_mnist.load_data()

#Spliting the training Dataset into train and validation
X_valid, X_train = X_train_full[:4000] / 255.0, X_train_full[4000:] / 255.0
y_valid, y_train = y_train_full[:4000], y_train_full[4000:]

## Modeling

Now Let's try the following traditional approach for deep neural net

In [None]:
model = models.Sequential([
                          layers.Flatten(input_shape=[28, 28]),
                          layers.Dense(16, activation="relu"),
                          layers.Dense(8, activation="relu"),
                          layers.Dense(10, activation="softmax")
                          ])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.summary()

Model: "sequential_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 flatten_5 (Flatten)         (None, 784)               0         
                                                                 
 dense_15 (Dense)            (None, 16)                12560     
                                                                 
 dense_16 (Dense)            (None, 8)                 136       
                                                                 
 dense_17 (Dense)            (None, 10)                90        
                                                                 
Total params: 12786 (49.95 KB)
Trainable params: 12786 (49.95 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [None]:
#Fit or Train the model
history = model.fit(X_train, y_train, batch_size=1000, epochs=10, validation_data=(X_valid, y_valid))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


## Model evaluating
Once you have attained satisfactory validation accuracy for your model, it is crucial to evaluate its performance on the test set to estimate the generalization error before deploying the model to production. This can be accomplished conveniently using the evaluate() method. This method supports various arguments, including batch_size or sample_weight. For further details, please refer to the documentation.

In [None]:
#Model evaluating
model.evaluate(X_test, y_test)



[68.99422454833984, 0.8152999877929688]

**It looks relatively good but let's try another more advanced approach**



## Introducing Convolutional Neural Networks (CNNs)

Convolutional Neural Networks (CNNs) are a class of deep neural networks that are especially effective for analyzing visual imagery. They have become a cornerstone technology in the field of computer vision, particularly excelling in tasks like image and video recognition, image classification, and also applications such as medical image analysis and autonomous driving.
![CNN](https://saturncloud.io/images/blog/a-cnn-sequence-to-classify-handwritten-digits.webp)
### Key Components of CNNs:

1. **Convolutional Layers:**
   - These are the core building blocks of a CNN. The convolutional layer applies a number of filters to the input. Each filter detects different features such as edges, colors, or more complex shapes. The output of this layer is called a feature map, which highlights the areas of the input image most activated by the filter.

2. **ReLU Layer (Activation):**
   - After each convolution operation, an activation function such as the Rectified Linear Unit (ReLU) is applied to introduce non-linearity into the model. Non-linearity is crucial as it helps the network learn more complex patterns in the data.

3. **Pooling Layers:**
   - Pooling (also known as subsampling or downsampling) reduces the dimensionality of each feature map but retains the most important information. Max pooling is a common technique used to reduce the spatial dimensions of the input volume for the next convolution layer. It works by selecting the maximum value from each cluster of neurons at the prior layer.

4. **Fully Connected Layers:**
   - After several convolutional and pooling layers, the high-level reasoning in the neural network is done via fully connected layers. Neurons in a fully connected layer have full connections to all activations in the previous layer. This part of the network is typically responsible for assembling the features extracted by the convolutional layers and pooling layers to form the final outputs.

5. **Output Layer:**
   - The final layer uses an activation function such as softmax (for classification tasks) to map the output of the last fully connected layer to probability distributions over classes.

In [None]:
model = models.Sequential([
                          layers.Conv2D(32, kernel_size=(4, 4), activation='relu', input_shape=(28, 28,1)),
                          layers.MaxPool2D(2,2),
                          layers.Flatten(),
                          layers.Dense(16, activation="relu"),
                          layers.Dense(8, activation="relu"),
                          layers.Dense(10, activation="softmax")
                          ])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.summary()

Model: "sequential_6"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_3 (Conv2D)           (None, 25, 25, 32)        544       
                                                                 
 max_pooling2d_3 (MaxPoolin  (None, 12, 12, 32)        0         
 g2D)                                                            
                                                                 
 flatten_6 (Flatten)         (None, 4608)              0         
                                                                 
 dense_18 (Dense)            (None, 16)                73744     
                                                                 
 dense_19 (Dense)            (None, 8)                 136       
                                                                 
 dense_20 (Dense)            (None, 10)                90        
                                                      

### How CNNs Process an Image:
![CNN process](https://i.stack.imgur.com/bN2iA.png)
- **Input:** The process begins with an input image, which is passed through a series of convolutional, nonlinear, and pooling layers.
- **Feature Learning:** Through the convolutional layers, the network learns to identify various features of the image. Early layers might detect simple features like edges and textures, while deeper layers can identify more complex features like patterns or objects.
- **Classification:** After feature extraction, the network uses fully connected layers to determine the content of the image based on the presence of learned features, culminating in a classification decision.

In [None]:
# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(X_train, y_train, batch_size=2000, epochs=10, validation_data=(X_valid, y_valid))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x7bada1a4fee0>

## Model evaluating
Once you have attained satisfactory validation accuracy for your model, it is crucial to evaluate its performance on the test set to estimate the generalization error before deploying the model to production. This can be accomplished conveniently using the evaluate() method. This method supports various arguments, including batch_size or sample_weight. For further details, please refer to the documentation.

In [None]:
#Model evaluating
model.evaluate(X_test, y_test)



[58.213592529296875, 0.8378000259399414]

### Advantages of CNNs:

- **Automatic Feature Extraction:** Unlike traditional algorithms, CNNs learn to detect features without needing any explicit programming, making them highly effective for tasks involving complex visual inputs.
- **Spatial Hierarchies:** CNNs can learn spatial hierarchies of features thanks to their deep architecture. They can recognize objects regardless of variations in their appearance or in different environments.
- **Efficiency:** Once trained, CNNs can make predictions rapidly, making them suitable for applications requiring real-time processing.

CNNs are foundational to modern image recognition systems and continue to push the boundaries in fields reliant on image understanding.

# Now let's try a more challenging problem

## Loading Data

The above CNN method truly shines when used with big images and especially multi-channel images

**The CIFAR-100** dataset is a well-known dataset used in machine learning and computer vision for evaluating image recognition algorithms. It is a more complex and diverse dataset compared to its counterpart, CIFAR-10, primarily due to the larger number of classes.

![Cifar100](https://datasets.activeloop.ai/wp-content/uploads/2022/09/CIFAR-100-dataset-Activeloop-Platform-visualization-image.webp)

>- CIFAR-100 contains 60,000 32x32 color images.
>- The images are divided into 100 classes, each containing 600 images.
>- The 100 classes are grouped into 20 superclasses.
>- Each superclass encompasses several classes that are more specific; for example, the "aquatic mammals" superclass includes classes like "beaver", "dolphin", and "otter".

In [None]:
# Loading the Dataset from Keras
cifar = datasets.cifar100
(X_train_full, y_train_full), (X_test, y_test) = cifar.load_data(label_mode='coarse')

#Spliting the training Dataset into train and validation
X_valid, X_train = X_train_full[:4000] / 255.0, X_train_full[4000:] / 255.0
y_valid, y_train = y_train_full[:4000], y_train_full[4000:]

In [None]:
X_train_full.shape

(50000, 32, 32, 3)

In [None]:
np.unique(y_train)

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19])

## Modeling

In [None]:
model = models.Sequential([
                          layers.Flatten(input_shape=[32, 32, 3]),
                          layers.Dense(16, activation="relu"),
                          layers.Dense(8, activation="relu"),
                          layers.Dense(20, activation="softmax")
                          ])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.summary()

Model: "sequential_8"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 flatten_8 (Flatten)         (None, 3072)              0         
                                                                 
 dense_24 (Dense)            (None, 16)                49168     
                                                                 
 dense_25 (Dense)            (None, 8)                 136       
                                                                 
 dense_26 (Dense)            (None, 20)                180       
                                                                 
Total params: 49484 (193.30 KB)
Trainable params: 49484 (193.30 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [None]:
#Fit or Train the model
history = model.fit(X_train, y_train, batch_size=1000, epochs=10, validation_data=(X_valid, y_valid))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


## Model evaluating

In [None]:
#Model evaluating
model.evaluate(X_test, y_test)



[100.51434326171875, 0.10509999841451645]

As observed the traditional way does not work well with complex images. Now let's try out the CNN approach

In [None]:
model = models.Sequential([
                          layers.Conv2D(32, kernel_size=(4, 4), activation='relu', input_shape=(32, 32,3)),
                          layers.MaxPool2D(2,2),
                          layers.Flatten(),
                          layers.Dense(16, activation="relu"),
                          layers.Dense(8, activation="relu"),
                          layers.Dense(20, activation="softmax")
                          ])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.summary()

Model: "sequential_10"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_5 (Conv2D)           (None, 29, 29, 32)        1568      
                                                                 
 max_pooling2d_5 (MaxPoolin  (None, 14, 14, 32)        0         
 g2D)                                                            
                                                                 
 flatten_10 (Flatten)        (None, 6272)              0         
                                                                 
 dense_30 (Dense)            (None, 16)                100368    
                                                                 
 dense_31 (Dense)            (None, 8)                 136       
                                                                 
 dense_32 (Dense)            (None, 20)                180       
                                                     

In [None]:
# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(X_train, y_train, batch_size=2000, epochs=10, validation_data=(X_valid, y_valid))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x7bada0368a30>

In [None]:
#Model evaluating
model.evaluate(X_test, y_test)



[280.3564453125, 0.18209999799728394]

As you can see the accuracy here improved upon using the new CNN approach.

It's also worth noting that this is a simple CNN Design in real life the neural nets are much more complex and performs much better.