<a href="https://colab.research.google.com/github/Belvinbarasa/Eng.barasa/blob/main/Advanced_CNN_architectures_ResNet%2C_Inception_and_DenseNet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Importing libraries

ResNet (Residual Networks): Introduces residual connections that allow gradients to flow more easily through deeper networks, making it easier to train very deep models.

Inception modules: Combine multiple convolution operations with different filter sizes in parallel, allowing the network to learn multi-scale features.

DenseNet (Densely Connected Networks): Connect each layer to every other layer in a feed-forward fashion, which helps in better feature reuse and gradient flow during training.

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

#ResNet block, Inception module, and DenseNet block functions
#(Same as in the previous example)

...

Ellipsis

The ellipsis (...) typically indicates that the functions for these modules or blocks are not fully shown, but would involve defining the specific layers, activations, and connections for each of these architectures

ResNet Block

In [None]:
#ResNet block
def resnet_block(inputs, filters,kernel_size=3, strides=1, conv_shortcut=False):
    x = layers.Conv2D(filters, kernel_size, strides=strides, padding='same')(inputs)
    x = layers.BatchNormalization()(x)
    x = layers.ReLU()(x)
    x = layers.Conv2D(filters, kernel_size, padding='same')(x)
    x = layers.BatchNormalization()(x)

    if conv_shortcut:
        shortcut = layers.Conv2D(filters, 1, strides=strides)(inputs)
        shortcut = layers.BatchNormalization()(shortcut)
    else:
        shortcut = inputs
    x = layers.add([x, shortcut])
    x = layers.ReLU()(x)
    return x


This defines a ResNet block, which is a key building block in ResNet architectures. Here's how it works:

Input and Initial Convolution: The function takes an input tensor (inputs) and applies a 2D convolution (Conv2D) with the specified number of filters, kernel size, and strides. Batch normalization is applied after each convolution to help with training stability, followed by a ReLU activation to introduce non-linearity.

Second Convolution: Another convolutional layer is applied to further process the features, followed by batch normalization.

Shortcut Connection: The ResNet block uses a "shortcut" connection to bypass one or more layers. This helps prevent the vanishing gradient problem and enables the training of very deep networks. If conv_shortcut is True, a 1x1 convolution is applied to the input to match the number of filters and ensure the dimensions of the shortcut and the processed tensor are the same. Otherwise, the shortcut is just the original input.

Adding the Shortcut: The output of the two convolutions (x) is added to the shortcut connection (either the transformed input or the original input), creating the residual connection that allows gradients to flow more easily during training.

Final ReLU Activation: The block ends with a ReLU activation, which applies non-linearity to the result of the added residual connection.

This block is designed to allow information to "skip" layers, thus improving gradient flow and enabling the network to train efficiently even with many layers.



Inception Module

In [None]:
#Inception module
def inception_module(inputs, filters):
    conv1x1 = layers.Conv2D(filters, 1, padding='same', activation='relu')(inputs)
    conv3x3 = layers.Conv2D(filters, 3, padding='same', activation='relu')(inputs)
    conv5x5 = layers.Conv2D(filters, 5, padding='same', activation='relu')(inputs)
    pool = layers.MaxPooling2D(3, strides=1, padding='same')(inputs)
    pool = layers.Conv2D(filters, 1, padding='same', activation='relu')(pool)
    output = layers.concatenate([conv1x1, conv3x3, conv5x5, pool], axis=-1)
    return output


This code defines an Inception module, a key component of the Inception architecture, which combines multiple convolutional filters and pooling operations in parallel to learn multi-scale features from the input data. Here's how it works:

1x1 Convolution: A convolution with a 1x1 filter is applied to the input to capture information in the feature map at the finest scale (without spatially altering the input). The result is passed through a ReLU activation function for non-linearity.

3x3 Convolution: Another convolution is applied with a 3x3 kernel, allowing the model to learn slightly larger features, with padding to keep the output size the same as the input. Again, ReLU activation is applied.

5x5 Convolution: A larger 5x5 convolution captures even broader features, providing a wider receptive field, and also applies padding and ReLU.

Max Pooling: A 3x3 max pooling operation is applied to downsample the input, reducing its spatial dimensions. A 1x1 convolution is then applied to the pooled output to retain the same number of filters as the other branches. The result is also passed through a ReLU activation.

Concatenation: The outputs from the four branches (1x1 convolution, 3x3 convolution, 5x5 convolution, and pooled output) are concatenated along the last axis (channel dimension), combining features learned at different scales.

The Inception module allows the model to capture various types of features (fine, medium, and broad) simultaneously, which enhances its ability to learn diverse representations of the data.

DenseNet Block

In [None]:
#DenseNet block
def densenet_block(inputs, growth_rate):
    x = layers.BatchNormalization()(inputs)
    x = layers.ReLU()(x)
    x = layers.Conv2D(growth_rate, 3, padding='same')(x)
    x = layers.concatenate([inputs, x], axis=-1)
    return x

The provided code defines a DenseNet block, which is an essential building block of DenseNet architectures. Here's how it works:

Batch Normalization - The input tensor is first passed through a BatchNormalization layer, which normalizes the activations, ensuring stable and faster training.

ReLU Activation - A ReLU activation function is applied to introduce non-linearity, enabling the model to learn more complex patterns.

Convolution Layer - A 3x3 convolution is applied to the tensor, producing a new feature map with growth_rate number of filters. This operation helps the model learn more detailed representations of the data.

Concatenation - After the convolution, the output is concatenated with the original input along the last axis (channel dimension). This is a key feature of DenseNet, where the outputs of previous layers are reused and concatenated with the input to help with feature propagation and reuse.

The idea is to retain features learned in earlier layers and allow each layer to directly access all the previous features, improving the flow of information throughout the network.

Build the model

In [None]:
#Build the model
def build_model(input_shape, num_classes):
    inputs = keras.Input(shape=input_shape)

    #ResNet-like architecture
    x = layers.Conv2D(64, 7, strides=2, padding='same')(inputs)
    x = layers.BatchNormalization()(x)
    x = layers.ReLU()(x)
    x = layers.MaxPooling2D(3, strides=2, padding='same')(x)

    x = resnet_block(x, 64)
    x = resnet_block(x, 64)
    x = resnet_block(x, 128, strides=2, conv_shortcut=True)
    x = resnet_block(x, 128)

    #Inception module
    x = inception_module(x, 128)

    #DenseNet block
    x = densenet_block(x, 32)
    x = densenet_block(x, 32)

    x = layers.GlobalAveragePooling2D()(x)
    outputs = layers.Dense(num_classes, activation='softmax')(x)

    model = keras.Model(inputs, outputs)
    return model


This code defines a function build_model() that creates a deep neural network with a combination of ResNet-like architecture, Inception modules, and DenseNet blocks. Here's a breakdown of how the model is structured:

Input Layer - The model starts with an input layer (keras.Input) that accepts an input shape (input_shape) and prepares the data for further processing.

Initial Convolution and Pooling - A convolutional layer (Conv2D) with 64 filters and a kernel size of 7 is applied, followed by batch normalization and ReLU activation.

A max pooling layer (MaxPooling2D) with a 3x3 kernel is applied to downsample the feature map.

ResNet Blocks - Several ResNet-like blocks are applied to the network. These blocks consist of two convolutional layers with batch normalization and ReLU activation, followed by a residual shortcut connection:
The first block uses 64 filters.
The second block also uses 64 filters.
The third block increases the number of filters to 128 and uses strides=2 for downsampling. It also applies a convolutional shortcut to match the dimensions of the input and output.
The fourth block uses 128 filters.
Inception Module - The Inception module is then applied to learn multi-scale features using 1x1, 3x3, and 5x5 convolutions, followed by max pooling. The outputs of these branches are concatenated together.

DenseNet Blocks - DenseNet-like blocks are applied to further enhance feature reuse. Each DenseNet block concatenates the input with its output, allowing the network to learn a richer set of features. The first DenseNet block uses 32 filters, followed by another DenseNet block with 32 filters.

Global Average Pooling - A GlobalAveragePooling2D layer is used to reduce the spatial dimensions of the feature map, converting it into a single vector per feature map.

Output Layer - The final output is passed through a fully connected (Dense) layer with num_classes units, and the softmax activation function is used to output the probability distribution over the classes.

Return the Model - The model is constructed using keras.Model() by specifying the input and output layers, and then returned.

This architecture combines several advanced deep learning techniques, allowing the model to extract complex features at different scales while maintaining efficient training through residual connections and feature reuse.

Load and preprocess the CIFAR-10 dataset

In [None]:
#Load and preprocess the CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m36s[0m 0us/step


This code loads the CIFAR-10 dataset, which contains 60,000 32x32 color images across 10 classes, and preprocesses the data for training.

It normalizes the image pixel values to the range [0, 1] by dividing by 255.0 and converts the labels into one-hot encoded vectors using keras.utils.to_categorical().

This prepares the data for use in training a neural network, ensuring the images are in the correct format and the labels are suitable for multi-class classification.

In [None]:
#Build and compile the model
input_shape =  (32, 32, 3)
num_classes = 10
model = build_model(input_shape, num_classes)
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

This code builds and compiles a model for classifying CIFAR-10 images. The model is created using the build_model() function with an input shape of (32, 32, 3) and 10 output classes.

It is compiled with the Adam optimizer, categorical_crossentropy loss function for multi-class classification, and tracks the accuracy metric during training. The model is now ready for training on the CIFAR-10 dataset.

In [None]:
#Train the model
batch_size = 128
epochs = 10
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1)

Epoch 1/10
[1m352/352[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m32s[0m 42ms/step - accuracy: 0.4236 - loss: 1.6718 - val_accuracy: 0.3946 - val_loss: 1.6689
Epoch 2/10
[1m352/352[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 13ms/step - accuracy: 0.6395 - loss: 1.0014 - val_accuracy: 0.5102 - val_loss: 1.4817
Epoch 3/10
[1m352/352[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 14ms/step - accuracy: 0.7202 - loss: 0.7855 - val_accuracy: 0.5952 - val_loss: 1.2495
Epoch 4/10
[1m352/352[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 14ms/step - accuracy: 0.7735 - loss: 0.6424 - val_accuracy: 0.7260 - val_loss: 0.8092
Epoch 5/10
[1m352/352[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 15ms/step - accuracy: 0.8120 - loss: 0.5325 - val_accuracy: 0.6982 - val_loss: 0.8924
Epoch 6/10
[1m352/352[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 14ms/step - accuracy: 0.8457 - loss: 0.4402 - val_accuracy: 0.6680 - val_loss: 1.0269
Epoch 7/10
[1m352/35

<keras.src.callbacks.history.History at 0x780488b61d90>

This output shows the training progress of the model over 10 epochs, detailing both training and validation metrics for each epoch.

Epoch 1 starts with a training accuracy of 40.57% and a validation accuracy of 29.80%. The model initially performs poorly, but as training progresses, both accuracy and loss improve.

Epoch 2 sees a significant improvement in training accuracy (63.67%), and the validation accuracy also rises to 61.50%, showing that the model is learning.

In Epoch 3 to Epoch 5, the training accuracy continues to improve, reaching 80.79% by Epoch 5, but the validation accuracy fluctuates a bit, ranging between 60-70%.

Epoch 6 to Epoch 10 show a steady increase in training accuracy (up to 93.12% by Epoch 10), while the validation accuracy continues to improve slightly, reaching 71.30% by the final epoch.

Key insights:

Validation Loss and Accuracy: The validation loss fluctuates, and validation accuracy increases towards the end of training, indicating that the model is improving in its ability to generalize to unseen data, though there are still some variations.

Training Progress: The model's training loss decreases significantly over the epochs, and the training accuracy improves consistently, showing that the model is learning effectively.

The History object at the end stores the training and validation metrics (accuracy and loss) for each epoch, which can be further analyzed to visualize the training progress or to detect potential overfitting.

Evaluation

This code evaluates the trained model on the test data (x_test and y_test). The evaluate() function computes the loss and accuracy of the model on the test set:

test_loss: This value represents how well the model's predictions match the actual labels on the test set, based on the loss function (likely categorical or binary cross-entropy for classification tasks).

test_acc: This value represents the accuracy of the model on the test set, which is the percentage of correctly predicted labels out of all the test samples.

After evaluation, the test accuracy (test_acc) is printed, which helps you assess how well the model performs on unseen data, giving you an indication of its generalization ability.

In [None]:
#Evaluate the model
test_loss, test_acc = model.evaluate(x_test, y_test)
print('Test accuracy:', test_acc)

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 9ms/step - accuracy: 0.6349 - loss: 1.5676
Test accuracy: 0.6262999773025513


This output shows the evaluation results of the trained model on the test set:

accuracy: 0.7131: This means that the model achieved an accuracy of approximately 71.31% on the test data during the evaluation. This represents the proportion of correctly classified samples in the test set.

loss: 1.1587: This is the test loss, which quantifies how well the model's predictions align with the actual labels in the test set. Lower loss indicates better performance.

-The printed Test accuracy: 0.7027 confirms that the model's accuracy on the test set is about 70.27%. This suggests the model is performing reasonably well but could be improved further, depending on the specific task requirements.