
# Custom ResNet-50 v2 Model Training and Evaluation

This notebook demonstrates the process of training and evaluating a custom ResNet-50 v2 model with an additional stage. The workflow includes data preprocessing, model architecture definition, training, evaluation, and visualization of results.

## Workflow Overview

1. **Data Loading and Preprocessing**:
    - Load images and labels from `.npy` files.
    - Balance the dataset by augmenting underrepresented classes.
    - Split the dataset into training and validation sets.

2. **Model Architecture**:
    - Define a custom ResNet-50 v2 model with an additional stage (Stage 3.5).
    - Use identity and convolutional blocks to build the model.

3. **Training**:
    - Compile the model with Adam optimizer and sparse categorical cross-entropy loss.
    - Train the model using the balanced dataset with learning rate scheduling.

4. **Evaluation**:
    - Calculate metrics such as accuracy, precision, recall, F1-score, and confusion matrix.
    - Visualize training and validation accuracy/loss over epochs.
    - Plot the confusion matrix and ROC curves for multi-class classification.

5. **Saving Results**:
    - Save the trained model and metrics to files for future use.

## Key Features

- **Data Augmentation**: Random flipping, brightness, and contrast adjustments to improve model generalization.
- **Custom Model Architecture**: ResNet-50 v2 with an additional stage for enhanced feature extraction.
- **Performance Metrics**: Comprehensive evaluation using multiple metrics and visualizations.
- **Model Saving**: Save the trained model and metrics for reproducibility.


### Data Loading and ResNet-50 v2 Block Definitions

This section includes the following steps:

1. **Data Loading**:
    - Images and labels are loaded from `.npy` files using `numpy`.
    - The shapes of the loaded images and labels are printed to verify the data.

2. **Identity Block (v2)**:
    - Implements the identity block for ResNet-50 v2.
    - The block includes:
        - Batch normalization and ReLU activation.
        - Three convolutional layers with specified filters and kernel sizes.
        - A shortcut connection that adds the input to the output of the main path.

3. **Convolutional Block (v2)**:
    - Implements the convolutional block for ResNet-50 v2.
    - The block includes:
        - Batch normalization and ReLU activation.
        - Three convolutional layers with specified filters and kernel sizes.
        - A shortcut connection with a convolutional layer to match dimensions.
        - Stride is used to downsample the input.

These blocks are essential components of the ResNet-50 v2 architecture, enabling deep feature extraction and efficient gradient flow through the network.


In [None]:
import numpy as np
import tensorflow as tf
from collections import Counter
from tensorflow.keras.layers import Conv2D, BatchNormalization, Activation, Add, Input, ZeroPadding2D, MaxPooling2D, AveragePooling2D, Flatten, Dense
from tensorflow.keras.models import Model

# Load the images and labels
images = np.load('images.npy')
labels = np.load('labels.npy')

print("Images shape:", images.shape)
print("Labels shape:", labels.shape)

def identity_block_v2(X, f, filters, stage, block):
    # Define name basis
    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'

    # Retrieve Filters
    F1, F2, F3 = filters

    # Save the input value
    X_shortcut = X

    # First component of main path
    X = BatchNormalization(axis=3, name=bn_name_base + '2a')(X)
    X = Activation('relu')(X)
    X = Conv2D(F1, (1, 1), strides=(1, 1), padding='valid', name=conv_name_base + '2a')(X)

    # Second component of main path
    X = BatchNormalization(axis=3, name=bn_name_base + '2b')(X)
    X = Activation('relu')(X)
    X = Conv2D(F2, (f, f), strides=(1, 1), padding='same', name=conv_name_base + '2b')(X)

    # Third component of main path
    X = BatchNormalization(axis=3, name=bn_name_base + '2c')(X)
    X = Activation('relu')(X)
    X = Conv2D(F3, (1, 1), strides=(1, 1), padding='valid', name=conv_name_base + '2c')(X)

    # Final step: Add shortcut value to main path
    X = Add()([X, X_shortcut])

    return X

def convolutional_block_v2(X, f, filters, stage, block, s=2):
    # Define name basis
    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'

    # Retrieve Filters
    F1, F2, F3 = filters

    # Save the input value
    X_shortcut = X

    # First component of main path
    X = BatchNormalization(axis=3, name=bn_name_base + '2a')(X)
    X = Activation('relu')(X)
    X = Conv2D(F1, (1, 1), strides=(s, s), padding='valid', name=conv_name_base + '2a')(X)

    # Second component of main path
    X = BatchNormalization(axis=3, name=bn_name_base + '2b')(X)
    X = Activation('relu')(X)
    X = Conv2D(F2, (f, f), strides=(1, 1), padding='same', name=conv_name_base + '2b')(X)

    # Third component of main path
    X = BatchNormalization(axis=3, name=bn_name_base + '2c')(X)
    X = Activation('relu')(X)
    X = Conv2D(F3, (1, 1), strides=(1, 1), padding='valid', name=conv_name_base + '2c')(X)

    # Shortcut path
    X_shortcut = Conv2D(F3, (1, 1), strides=(s, s), padding='valid', name=conv_name_base + '1')(X_shortcut)
    X_shortcut = BatchNormalization(axis=3, name=bn_name_base + '1')(X_shortcut)

    # Final step: Add shortcut value to main path
    X = Add()([X, X_shortcut])

    return X

### Custom ResNet-50 v2 Model Architecture

This section defines the architecture of a custom ResNet-50 v2 model with an additional stage (Stage 3.5). The model is implemented using TensorFlow and Keras, and it includes the following components:

1. **Input Layer**:
    - Accepts input images with a shape of `(224, 224, 3)`.

2. **Stage 1**:
    - Zero-padding is applied to the input.
    - A convolutional layer with a kernel size of `(7, 7)` and stride `(2, 2)` is used.
    - Batch normalization and ReLU activation are applied.
    - Max pooling is performed to reduce spatial dimensions.

3. **Stage 2**:
    - A convolutional block is followed by two identity blocks.
    - Each block uses filters `[64, 64, 256]`.

4. **Stage 3**:
    - A convolutional block is followed by three identity blocks.
    - Each block uses filters `[128, 128, 512]`.

5. **New Stage 3.5**:
    - An additional stage is introduced to enhance feature extraction.
    - A convolutional block is followed by two identity blocks.
    - Each block uses filters `[128, 128, 512]`.

6. **Stage 4**:
    - A convolutional block is followed by five identity blocks.
    - Each block uses filters `[256, 256, 1024]`.

7. **Stage 5**:
    - A convolutional block is followed by two identity blocks.
    - Each block uses filters `[512, 512, 2048]`.

8. **Average Pooling**:
    - A global average pooling layer reduces the spatial dimensions to a single value per channel.

9. **Output Layer**:
    - A fully connected layer with a softmax activation function outputs predictions for `7` classes.

The model is created using the `Model` API from Keras, and the function `ResNet50_v2` returns the complete model.


In [None]:
from tensorflow.keras.layers import Input, ZeroPadding2D, Conv2D, BatchNormalization, Activation, MaxPooling2D, AveragePooling2D, Flatten, Dense
from tensorflow.keras.models import Model

def ResNet50_v2(input_shape=(224, 224, 3), classes=7):
    # Define the input as a tensor with shape input_shape
    X_input = Input(input_shape)

    # Stage 1
    X = ZeroPadding2D((3, 3))(X_input)
    X = Conv2D(64, (7, 7), strides=(2, 2), name='conv1')(X)
    X = BatchNormalization(axis=3, name='bn_conv1')(X)
    X = Activation('relu')(X)
    X = MaxPooling2D((3, 3), strides=(2, 2))(X)

    # Stage 2
    X = convolutional_block_v2(X, f=3, filters=[64, 64, 256], stage=2, block='a', s=1)
    X = identity_block_v2(X, 3, [64, 64, 256], stage=2, block='b')
    X = identity_block_v2(X, 3, [64, 64, 256], stage=2, block='c')

    # Stage 3
    X = convolutional_block_v2(X, f=3, filters=[128, 128, 512], stage=3, block='a', s=2)
    X = identity_block_v2(X, 3, [128, 128, 512], stage=3, block='b')
    X = identity_block_v2(X, 3, [128, 128, 512], stage=3, block='c')
    X = identity_block_v2(X, 3, [128, 128, 512], stage=3, block='d')

    # New Stage 3.5 (additional stage)
    X = convolutional_block_v2(X, f=3, filters=[128,128,512], stage=3.5, block='a', s=2)
    X = identity_block_v2(X, 3, [128,128,512], stage=3.5, block='b')
    X = identity_block_v2(X, 3, [128,128,512], stage=3.5, block='c')

    # Stage 4
    X = convolutional_block_v2(X, f=3, filters=[256, 256, 1024], stage=4, block='a', s=2)
    X = identity_block_v2(X, 3, [256, 256, 1024], stage=4, block='b')
    X = identity_block_v2(X, 3, [256, 256, 1024], stage=4, block='c')
    X = identity_block_v2(X, 3, [256, 256, 1024], stage=4, block='d')
    X = identity_block_v2(X, 3, [256, 256, 1024], stage=4, block='e')
    X = identity_block_v2(X, 3, [256, 256, 1024], stage=4, block='f')

    # Stage 5
    X = convolutional_block_v2(X, f=3, filters=[512, 512, 2048], stage=5, block='a', s=2)
    X = identity_block_v2(X, 3, [512, 512, 2048], stage=5, block='b')
    X = identity_block_v2(X, 3, [512, 512, 2048], stage=5, block='c')

    # Average Pooling
    X = AveragePooling2D(pool_size=(2, 2), padding='same')(X)

    # Output layer
    X = Flatten()(X)
    X = Dense(classes, activation='softmax', name='fc' + str(classes))(X)

    # Create model
    model = Model(inputs=X_input, outputs=X, name='ResNet50_v2_stage')

    return model


### Data Augmentation, Dataset Preparation, and Model Training

This section describes the process of augmenting the dataset, balancing class sizes, preparing training and validation datasets, and training the custom ResNet-50 v2 model.

1. **Data Augmentation**:
    - A function `augment_image` is defined to apply random transformations to images:
        - Random horizontal and vertical flips.
        - Random brightness and contrast adjustments.
    - This helps improve the model's generalization by introducing variability in the training data.

2. **Balancing Class Sizes**:
    - The dataset is balanced by augmenting underrepresented classes.
    - For each class:
        - Images and labels are extracted.
        - If the class size is smaller than the maximum class size, augmentation is applied, and the dataset is repeated to match the maximum size.

3. **Dataset Preparation**:
    - All class datasets are combined into a single balanced dataset.
    - The dataset is shuffled and split into training and validation sets:
        - 80% of the data is used for training.
        - 20% of the data is used for validation.
    - The datasets are batched and prefetched for efficient training.

4. **Model Initialization**:
    - A custom ResNet-50 v2 model with an additional stage is initialized using the `ResNet50_v2` function.
    - The model summary is printed to display the number of parameters.

5. **FLOPs Calculation**:
    - A function `get_flops` is defined to calculate the number of floating-point operations (FLOPs) for the model.
    - The FLOPs value is printed to assess the computational complexity of the model.

6. **Model Compilation**:
    - The model is compiled with:
        - Adam optimizer (learning rate: 0.00001).
        - Sparse categorical cross-entropy loss.
        - Accuracy as the evaluation metric.

7. **Callbacks**:
    - Two callbacks are defined:
        - `ModelCheckpoint`: Saves the best model based on validation loss.
        - `ReduceLROnPlateau`: Reduces the learning rate if validation loss does not improve for 3 consecutive epochs.

8. **Model Training**:
    - The model is trained using the `fit` method with:
        - Training and validation datasets.
        - 25 epochs.
        - Steps per epoch and validation steps calculated based on dataset size.
        - The defined callbacks for saving the best model and learning rate scheduling.


In [None]:
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2

# Determine the maximum number of images in any class
label_counts = Counter(labels)
max_count = max(label_counts.values())

# Define a function for augmenting images
def augment_image(image):
    image = tf.image.random_flip_left_right(image)
    image = tf.image.random_flip_up_down(image)
    image = tf.image.random_brightness(image, max_delta=0.1)
    image = tf.image.random_contrast(image, lower=0.9, upper=1.1)
    return image

# Create datasets for each class with augmentation to balance the class sizes
datasets = []
for label, count in label_counts.items():
    class_images = images[labels == label]
    class_labels = labels[labels == label]
    
    class_dataset = tf.data.Dataset.from_tensor_slices((class_images, class_labels))
    
    # Apply augmentation if the class is underrepresented
    if count < max_count:
        class_dataset = class_dataset.map(lambda x, y: (augment_image(x), y))
        class_dataset = class_dataset.repeat((max_count // count) + 1).take(max_count)
    
    datasets.append(class_dataset)

# Combine all class datasets into one balanced dataset
balanced_dataset = tf.data.Dataset.sample_from_datasets(datasets)

# Shuffle the combined dataset
balanced_dataset = balanced_dataset.shuffle(buffer_size=1000).repeat()

# Split the balanced dataset into training and validation sets
val_size = int(0.2 * max_count * len(label_counts))  # 20% of the total balanced dataset
train_dataset = balanced_dataset.skip(val_size)
val_dataset = balanced_dataset.take(val_size)

# Define batch size
batch_size = 32

# Batch and prefetch the datasets
train_dataset = train_dataset.batch(batch_size).prefetch(buffer_size=tf.data.experimental.AUTOTUNE)
val_dataset = val_dataset.batch(batch_size).prefetch(buffer_size=tf.data.experimental.AUTOTUNE)

# Calculate steps per epoch
steps_per_epoch = max(1, (max_count * len(label_counts) - val_size) // batch_size)
validation_steps = max(1, val_size // batch_size)

# Initialize the custom ResNet-50 v2 model
num_classes = 7
model = ResNet50_v2(input_shape=(256, 256, 3), classes=num_classes)
# Print model summary to get number of parameters
model.summary()

# Function to calculate FLOPs
def get_flops(model):
    concrete_func = tf.function(lambda inputs: model(inputs))
    concrete_func = concrete_func.get_concrete_function(
        tf.TensorSpec([1] + list(model.input_shape[1:]))
    )
    frozen_func = convert_variables_to_constants_v2(concrete_func)
    run_meta = tf.compat.v1.RunMetadata()
    opts = tf.compat.v1.profiler.ProfileOptionBuilder.float_operation()    
    flops = tf.compat.v1.profiler.profile(graph=frozen_func.graph, run_meta=run_meta, options=opts)
    
    return flops.total_float_ops

# Get FLOPs and print
flops = get_flops(model)
print(f"FLOPs: {flops}")

# Compile the model
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.00001), loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Define callbacks
callbacks = [
    tf.keras.callbacks.ModelCheckpoint("resnet-50-v2_custom_stage.keras", save_best_only=True, monitor="val_loss")
]
from tensorflow.keras.callbacks import ReduceLROnPlateau

lr_scheduler = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=3, min_lr=1e-7)
history = model.fit(train_dataset, 
                    validation_data=val_dataset, 
                    epochs=25, 
                    steps_per_epoch=steps_per_epoch, 
                    validation_steps=validation_steps, 
                    callbacks=[lr_scheduler])



This code is used to **visualize the training and validation performance of the model** over the epochs. Specifically:

1. **Left Plot**: Displays the training and validation accuracy values for each epoch, helping to assess how well the model is learning and generalizing.
2. **Right Plot**: Shows the training and validation loss values for each epoch, providing insights into the model's convergence and overfitting behavior.

These plots are essential for monitoring the model's performance during training and identifying potential issues like overfitting or underfitting.

In [None]:
import matplotlib.pyplot as plt

# Plot training & validation accuracy values
plt.figure(figsize=(12, 6))

plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend(['Train', 'Validation'], loc='upper left')

# Plot training & validation loss values
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend(['Train', 'Validation'], loc='upper left')

plt.tight_layout()
plt.show()


1. **Save the Final Model**:
    - The trained ResNet-50 v2 model is saved to a file named `"resnet_50-v2.keras"` for future use.

2. **Import Required Libraries**:
    - Libraries like `numpy`, `matplotlib.pyplot`, and `sklearn` are imported for metrics computation and visualization.

3. **Define a Function to Get True Labels and Predicted Probabilities**:
    - The function `get_true_labels_and_probs` iterates over the validation dataset to collect the true labels and predicted probabilities from the model.

4. **Compute True Labels and Predicted Probabilities**:
    - The function is used to get the true labels and predicted probabilities for the validation dataset.

5. **Binarize True Labels**:
    - The true labels are converted into a binary format using `to_categorical` for multi-class ROC computation.

6. **Calculate and Plot Confusion Matrix**:
    - A confusion matrix is computed using `confusion_matrix` and visualized using `ConfusionMatrixDisplay`.

7. **Plot ROC Curve for Each Class**:
    - The ROC curve is plotted for each class using `roc_curve` and `auc`. The Area Under the Curve (AUC) is also displayed for each class.

This code is primarily used for **model evaluation and visualization**, focusing on the confusion matrix and ROC curves for multi-class classification.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

# Function to get true labels and predicted labels from the model
def get_true_and_predicted_labels(model, dataset):
    true_labels = []
    predicted_labels = []
    for x, y_true in dataset:
        y_pred = np.argmax(model.predict(x), axis=-1)
        true_labels.extend(y_true.numpy())
        predicted_labels.extend(y_pred)
    return true_labels, predicted_labels

# Get true and predicted labels for validation dataset
true_labels, predicted_labels = get_true_and_predicted_labels(model, val_dataset)

# Calculate confusion matrix
cm = confusion_matrix(true_labels, predicted_labels)

# Plot confusion matrix
plt.figure(figsize=(8, 6))
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=range(num_classes))
disp.plot(cmap=plt.cm.Blues)
plt.title('Confusion Matrix')
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.show()


### Model Evaluation Metrics

This section focuses on evaluating the performance of the trained ResNet-50 v2 model using various metrics. The following steps are performed:

1. **Accuracy**: Measures the overall correctness of the model's predictions.
2. **Precision (Macro)**: Evaluates the ability of the model to avoid false positives across all classes.
3. **Recall (Macro)**: Measures the ability of the model to identify all relevant instances across all classes.
4. **F1-Score (Macro)**: Provides a balance between precision and recall.
5. **Confusion Matrix**: Displays the performance of the model in a tabular format, showing true positives, false positives, true negatives, and false negatives for each class.
6. **MCC (Matthews Correlation Coefficient)**: A balanced measure that accounts for true and false positives and negatives.
7. **Cohen's Kappa**: Measures the agreement between predicted and true labels, adjusted for chance.
8. **Hamming Loss**: Calculates the fraction of incorrect labels.
9. **Jaccard Score (Macro)**: Measures the similarity between predicted and true labels.

These metrics provide a comprehensive evaluation of the model's performance, ensuring that it performs well across multiple dimensions.

In [None]:
import numpy as np
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix, roc_auc_score, log_loss, matthews_corrcoef, cohen_kappa_score, hamming_loss, jaccard_score
y_true = true_labels
y_pred = predicted_labels
# Calculate metrics
accuracy = accuracy_score(y_true, y_pred)
precision_macro = precision_score(y_true, y_pred, average='macro')
recall_macro = recall_score(y_true, y_pred, average='macro')
f1_macro = f1_score(y_true, y_pred, average='macro')
cm = confusion_matrix(y_true, y_pred)
mcc = matthews_corrcoef(y_true, y_pred)
kappa = cohen_kappa_score(y_true, y_pred)
hamming = hamming_loss(y_true, y_pred)
jaccard_macro = jaccard_score(y_true, y_pred, average='macro')

# Output the metrics
metrics = {
    "Accuracy": accuracy,
    "Precision (Macro)": precision_macro,
    "Recall (Macro)": recall_macro,
    "F1-Score (Macro)": f1_macro,
    "Confusion Matrix": cm,
    "MCC": mcc,
    "Cohen's Kappa": kappa,
    "Hamming Loss": hamming,
    "Jaccard Score (Macro)": jaccard_macro
}

for metric, value in metrics.items():
    print(f"{metric}: {value}")


### Save Training and Validation Metrics

The following code extracts the training and validation metrics (loss and accuracy) from the `history` object and saves them into a CSV file for further analysis:

1. **Extract Metrics**:
    - `train_loss` and `train_accuracy` are extracted for the training dataset.
    - `val_loss` and `val_accuracy` are extracted for the validation dataset.

2. **Create a DataFrame**:
    - A pandas DataFrame is created to organize the metrics into columns: `Train Loss`, `Train Accuracy`, `Validation Loss`, and `Validation Accuracy`.

3. **Save to CSV**:
    - The DataFrame is saved to a file named `Resnet-50-custom.csv`, which can be used for visualization or reporting purposes.


In [None]:
import pandas as pd

# Extract all metrics from history
train_loss = history.history['loss']
train_accuracy = history.history['accuracy']
val_loss = history.history['val_loss']
val_accuracy = history.history['val_accuracy']

# Create a DataFrame to store all metrics
metrics_df = pd.DataFrame({
    'Train Loss': train_loss,
    'Train Accuracy': train_accuracy,
    'Validation Loss': val_loss,
    'Validation Accuracy': val_accuracy
})

# Save metrics to CSV
metrics_df.to_csv('Resnet-50-custom.csv', index=False)
