## Component Two (Vehicle Damage Insurance Claim Verification)

In this notebook we will develop a CNN model which is capable of classifying insurance claim into fraudulent or non-fraudulent using images.

We have the following categories of Damaged Vehicles:

1. Crack
2. Scratch
3. Tire flat
4. Dent
5. Glass shatter
6. Lamp broken

We will also explain our choices in:
* Architecture
* Regularisation
* Hyperparameter tuning

And justify our choices for the above. 

First we will import our libraries.

In [2]:
import os
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, AveragePooling2D, BatchNormalization, GlobalAveragePooling2D
from tensorflow.keras.optimizers import Adam, SGD
from tensorflow.keras.utils import image_dataset_from_directory
from sklearn.utils.class_weight import compute_class_weight
from keras.callbacks import EarlyStopping
from tensorflow.keras.callbacks import ReduceLROnPlateau
from tensorflow.keras.regularizers import l2 
from tensorflow.keras.metrics import Precision, Recall, AUC
from sklearn.metrics import classification_report, confusion_matrix, f1_score, ConfusionMatrixDisplay, roc_curve, auc
from sklearn.preprocessing import label_binarize
from tensorflow.keras.callbacks import TensorBoard

BATCH_SIZE = 32

We are going to use the **flow_from_dataframe()** to read the data from the CSV and our data folder. (https://vijayabhaskar96.medium.com/tutorial-on-keras-flow-from-dataframe-1fd4493d237c)

In [None]:
train_df = pd.read_csv('data/insurance/train/train.csv')
train_df['label'] = train_df['label'] - 1

In [None]:
train_df.label.value_counts()

### Image Preprocessing & Augmentation

Before training a deep learning model on image data, it's important to preprocess and augment the data to improve model performance and generalization. Here's what each part of the code does:


#### `train_datagen` — Training Image Generator

We use `ImageDataGenerator` from Keras to apply real-time data augmentation:


In [None]:
train_df['label'].value_counts()

In [None]:
train_datagen = ImageDataGenerator(
    rescale=1./255.,
    rotation_range=10,  
    width_shift_range=0.1,  # Reduced shift
    height_shift_range=0.1,  # Reduced shift
    horizontal_flip=True,
    shear_range=0.1,  # Reduced shear
    zoom_range=0.1  # Reduced zoom
)

In [None]:
# Convert labels to str
train_df['label'] = train_df['label'].astype(str)

For the test and validation sets, we avoid any kind of augmentation. We only normalize the pixel values so they're in the same scale as the training images.

In [None]:
test_datagen = ImageDataGenerator(rescale=1./255.)

### Splitting the Dataset: Train, Validation, and Test Sets

To ensure the model is trained and evaluated properly, we split the original dataset into three parts:
* **Training**
* **Validation**
* **Test**

#### Step 1: Create Training and Temporary Sets

In [None]:
train_split, temp_split = train_test_split(
    train_df,
    test_size=0.3,
    stratify=train_df['label'],
    random_state=42
)

valid_split, test_split = train_test_split(
    temp_split,
    test_size=0.5,
    stratify=temp_split['label'],
    random_state=42
)


### Creating Image Generators from DataFrames

Once the dataset is split into training, validation, and test sets, we use Keras' `flow_from_dataframe()` to load and preprocess images directly from file paths listed in a DataFrame. This is efficient and flexible for handling image data.

#### Training Generator

The training generator uses the training split and includes data augmentation. This helps the model generalize better by exposing it to a variety of slightly altered versions of the training images. The images are also resized to a consistent shape, and their pixel values are scaled to the range [0, 1]. Shuffling is enabled to ensure that the model doesn’t see the same order of images in every epoch.

#### Validation Generator

The validation generator uses the validation split. No augmentation is applied here — only rescaling is done. This allows us to evaluate how well the model is performing during training without introducing any randomness. Shuffling is turned off to keep the validation evaluation consistent across epochs.

#### Test Generator

The test generator is used to evaluate the final model performance on completely unseen data. Like the validation generator, it only applies rescaling. Shuffling is also turned off to ensure reproducibility and maintain order when generating predictions.

Using these generators ensures that the data is efficiently loaded, consistently preprocessed, and correctly formatted for training and evaluating a deep learning model.


In [None]:
train_generator = train_datagen.flow_from_dataframe(
    dataframe=train_split,
    directory="data/insurance/train/images/",
    x_col="filename",
    y_col="label",
    batch_size=BATCH_SIZE,
    seed=42,
    shuffle=True,
    class_mode="categorical",
    target_size=(192, 192)
)

valid_generator = test_datagen.flow_from_dataframe(
    dataframe=valid_split,
    directory="data/insurance/train/images/",
    x_col="filename",
    y_col="label",
    batch_size=BATCH_SIZE,
    seed=42,
    shuffle=False,
    class_mode="categorical",
    target_size=(192, 192)
)

test_generator = test_datagen.flow_from_dataframe(
    dataframe=test_split,
    directory="data/insurance/train/images/",
    x_col="filename",
    y_col="label",
    batch_size=BATCH_SIZE,
    seed=42,
    shuffle=False,
    class_mode="categorical",
    target_size=(192, 192)
)


In [None]:
# Preview test generator
images, labels = next(test_generator)
image = images[0]
label = labels[0]
plt.imshow(image)
plt.axis('off') 
plt.title(label)
plt.tight_layout()
plt.savefig('demo imagetest', bbox_inches='tight')
plt.show()

Since we have class imbalance we will apply `class_weight` from sklearn.

### Handling Class Imbalance with Class Weights

When working with classification problems, especially in cases where some classes have significantly more samples than others, it's important to address class imbalance. One effective way to do this is by computing **class weights**, which can be passed to the model during training to give more importance to underrepresented classes.

#### Step 1: Get the Mapping of Labels to Indices

The training generator stores a mapping of original class labels to the internal numeric indices it uses during training. This mapping is accessed through `class_indices`.

We then invert this mapping to go from the original label (usually a string or integer) to its corresponding internal index.

#### Step 2: Extract Actual Labels

The `train_generator.classes` attribute provides the class index (as an integer) for each image in the training set. These labels are used to compute the distribution of classes.

#### Step 3: Compute Class Weights

Using `compute_class_weight` from `sklearn.utils.class_weight`, we calculate weights for each class based on how frequently they appear in the training set. Less frequent classes are assigned higher weights to balance their impact during training.

The method requires:
- The strategy (`'balanced'`), which adjusts weights inversely proportional to class frequencies.
- The unique class labels.
- The array of labels used in training.

#### Step 4: Create a Dictionary for Class Weights

The computed weights are zipped with their corresponding class indices to create a dictionary. This `class_weight_dict` can be passed to the model during training using the `class_weight` parameter in `model.fit()`.

This helps the model treat each class more fairly, improving performance especially on minority classes.


In [None]:
class_indices = train_generator.class_indices  

# Invert the mapping to go from original label -> internal index

# Get the actual labels from the generator
train_labels = train_generator.classes  # These will be 0-based integers

# Compute weights on the actual used labels (integers from 0–5)
weights = compute_class_weight(
    class_weight='balanced',
    classes=np.unique(train_labels),
    y=train_labels
)

# Create a dictionary with index-based keys
class_weight_dict = dict(zip(np.unique(train_labels), weights))

print(class_weight_dict)

In [None]:
# Print class weights
class_weight_dict

In [None]:
train_df['label'].value_counts()

### Training the Model and Evaluating Performance

This function handles the training, validation, and evaluation of a Keras deep learning model using image data generators.

#### Early Stopping

An **EarlyStopping** callback is used to monitor the validation loss and stop training when it stops improving. The `patience=5` means training will stop after 5 consecutive epochs without improvement, and `restore_best_weights=True` ensures the model reverts to its best state.

#### Compiling the Model

The model is compiled with:
- A **categorical cross-entropy loss**, which is standard for multi-class classification.
- Several **evaluation metrics**:
  - `accuracy`: overall correctness
  - `precision`: how many predicted positives are correct
  - `recall`: how many actual positives are identified
  - `AUC`: quality of the classifier across thresholds

#### Model Training

The model is trained using:
- The training and validation generators.
- The specified number of epochs.
- The early stopping callback.
- `steps_per_epoch` and `validation_steps` are based on generator lengths.
- Optionally, class weights can be passed (commented out here but available).

#### Model Evaluation

After training, the model is passed to a separate `evaluate_metrics` function which:
- Evaluates the model on both validation and test sets.
- Returns a dictionary of key performance metrics like accuracy, precision, recall, and AUC.

This approach ensures the model is trained with care (early stopping), evaluated thoroughly, and that all key training settings are tracked (architecture, layers, learning rate, etc.).


In [None]:
def save_and_plot_history(history, model_name):
    """
    Plots the training and validation loss and accuracy, and saves the figure.

    Parameters:
    ----------
    history : History
        The history object returned by model.fit() containing the training and validation metrics.
    model_name : str
        The name of the model architecture for tracking purposes.
    epoch_count : int
        Number of epochs the model was trained for, used in plots.
    """
    history_dict = history.history
    epochs = range(1, len(history.history['loss']) + 1)
    
    # Create a figure
    plt.figure(figsize=(12, 6))

    # Plot Loss
    plt.subplot(1, 2, 1)
    plt.plot(epochs, history_dict['loss'], label='Training Loss')
    plt.plot(epochs, history_dict['val_loss'], label='Validation Loss')
    plt.xlabel('Epochs')
    plt.ylabel('Loss')
    plt.legend()
    title = f'{model_name} - Loss'
    plt.title(title, fontsize=10)


    # Plot Accuracy
    plt.subplot(1, 2, 2)
    plt.plot(epochs, history_dict['accuracy'], label='Training Accuracy')
    plt.plot(epochs, history_dict['val_accuracy'], label='Validation Accuracy')
    plt.xlabel('Epochs')
    plt.ylabel('Accuracy')
    plt.legend()
    title = f"Vehicle Plots/{model_name}"
    plt.title(title, fontsize=10)

    # Save the figure as a .png file
    plt.tight_layout()
    plt.savefig(f"Vehicle Plots/{model_name}.png")
    plt.show()

In [None]:
def fit_model(train_generator, model, optimizer, valid_generator, test_generator, arch, dense, lr, epochs, class_weights):
    """
    Trains a deep learning model using the provided training and validation generators,
    with early stopping to prevent overfitting. After training, the function evaluates
    the model and returns relevant performance metrics.

    Parameters:
    ----------
    train_generator : DirectoryIterator
        Generator for training data with labels.
    model : keras.Model
        The compiled Keras model to be trained.
    optimizer : keras.optimizers.Optimizer
        Optimizer to be used during training (e.g., Adam, SGD).
    valid_generator : DirectoryIterator
        Generator for validation data.
    test_generator : DirectoryIterator
        Generator for test data (used during final evaluation).
    arch : str
        Name of the model architecture (for tracking in metrics).
    dense : int
        Number of dense units (for tracking in metrics).
    lr : float
        Learning rate used in the optimizer (for tracking in metrics).
    epochs : int
        Maximum number of training epochs.
    class_weights : dict
        Class weights to handle class imbalance. Can be passed to model.fit().

    Returns:
    -------
    metrics : dict
        A dictionary of evaluation metrics such as accuracy, precision, recall, and AUC.
    """
    early_stopping = EarlyStopping(monitor='val_loss', patience = 5, restore_best_weights=True)
    steps_per_epoch = len(train_generator)  
    validation_steps = len(valid_generator)  

    model.compile(optimizer=optimizer, loss='categorical_crossentropy',
                  metrics=['accuracy',
                            Precision(name='precision'),
                            Recall(name='recall'),
                            AUC(name='auc')])
    history = model.fit(
        train_generator,  
        epochs=epochs, 
        validation_data=valid_generator,
        class_weight = class_weights,
        callbacks=[early_stopping],
    )
    save_and_plot_history(history, arch)

    
    # Get metrics from the evaluate_model function
    metrics = evaluate_metrics(model, valid_generator, test_generator, arch, dense, lr, epochs)   


### Evaluating the Model: Metrics, Reports, and Visualizations

This function is responsible for evaluating the trained model on both validation and test datasets. It performs several key steps and saves important outputs for analysis and comparison.

#### 1. Validation Evaluation

- The model is evaluated on the validation dataset.
- The returned metrics include:
  - `loss`
  - `accuracy`
  - `precision`
  - `recall`
  - `AUC (Area Under Curve)`

#### 2. Predictions & F1 Score (Validation)

- Predictions are made on the validation set.
- Ground truth labels are retrieved.
- **F1 scores** are calculated using:
  - `macro`: average F1 score across all classes
  - `micro`: global metrics by counting total true positives, etc.
  - `weighted`: average F1 weighted by support of each class

#### 3. Classification Report & Confusion Matrix

- A **classification report** is generated using `classification_report()` for validation data.
- It is saved as a CSV:
  - Appends if the file already exists.
  - Otherwise, creates a new file.
- A **confusion matrix** and **ROC-AUC curves** are saved using `save_confusion_matrix()` and `save_roc_auc_curve()` functions.

#### 4. Test Evaluation

- Similar steps are repeated for the test data:
  - Model evaluation
  - Predictions
  - F1 scores
  - Classification report
  - Confusion matrix and ROC curve generation

#### 5. Saving Final Metrics

- A dictionary is created to store all key metrics:
  - From both validation and test datasets
  - Architecture, dense units, learning rate, epochs
- This dictionary is converted to a DataFrame and stored as `metrics.csv`:
  - Appends if the file exists
  - Creates a new file otherwise

#### 6. Return Value

- The function returns a **single-row DataFrame** containing all important metrics.
- This is useful for comparing models across different configurations or hyperparameters.


In [None]:
def evaluate_metrics(model, valid_gen, test_gen, arch, dense, kr, epochs):
    """
    Evaluates a trained deep learning model on both validation and test datasets. 
    Computes standard metrics, generates classification reports, confusion matrices, and 
    ROC curves, and stores results in CSV files for further analysis.

    Parameters:
    ----------
    model : keras.Model
        The trained Keras model to be evaluated.
    valid_gen : DirectoryIterator
        Generator containing validation data.
    test_gen : DirectoryIterator
        Generator containing test data.
    arch : str
        Name of the model architecture (used for saving reports).
    dense : int
        Number of dense units in the model (for tracking in saved metrics).
    kr : float
        Learning rate used (for tracking in saved metrics).
    epochs : int
        Number of training epochs (for tracking in saved metrics).

    Returns:
    -------
    metrics_df : pandas.DataFrame
        A single-row DataFrame containing all the evaluation metrics for both validation and test sets.
    """

    class_names = ['Crack', 'Scratch', 'Tire flat', 'Dent', 'GS', 'LB']

    # Step 5: Evaluate model on test data (new addition)
    test_results = model.evaluate(test_gen)
    
    # Unpacking test evaluation results
    test_loss = test_results[0]
    test_accuracy = test_results[1]
    test_precision = test_results[2]
    test_recall = test_results[3]
    test_auc = test_results[4]
    
    # Get predictions and true labels for test data
    y_pred_probs_test = model.predict(test_gen)
    y_pred_test = np.argmax(y_pred_probs_test, axis=1)
    y_true_test = test_gen.classes

    # Step 6: F1 Scores on test data
    f1_macro_test = f1_score(y_true_test, y_pred_test, average='macro')
    f1_micro_test = f1_score(y_true_test, y_pred_test, average='micro')
    f1_weighted_test = f1_score(y_true_test, y_pred_test, average='weighted')

    # Step 7: Classification report for test data
    print("\nClassification Report (Test):\n")
    class_report_test = classification_report(y_true_test, y_pred_test, target_names=class_names, output_dict=True)
    print(classification_report(y_true_test, y_pred_test, target_names=class_names))

    # Convert classification report for test data to DataFrame for easy saving as CSV
    class_report_df_test = pd.DataFrame(class_report_test).transpose()

    # Save or append the classification report for test data as CSV
    class_report_filename_test = f"{arch}_test_classification_report.csv"
    
    if os.path.exists(class_report_filename_test):
        class_report_df_test.to_csv(class_report_filename_test, mode='a', header=False)
        print(f"Appended to existing test classification report: {class_report_filename_test}")
    else:
        class_report_df_test.to_csv(class_report_filename_test, mode='w', header=True)
        print(f"Test classification report saved as {class_report_filename_test}")

    # Save confusion matrix and ROC curve for test data
    save_confusion_matrix(y_true_test, y_pred_test, class_names, arch)
    save_roc_auc_curve(y_true_test, y_pred_probs_test, class_names, arch)

    # Step 8: Return metrics dictionary (for both validation and test data)
    metrics_dict = {
        "Architecture": arch,
        "test_loss": test_loss,
        "test_accuracy": test_accuracy,
        "test_precision": test_precision,
        "test_recall": test_recall,
        "test_auc": test_auc,
        "f1_macro_test": f1_macro_test,
        "f1_micro_test": f1_micro_test,
        "f1_weighted_test": f1_weighted_test,
        "Dense": dense,
        "learning_rate": kr,
        "epochs": epochs
    }

    # Convert the dictionary into a DataFrame (1 row, columns as keys)
    metrics_df = pd.DataFrame([metrics_dict])

    # Define the file path for saving the metrics
    metrics_filename = "metrics.csv"
    
    if os.path.exists(metrics_filename):
        metrics_df.to_csv(metrics_filename, mode='a', header=False, index=False)
        print(f"Appended to existing metrics: {metrics_filename}")
    else:
        metrics_df.to_csv(metrics_filename, mode='w', header=True, index=False)
        print(f"Metrics saved as {metrics_filename}")

    return metrics_df


### Saving the Confusion Matrix

The `save_confusion_matrix` function generates and saves a **confusion matrix plot** for evaluating the performance of a model. Here's a breakdown of the steps:

#### 1. **Computing the Confusion Matrix**

- The confusion matrix is computed using the true labels (`y_true`) and the predicted labels (`y_pred`).
- The confusion matrix is a table used to evaluate the performance of a classification algorithm by showing the number of correct and incorrect predictions.

#### 2. **Displaying the Matrix**

- A `ConfusionMatrixDisplay` object is created to visually display the confusion matrix.
- The matrix is plotted using a red color map (`cmap='Reds'`) with integer values formatted for clarity.

#### 3. **Adding Labels and Title**

- The axes are labeled as "Predicted Label" and "True Label" for clear understanding.
- The title is dynamically set using the model architecture name (`arch`).

#### 4. **Saving the Plot**

- The confusion matrix plot is saved as a **PNG file** under the `Vehicle Plots/` directory.
- The filename is based on the model architecture (`arch`) to ensure that plots from different models are distinguishable.

#### 5. **Closing the Plot**

- The plot is closed after saving to free up memory for subsequent operations.
  
#### 6. **Print Confirmation**

- A print statement confirms the filename of the saved confusion matrix plot, ensuring the user knows where the plot is stored.


In [None]:
def save_confusion_matrix(y_true, y_pred, class_names, arch):
    """
    Computes and saves the confusion matrix plot for the model predictions.

    Parameters:
    ----------
    y_true : array-like
        The true labels of the dataset.
    y_pred : array-like
        The predicted labels from the model.
    class_names : list
        A list of class labels to display on the axes of the confusion matrix.
    arch : str
        The model architecture name (used for saving the plot with a relevant filename).

    Returns:
    -------
    None
        Saves the confusion matrix plot as a PNG file in the specified directory.
    """
    # Compute confusion matrix
    cm = confusion_matrix(y_true, y_pred)
    print(cm)
    # Create ConfusionMatrixDisplay object
    disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=class_names)

    # Plot confusion matrix
    plt.figure(figsize=(10, 8))
    disp.plot(cmap='Reds', values_format='d')
    
    # Set labels and title for better clarity
    plt.xlabel("Predicted Label")
    plt.ylabel("True Label")
    title = f"ConfMatrix - {arch}"
    plt.title(title, loc='center', fontsize=10)
    # Save confusion matrix plot as PNG file
    cm_plot_filename = f"{arch}_confusion_matrix.png"
    plt.savefig(f'Vehicle Plots/{cm_plot_filename}with weights.png')

    # Close plot to free up memory
    plt.close()

    # Print the filename where the plot is saved
    print(f"Confusion matrix plot saved as {cm_plot_filename}")

### `save_roc_auc_curve` Function

This function generates and saves a multi-class ROC-AUC curve plot for the model's predictions.

#### **Parameters:**
- `y_true`: **Ground truth class labels** (integers).
  - These are the true class labels of the data.
  
- `y_pred_probs`: **Predicted probabilities** from the model.
  - These are the probability scores predicted by the model for each class.
  
- `class_names`: **List of class names**.
  - A list containing the names of all the classes in the classification task.
  
- `arch`: **Architecture name** (string).
  - The name of the model architecture used, which will be included in the filename when saving the plot.

#### **Steps:**
1. **Binarize the labels**:
   - Converts the true class labels (`y_true`) into binary format for each class, as required for ROC curve computation.

2. **Calculate the ROC curve for each class**:
   - For each class, it computes the **False Positive Rate (FPR)** and **True Positive Rate (TPR)**, as well as the **AUC (Area Under the Curve)** score.

3. **Plot the ROC curve**:
   - For each class, the ROC curve is plotted with the calculated FPR and TPR values, and the corresponding AUC score is displayed in the legend.
   - A diagonal line (`k--`) representing a random classifier is also plotted.

4. **Save the ROC curve plot**:
   - The ROC curve plot is saved as a PNG file with the architecture name in the filename (e.g., `architecture_name_roc_curve.png`).

#### **Output:**
- The function saves the ROC curve plot as a PNG file in the `Vehicle Plots/` directory.
- Prints the filename where the plot has been saved.


In [None]:
def save_roc_auc_curve(y_true, y_pred_probs, class_names, arch):
    """
    Saves multi-class ROC-AUC curve plot as PNG.

    Parameters:
    - y_true: Ground truth class labels (integers)
    - y_pred_probs: Predicted probabilities from the model
    - class_names: List of class names
    - arch: Architecture name (used in filename)
    """
    n_classes = len(class_names)
    y_true_bin = label_binarize(y_true, classes=list(range(n_classes)))

    fpr = dict()
    tpr = dict()
    roc_auc = dict()

    for i in range(n_classes):
        fpr[i], tpr[i], _ = roc_curve(y_true_bin[:, i], y_pred_probs[:, i])
        roc_auc[i] = auc(fpr[i], tpr[i])

    plt.figure(figsize=(10, 8))

    for i in range(n_classes):
        plt.plot(fpr[i], tpr[i], lw=2, label=f'{class_names[i]} (AUC = {roc_auc[i]:.2f})')

    plt.plot([0, 1], [0, 1], 'k--', lw=2)
    plt.xlim([0.0, 1.0])
    plt.ylim([0.0, 1.05])
    plt.xlabel('False Positive Rate')
    plt.ylabel('True Positive Rate')
    plt.title('ROC Curve - Multi-class')
    plt.legend(loc='lower right')

    roc_curve_filename = f"{arch}_roc_curve.png"
    plt.savefig(f'Vehicle Plots/{roc_curve_filename}with weights.png')
    plt.close()

    print(f"ROC curve plot saved as {roc_curve_filename}")


### Model Training with Conv2D Architecture

The model uses a Convolutional Neural Network (CNN) architecture with the following layers:

- **Convolutional Layers (Conv2D)**: Four layers with increasing filters (32, 64, 128, 256), each followed by Batch Normalization and MaxPooling2D to downsample the spatial dimensions.
- **Flatten**: Used to reduce the spatial dimensions before the dense layer.
- **Dense Layers**: A fully connected layer with 128 units and a final output layer with 6 units (corresponding to 6 classes), using the softmax activation function.

The optimizer used is **Adam** with a learning rate of 0.001.

Training is performed for 20 epochs, utilizing class weights for class imbalance, and the model's performance is validated using the validation generator and evaluated on the test set.


In [None]:
model = Sequential([
    Conv2D(32, (3,3), padding='same', activation='relu', input_shape=(192, 192, 3)),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    Conv2D(64, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    Conv2D(128, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    Conv2D(256, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    Flatten(),
    Dense(128, activation='relu'),
    Dropout(0.5),
    Dense(6, activation='softmax')
])

In [None]:
adam_optimizer = Adam(learning_rate=0.01)
fit_model(train_generator, model, adam_optimizer, valid_generator, test_generator, 'Conv2D with 4 layers with weights', 128, 0.01, 20, class_weight_dict)

### Model Training with Conv2D Architecture (5 Layers)

The updated model uses a Convolutional Neural Network (CNN) architecture with the following layers:

- **Convolutional Layers (Conv2D)**: Five layers with filters (32, 64, 128, 128, 256), each followed by Batch Normalization and MaxPooling2D for spatial downsampling.
- **GlobalAveragePooling2D**: Reduces the spatial dimensions before passing the data to the dense layer.
- **Dense Layers**: A fully connected layer with 128 units followed by a dropout layer (0.5) and a final output layer with 6 units, using softmax activation for multi-class classification.

The optimizer used is **Adam** with a learning rate of 0.001.

Training is conducted for 20 epochs, with class weights applied for handling class imbalance. The model's performance is evaluated using the validation and test generators.


In [None]:
model2 = Sequential([
    Conv2D(32, (3,3), padding='same', activation='relu', input_shape=(192, 192, 3)),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    Conv2D(64, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    Conv2D(128, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),
    Conv2D(128, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),
    Conv2D(256, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    GlobalAveragePooling2D(),
    Dense(128, activation='relu'),
    Dropout(0.5),
    Dense(6, activation='softmax')
])

In [None]:
adam_optimizer = Adam(learning_rate=0.001)
fit_model(train_generator, model2, adam_optimizer, valid_generator, test_generator, 'Conv2D with 5 layers with weights', 128, 0.001, 20, class_weight_dict)

### Model Training with Conv2D Architecture (5 Layers) - 40 Epochs

This model uses a Convolutional Neural Network (CNN) architecture with the following layers:

- **Convolutional Layers (Conv2D)**: Four layers with filters (32, 64, 128, 256), each followed by Batch Normalization and MaxPooling2D for spatial downsampling.
- **GlobalAveragePooling2D**: Reduces the spatial dimensions before passing the data to the dense layer.
- **Dense Layers**: A fully connected layer with 128 units followed by a dropout layer (0.5) and a final output layer with 6 units, using softmax activation for multi-class classification.

The optimizer used is **Adam** with a learning rate of 0.001.

Training is conducted for 40 epochs, with class weights applied to handle class imbalance. The model's performance is evaluated using the validation and test generators.


In [None]:
model3 = Sequential([
    Conv2D(32, (3,3), padding='same', activation='relu', input_shape=(192, 192, 3)),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    Conv2D(64, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    Conv2D(128, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    Conv2D(256, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    GlobalAveragePooling2D(),
    Dense(128, activation='relu'),
    Dropout(0.5),
    Dense(6, activation='softmax')
])

In [None]:
fit_model(train_generator, model3, Adam(learning_rate=0.001), valid_generator, test_generator, 'Conv2D with 4 layers and 40 epochs with weights', 128, 0.001, 40, class_weight_dict)

### Model Training with Conv2D Architecture (256 Dense Units) - 30 Epochs

This model features a Convolutional Neural Network (CNN) architecture with the following layers:

- **Convolutional Layers (Conv2D)**: Four layers with filters (32, 64, 128, 256), each followed by Batch Normalization and MaxPooling2D to reduce spatial dimensions.
- **GlobalAveragePooling2D**: Reduces the spatial dimensions to a single vector before the fully connected layer.
- **Dense Layer**: A fully connected layer with 256 units and ReLU activation, followed by a dropout layer with a rate of 0.5 for regularization. The final output layer has 6 units, using softmax activation for multi-class classification.

The optimizer used is **Adam** with a learning rate of 0.001.

Training runs for 30 epochs with class weights applied to handle class imbalance, and the model is evaluated on the validation and test sets.


In [None]:
model4 = Sequential([
    Conv2D(32, (3,3), padding='same', activation='relu', input_shape=(192, 192, 3)),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    Conv2D(64, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    Conv2D(128, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    Conv2D(256, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    GlobalAveragePooling2D(),
    Dense(256, activation='relu'),
    Dropout(0.5),
    Dense(6, activation='softmax')
])

In [None]:
adam_optimizer = Adam(learning_rate=0.001)
fit_model(train_generator, model4, adam_optimizer, valid_generator, test_generator, 'Conv2D with 256 dense layers with weights', 256, 0.001, 30, class_weight_dict)

### Model Training with Conv2D Architecture (5 Layers, 192x192 Input) with Manual Weights - 40 Epochs

This model utilizes a deep Convolutional Neural Network (CNN) architecture with the following layers:

- **Convolutional Layers (Conv2D)**: 
  - Five convolutional layers with increasing filter sizes: 32, 64, 128, 128, and 256.
  - Each convolutional layer is followed by **BatchNormalization** and **MaxPooling2D** layers to reduce spatial dimensions.

- **GlobalAveragePooling2D**: 
  - After the convolutional layers, the spatial dimensions are reduced to a single vector using GlobalAveragePooling2D.

- **Fully Connected Layers**:
  - A **Dense layer** with 256 units and ReLU activation.
  - A **Dropout layer** with a rate of 0.5 for regularization to prevent overfitting.
  - The output layer has **6 units** with **softmax activation** for multi-class classification.

#### Optimizer:
- **Adam** optimizer with a learning rate of **0.0001**.

#### Class Weights:
- **Manual Weights**: To handle class imbalance, the following manual class weights are applied:
  - Class 0: 5.0
  - Class 1: 1.0
  - Class 2: 2.0
  - Class 3: 1.5
  - Class 4: 1.2
  - Class 5: 1.0

#### Model Training:
- The model is trained for **40 epochs** using the Adam optimizer and the manual class weights. 
- **Training Data**: The model is trained using the `train_generator`.
- **Validation and Test Data**: The model is evaluated on the `valid_generator` and `test_generator`.


In [None]:
model5 = Sequential([
    Conv2D(32, (3,3), padding='same', activation='relu', input_shape=(192, 192, 3)),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    Conv2D(64, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    Conv2D(128, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),
    Conv2D(128, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),
    Conv2D(256, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    GlobalAveragePooling2D(),
    Dense(256, activation='relu'),
    Dropout(0.5),
    Dense(6, activation='softmax')
])
model5.summary()

In [None]:
manual_weights = {0: 5.0, 1: 1.0, 2: 2.0, 3: 1.5, 4: 1.2, 5: 1.0}
adam_optimizer = Adam(learning_rate=0.0001)
fit_model(train_generator, model5, adam_optimizer, valid_generator,test_generator, 'Conv2D with 5 layers and 192x192 with weights', 256, 0.0001, 40, manual_weights)

In [None]:
print(train_generator.class_indices)

### Model Training with Conv2D Architecture (6 Layers, 192x192 Input) - 40 Epochs

This model utilizes a deep Convolutional Neural Network (CNN) architecture with the following layers:

- **Convolutional Layers (Conv2D)**: Six layers with filters (32, 64, 128, 128, 256, 256), each followed by Batch Normalization and MaxPooling2D. Dropout (0.2) is applied after the second and fifth convolutional blocks to prevent overfitting.
- **GlobalAveragePooling2D**: Reduces the spatial dimensions to a single vector before the fully connected layer.
- **Dense Layer**: A fully connected layer with 256 units and ReLU activation, followed by a dropout layer with a rate of 0.5 for regularization. The output layer has 6 units with softmax activation for multi-class classification.

The optimizer used is **Adam** with a learning rate of 0.0001. The model is trained for 40 epochs with class weights to address class imbalance, and it is evaluated on both the validation and test sets.


In [None]:
model6 = Sequential([
    Conv2D(32, (3,3), padding='same', activation='relu', input_shape=(192, 192, 3)),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    Conv2D(64, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),
    Dropout(0.2),

    Conv2D(128, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),
    Conv2D(128, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),
    Conv2D(256, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),
    Dropout(0.2),
    
    Conv2D(256, (3,3), padding='same', activation='relu'),
    BatchNormalization(),
    MaxPooling2D((2,2)),

    GlobalAveragePooling2D(),
    Dense(256, activation='relu'),
    Dropout(0.5),
    Dense(6, activation='softmax')
])

In [None]:
adam_optimizer = Adam(learning_rate=0.0001)
fit_model(train_generator, model6, adam_optimizer, valid_generator, test_generator, 'Conv2D with 6 layers and 192x192 with weights', 128, 0.0001, 40, class_weight_dict)

# CNN Model Performance Comparison

## 1. Architecture Overview
We evaluated six different CNN architectures with varying configurations in terms of layers, learning rates, and epochs. Below is a comparison of the performance across models using key metrics.

### Models Overview:
1. **Model 1 (Conv2D with 5 layers)**
2. **Model 2 (Conv2D with 5 layers and 40 epochs)**
3. **Model 3 (Conv2D with 256 dense layers)**
4. **Model 4 (Conv2D with 5 layers and 192x192 input)**
5. **Model 5 (Conv2D with 6 layers and 192x192 input)**
6. **Model 6 (Conv2D with 6 layers and 192x192 input + Dropout)**




In [None]:
metrics = pd.read_csv('metrics.csv')

In [None]:
metrics

### Model Performance Analysis

Below is an analysis of various convolutional neural network (CNN) architectures based on the provided metrics. We will look at **test loss**, **test accuracy**, **test precision**, **test recall**, **test AUC**, and **F1 scores** for different models.

#### Summary of Key Metrics:
1. **Test Loss**: Lower test loss indicates a better model fit, as the model's predictions are closer to the true values.
2. **Test Accuracy**: The percentage of correct predictions out of the total predictions made. Higher accuracy indicates better overall performance.
3. **Test Precision**: The proportion of true positive predictions out of all positive predictions. This metric is important in imbalanced datasets to understand how well the model identifies the positive class.
4. **Test Recall**: The proportion of true positive predictions out of all actual positives. Recall is crucial in cases where false negatives are more problematic.
5. **Test AUC (Area Under the ROC Curve)**: A higher AUC value indicates a better model ability to distinguish between classes.
6. **F1 Score (Macro, Micro, Weighted)**: These provide a balance between precision and recall. The **macro** F1 score averages the F1 score for each class, the **micro** F1 score aggregates all classes, and the **weighted** F1 score considers the imbalance of the dataset by weighing the F1 score of each class by its frequency.

#### Detailed Observations:

1. **Conv2D with 4 layers**:
   - Test accuracy: **0.1685** (very low, indicating the model is not performing well).
   - Test AUC: **0.6177** (moderate, suggesting some ability to differentiate between classes).
   - F1 scores are also low, indicating poor model balance in terms of precision and recall.

2. **Conv2D with 5 layers**:
   - Test accuracy: **0.4250** (a significant improvement over the 4-layer model).
   - Test AUC: **0.8115** (better differentiation between classes).
   - F1 scores show improvements, particularly **f1_weighted_test** (**0.4565**), indicating a better balance.

3. **Conv2D with 4 layers and 40 epochs**:
   - Test accuracy: **0.4130** (similar to the 5-layer model but with fewer epochs).
   - Test AUC: **0.8051** (still quite good).
   - The F1 scores are lower than the 5-layer model, but performance is relatively close.

4. **Conv2D with 256 dense layers**:
   - Test accuracy: **0.5009** (significant improvement).
   - Test AUC: **0.8508** (excellent ability to distinguish classes).
   - F1 scores are higher across the board, particularly the **f1_micro_test** (**0.5043**), indicating more balanced predictions.

5. **Conv2D with 5 layers and 192x192 input size**:
   - Test accuracy: **0.4259** (similar to the 5-layer model with slightly better input resolution).
   - Test AUC: **0.7957** (slightly lower than the previous models).
   - The F1 scores are competitive with the 5-layer model.

6. **Conv2D with 6 layers and 192x192 input size**:
   - Test accuracy: **0.4157** (slightly lower than the 5-layer, 192x192 model).
   - Test AUC: **0.8160** (improved AUC).
   - F1 scores are close to other models with 192x192 input size.

7. **Conv2D with 5 layers and 192x192 with weights**:
   - Test accuracy: **0.4380** (a small improvement).
   - Test AUC: **0.8187** (better differentiation ability).
   - F1 scores are decent but don't significantly outperform previous models.

8. **Conv2D with 5 layers and 192x192 with manual weights**:
   - Test accuracy: **0.6898** (a significant improvement).
   - Test AUC: **0.9459** (excellent performance in distinguishing between classes).
   - The **F1 scores** (especially **f1_weighted_test** of **0.6899**) suggest that the model is highly balanced, performing well in precision and recall across classes.

#### Conclusion:

This study developed and evaluated CNN for vehicle damage classification to support insurance claims. By using a five-layer model with ReLU activation, Batch normalization and Global Average Pooling. The model achieved a strong AUC of 0.94 and precision of 0.72 which demonstrates its effectiveness in differentiating damage between all labels. 

Regularization strategies such as Dropout, Data Augmentation and class weight imbalance handling were key to improving performance and reducing overfitting. Although the training accuracy and validation accuracy curves showed that the model was still overfitting, which suggests further improvements were required. 

Overall, the results showed that CNN is a good tool for classifying images in this field. With further refinement or advanced architecture and more balanced datasets we could enhance our model’s performance. 