In [None]:
!pip install livelossplot

Importing Libraries: The code starts by importing necessary libraries and modules including ones for data handling, model building, evaluation metrics, and visualization.

In [None]:
import os
import numpy as np
import pandas as pd
import random
import glob
import matplotlib.pyplot as plt
import seaborn as sns
import matplotlib.image as mimg
%matplotlib inline
from PIL import Image
from scipy import misc

import tensorflow as tf
import keras
import keras.backend as K
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential, Model

from keras.layers import Input, Conv2D, Conv2DTranspose, MaxPooling2D, Concatenate, BatchNormalization
from keras.layers import UpSampling2D, Dropout, Add, Multiply, Subtract, AveragePooling2D
from keras.layers import Activation, SpatialDropout2D
from keras.layers import Dense, Lambda
from keras.layers import GlobalAveragePooling2D, Reshape, Dense, Permute, Flatten

from keras.utils import plot_model

from keras.optimizers import * 
from keras.callbacks import *
from keras.activations import *

from sklearn.metrics import classification_report, confusion_matrix, roc_curve, auc
from sklearn.metrics import precision_score, recall_score, f1_score, accuracy_score
from sklearn.preprocessing import label_binarize

from livelossplot import PlotLossesKeras
from mlxtend.plotting import plot_confusion_matrix

from tensorflow.keras.applications import DenseNet201, DenseNet121,DenseNet169,InceptionResNetV2,ResNet152V2
from scipy.ndimage import median_filter


Setting Seeds: Seeds for random number generation are set to ensure reproducibility across runs.

In [None]:
np.random.seed(42)
random.seed(42)
tf.random.set_seed(42)

Defining Data Directories: Paths to directories containing training, validation, and test data are specified.

In [None]:
train_loc = '/kaggle/input/mpox-skin-lesion-dataset-version-20-msld-v20/Augmented Images/Augmented Images/FOLDS_AUG/fold5_AUG/Train'
val_loc = '/kaggle/input/mpox-skin-lesion-dataset-version-20-msld-v20/Original Images/Original Images/FOLDS/fold5/Valid'
test_loc = '/kaggle/input/mpox-skin-lesion-dataset-version-20-msld-v20/Original Images/Original Images/FOLDS/fold5/Test'

Batch size for training and validation data generators is defined. It's set to 32 .

In [None]:
#Use different batch size  for training and validation to make sure that the model is exposed to a wide variety of data during each training and validation. Then comapre both rseult.
BATCH_SIZE = 32

Data Generators: Image data generators are initialized for training, validation, and test data. These generators will preprocess the images and yield batches of data during model training.
These lines of code are using the `ImageDataGenerator` class from Keras to create data generators for loading and augmenting images from directories. Here's a breakdown:

1. **ImageDataGenerator**:
   - `ImageDataGenerator` is a utility class in Keras that generates batches of tensor image data with real-time data augmentation.

2. **Data Loading**:
   - `train_data`, `val_data`, and `test_data` are data generators created for training, validation, and testing datasets respectively.
   - Each data generator loads images from a specific directory (`train_loc`, `val_loc`, `test_loc`) and generates batches of augmented images during training.

3. **Parameters**:
   - `directory`: Specifies the path to the directory containing the images.
   - `target_size`: Tuple specifying the height and width to which all images will be resized during loading.
   - `batch_size`: The number of samples in each batch of data generated.
   - `shuffle`: Boolean indicating whether to shuffle the data after each epoch.
   - `seed`: Random seed for shuffling and transformations.

4. **Flow from Directory**:
   - `flow_from_directory` method is called on each `ImageDataGenerator` object to generate batches of data from the specified directories.
   - It automatically infers the labels from the subdirectory structure and assigns them to the images.

5. **Training, Validation, and Testing Data**:
   - `train_data`: Data generator for training dataset.
   - `val_data`: Data generator for validation dataset.
   - `test_data`: Data generator for testing dataset.

These data generators are suitable for training convolutional neural networks (CNNs) on image classification tasks where the dataset is organized into separate directories for each class. The generators load images in batches, which is memory efficient and allows for real-time data augmentation during training.

In [None]:
trdata = ImageDataGenerator()
train_data = trdata.flow_from_directory(directory=train_loc, target_size=(224,224), batch_size=BATCH_SIZE, shuffle=True, seed=42)

vdata = ImageDataGenerator()
val_data = vdata.flow_from_directory(directory=val_loc, target_size=(224,224), batch_size=BATCH_SIZE, shuffle=True, seed=42)

tsdata = ImageDataGenerator()
test_data = tsdata.flow_from_directory(directory=test_loc, target_size=(224,224), batch_size=BATCH_SIZE, shuffle=False, seed=42)

Model Creation Function: The create_model function is defined to build a classification model using one of the specified pre-trained CNN architectures (DenseNet121,DenseNet169, ResNet152V2, InceptionResNetV2, DenseNet201). The function adds custom fully connected layers on top of the pre-trained base and compiles the model with specified optimizer and loss function.

This `create_model` function constructs a convolutional neural network (CNN) based on the specified model architecture (`model_name`). Here's how it works:

- **Inputs**:
  - `model_name`: Name of the model architecture to be used (`DenseNet121`, `DenseNet169`, `InceptionResNetV2`, `ResNet152V2`, or `DenseNet201`).
  - `input_shape`: Shape of the input data (e.g., `(height, width, channels)`).
  - `n_classes`: Number of classes for classification.
  - `optimizer`: Optimizer used for training the model.
  - `fine_tune`: Boolean indicating whether fine-tuning is to be performed.

- **Model Construction**:
  - Based on the `model_name`, it initializes the corresponding pre-trained CNN model (e.g., DenseNet121, DenseNet169) with pre-trained weights from ImageNet.
  - The top layers of the pre-trained model are removed, leaving only the convolutional base.
  - Additional layers are added on top of the convolutional base for custom classification:
    - `GlobalAveragePooling2D`: Performs global average pooling operation over the spatial dimensions of the input. This reduces each feature map to a single number by taking the average.
    - `Flatten`: Flattens the input to a one-dimensional array.
    - `Dense` layers with ReLU activation: These fully connected layers introduce non-linearity to the model.
    - `Dropout`: Regularization technique to prevent overfitting by randomly dropping a fraction of units (0.2 and 0.3) during training.
    - Final `Dense` layer with softmax activation: Produces output probabilities for each class.

- **Model Compilation**:
  - Compiles the model using the specified `optimizer` and loss function (`categorical_crossentropy` for multi-class classification).
  - Metrics are set to `'accuracy'` for monitoring training and validation accuracy.

- **Output**:
  - Returns the compiled model.

This function allows for creating various CNN architectures with customizable input shape, number of classes, and optimizer. It leverages transfer learning by initializing pre-trained models and fine-tuning them for the specific classification task.

In [None]:
def create_model(model_name, input_shape, n_classes, optimizer, fine_tune):
    if model_name == 'DenseNet121':
        conv_base = DenseNet121(include_top=False, weights='imagenet', input_shape=input_shape)
    elif model_name == 'DenseNet169':
        conv_base = DenseNet169(include_top=False, weights='imagenet', input_shape=input_shape)
    elif model_name == 'InceptionResNetV2':
        conv_base = InceptionResNetV2(include_top=False, weights='imagenet', input_shape=input_shape)
    elif model_name == 'ResNet152V2':
        conv_base = ResNet152V2(include_top=False, weights='imagenet', input_shape=input_shape)
    elif model_name == 'DenseNet201':
        conv_base = DenseNet201(include_top=False, weights='imagenet', input_shape=input_shape)
    
    else:
        raise ValueError("Invalid model name!")
    
    top_model = conv_base.output
    top_model = GlobalAveragePooling2D()(top_model)
    top_model = Flatten(name="flatten")(top_model)
    top_model = Dense(128, activation='relu')(top_model)
    top_model = Dropout(0.2)(top_model)    
    top_model = Dense(64, activation='relu')(top_model)
    top_model = Dropout(0.3)(top_model)
    output_layer = Dense(n_classes, activation='softmax')(top_model)
    
    model = Model(inputs=conv_base.input, outputs=output_layer)
    
    model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
    
    return model

In situations where the hard voting predicts one class as the majority and the soft voting predicts a different class, but the actual class is neither of them, it indicates a disagreement among the ensemble models. This scenario can occur due to various reasons such as noise in the data, mislabeling, or inherent uncertainty in the classification task.
To handle such situations and improve the robustness of your classifier, you can consider the following steps:

1.	Threshold Adjustment: Adjust the threshold for considering the majority in hard voting. Instead of a simple majority vote, you can require a higher threshold (e.g., more than half of the models) to agree on a class prediction before making a decision based on hard voting.

2.	Confidence Score: Calculate a confidence score for each prediction based on the agreement among the models. If there is high agreement among the models, you can trust the prediction more. Conversely, if there is disagreement, you can assign a lower confidence score to the prediction.

3.	Ensemble Weights: Assign different weights to the predictions of individual models based on their performance or reliability. Models with higher performance or higher confidence scores can be given more weight in the final decision.

4.	Post-Processing Techniques: Apply post-processing techniques such as smoothing or filtering to the predictions to remove noise or outliers.

5.	Human-in-the-Loop: Incorporate human feedback or domain knowledge to resolve conflicting predictions. Human experts can provide valuable insights in ambiguous cases.

6.	Revisiting Training Data: Reevaluate the training data and labels to ensure accuracy and consistency. Retraining the models with improved data quality may lead to better performance.

7.	Model Diversity: Ensure diversity among the ensemble models by using different architectures, hyperparameters, or training data. This can help reduce the likelihood of all models making the same errors.

8.	Error Analysis: Conduct thorough error analysis to identify patterns in misclassifications and potential causes. This can guide future improvements in data collection, preprocessing, or model architecture.


To combine hard and soft voting, you can first compute the hard voting predictions and then compute the soft voting predictions only for the cases where there is no clear majority in the hard voting predictions.
By combining both hard and soft voting, you can leverage the benefits of both approaches and potentially improve the overall accuracy of your ensemble classifier.

•	I've added a function combined_voting to perform ensemble voting with confidence score and threshold adjustment.

•	Inside train_evaluate_ensemble, after evaluating individual models and obtaining their predictions, I've calculated ensemble predictions using hard voting, soft voting, and the combined approach.

•	The combined_voting function takes the predictions from individual models, calculates the confidence scores, and then combines the predictions based on the threshold.

•	Finally, I've evaluated the combined predictions and printed the classification report and confusion matrix for the combined model.


In [None]:
def combined_voting(test_predictions, hard_voting_predictions, soft_voting_predictions, threshold):
    confidence_scores = np.mean(np.max(test_predictions, axis=1), axis=0)

    combined_predictions = []

    min_len = min(len(hard_voting_predictions), len(soft_voting_predictions))
    
    for i in range(min_len):
        if confidence_scores[i] >= threshold:
            combined_predictions.append(hard_voting_predictions[i])
        else:
            combined_predictions.append(soft_voting_predictions[i])

    combined_predictions = np.array(combined_predictions)
    return combined_predictions


Post-processing techniques can be applied after obtaining the combined predictions to refine the final output. In the provided code, we can apply post-processing techniques such as filtering or smoothing to the predictions to remove noise or outliers and  to improve the quality of the predictions. 

Smoothing and filtering are both techniques used in signal and image processing to enhance or modify data. While they serve similar purposes, they are typically used in slightly different contexts and may employ different mathematical methods.

1.	Smoothing:

•	Smoothing is a process of reducing the noise or high-frequency variations in a signal or an image.

•	It aims to create a smoother version of the data by averaging or interpolating neighboring values.

•	Smoothing is often used to remove noise, blur sharp edges, or simplify complex patterns.

•	Common smoothing techniques include Gaussian smoothing, median filtering, and moving average filtering.

•	Smoothing is commonly applied to data that exhibit random fluctuations or high-frequency noise.

2.	Filtering:

•	Filtering refers to the process of modifying or extracting specific components from a signal or an image using a filter.

•	Filters can be designed to enhance certain features, suppress noise, or extract relevant information from the data.

•	Filtering can be categorized into various types such as low-pass, high-pass, band-pass, and notch filters, each designed to address specific frequency components.

•	Filtering can be linear or nonlinear, depending on the characteristics of the filter.

•	Filtering is a broader concept that encompasses various operations, including smoothing, sharpening, edge detection, and feature extraction.

In summary, smoothing is a specific type of filtering operation that focuses on reducing noise and creating a smoother version of the data, while filtering encompasses a broader range of operations aimed at modifying or extracting specific components from the data. Smoothing is often used as a preprocessing step before further analysis, while filtering can serve multiple purposes depending on the specific application.

•	I've defined a function apply_post_processing to apply median filtering to the combined predictions.

•	The apply_post_processing function takes the combined predictions and applies the median filter with a specified filter size (in this case, 3x3).

•	After obtaining the combined predictions, we apply the post-processing technique to filter the predictions.

•	Finally, we evaluate the filtered predictions and print the classification report and confusion matrix for the combined model after post-processing.

In [None]:
# Apply post-processing technique - Median Filtering
def apply_post_processing(predictions, filter_size):
    filtered_predictions = []
    for prediction in predictions:
        filtered_prediction = median_filter(prediction, size=filter_size)
        filtered_predictions.append(filtered_prediction)
    return np.array(filtered_predictions)

This function `train_evaluate_ensemble` is designed to train, evaluate, and ensemble multiple models. Let's break down its functionality:

1. **Inputs**:
   - `models`: A list containing names of different models to be used.
   - `input_shape`: The shape of the input data.
   - `n_classes`: The number of classes in the classification task.
   - `optimizer`: The optimizer used for training the models.
   - `fine_tune`: A boolean indicating whether fine-tuning is to be performed.

2. **Initialization**:
   - `model_list`: An empty list to store the trained models.
   - `acc_individual_models`: A dictionary to store the accuracy of individual models.

3. **Training and Evaluation of Individual Models**:
   - Iterate through each model in the `models` list.
   - Create the model using `create_model` function.
   - Train the model using training data (`train_data`) and validate on validation data (`val_data`).
   - Evaluate the trained model on the test data (`test_data`).
   - Store the accuracy of each individual model in `acc_individual_models` dictionary.
   - Print accuracy, classification report, and confusion matrix for each individual model.

4. **Ensemble Predictions**:
   - Generate predictions for each individual model on the test data.
   - Perform ensemble using majority voting and soft voting techniques.
   - Calculate the accuracy of the ensemble models.

5. **Post-processing and Evaluation of Combined Predictions**:
   - Combine predictions from individual models.
   - Apply post-processing technique (median filtering) using `apply_post_processing` function.
   - Evaluate the accuracy of the combined predictions after post-processing.
   - Print accuracy, classification report, and confusion matrix for the combined model.

6. **Save Combined Filtered Model**:
   - Save the combined filtered model to a specified path.

7. **Accuracy Comparison Visualization**:
   - Plot a bar graph showing the accuracy comparison between individual models, ensemble models, and the combined filtered model.

8. **ROC Analysis for Ensemble Model**:
   - Perform ROC analysis for the ensemble model with combined filtered predictions.
   - Plot ROC curves for each class.

9. **Outputs**:
   - Return the accuracy of the combined filtered model, the path where the model is saved, and the list of trained models.

This function provides a comprehensive analysis of the performance of individual models, ensemble models, and the combined filtered model, along with visualizations to aid in understanding the results.

In [None]:
# Train, evaluate and ensemble models
def train_evaluate_ensemble(models, input_shape, n_classes, optimizer, fine_tune):
    model_list = []
    acc_individual_models = {}
    
    for model_name in models:
        model = create_model(model_name, input_shape, n_classes, optimizer, fine_tune)
        history = model.fit(train_data, epochs=100, steps_per_epoch=train_data.n//train_data.batch_size,
                            class_weight=class_weights, validation_data=val_data, validation_steps=val_data.n//val_data.batch_size,
                            callbacks=[checkpoint, early_stop, PlotLossesKeras()], verbose=0)
        model_list.append(model)
        
        # Evaluate individual models
        model_preds = model.predict(test_data, test_data.samples//test_data.batch_size+1)
        model_pred_classes = np.argmax(model_preds , axis=1)
        true_classes = test_data.classes
        acc_individual_models[model_name] = accuracy_score(true_classes, model_pred_classes)
        
        print(f"Accuracy of {model_name}: {acc_individual_models[model_name] * 100:.2f}%")
        print('Classification Report:')
        print(classification_report(true_classes, model_pred_classes))
        print('Confusion Matrix:')
        print(confusion_matrix(true_classes, model_pred_classes))
    
    # Predictions
    test_predictions = []
    for model in model_list:
        test_predictions.append(model.predict(test_data, test_data.samples//test_data.batch_size+1))
    
    # Ensemble using majority voting
    majority_voting_predictions = np.argmax(np.sum(test_predictions, axis=0), axis=1)
    
    # Ensemble using soft voting
    soft_voting_predictions = np.argmax(np.mean(test_predictions, axis=0), axis=1)
    
    # Evaluate ensemble models
    acc_majority_voting = accuracy_score(true_classes, majority_voting_predictions)
    acc_soft_voting = accuracy_score(true_classes, soft_voting_predictions)
    
    
    print(f"Accuracy of Ensemble Model (Majority Voting): {acc_majority_voting * 100:.2f}%")
    print(f"Accuracy of Ensemble Model (Soft Voting): {acc_soft_voting * 100:.2f}%")
    
    print('Classification Report for Ensemble Model (Majority Voting):')
    print(classification_report(true_classes, majority_voting_predictions))
    print('Confusion Matrix for Ensemble Model (Majority Voting):')
    print(confusion_matrix(true_classes, majority_voting_predictions))
    
    print('Classification Report for Ensemble Model (Soft Voting):')
    print(classification_report(true_classes, soft_voting_predictions))
    print('Confusion Matrix for Ensemble Model (Soft Voting):')
    print(confusion_matrix(true_classes, soft_voting_predictions))

    # Combine predictions
    combined_predictions = combined_voting(test_predictions, majority_voting_predictions, soft_voting_predictions, threshold)
    
    # Apply post-processing technique
    filtered_predictions = apply_post_processing(combined_predictions, filter_size=3)
    
    # Evaluate combined predictions after post-processing
    acc_combined_filtered = accuracy_score(true_classes, filtered_predictions)

    print(f"Accuracy of Combined Model after Post-Processing: {acc_combined_filtered * 100:.2f}%")
    print('Classification Report for Combined Model after Post-Processing:')
    print(classification_report(true_classes, filtered_predictions))
    print('Confusion Matrix for Combined Model after Post-Processing:')
    print(confusion_matrix(true_classes, filtered_predictions))

    # Save combined filtered model
    combined_filtered_model_path = "../working/combined_filtered_model.h5"
    combined_filtered_model = create_model(models[0], input_shape, n_classes, optimizer, fine_tune)  # Assuming the first model in the list is used for creating the combined model
    combined_filtered_model.save(combined_filtered_model_path)
    print(f"Combined Filtered Model saved at: {combined_filtered_model_path}")

    # Plotting accuracy comparison graph
    models.append('Ensemble (Majority Voting)')
    models.append('Ensemble (Soft Voting)')
    models.append('Ensemble (combined_filtered)')
    accuracies = [acc_individual_models[model_name] for model_name in models[:-3]]
    accuracies.extend([acc_majority_voting, acc_soft_voting, acc_combined_filtered])
    
    plt.figure(figsize=(10, 6))
    plt.bar(models, accuracies, color=['blue', 'orange', 'green', 'red', 'purple', 'brown', 'cyan', 'magenta', 'gray'])
    plt.title('Accuracy Comparison')
    plt.xlabel('Models')
    plt.ylabel('Accuracy')
    plt.xticks(rotation=45)
    plt.ylim(0, 1)
    plt.show()
    

    # Predictions for ROC Analysis
    filtered_predictions_binary = label_binarize(filtered_predictions, classes=[0, 1, 2, 3, 4, 5])
    true_classes_binary = label_binarize(true_classes, classes=[0, 1, 2, 3, 4, 5])
    
    # ROC Analysis for Ensemble Model with Combined Filtered Predictions
    fpr_combined_filtered = {}
    tpr_combined_filtered = {}
    roc_auc_combined_filtered = {}
    for i in range(n_classes):
        fpr_combined_filtered[i], tpr_combined_filtered[i], _ = roc_curve(true_classes_binary[:, i], filtered_predictions_binary[:, i])
        roc_auc_combined_filtered[i] = auc(fpr_combined_filtered[i], tpr_combined_filtered[i])
    
    plt.figure(figsize=(10, 8))
    colors = ['blue', 'red', 'green', 'orange', 'purple', 'brown']
    for i, color in zip(range(n_classes), colors):
        plt.plot(fpr_combined_filtered[i], tpr_combined_filtered[i], color=color, lw=2, label=f'ROC curve (class {i}) (area = {roc_auc_combined_filtered[i]:0.2f})')
    
    plt.plot([0, 1], [0, 1], 'k--', lw=2)
    plt.xlim([0.0, 1.0])
    plt.ylim([0.0, 1.05])
    plt.xlabel('False Positive Rate')
    plt.ylabel('True Positive Rate')
    plt.title('ROC Analysis for Ensemble Model with Combined Filtered Predictions')
    plt.legend(loc="lower right")
    plt.show()


    return acc_combined_filtered,combined_filtered_model_path,model_list


Global variables

In [None]:
input_shape = (224, 224, 3)
optimizer = tf.keras.optimizers.Adam(learning_rate=0.00001)
n_classes = 6
ft = 0
# Set threshold for confidence score
threshold = 0.7

In [None]:
models = ['DenseNet121', 'DenseNet169', 'InceptionResNetV2', 'ResNet152V2', 'DenseNet201']

This code calculates class weights based on the distribution of classes in the training data. Let me explain each part:

1. `from collections import Counter`: This imports the `Counter` class from the `collections` module. The `Counter` class is used for counting hashable objects.

2. `counter = Counter(train_data.classes)`: This line creates a `Counter` object named `counter` by passing the `train_data.classes` as input. `train_data.classes` likely contains the class labels corresponding to the training data samples.

3. `max_val = float(max(counter.values()))`: This line calculates the maximum count of any class in the training data by taking the maximum value from the counts stored in the `counter` object. It's converted to a float to ensure that division later on yields a float result.

4. `class_weights = {class_id : max_val/num_images for class_id, num_images in counter.items()}`: This line creates a dictionary named `class_weights`. It iterates over the items (class ID and count) in the `counter` object. For each class ID, it assigns a weight calculated as the maximum count (`max_val`) divided by the number of images in that class (`num_images`). This effectively gives more weight to underrepresented classes and less weight to overrepresented classes during training.

5. `print(class_weights)`: Finally, this line prints out the calculated class weights.

Overall, this code snippet is useful for addressing class imbalance issues in a classification task by assigning appropriate weights to different classes during training.

In [None]:
from collections import Counter
counter = Counter(train_data.classes)                       
max_val = float(max(counter.values()))   
class_weights = {class_id : max_val/num_images for class_id, num_images in counter.items()}
print(class_weights)

This code snippet is creating `ModelCheckpoint` instances for each model in the `models` list. Here's how it works:

1. `checkpoints = []`: Initializes an empty list to store the `ModelCheckpoint` instances.

2. Loop through each model name in the `models` list:

   ```python
   for model_name in models:
   ```

3. Define the filepath for saving the model weights:

   ```python
   filepath = f"/kaggle/working/{model_name}.h5"
   ```

   Here, the model weights will be saved in an HDF5 file format with the name of the model appended with ".h5". The path specified is within the "/kaggle/working" directory.

4. Create a `ModelCheckpoint` instance:

   ```python
   checkpoint = ModelCheckpoint(
       filepath=filepath, 
       monitor='val_accuracy', 
       verbose=1,
       save_best_only=True, 
       save_weights_only=False, 
       mode='auto'
   )
   ```

   - `filepath`: Specifies the file path to save the model.
   - `monitor`: Quantity to monitor (in this case, validation accuracy).
   - `verbose`: Verbosity mode, where 1 indicates updating messages.
   - `save_best_only`: Indicates whether to save only the best model.
   - `save_weights_only`: Specifies whether to save the entire model (`False`) or just the model weights (`True`).
   - `mode`: Defines the direction of improvement to monitor (`auto` in this case, which automatically decides the direction based on the monitored quantity).

5. Append the created `ModelCheckpoint` instance to the `checkpoints` list:

   ```python
   checkpoints.append(checkpoint)
   ```

After this loop, the `checkpoints` list will contain `ModelCheckpoint` instances for each model, each specifying the filepath to save the best model weights based on validation accuracy. These checkpoints can be passed as callbacks during model training to save the best model weights automatically.

In [None]:
checkpoints = []

# Loop through the models to create ModelCheckpoint instances with different filepaths
for model_name in models:
    filepath = f"/kaggle/working/{model_name}.h5"
    checkpoint = ModelCheckpoint(
        filepath=filepath, 
        monitor='val_accuracy', 
        verbose=1,
        save_best_only=True, 
        save_weights_only=False, 
        mode='auto'
    )
    checkpoints.append(checkpoint)

The code you provided initializes an `EarlyStopping` callback. Here's a breakdown of its parameters:

- `monitor='val_accuracy'`: This parameter specifies the quantity to be monitored during training. In this case, it's validation accuracy.

- `min_delta=0`: This parameter defines the minimum change in the monitored quantity to qualify as an improvement. If the change is less than this value, it won't be considered an improvement.

- `patience=20`: This parameter indicates the number of epochs with no improvement after which training will be stopped. In this case, if there's no improvement in validation accuracy for 20 consecutive epochs, training will be stopped.

- `verbose=1`: This parameter controls the verbosity of the output. A value of 1 means that progress messages will be printed.

- `mode='auto'`: This parameter determines the direction of improvement to monitor. 'auto' mode automatically infers the direction based on the monitored quantity. Since the monitored quantity is validation accuracy, 'auto' will infer that an increase in validation accuracy is considered an improvement.

- `restore_best_weights=True`: This parameter specifies whether to restore the model weights from the epoch with the best value of the monitored quantity. If set to `True`, the model weights will be restored to the best state found during training when training stops.

Overall, this `EarlyStopping` callback will monitor the validation accuracy during training and stop training if there's no improvement for 20 consecutive epochs, while restoring the model weights to the best state found during training.

In [None]:
# Callbacks
early_stop = EarlyStopping(monitor='val_accuracy', min_delta=0, patience=20, verbose=1, mode='auto', restore_best_weights=True)

Train, Evaluate and Ensemble models

In [None]:
train_evaluate_ensemble(models, input_shape, n_classes, optimizer, ft)