# **Deep Learning | <span style="color: pink;">Project Notebook</span>**

___

#### NOVA IMS / BSc in Data Science / Deep Learning 2024/2025
### <b><span style="color: pink">Image Classification - Breast Cancer Diagnosis</span></b>

#### <b>Group 8</b>:
- Carolina Almeida (20221855)
- Duarte Carvalho (20221900)
- Francisco Gomes (20221810)
- Maria Henriques (20221952)
- Marta Monteiro (20221954)

____

#### <font color='pink'>Table of Contents </font> <a class="anchor" id='toc'></a>
- [1. Initial Data Understanding](#P1)
- [2. Data Visualization](#P2)

- [3. Binary Classification Model](#P3)
    - [3.1 Specific Data Preparation](#P31)
    - [3.2 Modelling with the Original Images](#P32)
    - [3.3 Modelling with the Preprocessed Images](#P33)
    - [3.4 Final Model Evaluation](#P34)
- [4. Multi-Class Classification Model](#P4)
    - [3.1 Specific Data Preparation](#P31)
    - [3.2 Modelling with the Original Images](#P32)
    - [3.3 Modelling with the Preprocessed Images](#P33)
    - [3.4 Final Model Evaluation](#P34)

___

In this section, we import all the required libraries.

In [None]:
# --------------------------------------------------
# Standard Imports
# --------------------------------------------------
import os
import random
from copy import deepcopy
import datetime
from sklearn.utils import shuffle
from collections import defaultdict

# --------------------------------------------------
# Data Manipulation
# --------------------------------------------------
import numpy as np
import pandas as pd

# --------------------------------------------------
# Image Processing
# --------------------------------------------------
from PIL import Image, ImageEnhance, ImageOps
import cv2
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# --------------------------------------------------
# Machine Learning & Deep Learning
# --------------------------------------------------
import shutil
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import (classification_report, confusion_matrix, precision_score, recall_score, f1_score, roc_auc_score, balanced_accuracy_score, precision_recall_curve, auc)
from sklearn.utils.class_weight import compute_class_weight

# Resampling
from imblearn.over_sampling import RandomOverSampler

# TensorFlow & Keras
import tensorflow as tf
from tensorflow.keras import backend as K, layers, models
from tensorflow.keras.models import Model, Sequential, load_model
from keras.layers import Dropout, Conv2D, MaxPooling2D, Flatten, Dense, GlobalAveragePooling2D
from keras.layers import Input, Concatenate
from tensorflow.keras.layers import BatchNormalization, LeakyReLU 
from tensorflow.keras.regularizers import l2 
from keras.callbacks import EarlyStopping, ReduceLROnPlateau
from keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical 
from tensorflow.keras.applications import VGG16, ResNet50

# --------------------------------------------------
# Keras Tuner for Hyperparameter Optimization
# --------------------------------------------------
import keras_tuner as kt

# --------------------------------------------------
# Visualization
# --------------------------------------------------
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.graph_objects as go
import plotly.subplots as sp
from plotly.subplots import make_subplots

# Evaluation Metrics for Visualization
from sklearn.metrics import roc_curve, auc, precision_recall_curve
from sklearn.calibration import calibration_curve

# -----------------------------------------------------------
import warnings
warnings.filterwarnings("ignore")

# -----------------------------------------------------------
from Functions_Group_8 import *

# -----------------------------------------------------------
%load_ext autoreload
%autoreload 2

___

<font color='pink' size=5>**1. Initial Data Understanding**</font> <a class="anchor" id="P1"></a>
  
[Back to TOC](#toc)

The cell below contains the respective paths for the group members. 
<br>
Please add any new paths as needed to run the code.

In [None]:
path = '/Users/franciscogomes/Desktop/Faculdade/3rd year/1st Semester/Deep Learning/Project'
# path = '/Users/carol/Desktop/NOVA IMS/Third Year - First Semester/Deep Learning/Group Project/Group 8'
# path = '/Users/marga/OneDrive/Documentos/universidade/3º ano/1º semestre/Deep Learning/Deep Leaning - Project'
# path = '/Users/dacar/OneDrive - NOVAIMS/Ambiente de Trabalho/Deep Learning Project'
# path = C:\Users\marta\Desktop\eu\faculdade\3\First Semester\Deep Learning\Projeto\DeepLearning24_25

The CSV file is imported and assigned to the variable **metadata**.

In [None]:
metadata = pd.read_csv(path + '/BreaKHis_v1/histology_slides/breast/image_data.csv')
metadata.head()

The command below provides detailed information about our data, including the memory usage of each column. 
<br>
It helps to understand the memory consumption of the data, particularly for large datasets.

In [None]:
metadata.info(memory_usage = 'deep')

We can observe that the dataset contains **7909 entries** and **4 columns**. 
<br>
All columns are correctly assigned the **object data type**. 
<br>
Regarding missing values, we can see that, except for the 'path_to_image' column, the other columns have 3 or 4 missing values.

**Handling Missing Values**

The following code displays all rows in the data that contain any missing values. It helps to quickly identify which rows have incomplete data.

In [None]:
display(metadata[metadata.isnull().any(axis = 1)])

There are **4 rows with missing data**. Of these, 3 rows have missing values in 3 columns, while 1 row is missing data in 2 columns.

We decided to print the full 'path_to_image' for these rows, as reviewing the paths will allow us to manually input the missing values.

In [None]:
for index in [2871, 2871, 3228, 4536]:
    print(metadata.loc[index, 'path_to_image'])

Row | Benign or Malignant | Cancer Type | Magnification |
---------- |----------|----------|----------|
2871 | Malignant | Mucinous Carcinoma | 100X |
3093 | Malignant | Mucinous Carcinoma | 100X |
3228 | Malignant | Mucinous Carcinoma | 400X |
4536 | Malignant | Deuctal Carcinoma | 40X |

Using the table above, we are able to input the missing values.

In [None]:
metadata.loc[2871, 'Benign or Malignant'] = 'Malignant'
metadata.loc[2871, 'Cancer Type'] = 'Mucinous Carcinoma'
metadata.loc[2871, 'Magnification'] = '100X'

metadata.loc[3093, 'Cancer Type'] = 'Mucinous Carcinoma'
metadata.loc[3093, 'Magnification'] = '100X'

metadata.loc[3228, 'Benign or Malignant'] = 'Malignant'
metadata.loc[3228, 'Cancer Type'] = 'Mucinous Carcinoma'
metadata.loc[3228, 'Magnification'] = '400X'

metadata.loc[4536, 'Benign or Malignant'] = 'Malignant'
metadata.loc[4536, 'Cancer Type'] = 'Ductal Carcinoma'
metadata.loc[4536, 'Magnification'] = '40X'

The following code is used to demonstrate that after imputing the missing values, we no longer have any missing data.

In [None]:
display(metadata[metadata.isnull().any(axis = 1)])

**Handling Duplicates**

To identify duplicate images in the dataset, we utilized **perceptual hashing**. The process involved two key functions:

- **compute_image_hash(image_path)**: This function computes a perceptual hash for each image. It converts the image to grayscale, resizes it to a smaller resolution (8x8 pixels), and calculates a hash based on the pixel values. This hash serves as a unique fingerprint for the image. If two images have the same hash, they are considered duplicates.

- **find_duplicate_images(metadata)**: This function scans through the dataset's 'path_to_image' column, computes the hash for each image, and groups images with the same hash into a dictionary. It then identifies and returns a dictionary of duplicate images, where the keys are hash values and the values are lists of image paths corresponding to duplicate images.

By using these functions, we were able to detect and list any repeated images in the dataset.

In [None]:
duplicate_images = find_duplicate_images(metadata)

if duplicate_images:
    print(f"Number of duplicate groups: {len(duplicate_images)}")
    print(f"Total number of duplicate images: {sum(len(paths) for paths in duplicate_images.values())}")

    random_duplicate = random.choice(list(duplicate_images.keys()))
    selected_paths = duplicate_images[random_duplicate]

    if len(selected_paths) >= 2:
        with Image.open(selected_paths[0]) as image_1, Image.open(selected_paths[1]) as image_2:
            fig, axes = plt.subplots(1, 2, figsize = (10, 5))
            axes[0].imshow(image_1)
            axes[0].set_title(f"Image 1: {selected_paths[0].split('/')[-1]}")
            axes[0].axis('off')

            axes[1].imshow(image_2)
            axes[1].set_title(f"Image 2: {selected_paths[1].split('/')[-1]}")
            axes[1].axis('off')

            plt.tight_layout()
            plt.show()
else:
    print("No duplicate images found.")

We concluded that there are **131 groups of duplicate images**, with a total of **268 images**. 
<br>
To verify the results, we decided to randomly display one pair of duplicates.

If there are duplicate images in the dataset, splitting the data during the modeling phase could result in a scenario where a similar image appears in both the training and test sets. 
<br>
This would lead to data leakage, negatively impacting the model's performance. 
<br>
Thus, to address this, we decided that the best approach is to retain only one image from each group of duplicates.

In [None]:
metadata['path_to_image'] = metadata['path_to_image'].apply(keep_one_image)

metadata = metadata.dropna(subset = ['path_to_image'])

**Summary Statistics**

In [None]:
metadata.describe(include = 'object')

- **path_to_image**: This column contains 7909 unique entries, meaning each image path is unique.
- **Benign or Malignant**: This column has two unique values, with "Malignant" being the most frequent, appearing 5429 times.
- **Cancer Type**: There are 8 unique cancer types in the dataset, with "Ductal Carcinoma" being the most common, occurring 3451 times.
- **Magnification**: This column has 4 unique magnification levels, with "100X" being the most frequently recorded, appearing 2082 times.

<font color='pink' size=5>**2. Data Visualization**</font> <a class="anchor" id="P2"></a>
  
[Back to TOC](#toc)

In this section, we decided to create some visualizations to help us better understand the dataset.

In [None]:
bar_plot(data = metadata, column_name = 'Benign or Malignant', title = 'Number of observations over each diagnosis', x_label = 'Diagnosis', figsize = (8, 6))

As shown above, there are more malignant cases than benign ones. We will need to address this imbalance when performing binary classification.

In [None]:
bar_plot(data = metadata, column_name = 'Cancer Type', title = 'Number of observations over each Cancer Type', x_label = 'Cancer Type', figsize = (15, 6))

There is also an imbalance in the distribution of cancer types, which will need to be addressed during the multi-class classification phase.

In [None]:
bar_plot(data = metadata, column_name = 'Cancer Type', title = 'Number of observations over each Cancer Type', x_label = 'Cancer Type', figsize = (15, 6), hue = 'Benign or Malignant', show_legend = True)

We can observe that out of the 8 cancer types, 4 are associated with benign cases, while the remaining 4 are linked to malignant cases.

In [None]:
heatmap(metadata, 'Cancer Type', 'Magnification', 'size', 0, (12, 8), 'Heatmap of Cancer Type vs. Magnification')

In terms of magnification, we notice a strong positive correlation with "Ductal Carcinoma", but this is primarily due to the fact that "Ductal Carcinoma" has the highest number of observations. Overall, no anomalies are apparent.

<font color='pink' size=5>**3. Binary Classification Model**</font> <a class="anchor" id="P3"></a>
  
[Back to TOC](#toc)

In the next section, we will conduct several tests using different models and data splitting approaches to identify the best binary classification model. 
<br>
The goal is to develop a model that can accurately predict whether an image corresponds to a benign or malignant case.

<font color='pink' size=5>**3.1. Specific Data Preparation**</font> <a class="anchor" id="P31"></a>

In this step, we used the `LabelEncoder` to convert the <u>Benign or Malignant</u> column in the metadata into binary labels. 
<br>
This transformation assigns a numerical value to each category, with **Benign** mapped to **0** and **Malignant** mapped to **1**.

In [None]:
metadata['Binary Labels'] = LabelEncoder().fit_transform(metadata['Benign or Malignant'])

The separation below prepares the data for further processing and model training.

In [None]:
metadata_indices = metadata['path_to_image']
metadata_labels = metadata['Binary Labels']

**Hold - Out Method**

Here, **80% of the data is used for training the model**, while **20% is reserved for testing**. 
<br>
To ensure that the distribution of labels is preserved across both sets, we use the stratify parameter with the 'metadata_labels'. 
<br>
This helps to maintain a balanced representation of both classes (benign and malignant) in the training and test sets. 
<br>
The random state is set to 42 to ensure reproducibility of the results.

In [None]:
indices_train, indices_test, labels_train, labels_test = train_test_split(metadata_indices, metadata_labels, test_size = 0.2, random_state = 42, stratify = metadata_labels)

In [None]:
indices_train_array = load_images(indices_train)

indices_test_array = load_images(indices_test)

**Data Augmentation**

We set up data augmentation for the training set using `ImageDataGenerator`, applying transformations like rotation, zoom, shifting, and flipping to improve model generalization. 
<br>
The validation set is extracted from the training data using the 'validation_split' parameter. 
<br>
For the test set, only rescaling is applied to normalize pixel values. 
<br>
We convert the training and test labels into tensors for compatibility with the model.

In [None]:
train_datagen = ImageDataGenerator(featurewise_center = False, samplewise_center = False, featurewise_std_normalization = False, samplewise_std_normalization = False, zca_whitening = False, rotation_range = 10, zoom_range = 0.1, width_shift_range = 0.2, height_shift_range = 0.2, horizontal_flip = True, vertical_flip = True, validation_split = 0.2)  
test_datagen = ImageDataGenerator(rescale = 1.0 / 255)

labels_train_tensor = tf.convert_to_tensor(labels_train, dtype = tf.float32)
labels_test_tensor = tf.convert_to_tensor(labels_test, dtype = tf.float32)

train_generator = train_datagen.flow(indices_train_array, labels_train_tensor, batch_size = 512, subset = 'training')
validation_generator = train_datagen.flow(indices_train_array, labels_train_tensor, batch_size = 512, subset = 'validation')
test_generator = test_datagen.flow(indices_test_array, labels_test_tensor, batch_size = 512)

**Data Oversampling**

To address class imbalance, we applied random oversampling to the training data using `RandomOverSampler`. 

In [None]:
indices_train_flattened = indices_train_array.reshape(indices_train_array.shape[0], -1)
indices_train_oversampled, labels_train_oversampled = RandomOverSampler().fit_resample(indices_train_flattened, labels_train)
indices_train_oversampled = indices_train_oversampled.reshape(-1, *indices_train_array.shape[1:])

**Callbacks Definition**

To optimize the training process, we set up two callbacks: 

In [None]:
early_stopping = EarlyStopping(monitor = 'val_loss', patience = 5, restore_best_weights = True)
reduce_on_plateau = ReduceLROnPlateau(monitor = 'val_loss', factor = 0.5, patience = 3)

<font color='pink' size=5>**3.2. Modelling with the Original Images**</font> <a class="anchor" id="P32"></a>

---
# **Model from Scratch**
---

We decided to begin hyperparameter tuning for our model using the `Hyperband Tuner`. The objective is to minimize the validation loss.

### <u>**Hyperparameter Tuning**</u>

In [None]:
Tuner_1 = kt.Hyperband(model_building_function_1, objective = 'val_loss', max_epochs = 20, factor = 3, directory = 'Hyperband Tuner', project_name = 'Model 1')

In [None]:
Tuner_1.search(indices_train_array, labels_train, epochs = 20, validation_split = 0.2, callbacks = [early_stopping, reduce_on_plateau])

The **Binary_Model** was constructed using the best parameters identified through the tuning process.

In [None]:
Best_Hyperparameters = Tuner_1.get_best_hyperparameters(num_trials = 1)[0]
Binary_Model = Tuner_1.hypermodel.build(Best_Hyperparameters)

### **Hold - Out Method**

Below, we trained the **Binary_Model** using the training data obtained from the hold-out method.

In [None]:
history_1 = Binary_Model.fit(indices_train_array, labels_train, validation_split = 0.2, epochs = 50, batch_size = 650, callbacks = [early_stopping, reduce_on_plateau])

Please note that we are evaluating each epoch based on Recall and Precision. 
<br>
Thus, we created a custom F1 score function that computes the F1 score per epoch. These are the values that will be plotted alongside the loss.

In [None]:
histories_1 = history_1.history  
f1_scores_1 = obtain_f1_score(histories_1, 0)

Training_and_Validation_Metrics_Plot(history_1, f1_scores_1)

We decided to save the model, to ensure that the model's architecture, weights, and training configuration are preserved, allowing for easy restoration and future use.

In [None]:
Binary_Model.save('Model 1.h5')

### **Data Augmentation**

Below, we trained the **Binary_Model** using the training data obtained from data augmentation.

In [None]:
history_2 = Binary_Model.fit(train_generator, validation_data = validation_generator, epochs = 50, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_2 = history_2.history  
f1_scores_2 = obtain_f1_score(histories_2, 0)

Training_and_Validation_Metrics_Plot(history_2, f1_scores_2)

### **Data Oversampling** 

Below, we trained the **Binary_Model** using the training data obtained from the data oversampling.

In [None]:
history_3 = Binary_Model.fit(indices_train_oversampled, labels_train_oversampled, validation_split = 0.2, epochs = 50, batch_size = 650, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_3 = history_3.history  
f1_scores_3 = obtain_f1_score(histories_3, 0)

Training_and_Validation_Metrics_Plot(history_3, f1_scores_3)

---
# **VGG16 Model**
---

We used the a custom function to build a transfer learning model by leveraging the pre-trained **VGG16 model** as the base. 
<br>
In this process, we froze the weights of all layers in the VGG16 model to prevent them from being updated during training, allowing us to retain the knowledge learned from large-scale image datasets like ImageNet. 
<br>
We then added custom dense layers with ReLU activation functions, dropout for regularization, and a sigmoid activation in the final layer to classify the images as either benign or malignant. 
<br>
The model was compiled with the Adam optimizer and binary cross-entropy loss, optimized for recall and precision as evaluation metrics.

The **VGG_16_Model** was constructed using this process.

In [None]:
VGG_16_Model = transfer_learning(VGG16(weights = 'imagenet', include_top = False, input_shape = (50, 50, 3)))
VGG_16_Model.summary()

### **Hold - Out Method**

Below, we trained the **VGG_16_Model** using the training data obtained from the hold-out method.

In [None]:
history_4 = VGG_16_Model.fit(indices_train_array, labels_train, validation_split = 0.2, epochs = 50, batch_size = 650, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_4 = history_4.history  
f1_scores_4 = obtain_f1_score(histories_4, 0)

Training_and_Validation_Metrics_Plot(history_4, f1_scores_4)

### **Data Augmentation** 

Below, we trained the **VGG_16_Model** using the training data obtained from data augmentation.

In [None]:
history_5 = VGG_16_Model.fit(train_generator, validation_data = validation_generator, epochs = 50, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_5 = history_5.history  
f1_scores_5 = obtain_f1_score(histories_5, 0)

Training_and_Validation_Metrics_Plot(history_5, f1_scores_5)

### **Oversampling** 

Below, we trained the **VGG_16_Model** using the training data obtained from data oversampling.

In [None]:
history_6 = VGG_16_Model.fit(indices_train_oversampled, labels_train_oversampled, validation_split = 0.2, epochs = 50, batch_size = 650, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_6 = history_6.history  
f1_scores_6 = obtain_f1_score(histories_6, 0)

Training_and_Validation_Metrics_Plot(history_6, f1_scores_6)

---
# **ResNet 50 Model**
---

We used the a custom function to build a transfer learning model by leveraging the pre-trained **ResNet 50 model** as the base. 
<br>
In this process, we froze the weights of all layers in the ResNet 50 model to prevent them from being updated during training, allowing us to retain the knowledge learned from large-scale image datasets like ImageNet. 
<br>
We then added custom dense layers with ReLU activation functions, dropout for regularization, and a sigmoid activation in the final layer to classify the images as either benign or malignant. 
<br>
The model was compiled with the Adam optimizer and binary cross-entropy loss, optimized for recall and precision as evaluation metrics.

The **ResNet_50_Model** was constructed using this process.

In [None]:
ResNet_50_Model = transfer_learning(ResNet50(weights = 'imagenet', include_top = False, input_shape = (50, 50, 3)))
ResNet_50_Model.summary()

### **Hold - Out Method**

Below, we trained the **ResNet_50_Model** using the training data obtained from the hold-out method.

In [None]:
history_7 = ResNet_50_Model.fit(indices_train_array, labels_train, validation_split = 0.2, epochs = 50, batch_size = 650, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_7 = history_7.history  
f1_scores_7 = obtain_f1_score(histories_7, 0)

Training_and_Validation_Metrics_Plot(history_7, f1_scores_7)

### **Data Augmentation** 

Below, we trained the **ResNet_50_Model** using the training data obtained from data augmentation.

In [None]:
history_8 = ResNet_50_Model.fit(train_generator, validation_data = validation_generator, epochs = 50, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_8 = history_8.history  
f1_scores_8 = obtain_f1_score(histories_8, 0)

Training_and_Validation_Metrics_Plot(history_8, f1_scores_8)

### **Oversampling** 

Below, we trained the **ResNet_50_Model** using the training data obtained from data oversampling.

In [None]:
history_9 = ResNet_50_Model.fit(indices_train_oversampled, labels_train_oversampled, validation_split = 0.2, epochs = 50, batch_size = 650, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_9 = history_9.history  
f1_scores_9 = obtain_f1_score(histories_9, 0)

Training_and_Validation_Metrics_Plot(history_9, f1_scores_9)

<font color='pink' size=5>**3.3. Modelling with the Preprocessed Images**</font> <a class="anchor" id="P33"></a>

For this section, we applied several preprocessing techniques and trained the model returned by the tuner. 
<br>
However, for the grayscale images, we created a new model with the same structure, adapted to accept input images of shape (50, 50, 1).

---
# **Model from Scratch**
---

### **Grayscale** 

Below, we present a random image alongside its grayscale version for visualization.

In [None]:
plot_image_with_preprocessing(metadata)

In [None]:
model_grayscale = models.Sequential()

model_grayscale.add(layers.Conv2D(32, (3, 3), activation = 'relu', input_shape = (50, 50, 1)))
model_grayscale.add(layers.MaxPooling2D((2, 2)))

model_grayscale.add(layers.Conv2D(64, (3, 3), activation = 'relu'))
model_grayscale.add(layers.Conv2D(128, (3, 3), activation = 'relu'))

model_grayscale.add(layers.Flatten())

model_grayscale.add(layers.Dense(128, activation = 'relu'))
model_grayscale.add(Dropout(0.6))

model_grayscale.add(layers.Dense(64, activation = 'relu'))
model_grayscale.add(layers.Dense(1, activation = 'sigmoid'))

model_grayscale.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['recall', 'precision'])

Below we trained our model using the training data obtained from the hold-out method, but converted to grayscale.

In [None]:
indices_train_grayscale = apply_preprocessing(indices_train_array, 'grayscale')

history_grayscale = model_grayscale.fit(indices_train_grayscale, labels_train, validation_split = 0.2, batch_size = 650, epochs = 50, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_grayscale = history_grayscale.history  
f1_scores_grayscale = obtain_f1_score(histories_grayscale, 0)

Training_and_Validation_Metrics_Plot(history_grayscale, f1_scores_grayscale)

### **Contrast** 

Below, we present a random image alongside its contrast-enhanced version for visualization.

In [None]:
plot_image_with_preprocessing(metadata, 'contrast')

Below we trained our model using the training data obtained from the hold-out method, but with enhanced contrast applied to the images.

In [None]:
indices_train_contrast = apply_preprocessing(indices_train_array, 'contrast')

history_contrast = Binary_Model.fit(indices_train_contrast, labels_train, validation_split = 0.2, batch_size = 650, epochs = 50, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_contrast = history_contrast.history  
f1_scores_contrast = obtain_f1_score(histories_contrast, 0)

Training_and_Validation_Metrics_Plot(history_contrast, f1_scores_contrast)

### **Brightness Contrast** 

Below, we present a random image alongside its brightness and contrast-adjusted version for visualization.

In [None]:
plot_image_with_preprocessing(metadata, 'brightness_contrast')

Below we trained our model using the training data obtained from the hold-out method, but with adjusted brightness and contrast applied to the images.

In [None]:
indices_train_brightness_contrast = apply_preprocessing(indices_train_array, 'brightness_contrast')

history_brightness_contrast = Binary_Model.fit(indices_train_brightness_contrast, labels_train, validation_split = 0.2, batch_size = 650, epochs = 50, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_brightness_contrast = history_brightness_contrast.history  
f1_scores_brightness_contrast = obtain_f1_score(histories_brightness_contrast, 0)

Training_and_Validation_Metrics_Plot(history_brightness_contrast, f1_scores_brightness_contrast)

<font color='pink' size=5>**3.4. Final Model Evaluation**</font> <a class="anchor" id="P35"></a>

We concluded that the best model was the one returned by the tuner using the original images and the hold-out method. In the cell below, we load this model for further evaluation.

In [None]:
binary_classification_model = load_model('Model 1.h5')

In [None]:
binary_classification_model.summary()

We made predictions on the test data using the model and converted the probabilities to binary class labels by applying a threshold of 0.5.

In [None]:
test_predictions = binary_classification_model.predict(indices_test_array)
test_predictions = (test_predictions > 0.5).astype(int)

Below, we can visualize the confusion matrix and the classification report.

In [None]:
evaluate_model(labels_test, test_predictions)

**Classification Report**

The classification report highlights the performance of the model on the test set, with a focus on precision, recall, and F1-score for each class. 
<br>
For class 0 (Benign), the precision is 0.83, recall is 0.67, and the F1-score is 0.74, indicating that while the model performs reasonably well in identifying Benign cases, it misses some instances. 
<br>
For class 1 (Malignant), the model achieves better performance, with a precision of 0.86, recall of 0.94, and an F1-score of 0.90, reflecting high accuracy in identifying Malignant cases.

Given that the dataset is imbalanced, with more Malignant instances than Benign ones, the higher performance on class 1 is expected. 
<br>
The overall accuracy is 0.85, with a macro average F1-score of 0.82 and a weighted average F1-score of 0.85, demonstrating strong performance, particularly on the Malignant class.

**Confusion Matrix**

The confusion matrix provides a detailed breakdown of the model's predictions.
<br>
For class 0 (Benign), the model correctly classified 331 instances as class 0 but misclassified 165 instances as class 1.
<br>
For class 1 (Malignant), it correctly classified 1018 instances, but misclassified 68 instances as class 0.

The model performs better at predicting class 1 (Malignant), likely due to the higher number of observations for this class. This imbalance leads to more accurate predictions for class 1, while class 0 (Benign) experiences more misclassifications, resulting in a lower recall for this class.

For this model, we have visualized several important performance metrics: the **ROC curve**, the **precision-recall curve**, and the **calibration curve**, each providing different insights into the model's behavior.

In [None]:
model_evaluation_plots(labels_test, test_predictions)

- **ROC Curve:** The ROC curve is closer to the top-left corner, which indicates that the model performs well in distinguishing between the Benign and Malignant classes. A curve near the top-left suggests a high true positive rate (recall) and a low false positive rate, meaning the model is effectively identifying Malignant cases with minimal misclassification of Benign cases.

- **Precision-Recall Curve:** The precision-recall curve shows high values for both precision and recall, which is a positive indicator, especially given the imbalanced nature of the dataset. This suggests that the model is effectively identifying Benign cases (the minority class) with good precision, without generating too many false positives.

- **Calibration Curve:** The calibration curve is slightly off from the diagonal line, indicating that while the model's probability estimates are generally good, they are not perfectly calibrated. The slight deviation suggests that the model might be overestimating or underestimating the confidence of its predictions to some extent. While this does not severely impact the model's classification performance, improving the calibration could help achieve more reliable probability predictions for decision-making.

Overall, these curves demonstrate that the model is performing well, particularly in distinguishing Malignant cases, though there is room for improvement in the calibration of predicted probabilities.

<font color='pink' size=5>**4. Multi-Class Classification Model**</font> <a class="anchor" id="P4"></a>
  
[Back to TOC](#toc)

<font color='pink' size=5>**4.1. Specific Data Preparation**</font> <a class="anchor" id="P41"></a>

In this step, we used the `LabelEncoder` to convert the <u>Cancer Type</u> column in the metadata into numerical labels for multi-class classification.
<BR>
This transformation assigns a unique numerical value to each cancer type, with each category being mapped to a different integer label.

In [None]:
metadata['Multi Class Labels'] = LabelEncoder().fit_transform(metadata['Cancer Type'])

The separation below prepares the data for further processing and model training.
<BR>
Note that the indices were defined earlier and are the same as those used for the binary classification.

In [None]:
metadata_labels_ = metadata['Multi Class Labels']

**Hold - Out Method**

Here, **80% of the data is used for training the model**, while **20% is reserved for testing**. 
<br>
To ensure that the distribution of labels is preserved across both sets, we use the stratify parameter with the 'metadata_labels_'. 
<br>
This helps to maintain a balanced representation of all classes in the training and test sets. 
<br>
The random state is set to 42 to ensure reproducibility of the results.

In [None]:
indices_train_, indices_test_, labels_train_, labels_test_ = train_test_split(metadata_indices, metadata_labels_, test_size = 0.2, random_state = 42, stratify = metadata_labels_)

We converted the training and test labels into one-hot encoded format using the `to_categorical` function. 
<br>
For both the training labels and the test labels, we specified 'num_classes = 8' to ensure the labels are encoded into vectors with 8 possible categories, corresponding to the different cancer types in the dataset.

In [None]:
labels_train_one_hot_encoded = to_categorical(labels_train_, num_classes = 8)
labels_test_one_hot_encoded = to_categorical(labels_test_, num_classes = 8)

In [None]:
indices_train_array_ = load_images(indices_train_)

indices_test_array_ = load_images(indices_test_)

**Data Augmentation**

We used the `ImageDataGenerator` configured earlier and applied it to our new training, validation, and test sets.

In [None]:
labels_train_tensor_one_hot_encoded = tf.convert_to_tensor(labels_train_one_hot_encoded, dtype = tf.float32)
labels_test_tensor_one_hot_encoded = tf.convert_to_tensor(labels_test_one_hot_encoded, dtype = tf.float32)

train_generator_multi_class = train_datagen.flow(indices_train_array_, labels_train_tensor_one_hot_encoded, batch_size = 512, subset = 'training')
validation_generator_multi_class = train_datagen.flow(indices_train_array_, labels_train_tensor_one_hot_encoded, batch_size = 512, subset = 'validation')
test_generator_multi_class = test_datagen.flow(indices_test_array_, labels_test_tensor_one_hot_encoded, batch_size = 512)

**Data Oversampling**

For oversampling, we applied the same process as in the first stage, but this time to our new datasets.

In [None]:
indices_train_flattened_ = indices_train_array_.reshape(indices_train_array_.shape[0], -1)
indices_train_oversampled_, labels_train_oversampled_ = RandomOverSampler().fit_resample(indices_train_flattened_, labels_train_one_hot_encoded)
indices_train_oversampled_ = indices_train_oversampled_.reshape(-1, *indices_train_array_.shape[1:])

<font color='pink' size=5>**4.2. Modelling with the Original Images**</font> <a class="anchor" id="P42"></a>

---
# **Model from Scratch**
---

We decided to begin hyperparameter tuning for our model using the `Hyperband Tuner`. The objective is to minimize the validation loss.

### <u>**Hyperparameter Tuning**</u>

First, we computed the class weights using the `compute_class_weight` function with the balanced parameter. This helps account for the class imbalance in the dataset by assigning higher weights to the underrepresented classes. We applied it to the training labels.

In [None]:
class_weights = compute_class_weight(class_weight = 'balanced', classes = np.unique(labels_train_one_hot_encoded.argmax(axis = 1)), y = labels_train_one_hot_encoded.argmax(axis = 1))

class_weights_dictionary = dict(enumerate(class_weights))

In [None]:
Tuner_2 = kt.Hyperband(model_building_function_2, objective = 'val_loss', max_epochs = 20, factor = 3, directory = 'Hyperband Tuner', project_name = 'Model 2')

In [None]:
Tuner_2.search(indices_train_array_, labels_train_one_hot_encoded, epochs = 20, validation_split = 0.2, callbacks = [early_stopping, reduce_on_plateau])

The **Multi_Class_Model** was constructed using the best parameters identified through the tuning process.

In [None]:
Best_Hyperparameters_Multi_Class = Tuner_2.get_best_hyperparameters(num_trials = 1)[0]
Multi_Class_Model = Tuner_2.hypermodel.build(Best_Hyperparameters_Multi_Class)

### **Hold - Out Method**

Below, we trained the **Multi_Class_Model** using the training data obtained from the hold-out method.

In [None]:
history_10 = Multi_Class_Model.fit(indices_train_array_, labels_train_one_hot_encoded, epochs = 50, validation_split = 0.2, batch_size = 650, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_10 = history_10.history  
f1_scores_10 = obtain_f1_score(histories_10, 0)

Training_and_Validation_Metrics_Plot(history_10, f1_scores_10)

We decided to save the model, to ensure that the model's architecture, weights, and training configuration are preserved, allowing for easy restoration and future use.

In [None]:
Multi_Class_Model.save('Model 2.h5')

### **Data Augmentation**

Below, we trained the **Multi_Class_Model** using the training data obtained from data augmentation.

In [None]:
history_11 = Multi_Class_Model.fit(train_generator_multi_class, validation_data = validation_generator_multi_class, epochs = 50, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_11 = history_11.history  
f1_scores_11 = obtain_f1_score(histories_11, 0)

Training_and_Validation_Metrics_Plot(history_11, f1_scores_11)

### **Oversampling** 

Below, we trained the **Multi_Class_Model** using the training data obtained from the data oversampling.

In [None]:
history_12 = Multi_Class_Model.fit(indices_train_oversampled_, labels_train_oversampled_, validation_split = 0.2, epochs = 50, batch_size = 650, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_12 = history_12.history  
f1_scores_12 = obtain_f1_score(histories_12, 0)

Training_and_Validation_Metrics_Plot(history_12, f1_scores_12)

---
# **VGG16 Model**
---

We used the a custom function to build a transfer learning model by leveraging the pre-trained **VGG16 model** as the base. 
<br>
In this process, we froze the weights of all layers in the VGG16 model to prevent them from being updated during training, allowing us to retain the knowledge learned from large-scale image datasets like ImageNet. 
<br>
We then added custom dense layers with ReLU activation functions, dropout for regularization, and a softamx activation in the final layer to classify the images into one of the 8 cancer types. 
<br>
The model was compiled with the Adam optimizer and categorical cross-entropy loss, optimized for recall and precision as evaluation metrics.

The **VGG_16_Model_Multi_Class** was constructed using this process.

In [None]:
VGG_16_Model_Multi_Class = transfer_learning_multi_class(VGG16(weights = 'imagenet', include_top = False, input_shape = (50, 50, 3)))
VGG_16_Model_Multi_Class.summary()

Below, we trained the **VGG_16_Model_Multi_Class** using the training data obtained from the hold-out method.

In [None]:
history_13 = VGG_16_Model_Multi_Class.fit(indices_train_array_, labels_train_one_hot_encoded, validation_split = 0.2, epochs = 50, batch_size = 650, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_13 = history_13.history  
f1_scores_13 = obtain_f1_score(histories_13, 0)

Training_and_Validation_Metrics_Plot(history_13, f1_scores_13)

### **Data Augmentation** 

Below, we trained the **VGG_16_Model_Multi_Class** using the training data obtained from data augmentation.

In [None]:
history_14 = VGG_16_Model_Multi_Class.fit(train_generator_multi_class, validation_data = validation_generator_multi_class, epochs = 50, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_14 = history_14.history  
f1_scores_14 = obtain_f1_score(histories_14, 0)

Training_and_Validation_Metrics_Plot(history_14, f1_scores_14)

### **Oversampling** 

Below, we trained the **VGG_16_Model_Multi_Class** using the training data obtained from data oversampling.

In [None]:
history_15 = VGG_16_Model_Multi_Class.fit(indices_train_oversampled_, labels_train_oversampled_, validation_split = 0.2, epochs = 50, batch_size = 650, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_15 = history_15.history  
f1_scores_15 = obtain_f1_score(histories_15, 0)

Training_and_Validation_Metrics_Plot(history_15, f1_scores_15)

---
# **ResNet 50 Model**
---

We used the a custom function to build a transfer learning model by leveraging the pre-trained **ResNet 50 model** as the base. 
<br>
In this process, we froze the weights of all layers in the ResNet 50 model to prevent them from being updated during training, allowing us to retain the knowledge learned from large-scale image datasets like ImageNet. 
<br>
We then added custom dense layers with ReLU activation functions, dropout for regularization, and a softamx activation in the final layer to classify the images into one of the 8 cancer types. 
<br>
The model was compiled with the Adam optimizer and categorical cross-entropy loss, optimized for recall and precision as evaluation metrics.

The **ResNet_50_Model_Multi_Class** was constructed using this process.

In [None]:
ResNet_50_Model_Multi_Class = transfer_learning_multi_class(ResNet50(weights = 'imagenet', include_top = False, input_shape = (50, 50, 3)))
ResNet_50_Model_Multi_Class.summary()

Below, we trained the **ResNet_50_Model_Multi_Class** using the training data obtained from the hold-out method.

In [None]:
history_16 = ResNet_50_Model_Multi_Class.fit(indices_train_array_, labels_train_one_hot_encoded, validation_split = 0.2, epochs = 50, batch_size = 650, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_16 = history_16.history  
f1_scores_16 = obtain_f1_score(histories_16, 0)

Training_and_Validation_Metrics_Plot(history_16, f1_scores_16)

### **Data Augmentation** 

Below, we trained the **ResNet_50_Model_Multi_Class** using the training data obtained from data augmentation.

In [None]:
history_17 = ResNet_50_Model_Multi_Class.fit(train_generator_multi_class, validation_data = validation_generator_multi_class, epochs = 50, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_17 = history_17.history  
f1_scores_17 = obtain_f1_score(histories_17, 0)

Training_and_Validation_Metrics_Plot(history_17, f1_scores_17)

### **Oversampling** 

Below, we trained the **ResNet_50_Model_Multi_Class** using the training data obtained from data oversampling.

In [None]:
history_18 = ResNet_50_Model_Multi_Class.fit(indices_train_oversampled_, labels_train_oversampled_, validation_split = 0.2, epochs = 50, batch_size = 650, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_18 = history_18.history  
f1_scores_18 = obtain_f1_score(histories_18, 0)

Training_and_Validation_Metrics_Plot(history_18, f1_scores_18)

---
# **Functional API**
---

In this model, we designed a multi-input architecture using `Keras's Functional API`. 

The model consists of two primary inputs: one for image data and another for binary information (which represents any additional features, such as benign or malignant status).

The image input undergoes a series of convolutional layers, followed by batch normalization, max pooling, and dropout for regularization. 
<br>
These layers are designed to extract relevant features from the images while preventing overfitting. 

The binary input, which indicates whether a tumor is benign or malignant, is processed through dense layers to capture important information that could aid classification. 

The outputs from both paths are then concatenated before being passed through additional dense layers to produce the final output.

The reason for including binary information such as benign or malignant status is that these categories are distinct and highly informative. 
<br>
By providing this prior information, we leverage the model's ability to make more accurate predictions. 

The model is compiled using the Adam optimizer, and we are particularly focused on recall and precision as our evaluation metrics. 

In [None]:
binary_train = metadata.loc[indices_train_.index, 'Binary Labels'].values
binary_test = metadata.loc[indices_test_.index, 'Binary Labels'].values

In [None]:
image_input = Input(shape = (50, 50, 3), name = 'image_input')

x = Conv2D(32, (3, 3), kernel_regularizer = l2(0.01))(image_input)
x = LeakyReLU(alpha = 0.1)(x)
x = BatchNormalization()(x)
x = MaxPooling2D((2, 2))(x)

x = Conv2D(64, (3, 3), kernel_regularizer = l2(0.01))(x)
x = BatchNormalization()(x)
x = MaxPooling2D((2, 2))(x)

x = Conv2D(128, (3, 3), kernel_regularizer = l2(0.01))(x)
x = BatchNormalization()(x)

x = Flatten()(x)

x = Dense(256, activation = 'relu', kernel_regularizer = l2(0.01))(x)
x = Dropout(0.6)(x)

x = Dense(128, activation = 'relu', kernel_regularizer = l2(0.01))(x)

x = Dense(63, activation = 'relu', kernel_regularizer = l2(0.01))(x)

x = Dense(32, activation = 'relu', kernel_regularizer = l2(0.01))(x)

binary_input = Input(shape = (1,), name = 'binary_input')
binary_path = Dense(32, activation = 'relu', kernel_regularizer = l2(0.01))(binary_input)
binary_path = BatchNormalization()(binary_path)  
binary_path = Dropout(0.2)(binary_path)        
binary_path = Dense(16, activation = 'relu', kernel_regularizer = l2(0.01))(binary_path)
binary_path = Dense(8, activation = 'relu')(binary_path)

combined = Concatenate()([x, binary_path])

final_output = Dense(8, activation = 'softmax', name = 'output')(combined)

functional_API = Model(inputs = [image_input, binary_input], outputs = final_output)

functional_API.compile(optimizer = tf.keras.optimizers.Adam(learning_rate = 0.001, decay = 1e-6), loss = 'categorical_crossentropy', metrics = ['recall', 'precision'])

Below, we trained the **functional_API** using the training data obtained from the hold-out method.

In [None]:
history_19 = functional_API.fit([indices_train_array_, binary_train], labels_train_one_hot_encoded, validation_split = 0.2, epochs = 50, batch_size = 650, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_19 = history_19.history  
f1_scores_19 = obtain_f1_score(histories_19, 0)

Training_and_Validation_Metrics_Plot(history_19, f1_scores_19)

**Functional API with Hyperparameter Tuning**

In [None]:
Tuner_3 = kt.Hyperband(model_building_function_3, objective = 'val_loss', max_epochs = 20, factor = 3, directory = 'Hyperband Tuner', project_name = 'Model 3')

In [None]:
Tuner_3.search([indices_train_array_, binary_train], labels_train_one_hot_encoded, epochs = 20, validation_split = 0.2, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
Best_Hyperparameters_Functional_API = Tuner_3.get_best_hyperparameters(num_trials = 1)[0]

Functional_API_Tuned = Tuner_3.hypermodel.build(Best_Hyperparameters_Functional_API)

In [None]:
history_20 = Functional_API_Tuned.fit([indices_train_array_, binary_train], labels_train_one_hot_encoded, epochs = 50, validation_split = 0.2, batch_size = 650, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_20 = history_20.history  
f1_scores_20 = obtain_f1_score(histories_20, 0)

Training_and_Validation_Metrics_Plot(history_20, f1_scores_20)

<font color='pink' size=5>**4.3. Modelling with the Preprocessed Images**</font> <a class="anchor" id="P43"></a>

For this section, we applied several preprocessing techniques and trained the model returned by the tuner. 
<br>
However, for the grayscale images, we created a new model with the same structure, adapted to accept input images of shape (50, 50, 1).

---
# **Model from Scratch**
---

### **Grayscale** 

In [None]:
model_grayscale_multi_class = models.Sequential()

model_grayscale_multi_class.add(layers.Conv2D(32, (3, 3), activation = 'relu', input_shape = (50, 50, 1)))
model_grayscale_multi_class.add(layers.MaxPooling2D((2, 2)))

model_grayscale_multi_class.add(layers.Conv2D(64, (3, 3), activation = 'relu'))
model_grayscale_multi_class.add(layers.Conv2D(128, (3, 3), activation = 'relu'))

model_grayscale_multi_class.add(layers.Flatten())

model_grayscale_multi_class.add(layers.Dense(128, activation = 'relu'))
model_grayscale_multi_class.add(Dropout(0.6))

model_grayscale_multi_class.add(layers.Dense(64, activation = 'relu'))
model_grayscale_multi_class.add(layers.Dense(8, activation = 'softmax'))

model_grayscale_multi_class.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['recall', 'precision'])

Below we trained our model using the training data obtained from the hold-out method, but converted to grayscale.

In [None]:
indices_train_grayscale_ = apply_preprocessing(indices_train_array_, 'grayscale')

history_grayscale_ = model_grayscale_multi_class.fit(indices_train_grayscale_, labels_train_one_hot_encoded, validation_split = 0.2, batch_size = 650, epochs = 50, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_grayscale_ = history_grayscale_.history  
f1_scores_grayscale_ = obtain_f1_score(histories_grayscale_, 0)

Training_and_Validation_Metrics_Plot(history_grayscale_, f1_scores_grayscale_)

### **Contrast** 

Below we trained our model using the training data obtained from the hold-out method, but with enhanced contrast applied to the images.

In [None]:
indices_train_contrast_ = apply_preprocessing(indices_train_array_, 'contrast')

history_contrast_ = Multi_Class_Model.fit(indices_train_contrast_, labels_train_one_hot_encoded, validation_split = 0.2, batch_size = 650, epochs = 50, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_contrast_ = history_contrast_.history  
f1_scores_contrast_ = obtain_f1_score(histories_contrast_, 0)

Training_and_Validation_Metrics_Plot(history_contrast_, f1_scores_contrast_)

### **Brightness Contrast** 

Below we trained our model using the training data obtained from the hold-out method, but with adjusted brightness and contrast applied to the images.

In [None]:
indices_train_brightness_contrast_ = apply_preprocessing(indices_train_array_, 'brightness_contrast')

history_brightness_contrast_ = Multi_Class_Model.fit(indices_train_brightness_contrast_, labels_train_one_hot_encoded, validation_split = 0.2, batch_size = 650, epochs = 50, callbacks = [early_stopping, reduce_on_plateau])

In [None]:
histories_brightness_contrast_ = history_brightness_contrast_.history  
f1_scores_brightness_contrast_ = obtain_f1_score(histories_brightness_contrast_, 0)

Training_and_Validation_Metrics_Plot(history_brightness_contrast_, f1_scores_brightness_contrast_)

<font color='pink' size=5>**4.4. Final Model Evaluation**</font> <a class="anchor" id="P44"></a>

We concluded that the best model was the one returned by the tuner using the original images and the hold-out method. In the cell below, we load this model for further evaluation.

In [None]:
multi_class_classification_model = load_model('Model 2.h5')

In [None]:
multi_class_classification_model.summary()

We made predictions on the test data using the multi-class model and converted the probabilities to class labels by applying `np.argmax` to select the class with the highest probability. 
<br>
This gives us the predicted class for each sample in the test set.

In [None]:
test_predictions_multi_class = multi_class_classification_model.predict(indices_test_array_)
test_predictions_multi_class = np.argmax(test_predictions_multi_class, axis = 1)

In [None]:
class_names = ['Adenosis', 'Ductal Carcinoma', 'Fibroadenoma', 'Lobular Carcinoma', 'Mucinous Carcinoma', 'Papillary Carcinoma', 'Phyllodes Tumor', 'Tubular Adenoma']

Below, we can visualize the confusion matrix and the classification report, for each class.

In [None]:
confusion_matrix_multi_class(multi_class_classification_model, indices_test_array_, labels_test_one_hot_encoded, class_names)

In [None]:
evaluate_each_class(multi_class_classification_model, indices_test_array_, labels_test_one_hot_encoded, class_names)

The classification report provides a comprehensive evaluation of the multi-class model's performance across various cancer types. 

For some classes, such as Ductal Carcinoma, the model performs relatively well, with a precision of 0.59, recall of 0.93, and an F1-score of 0.72. 
<br>
This performance can be attributed to Ductal Carcinoma being the most frequent class in the dataset, allowing the model to learn more effectively due to the larger number of observations. 
<br>
The high recall for this class indicates that the model is good at correctly identifying most instances, which is crucial for detecting this common cancer type.

However, the model struggles with rarer classes like Phyllodes Tumor, where precision, recall, and F1-score are all 0.00. 
<br>
This reflects the model's inability to correctly identify instances of this cancer type, likely due to the limited number of samples available during training. 
<br>
Phyllodes Tumor ranks second-to-last in terms of observations, which suggests that the scarcity of data for this class makes it difficult for the model to learn the distinguishing features, leading to poor performance. 
<br>
Similarly, other rare classes such as Mucinous Carcinoma and Papillary Carcinoma show low recall values, indicating that the model misses many instances of these cancer types.

The overall accuracy of 0.54 suggests that the model has some predictive power, but there is significant room for improvement, particularly when it comes to accurately predicting rarer cancer types. 
<br>
The macro average F1-score of 0.27 and the weighted average F1-score of 0.45 further emphasize the challenges faced by the model, especially considering the class imbalance. 
<br>
While the model performs better for more frequent classes, it struggles with rarer ones, underscoring the difficulty of achieving a balanced performance across all cancer types in an imbalanced dataset. 