## Supervised models
This notebook is intended for giving an introduction the ML supervised models that can be used for Covid detection.

For this notebook to find the new modules created for this project, we need to set its path to be in the root directory.

In [1]:
# Auto reload modules
%load_ext autoreload
%autoreload 2

In [2]:
import sys
sys.path.append("../")

<img src="../images/Supervised_Models.png" width="800"/>

## Loading packages and dependencies

In [None]:
from src.preprocessing.image_augmentor import generate_augmented_images_multiclass
from src.models.build_model import train_advanced_supervised_model, evaluate_model


# Path to the raw data and preprocessed data
raw_data_dir = '../data/raw/dataset/images'
IMG_SIZE = 299  # Resize images to IMG_SIZExIMG_SIZE pixels
batch_size = 32

## Extracting features from images

In [None]:
train_data, val_data, class_weight_dict = generate_augmented_images_multiclass(raw_data_dir, (IMG_SIZE, IMG_SIZE), batch_size)

Found 16933 images belonging to 4 classes.
Found 4232 images belonging to 4 classes.
Computed Class Weights:{0: 1.4632734185966125, 1: 0.8800935550935551, 2: 0.5191623742948246, 3: 3.9342472118959106} labels: {'COVID': 0, 'Lung_Opacity': 1, 'Normal': 2, 'Viral Pneumonia': 3}


## Normalizing features

## Training and evaluating models

### Convolutional Neural Networks (CNN)

✅ Strengths:
* Highly accurate for image tasks.
* Learns complex patterns automatically.
* Works well with large image datasets.

❌ Weaknesses:
* Computationally expensive (needs GPUs).
* Requires large labeled datasets.
* Not easily interpretable.

In [5]:
# Train the model
model, history = train_advanced_supervised_model(train_data, val_data, IMG_SIZE, 50, 4, class_weight_dict, model_type='CNN', classification_type='categorical')

2025-03-09 17:10:03.296779: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M3 Max
2025-03-09 17:10:03.296811: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 128.00 GB
2025-03-09 17:10:03.296815: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 48.00 GB
2025-03-09 17:10:03.296833: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2025-03-09 17:10:03.296842: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)


  self._warn_if_super_not_called()


Epoch 1/50


2025-03-09 17:10:04.272392: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.


[1m530/530[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m84s[0m 148ms/step - accuracy: 0.4349 - loss: 22.6839 - val_accuracy: 0.3889 - val_loss: 42.3137 - learning_rate: 1.0000e-04
Epoch 2/50
[1m530/530[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m197s[0m 371ms/step - accuracy: 0.4774 - loss: 48.6980 - val_accuracy: 0.6749 - val_loss: 24.4113 - learning_rate: 1.0000e-04
Epoch 3/50
[1m530/530[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m325s[0m 613ms/step - accuracy: 0.5143 - loss: 66.8435 - val_accuracy: 0.6371 - val_loss: 51.1466 - learning_rate: 1.0000e-04
Epoch 4/50
[1m530/530[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m580s[0m 1s/step - accuracy: 0.5420 - loss: 94.3771 - val_accuracy: 0.7181 - val_loss: 36.5573 - learning_rate: 1.0000e-04
Epoch 5/50
[1m530/530[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 589ms/step - accuracy: 0.5653 - loss: 113.8473
Epoch 5: ReduceLROnPlateau reducing learning rate to 4.999999873689376e-05.
[1m530/530[0m [32m━━━━━━━━━━

In [8]:
train_loss, train_acc = history.history['loss'][-1], history.history['accuracy'][-1]
print(f"Train Accuracy: {train_acc:.4f}, Train Loss: {train_loss:.4f}")

test_loss, test_acc = evaluate_model("Multi-label classification [Normal, COVID, Viral Pneumonia, Lung_Opacity] for images without masks", model, val_data, _, model_type="CNN", classification_type="multiclass")
print(f"Test Accuracy: {test_acc:.4f}, Test Loss: {test_loss:.4f}")

Train Accuracy: 0.6107, Train Loss: 145.0412
[1m133/133[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m14s[0m 105ms/step - accuracy: 0.6615 - loss: 25.9321
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 34ms/step


Successfully registered model 'tensorflow-CNN-multiclass'.
2025/03/09 17:48:50 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: tensorflow-CNN-multiclass, version 1


🏃 View run CNN-multiclass at: http://localhost:8080/#/experiments/803656475742168625/runs/d1e9a661a66f4c97b2c88d0546769448
🧪 View experiment at: http://localhost:8080/#/experiments/803656475742168625
Test Accuracy: 0.6661, Test Loss: 24.5907


Created version '1' of model 'tensorflow-CNN-multiclass'.


### Transfer learning

✅ Strengths
* Transfer learning reduces training time while maintaining high accuracy.
* Fine-tuning improves performance when sufficient data is available.
* Combining deep features with statistical features can enhance results.

In [9]:
model, history = train_advanced_supervised_model(train_data, val_data, IMG_SIZE, 50, 4, class_weight_dict, model_type="Transfer Learning", classification_type='categorical')

Epoch 1/50
[1m530/530[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m88s[0m 158ms/step - accuracy: 0.5588 - loss: 0.9287 - val_accuracy: 0.7616 - val_loss: 0.6082 - learning_rate: 1.0000e-04
Epoch 2/50
[1m530/530[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m79s[0m 149ms/step - accuracy: 0.7619 - loss: 0.5204 - val_accuracy: 0.7940 - val_loss: 0.5169 - learning_rate: 1.0000e-04
Epoch 3/50
[1m530/530[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m76s[0m 144ms/step - accuracy: 0.7988 - loss: 0.4302 - val_accuracy: 0.8228 - val_loss: 0.4697 - learning_rate: 1.0000e-04
Epoch 4/50
[1m530/530[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m76s[0m 143ms/step - accuracy: 0.8067 - loss: 0.4192 - val_accuracy: 0.8450 - val_loss: 0.4119 - learning_rate: 1.0000e-04
Epoch 5/50
[1m530/530[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 154ms/step - accuracy: 0.8260 - loss: 0.3777 - val_accuracy: 0.8164 - val_loss: 0.4781 - learning_rate: 1.0000e-04
Epoch 6/50
[1m530/530[0m [32m━━━

In [10]:
train_loss, train_acc = history.history['loss'][-1], history.history['accuracy'][-1]
print(f"Train Accuracy: {train_acc:.4f}, Train Loss: {train_loss:.4f}")

test_loss, test_acc = evaluate_model("Multi-label classification [Normal, COVID, Viral Pneumonia, Lung_Opacity] for images without masks", model, val_data, _, model_type="Transfer Learning", classification_type="multiclass")
print(f"Test Accuracy: {test_acc:.4f}, Test Loss: {test_loss:.4f}")

Train Accuracy: 0.8599, Train Loss: 0.2946
[1m133/133[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 111ms/step - accuracy: 0.8804 - loss: 0.3301
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2s/step


Successfully registered model 'tensorflow-Transfer Learning-multiclass'.
2025/03/09 18:12:32 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: tensorflow-Transfer Learning-multiclass, version 1


🏃 View run Transfer Learning-multiclass at: http://localhost:8080/#/experiments/803656475742168625/runs/191d9828446c4d85820db8f565b27032
🧪 View experiment at: http://localhost:8080/#/experiments/803656475742168625
Test Accuracy: 0.8724, Test Loss: 0.3523


Created version '1' of model 'tensorflow-Transfer Learning-multiclass'.
