### Ensemble Evaluation Notebook
In the following notebook, we will take a look for the evaluation of the ensemble models for two different kinds of ensemble:
- stacking ensemble  
- hard voting ensemble  

The efficiency of the ensemble strategy will be evaluated, documented and compared to the standard methods.  
The available code in the notebook will run the entire scenario.

In [1]:
# Imports
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F

import os

from EnsembleModels.Ensemble import StackingEnsemble, HardVotingEnsemble, SoftVotingEnsemble
from DataObjects import DataLoader
from torch import Tensor
from typing import Dict

from Architectures.SimpleCNN import SimpleCNN
from Architectures.OptimalCNN import OptimalCNN
from Architectures.StochasticDepthCNN import StochasticDepthCNN

from utils import load_model

In [2]:
# load datasets

num_classes: int = 10

train_loader_path = os.path.join("Data", "Data_converted", "train")
val_loader_path = os.path.join("Data", "Data_converted", "valid")
test_loader_path = os.path.join("Data", "Data_converted", "test")

device: torch.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
train_loader = DataLoader(train_loader_path, batch_size=64, shuffle=True, max_per_class=150)
val_loader = DataLoader(val_loader_path, batch_size=64, shuffle=False, max_per_class=150) 
test_loader = DataLoader(test_loader_path, batch_size=64, shuffle=False, max_per_class=150) 

In [3]:
model_Optimal_1 = load_model(os.path.join("Models_Pytorch_saved", "OptimalCNN_trained_saved.pth"))
model_Optimal_2 = load_model(os.path.join("Models_Pytorch_saved", "OptimalCNN_trained_saved.pth"))
model_Optimal_3 = load_model(os.path.join("Models_Pytorch_saved", "OptimalCNN_trained_saved.pth"))

Model loaded successfully from Models_Pytorch_saved\OptimalCNN_trained_saved.pth
Model loaded successfully from Models_Pytorch_saved\OptimalCNN_trained_saved.pth
Model loaded successfully from Models_Pytorch_saved\OptimalCNN_trained_saved.pth


In [4]:
base_models: Dict[str, nn.Module] = {"Model_Optimal_1": model_Optimal_1, "Model_Optimal_2": model_Optimal_2, "Model_Optimal_3": model_Optimal_3}
stacking_ensemble: StackingEnsemble = StackingEnsemble(base_models, num_classes)
hard_voting_ensemble: HardVotingEnsemble = HardVotingEnsemble(base_models)
soft_voting_ensemble: SoftVotingEnsemble = SoftVotingEnsemble(base_models)

stacking_ensemble.to(device)
optimizer = optim.Adam(stacking_ensemble.meta_model.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()

In [None]:
stacking_ensemble.train_ensemble(train_loader, optimizer, criterion, device, epochs=20)
loss, acc = stacking_ensemble.test(test_loader, criterion, device)
print("Stacking Ensemble Test Loss:", loss)
print("Stacking Ensemble Test Accuracy:", acc)

hv_acc = hard_voting_ensemble.test(test_loader, device)
print("Hard Voting Ensemble Test Accuracy:", hv_acc)

sv_acc = soft_voting_ensemble.test(test_loader, device)
print("Soft Voting Ensemble Test Accuracy:", sv_acc)

  5%|▌         | 1/20 [00:06<02:03,  6.51s/it]

Epoch 1/20 completed.


 10%|█         | 2/20 [00:18<02:52,  9.58s/it]

Epoch 2/20 completed.


 15%|█▌        | 3/20 [00:29<02:57, 10.47s/it]

Epoch 3/20 completed.


 20%|██        | 4/20 [00:40<02:50, 10.63s/it]

Epoch 4/20 completed.


 25%|██▌       | 5/20 [00:51<02:39, 10.63s/it]

Epoch 5/20 completed.


 30%|███       | 6/20 [01:02<02:29, 10.67s/it]

Epoch 6/20 completed.


 35%|███▌      | 7/20 [01:13<02:22, 10.97s/it]

Epoch 7/20 completed.


 40%|████      | 8/20 [01:21<02:00, 10.02s/it]

Epoch 8/20 completed.


 45%|████▌     | 9/20 [01:28<01:38,  8.94s/it]

Epoch 9/20 completed.


 50%|█████     | 10/20 [01:33<01:19,  7.92s/it]

Epoch 10/20 completed.


 55%|█████▌    | 11/20 [01:39<01:04,  7.11s/it]

Epoch 11/20 completed.


 60%|██████    | 12/20 [01:44<00:53,  6.70s/it]

Epoch 12/20 completed.


 65%|██████▌   | 13/20 [01:50<00:43,  6.27s/it]

Epoch 13/20 completed.


 70%|███████   | 14/20 [01:56<00:37,  6.29s/it]

Epoch 14/20 completed.


 75%|███████▌  | 15/20 [02:02<00:31,  6.31s/it]

Epoch 15/20 completed.


 80%|████████  | 16/20 [02:09<00:25,  6.30s/it]

Epoch 16/20 completed.


 85%|████████▌ | 17/20 [02:16<00:20,  6.77s/it]

Epoch 17/20 completed.


 90%|█████████ | 18/20 [02:23<00:13,  6.78s/it]

Epoch 18/20 completed.


 95%|█████████▌| 19/20 [02:29<00:06,  6.50s/it]

Epoch 19/20 completed.


100%|██████████| 20/20 [02:35<00:00,  7.76s/it]

Epoch 20/20 completed.





Stacking Ensemble Test Loss: 0.7572781588236491
Stacking Ensemble Test Accuracy: 0.8233333333333334
Hard Voting Ensemble Test Accuracy: 0.82
Hard Voting Ensemble Test Accuracy: 0.82
