### **Libraries**

In [2]:
import time
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
import torchvision.transforms as transforms
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader
import tensorflow_datasets as tfds
from sklearn.metrics import f1_score
import matplotlib.pyplot as plt
import numpy as np
import torchvision.models as models


In [3]:
# Check GPU availability
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

Using device: cuda


### **Dataset**

The **Flowers-102** dataset consists of of 8,189 color images, each of size 224x224 pixels, divided into 102 classes with 40 - 258 images per class..

In this case, images are resized to match the required input size of some models as most of them require a 224x224 image size as they were programmed on ImageNet.

In [3]:
# Size tranformation
transform = transforms.Compose([
    transforms.Resize((224, 224)),  # Resize to match models' expected input size
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) # ImageNet mean values
])

# Size tranformation for InceptionV3
transform_inception = transforms.Compose([
    transforms.Resize((299, 299)),  # Resize to 299x299 pixels for InceptionV3
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

# Load the Flower-102 dataset
(ds_train, ds_test), ds_info = tfds.load(
    'oxford_flowers102',
    split=['train', 'test'],
    shuffle_files=True,
    as_supervised=True,
    with_info=True,
)

# Convert TFDS datasets to ImageFolder-like structure for PyTorch
def tfds_to_imagefolder(ds, transform):
    images, labels = [], []
    for image, label in tfds.as_numpy(ds):
        image = transform(transforms.ToPILImage()(image))
        images.append(image)
        labels.append(label)
    dataset = list(zip(images, labels))
    return dataset

trainset = tfds_to_imagefolder(ds_train, transform)
testset = tfds_to_imagefolder(ds_test, transform)
trainset_inception = tfds_to_imagefolder(ds_train, transform_inception)
testset_inception = tfds_to_imagefolder(ds_test, transform_inception)

# Create DataLoaders
trainloader = DataLoader(trainset, batch_size=32, shuffle=True, num_workers=2)
testloader = DataLoader(testset, batch_size=32, shuffle=False, num_workers=2)
trainloader_inception = DataLoader(trainset_inception, batch_size=32, shuffle=True, num_workers=2)
testloader_inception = DataLoader(testset_inception, batch_size=32, shuffle=False, num_workers=2)


The CNN architectures that will be considered are the following:


- **AlexNet:** Original architecture
- **ResNet:** ResNet-50
- **VGGNet:** VGG-16
- **GoogleNet:** Inception v3
- **EfficientNet:** EfficientNet-B1, EfficientNet-B3 EfficientNet-B5

In [6]:
# Define models
models = {
    'AlexNet': torchvision.models.alexnet(pretrained=True),
    'ResNet50': torchvision.models.resnet50(pretrained=True),
    'VGG16': torchvision.models.vgg16(pretrained=True),
    'InceptionV3': torchvision.models.inception_v3(pretrained=True, aux_logits=True),
    'EfficientNet-B1': torchvision.models.efficientnet_b1(pretrained=True),
    'EfficientNet-B3': torchvision.models.efficientnet_b3(pretrained=True),
    'EfficientNet-B5': torchvision.models.efficientnet_b5(pretrained=True)
}



The following code ensures that each model is adapted to the Flower-102 dataset by replacing the final layer of the model. For models like AlexNet, VGG, and EfficientNet, this modification is made to the classifier's last layer, while for Inception and other models, it's made to the last fully connected layer 

In [7]:
# Modify the final layer to match CIFAR-100 classes
for name, model in models.items():
    if 'EfficientNet' in name:
        num_ftrs = model.classifier[-1].in_features
        model.classifier[-1] = nn.Linear(num_ftrs, 102)
    elif 'Inception' in name:
        num_ftrs = model.fc.in_features
        model.fc = nn.Linear(num_ftrs, 102)
    elif 'AlexNet' in name or 'VGG' in name:
        num_ftrs = model.classifier[-1].in_features
        model.classifier[-1] = nn.Linear(num_ftrs, 102)
    else:
        num_ftrs = model.fc.in_features
        model.fc = nn.Linear(num_ftrs, 102)
    models[name] = model.to(device)

This part initializes the loss function used for training the models. Cross-entropy loss measures how well a model's predicted probability distribution matches the actual distribution of the labels.

The mathematical definition of the cross-entropy loss for a dataset is defined as:

$$
L = -\frac{1}{N} \sum_{n=1}^N \sum_{i=1}^C y_{ni} \log(p_{ni})
$$

- \(N\): The number of samples.
- \(y_{ni}\): The actual binary indicator for sample \(n\) and class \(i\).
- \(p_{ni}\): The predicted probability for sample \(n\) and class \(i\).

In [8]:
# Define loss function
criterion = nn.CrossEntropyLoss()

For the following part:

The **train_model** funtion trains a given model on the training dataset for a specified number of epochs while tracking the loss and accuracy.

The **evaluate_model** functio evaluates the trained model on the test dataset.

The **train_model_inception** function trains the InceptionV3 model with accuracy tracking. The function outputs a tuple when *aux_logits* is *True*. The code checks if the output is a tuple and uses only the main output (ignoring auxiliary outputs).

In [9]:
# Function to train the model with accuracy tracking
def train_model(model, trainloader, criterion, optimizer, num_epochs=25):
    model.train()
    for epoch in range(num_epochs):
        running_loss = 0.0
        correct = 0
        total = 0
        start_time = time.time()
        for i, data in enumerate(trainloader, 0):
            inputs, labels = data[0].to(device), data[1].to(device)
            optimizer.zero_grad()
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()

            running_loss += loss.item()

            _, preds = torch.max(outputs, 1)
            correct += torch.sum(preds == labels).item()
            total += labels.size(0)

        epoch_loss = running_loss / len(trainloader)
        epoch_accuracy = correct / total
        end_time = time.time()

        print(f"Epoch {epoch+1}, Loss: {epoch_loss:.4f}, Accuracy: {epoch_accuracy:.4f}, Time: {end_time - start_time:.2f} seconds")

# Function to evaluate the model
def evaluate_model(model, testloader):
    model.eval()
    all_preds = []
    all_labels = []
    correct = 0
    total = 0
    with torch.no_grad():
        for inputs, labels in testloader:
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model(inputs)
            _, preds = torch.max(outputs, 1)
            correct += torch.sum(preds == labels).item()
            total += labels.size(0)
            all_preds.extend(preds.cpu().numpy())
            all_labels.extend(labels.cpu().numpy())

    accuracy = correct / total
    f1 = f1_score(all_labels, all_preds, average='weighted')
    return accuracy, f1

# Define a modified train function for InceptionV3
def train_model_inception(model, trainloader, criterion, optimizer, num_epochs=25):
    model.train()
    for epoch in range(num_epochs):
        running_loss = 0.0
        start_time = time.time()
        correct = 0
        total = 0
        for i, data in enumerate(trainloader, 0):
            inputs, labels = data[0].to(device), data[1].to(device)
            optimizer.zero_grad()
            outputs = model(inputs)
            if isinstance(outputs, tuple):
                outputs = outputs[0]  # Use only the main output
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()
            running_loss += loss.item()
            _, preds = torch.max(outputs, 1)
            correct += (preds == labels).sum().item()
            total += labels.size(0)
        end_time = time.time()
        epoch_accuracy = correct / total
        print(f"Epoch {epoch+1}, Loss: {running_loss / len(trainloader)}, Accuracy: {epoch_accuracy * 100:.2f}%, Time: {end_time - start_time} seconds")

### **Model's training**

Now, each one of the architectures previously mentioned are trained in the same way. The models are trained in the training data for 25 epochs using the Adam optimizer with a learning rate of 0.001. The most important definition of this section are the following:

- **Epoch** refers to one complete pass of the entire training dataset through the learning algorithm. 

- **Adam optimizer** (*optim.Adam*) is an adaptive learning rate optimization algorithm designed specifically for training deep neural networks. It combines the advantages of two other extensions of stochastic gradient descent: AdaGrad and RMSProp.

- **Learning rate** controls how much to change the model in response to the estimated error each time the model weights are updated during training. 

After training, the model is evaluated on the test data (testloader) to calculate the accuracy and F1 score. Finally, the results including the accuracy, F1 score, and training time are printed to the console.

#### **AlexNet**

In [11]:
# Train and evaluate AlexNet
model_name = 'AlexNet'
print(f"\nTraining {model_name}...\n")
model = models[model_name]
optimizer = optim.Adam(model.parameters(), lr=0.001) # lr=0.001 is considered standar for the learning rate
start_time = time.time()
train_model(model, trainloader, criterion, optimizer, num_epochs=25) # num_epochs=25 due to hardware limitation
training_time = time.time() - start_time
accuracy, f1 = evaluate_model(model, testloader)

print(f"{model_name} - Top-1 Accuracy: {accuracy * 100:.2f}%")
print(f"{model_name} - F1 Score: {f1 * 100:.2f}%")
print(f"{model_name} - Training Time: {training_time:.2f} seconds")


Training AlexNet...

Epoch 1, Loss: 4.6260, Accuracy: 0.0078, Time: 7.30 seconds
Epoch 2, Loss: 4.6256, Accuracy: 0.0098, Time: 7.20 seconds
Epoch 3, Loss: 4.6260, Accuracy: 0.0039, Time: 7.44 seconds
Epoch 4, Loss: 4.6263, Accuracy: 0.0078, Time: 7.32 seconds
Epoch 5, Loss: 4.6268, Accuracy: 0.0049, Time: 7.20 seconds
Epoch 6, Loss: 4.6280, Accuracy: 0.0029, Time: 7.22 seconds
Epoch 7, Loss: 4.6286, Accuracy: 0.0059, Time: 7.33 seconds
Epoch 8, Loss: 4.6294, Accuracy: 0.0029, Time: 7.19 seconds
Epoch 9, Loss: 4.6294, Accuracy: 0.0039, Time: 7.46 seconds
Epoch 10, Loss: 4.6294, Accuracy: 0.0098, Time: 7.38 seconds
Epoch 11, Loss: 4.6274, Accuracy: 0.0029, Time: 7.05 seconds
Epoch 12, Loss: 4.6260, Accuracy: 0.0059, Time: 7.43 seconds
Epoch 13, Loss: 4.6256, Accuracy: 0.0069, Time: 7.24 seconds
Epoch 14, Loss: 4.6259, Accuracy: 0.0069, Time: 7.25 seconds
Epoch 15, Loss: 4.6255, Accuracy: 0.0039, Time: 7.25 seconds
Epoch 16, Loss: 4.6258, Accuracy: 0.0039, Time: 7.40 seconds
Epoch 17, L

#### **ResNet-50**

In [12]:
# Train and evaluate ResNet50
model_name = 'ResNet50'
print(f"\nTraining {model_name}...\n")
model = models[model_name]
optimizer = optim.Adam(model.parameters(), lr=0.001)
start_time = time.time()
train_model(model, trainloader, criterion, optimizer, num_epochs=25)
training_time = time.time() - start_time
accuracy, f1 = evaluate_model(model, testloader)

print(f"{model_name} - Top-1 Accuracy: {accuracy * 100:.2f}%")
print(f"{model_name} - F1 Score: {f1 * 100:.2f}%")
print(f"{model_name} - Training Time: {training_time:.2f} seconds")


Training ResNet50...

Epoch 1, Loss: 4.7556, Accuracy: 0.0402, Time: 101.96 seconds
Epoch 2, Loss: 4.3269, Accuracy: 0.0412, Time: 123.19 seconds
Epoch 3, Loss: 3.8430, Accuracy: 0.0686, Time: 107.46 seconds
Epoch 4, Loss: 3.3960, Accuracy: 0.1343, Time: 103.04 seconds
Epoch 5, Loss: 2.9675, Accuracy: 0.2206, Time: 100.98 seconds
Epoch 6, Loss: 2.4951, Accuracy: 0.3265, Time: 101.05 seconds
Epoch 7, Loss: 2.2786, Accuracy: 0.3775, Time: 100.56 seconds
Epoch 8, Loss: 1.7822, Accuracy: 0.5010, Time: 100.43 seconds
Epoch 9, Loss: 1.4218, Accuracy: 0.5892, Time: 99.40 seconds
Epoch 10, Loss: 1.2098, Accuracy: 0.6520, Time: 96.95 seconds
Epoch 11, Loss: 0.8811, Accuracy: 0.7451, Time: 98.46 seconds
Epoch 12, Loss: 0.7757, Accuracy: 0.7676, Time: 100.59 seconds
Epoch 13, Loss: 0.5511, Accuracy: 0.8431, Time: 100.34 seconds
Epoch 14, Loss: 0.3540, Accuracy: 0.9029, Time: 100.73 seconds
Epoch 15, Loss: 0.2226, Accuracy: 0.9471, Time: 100.80 seconds
Epoch 16, Loss: 0.1868, Accuracy: 0.9461, Ti

#### **VGG-16**

In [13]:
# Train and evaluate VGG16
model_name = 'VGG16'
print(f"\nTraining {model_name}...\n")
model = models[model_name]
optimizer = optim.Adam(model.parameters(), lr=0.001)
start_time = time.time()
train_model(model, trainloader, criterion, optimizer, num_epochs=25)
training_time = time.time() - start_time
accuracy, f1 = evaluate_model(model, testloader)

print(f"{model_name} - Top-1 Accuracy: {accuracy * 100:.2f}%")
print(f"{model_name} - F1 Score: {f1:.2f}")
print(f"{model_name} - Training Time: {training_time:.2f} seconds")


Training VGG16...

Epoch 1, Loss: 4.7589, Accuracy: 0.0078, Time: 475.20 seconds
Epoch 2, Loss: 4.6422, Accuracy: 0.0059, Time: 468.42 seconds
Epoch 3, Loss: 4.6482, Accuracy: 0.0059, Time: 472.10 seconds
Epoch 4, Loss: 4.6386, Accuracy: 0.0069, Time: 477.71 seconds
Epoch 5, Loss: 4.6544, Accuracy: 0.0039, Time: 455.08 seconds
Epoch 6, Loss: 4.6652, Accuracy: 0.0137, Time: 452.59 seconds
Epoch 7, Loss: 4.6341, Accuracy: 0.0069, Time: 456.47 seconds
Epoch 8, Loss: 4.6268, Accuracy: 0.0127, Time: 451.33 seconds
Epoch 9, Loss: 4.6080, Accuracy: 0.0108, Time: 450.89 seconds
Epoch 10, Loss: 4.5824, Accuracy: 0.0069, Time: 451.50 seconds
Epoch 11, Loss: 4.4838, Accuracy: 0.0186, Time: 451.21 seconds
Epoch 12, Loss: 4.4472, Accuracy: 0.0108, Time: 451.30 seconds
Epoch 13, Loss: 4.3860, Accuracy: 0.0147, Time: 450.57 seconds
Epoch 14, Loss: 4.3273, Accuracy: 0.0225, Time: 452.21 seconds
Epoch 15, Loss: 4.4369, Accuracy: 0.0265, Time: 451.05 seconds
Epoch 16, Loss: 4.2231, Accuracy: 0.0353, Ti

#### **GoogleNet: Inception V3**

In [14]:
# Training and evaluating InceptionV3
print("\nTraining InceptionV3...\n")
model_inceptionv3 = models['InceptionV3']
optimizer_inceptionv3 = optim.Adam(model_inceptionv3.parameters(), lr=0.001)

start_time = time.time()
train_model_inception(model_inceptionv3, trainloader_inception, criterion, optimizer_inceptionv3, num_epochs=25)
end_time = time.time()

accuracy_inceptionv3, f1_inceptionv3 = evaluate_model(model_inceptionv3, testloader_inception)
print(f"InceptionV3 - Top-1 Accuracy: {accuracy_inceptionv3 * 100:.2f}%")
print(f"InceptionV3 - F1 Score: {f1_inceptionv3:.2f}")
print(f"Training and evaluation time for InceptionV3: {end_time - start_time} seconds")


Training InceptionV3...

Epoch 1, Loss: 4.403911046683788, Accuracy: 8.04%, Time: 361.23323249816895 seconds
Epoch 2, Loss: 2.977291002869606, Accuracy: 28.53%, Time: 341.2345886230469 seconds
Epoch 3, Loss: 2.0481783524155617, Accuracy: 46.18%, Time: 336.2950701713562 seconds
Epoch 4, Loss: 1.4312483929097652, Accuracy: 60.39%, Time: 330.7246196269989 seconds
Epoch 5, Loss: 0.9784657210111618, Accuracy: 72.55%, Time: 330.96296668052673 seconds
Epoch 6, Loss: 0.6254676431417465, Accuracy: 82.75%, Time: 331.0347945690155 seconds
Epoch 7, Loss: 0.49142562272027135, Accuracy: 86.67%, Time: 330.55593848228455 seconds
Epoch 8, Loss: 0.41412029042840004, Accuracy: 88.92%, Time: 338.07885217666626 seconds
Epoch 9, Loss: 0.2777302369941026, Accuracy: 92.94%, Time: 342.7504951953888 seconds
Epoch 10, Loss: 0.21562380297109485, Accuracy: 94.80%, Time: 343.76017332077026 seconds
Epoch 11, Loss: 0.11300348734948784, Accuracy: 97.45%, Time: 333.3437535762787 seconds
Epoch 12, Loss: 0.1354126282385

#### **EfficientNet-B1**

In [15]:
# Training and evaluating EfficientNet-B1
print("\nTraining EfficientNet-B1...\n")
model_efficientnet_b1 = models['EfficientNet-B1'].to(device)  # Ensure the model is on GPU
optimizer_efficientnet_b1 = optim.Adam(model_efficientnet_b1.parameters(), lr=0.001)

start_time = time.time()
train_model(model_efficientnet_b1, trainloader, criterion, optimizer_efficientnet_b1, num_epochs=25)
training_time = time.time() - start_time

accuracy_efficientnet_b1, f1_efficientnet_b1 = evaluate_model(model_efficientnet_b1, testloader)
print(f"EfficientNet-B1 - Top-1 Accuracy: {accuracy_efficientnet_b1 * 100:.2f}%")
print(f"EfficientNet-B1 - F1 Score: {f1_efficientnet_b1:.2f}")
print(f"EfficientNet-B1 - Training Time: {training_time:.2f} seconds")


Training EfficientNet-B1...



Epoch 1, Loss: 3.8232, Accuracy: 0.2667, Time: 101.96 seconds
Epoch 2, Loss: 1.0012, Accuracy: 0.8461, Time: 105.30 seconds
Epoch 3, Loss: 0.1939, Accuracy: 0.9794, Time: 99.56 seconds
Epoch 4, Loss: 0.0697, Accuracy: 0.9941, Time: 99.25 seconds
Epoch 5, Loss: 0.0313, Accuracy: 0.9990, Time: 99.25 seconds
Epoch 6, Loss: 0.0167, Accuracy: 1.0000, Time: 99.56 seconds
Epoch 7, Loss: 0.0115, Accuracy: 1.0000, Time: 99.24 seconds
Epoch 8, Loss: 0.0070, Accuracy: 1.0000, Time: 99.53 seconds
Epoch 9, Loss: 0.0060, Accuracy: 1.0000, Time: 99.43 seconds
Epoch 10, Loss: 0.0043, Accuracy: 1.0000, Time: 99.32 seconds
Epoch 11, Loss: 0.0037, Accuracy: 1.0000, Time: 99.74 seconds
Epoch 12, Loss: 0.0027, Accuracy: 1.0000, Time: 99.58 seconds
Epoch 13, Loss: 0.0027, Accuracy: 1.0000, Time: 99.29 seconds
Epoch 14, Loss: 0.0021, Accuracy: 1.0000, Time: 99.29 seconds
Epoch 15, Loss: 0.0021, Accuracy: 1.0000, Time: 99.32 seconds
Epoch 16, Loss: 0.0022, Accuracy: 1.0000, Time: 98.78 seconds
Epoch 17, Loss:

#### **EfficientNet-B3**

In [16]:
# Training and evaluating EfficientNet-B3
print("\nTraining EfficientNet-B3...\n")
model_efficientnet_b3 = models['EfficientNet-B3'].to(device)  # Ensure the model is on GPU
optimizer_efficientnet_b3 = optim.Adam(model_efficientnet_b3.parameters(), lr=0.001)

start_time = time.time()
train_model(model_efficientnet_b3, trainloader, criterion, optimizer_efficientnet_b3, num_epochs=25)
training_time = time.time() - start_time

accuracy_efficientnet_b3, f1_efficientnet_b3 = evaluate_model(model_efficientnet_b3, testloader)
print(f"EfficientNet-B3 - Top-1 Accuracy: {accuracy_efficientnet_b3 * 100:.2f}%")
print(f"EfficientNet-B3 - F1 Score: {f1_efficientnet_b3:.2f}")
print(f"EfficientNet-B3 - Training Time: {training_time:.2f} seconds")


Training EfficientNet-B3...

Epoch 1, Loss: 3.8952, Accuracy: 0.2275, Time: 178.53 seconds
Epoch 2, Loss: 1.3263, Accuracy: 0.7559, Time: 177.42 seconds
Epoch 3, Loss: 0.3527, Accuracy: 0.9353, Time: 177.18 seconds
Epoch 4, Loss: 0.1565, Accuracy: 0.9735, Time: 177.41 seconds
Epoch 5, Loss: 0.1028, Accuracy: 0.9794, Time: 177.05 seconds
Epoch 6, Loss: 0.0750, Accuracy: 0.9804, Time: 177.03 seconds
Epoch 7, Loss: 0.0462, Accuracy: 0.9931, Time: 177.31 seconds
Epoch 8, Loss: 0.0384, Accuracy: 0.9931, Time: 177.29 seconds
Epoch 9, Loss: 0.0359, Accuracy: 0.9941, Time: 177.30 seconds
Epoch 10, Loss: 0.0615, Accuracy: 0.9853, Time: 177.04 seconds
Epoch 11, Loss: 0.0623, Accuracy: 0.9902, Time: 177.18 seconds
Epoch 12, Loss: 0.0219, Accuracy: 0.9971, Time: 178.35 seconds
Epoch 13, Loss: 0.0485, Accuracy: 0.9882, Time: 184.73 seconds
Epoch 14, Loss: 0.0532, Accuracy: 0.9833, Time: 178.82 seconds
Epoch 15, Loss: 0.0704, Accuracy: 0.9804, Time: 177.05 seconds
Epoch 16, Loss: 0.0784, Accuracy: 