Oday ziq

Step 1: Load and Preprocess CIFAR-10 Dataset


In [1]:
import torch
import torchvision
import torchvision.transforms as transforms

transform = transforms.Compose([
    transforms.Resize(224),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=32, shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=32, shuffle=False, num_workers=2)


Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz


100%|██████████| 170498071/170498071 [00:18<00:00, 9122852.30it/s]


Extracting ./data/cifar-10-python.tar.gz to ./data
Files already downloaded and verified


Define the transform to normalize the data
Resize images to 224x224 (required for VGG16 and AlexNet), convert to tensor, and normalize
Download and load the training data
The CIFAR-10 training dataset is downloaded and loaded with defined transformations Download and load the test data The CIFAR-10 test dataset is downloaded and loaded with defined transformations

Step 2: Load Pretrained Models (AlexNet and VGG16)

In [2]:
import torch.nn as nn
import torchvision.models as models

alexnet = models.alexnet(pretrained=True)
vgg16 = models.vgg16(pretrained=True)

alexnet.classifier[6] = nn.Linear(alexnet.classifier[6].in_features, 10)
vgg16.classifier[6] = nn.Linear(vgg16.classifier[6].in_features, 10)


Downloading: "https://download.pytorch.org/models/alexnet-owt-7be5be79.pth" to /root/.cache/torch/hub/checkpoints/alexnet-owt-7be5be79.pth
100%|██████████| 233M/233M [00:01<00:00, 166MB/s]
Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to /root/.cache/torch/hub/checkpoints/vgg16-397923af.pth
100%|██████████| 528M/528M [00:03<00:00, 173MB/s]


Load pretrained AlexNet and VGG16 models These models are pre-trained on the ImageNet dataset Modify the final layer for CIFAR-10 (10 classes) The original classification layer is replaced with a new layer for 10 classes (CIFAR-10)

Step 3: Set Up Finetuning and ConvNet as Fixed Feature Extractor Approaches

Finetuning

In [3]:
def train_model(model, trainloader, criterion, optimizer, num_epochs=3):
    # Training loop for the model
    for epoch in range(num_epochs):
        model.train()  # Set the model to training mode
        running_loss = 0.0
        for inputs, labels in trainloader:
            inputs, labels = inputs.to(device), labels.to(device)  # Move data to the appropriate device
            optimizer.zero_grad()  # Zero the parameter gradients
            outputs = model(inputs)  # Forward pass
            loss = criterion(outputs, labels)  # Compute loss
            loss.backward()  # Backward pass
            optimizer.step()  # Update weights
            running_loss += loss.item()
        print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {running_loss/len(trainloader):.4f}')  # Print loss for each epoch

def evaluate_model(model, testloader):
    # Evaluation loop for the model
    model.eval()  # Set the model to evaluation mode
    correct = 0
    total = 0
    with torch.no_grad():
        for inputs, labels in testloader:
            inputs, labels = inputs.to(device), labels.to(device)  # Move data to the appropriate device
            outputs = model(inputs)  # Forward pass
            _, predicted = torch.max(outputs.data, 1)  # Get the predicted class
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
    print(f'Accuracy: {100 * correct / total:.2f}%')  # Print the accuracy

# Set up device
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")  # Use GPU if available
alexnet.to(device)  # Move AlexNet to the device
vgg16.to(device)  # Move VGG16 to the device

# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()  # Cross-entropy loss for classification
optimizer_alexnet = torch.optim.SGD(alexnet.parameters(), lr=0.001, momentum=0.9)  # Optimizer for AlexNet
optimizer_vgg16 = torch.optim.SGD(vgg16.parameters(), lr=0.001, momentum=0.9)  # Optimizer for VGG16

# Train and evaluate AlexNet
train_model(alexnet, trainloader, criterion, optimizer_alexnet, num_epochs=3)  # Train AlexNet
evaluate_model(alexnet, testloader)  # Evaluate AlexNet

# Train and evaluate VGG16
train_model(vgg16, trainloader, criterion, optimizer_vgg16, num_epochs=3)  # Train VGG16
evaluate_model(vgg16, testloader)  # Evaluate VGG16


Epoch [1/3], Loss: 0.6196
Epoch [2/3], Loss: 0.3814
Epoch [3/3], Loss: 0.2969
Accuracy: 89.02%
Epoch [1/3], Loss: 0.4477
Epoch [2/3], Loss: 0.2198
Epoch [3/3], Loss: 0.1432
Accuracy: 92.76%


ConvNet as Fixed Feature Extractor


In [7]:
def set_parameter_requires_grad(model, feature_extracting):
    # Freeze parameters for feature extraction
    if feature_extracting:
        for param in model.parameters():
            param.requires_grad = False

# Freeze all layers except the final one for AlexNet and VGG16
set_parameter_requires_grad(alexnet, feature_extracting=True)
set_parameter_requires_grad(vgg16, feature_extracting=True)

# Ensure the last layer requires gradients
# Remember to set requires_grad to True for all layers you want to train.
for param in alexnet.classifier[6].parameters():
    param.requires_grad = True
for param in vgg16.classifier[6].parameters():
    param.requires_grad = True

# Redefine the optimizer to only train the final layer
optimizer_alexnet = torch.optim.SGD(alexnet.classifier[6].parameters(), lr=0.001, momentum=0.9)
optimizer_vgg16 = torch.optim.SGD(vgg16.classifier[6].parameters(), lr=0.001, momentum=0.9)

# Train and evaluate AlexNet as feature extractor
train_model(alexnet, trainloader, criterion, optimizer_alexnet, num_epochs=3)  # Train AlexNet as a fixed feature extractor
evaluate_model(alexnet, testloader)  # Evaluate AlexNet

# Train and evaluate VGG16 as feature extractor
train_model(vgg16, trainloader, criterion, optimizer_vgg16, num_epochs=3)  # Train VGG16 as a fixed feature extractor
evaluate_model(vgg16, testloader)  # Evaluate VGG16

Epoch [1/3], Loss: 0.2115
Epoch [2/3], Loss: 0.1977
Epoch [3/3], Loss: 0.1936
Accuracy: 90.62%
Epoch [1/3], Loss: 0.0664
Epoch [2/3], Loss: 0.0601
Epoch [3/3], Loss: 0.0585
Accuracy: 93.47%


In this case study, we applied transfer learning using two popular networks, AlexNet and VGG16, to classify images on the CIFAR-10 dataset. We tested two setups: finetuning and using the ConvNet as a fixed feature extractor. The finetuning approach involves updating all the pretrained weights, including those of the newly added classification layer, while the fixed feature extractor approach involves freezing the weights of all layers except the final fully connected layer, which is trained from scratch.

Based on the results from the training and evaluation of AlexNet and VGG16 using the two setups, several comparisons and observations can be made. For the finetuning setup, AlexNet's loss decreased from 0.6196 in the first epoch to 0.2969 in the third epoch, resulting in a final test accuracy of 89.02%. In contrast, VGG16's loss decreased from 0.4477 to 0.1432 over the same period, achieving a final test accuracy of 92.76%. When using the ConvNet as a fixed feature extractor, AlexNet's loss decreased from 0.2115 to 0.1936, with a final test accuracy of 90.62%. VGG16, on the other hand, showed a loss decrease from 0.0664 to 0.0585, attaining the highest test accuracy of 93.47%.

These results reveal that both models exhibit higher accuracy when employed as fixed feature extractors compared to finetuning. This indicates that the features learned from the pre-trained models are robust and highly effective for the CIFAR-10 classification task, requiring minimal modifications to the network. VGG16 consistently outperforms AlexNet in both setups, highlighting that deeper networks like VGG16 can capture more complex features, leading to better performance in tasks involving detailed image patterns.

Regarding loss reduction, the finetuning approach shows a more significant decrease in both models, which suggests that more layers are adjusted to better fit the CIFAR-10 dataset. However, the accuracy improvement is not as substantial, potentially due to overfitting or the initial pre-trained weights being quite suitable for the task already. Conversely, using the ConvNet as a fixed feature extractor results in lower initial loss values and a steady but less pronounced decrease, indicating that the pre-trained features are already well-suited for the classification task and require minimal fine-tuning.

In conclusion, VGG16 as a fixed feature extractor achieved the highest accuracy of 93.47%, demonstrating the effectiveness of leveraging rich feature representations learned from large datasets like ImageNet for similar tasks. AlexNet as a fixed feature extractor also outperformed its finetuning counterpart, showcasing the robustness of pre-trained features. The finetuning approach did not significantly enhance accuracy over the fixed feature extractor method, suggesting that the pre-trained weights are highly effective for CIFAR-10. These findings underscore the power of transfer learning, particularly with well-pretrained models and deep architectures like VGG16, and indicate that using a model as a fixed feature extractor can provide excellent results with reduced computational cost and complexity.