<h1> ECE4179 - Semi-Supervised Learning Project</h1>
<h2>Data</h2>

We will be using a dataset that can be obtained directly from the torchvision package. There are 10 classes and we will be training a CNN for the image classification task. We have training, validation and test sets that are labelled with the class, and a large unlabeled set.

We will simulating a low training data scenario by only sampling a small percentage of the labelled data (10%) as training data. The remaining examples will be used as the validation set.

To get the labelled data, change the dataset_dir to something suitable for your machine, and execute the following (you will then probably want to wrap the dataset objects in a PyTorch DataLoader):

In [2]:
import torch
import torch.nn as nn
from torchvision.datasets import STL10 as STL10
import torchvision.transforms as transforms
from torch.utils.data import random_split
import torchvision
from torch.utils.data import DataLoader
from copy import deepcopy
from torch.optim import Adam
import torch.optim as optim
from torchvision import models
from sklearn.metrics import f1_score, classification_report
import time

####### CHANGE TO APPROPRIATE DIRECTORY TO STORE DATASET
dataset_dir = "../data"
#For MonARCH
# dataset_dir = "/mnt/lustre/projects/ds19/SHARED"

#All images are 3x96x96
image_size = 96                 
#Example batch size
batch_size = 512

<h3>Print the GPU type and use device</h3>

In [4]:
if torch.cuda.is_available():
    gpu_name = torch.cuda.get_device_name(0)
    print(f"Using GPU: {gpu_name}")
else:
    print("No GPU available, using CPU")

# Move the model to the appropriate device (GPU or CPU)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

Using GPU: NVIDIA TITAN RTX


<h3>Print the number of processors in cpu</h3>

In [6]:
import multiprocessing
print(multiprocessing.cpu_count())

16


<h3>Create the appropriate transforms</h3>

In [8]:
#Perform random crops and mirroring for data augmentation
transform_train = transforms.Compose(
    [transforms.RandomCrop(image_size, padding=4),
     transforms.RandomHorizontalFlip(p=0.5),
     transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

transform_unlabelled = transforms.Compose(
    [transforms.RandomHorizontalFlip(p=0.5),
     transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

#No random 
transform_test = transforms.Compose(
    [transforms.CenterCrop(image_size),
     transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])


<h3>Create training and validation split</h3>

In [10]:
#Load train and validation sets
trainval_set = STL10(dataset_dir, split='train', transform=transform_train, download=True)

#Use 10% of data for training - simulating low data scenario
num_train = int(len(trainval_set)*0.1)

#Split data into train/val sets
torch.manual_seed(0) #Set torch's random seed so that random split of data is reproducible
train_set, val_set = random_split(trainval_set, [num_train, len(trainval_set)-num_train])

#Load test set
test_set = STL10(dataset_dir, split='test', transform=transform_test, download=False)

Files already downloaded and verified


<h3>Get the unlabelled data</h3>

In [12]:
unlabelled_set = STL10(dataset_dir, split='unlabeled', transform=transform_unlabelled, download=True)

Files already downloaded and verified


You may find later that you want to make changes to how the unlabelled data is loaded. This might require you sub-classing the STL10 class used above or to create your own dataloader similar to the Pytorch one.
https://pytorch.org/docs/stable/_modules/torchvision/datasets/stl10.html#STL10

<h3>Create the four dataloaders</h3>

In [15]:
train_loader = DataLoader(train_set, shuffle=True, batch_size=batch_size)
unlabelled_loader = DataLoader(unlabelled_set, shuffle=True, batch_size=batch_size)

valid_loader = DataLoader(val_set, batch_size=batch_size)
test_loader = DataLoader(test_set, batch_size=batch_size)

<h3>Accuracy</h3>

In [17]:
# Define the test function
def test_model(model, test_loader):
    model.eval()  # Set the model to evaluation mode
    correct = 0
    total = 0

    with torch.no_grad():  # Disable gradient calculation
        for inputs, labels in test_loader:
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model(inputs)
            _, predicted = torch.max(outputs, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()

    accuracy = 100 * correct / total
    print(f"Test Accuracy: {accuracy}%")

<h3>Marco F1 Score</h3>

In [19]:
# Define the test function to calculate F1 score
def test_model_with_f1(model, test_loader):
    model.eval()  # Set model to evaluation mode
    all_labels = []
    all_preds = []

    with torch.no_grad():  # Disable gradient calculation
        for inputs, labels in test_loader:
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model(inputs)
            _, predicted = torch.max(outputs, 1)
            
            # Collect all predictions and labels for F1-score calculation
            all_labels.extend(labels.cpu().numpy())
            all_preds.extend(predicted.cpu().numpy())

    # Calculate the Macro F1-score for each class
    f1 = f1_score(all_labels, all_preds, average='macro')
    
    # Alternatively, you can get a detailed report for all classes
    report = classification_report(all_labels, all_preds, target_names=[f"Class {i}" for i in range(10)])
    
    print(f"Macro F1-score: {f1}")
    print("Classification Report:\n", report)

## Network

Let's use a ResNet18 architecture for our CNN...

## ResNet18

In [22]:
# We will keep this for later
model0 = torch.hub.load('pytorch/vision:v0.10.0', 'resnet18', pretrained=True)


for name, param in model0.named_parameters():
    print(f"Name: {name}, Shape: {param.shape}")

Name: conv1.weight, Shape: torch.Size([64, 3, 7, 7])
Name: bn1.weight, Shape: torch.Size([64])
Name: bn1.bias, Shape: torch.Size([64])
Name: layer1.0.conv1.weight, Shape: torch.Size([64, 64, 3, 3])
Name: layer1.0.bn1.weight, Shape: torch.Size([64])
Name: layer1.0.bn1.bias, Shape: torch.Size([64])
Name: layer1.0.conv2.weight, Shape: torch.Size([64, 64, 3, 3])
Name: layer1.0.bn2.weight, Shape: torch.Size([64])
Name: layer1.0.bn2.bias, Shape: torch.Size([64])
Name: layer1.1.conv1.weight, Shape: torch.Size([64, 64, 3, 3])
Name: layer1.1.bn1.weight, Shape: torch.Size([64])
Name: layer1.1.bn1.bias, Shape: torch.Size([64])
Name: layer1.1.conv2.weight, Shape: torch.Size([64, 64, 3, 3])
Name: layer1.1.bn2.weight, Shape: torch.Size([64])
Name: layer1.1.bn2.bias, Shape: torch.Size([64])
Name: layer2.0.conv1.weight, Shape: torch.Size([128, 64, 3, 3])
Name: layer2.0.bn1.weight, Shape: torch.Size([128])
Name: layer2.0.bn1.bias, Shape: torch.Size([128])
Name: layer2.0.conv2.weight, Shape: torch.Size(

Using cache found in C:\Users\惟神君/.cache\torch\hub\pytorch_vision_v0.10.0


In [23]:
# add your code below. Define the number of epochs to train for based on your laptop's performance
# on a CPU, my machine takes about 2 minutes per epoch
num_epochs=10

# Create a new model from the pretrained one
model_resnet18 = deepcopy(model0)  # <--- study the code here

# Modify the last fully connected layer to match the number of classes (e.g., 10 for STL10)
num_classes = 10
model_resnet18.fc = nn.Linear(model_resnet18.fc.in_features, num_classes)

model = model_resnet18.to(device)

# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = Adam(model_resnet18.parameters(), lr=0.001)

# Training loop
num_epochs = 10
for epoch in range(num_epochs):
    model_resnet18.train()  # Set the model to training mode
    running_loss = 0.0
    for inputs, labels in train_loader:
        inputs, labels = inputs.to(device), labels.to(device)
        
        # Zero the parameter gradients
        optimizer.zero_grad()
        
        # Forward pass
        outputs = model_resnet18(inputs)
        loss = criterion(outputs, labels)
        
        # Backward pass and optimization
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
    
    print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {running_loss/len(train_loader)}")

# Evaluation on test set
model_resnet18.eval()  # Set the model to evaluation mode
correct = 0
total = 0
with torch.no_grad():
    for inputs, labels in test_loader:
        inputs, labels = inputs.to(device), labels.to(device)
        outputs = model_resnet18(inputs)
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

# Print test accuracy
accuracy = 100 * correct / total
print(f"Test Accuracy: {accuracy}%")

Epoch [1/10], Loss: 0.9302005350589753
Epoch [2/10], Loss: 0.3454402476549149
Epoch [3/10], Loss: 0.2133716717362404
Epoch [4/10], Loss: 0.13532407060265542
Epoch [5/10], Loss: 0.08492341302335263
Epoch [6/10], Loss: 0.06833139136433601
Epoch [7/10], Loss: 0.05621107928454876
Epoch [8/10], Loss: 0.04910865388810635
Epoch [9/10], Loss: 0.03994848933070898
Epoch [10/10], Loss: 0.04816233925521374
Test Accuracy: 81.9125%


In [24]:
# Call the test function
test_model(model_resnet18, test_loader)

Test Accuracy: 81.9125%


In [25]:
# Call the function to calculate and print F1-scores
test_model_with_f1(model_resnet18, test_loader)

Macro F1-score: 0.8198189355296931
Classification Report:
               precision    recall  f1-score   support

     Class 0       0.82      0.88      0.85       800
     Class 1       0.82      0.87      0.84       800
     Class 2       0.85      0.94      0.89       800
     Class 3       0.63      0.86      0.73       800
     Class 4       0.87      0.81      0.84       800
     Class 5       0.78      0.71      0.74       800
     Class 6       0.91      0.78      0.84       800
     Class 7       0.81      0.81      0.81       800
     Class 8       0.98      0.67      0.80       800
     Class 9       0.84      0.86      0.85       800

    accuracy                           0.82      8000
   macro avg       0.83      0.82      0.82      8000
weighted avg       0.83      0.82      0.82      8000



## EfficientNet

In [27]:
# Load pretrained EfficientNet-B0 model from torchvision hub
efficientnetb0 = torch.hub.load('pytorch/vision', 'efficientnet_b0', weights="EfficientNet_B0_Weights.IMAGENET1K_V1")

for name, param in efficientnetb0.named_parameters():
    print(f"Name: {name}, Shape: {param.shape}")

Name: features.0.0.weight, Shape: torch.Size([32, 3, 3, 3])
Name: features.0.1.weight, Shape: torch.Size([32])
Name: features.0.1.bias, Shape: torch.Size([32])
Name: features.1.0.block.0.0.weight, Shape: torch.Size([32, 1, 3, 3])
Name: features.1.0.block.0.1.weight, Shape: torch.Size([32])
Name: features.1.0.block.0.1.bias, Shape: torch.Size([32])
Name: features.1.0.block.1.fc1.weight, Shape: torch.Size([8, 32, 1, 1])
Name: features.1.0.block.1.fc1.bias, Shape: torch.Size([8])
Name: features.1.0.block.1.fc2.weight, Shape: torch.Size([32, 8, 1, 1])
Name: features.1.0.block.1.fc2.bias, Shape: torch.Size([32])
Name: features.1.0.block.2.0.weight, Shape: torch.Size([16, 32, 1, 1])
Name: features.1.0.block.2.1.weight, Shape: torch.Size([16])
Name: features.1.0.block.2.1.bias, Shape: torch.Size([16])
Name: features.2.0.block.0.0.weight, Shape: torch.Size([96, 16, 1, 1])
Name: features.2.0.block.0.1.weight, Shape: torch.Size([96])
Name: features.2.0.block.0.1.bias, Shape: torch.Size([96])
Nam

Using cache found in C:\Users\惟神君/.cache\torch\hub\pytorch_vision_main


In [28]:
# Modify the last fully connected layer to output 10 classes
num_classes = 10
efficientnetb0.classifier[1] = nn.Linear(efficientnetb0.classifier[1].in_features, num_classes)

# Move the model to the appropriate device (GPU/CPU)
efficientnetb0 = efficientnetb0.to(device)

# Define loss function (CrossEntropyLoss) and optimizer (Adam)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(efficientnetb0.parameters(), lr=0.001)

# Training and validation code
def train_model(model, train_loader, valid_loader, num_epochs=10):
    for epoch in range(num_epochs):
        model.train()  # Set model to training mode
        running_loss = 0.0
        correct = 0
        total = 0

        # Loop through batches in the training data
        for inputs, labels in train_loader:
            inputs, labels = inputs.to(device), labels.to(device)
            
            # Zero the parameter gradients
            optimizer.zero_grad()
            
            # Forward pass
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            
            # Backward pass and optimization
            loss.backward()
            optimizer.step()
            
            # Calculate running loss
            running_loss += loss.item()

        # Validation after every epoch
        model.eval()  # Set model to evaluation mode
        correct = 0
        total = 0
        with torch.no_grad():
            for inputs, labels in valid_loader:
                inputs, labels = inputs.to(device), labels.to(device)
                outputs = model(inputs)
                _, predicted = torch.max(outputs, 1)
                total += labels.size(0)
                correct += (predicted == labels).sum().item()

        accuracy = 100 * correct / total
        print(f"Epoch {epoch+1}/{num_epochs}, Loss: {running_loss/len(train_loader)}, Accuracy: {accuracy}%")

train_model(efficientnetb0, train_loader, valid_loader, num_epochs=10)

ZeroDivisionError: division by zero

In [None]:
# Call the test function
test_model(efficientnetb0, test_loader)

In [None]:
# Call the function to calculate and print F1-scores
test_model_with_f1(efficientnetb0, test_loader)

## Vision Transformer (ViT)

In [29]:
# Set image size to 224x224 to match the input size of ViT
transform_train = transforms.Compose([
    transforms.Resize((224, 224)),  # Resize images to 224x224
    transforms.RandomCrop(224, padding=4),
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

transform_unlabelled = transforms.Compose([
    transforms.Resize((224, 224)),  # Resize images to 224x224
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

transform_test = transforms.Compose([
    transforms.Resize((224, 224)),  # Resize images to 224x224
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

# Load train and validation sets without redownloading data
trainval_set = STL10(dataset_dir, split='train', transform=transform_train, download=False)

# Use 10% of the data for training (simulating a low data scenario)
num_train = int(len(trainval_set) * 0.1)

# Split data into train/validation sets with a fixed random seed
torch.manual_seed(0)  # Ensure reproducibility
train_set, val_set = random_split(trainval_set, [num_train, len(trainval_set) - num_train])

# Load test set without redownloading data
test_set = STL10(dataset_dir, split='test', transform=transform_test, download=False)

# Create DataLoader for train, validation, and test sets
train_loader = DataLoader(train_set, shuffle=True, batch_size=batch_size, num_workers=8, pin_memory=True)
valid_loader = DataLoader(val_set, batch_size=batch_size, num_workers=8, pin_memory=True)
test_loader = DataLoader(test_set, batch_size=batch_size, num_workers=8, pin_memory=True)

In [30]:
# Load pretrained Vision Transformer (ViT) model from torchvision models
vit = models.vit_b_16(pretrained=True)

# Print the model structure to verify the changes
for name, param in vit.named_parameters():
    print(f"Name: {name}, Shape: {param.shape}")



Name: class_token, Shape: torch.Size([1, 1, 768])
Name: conv_proj.weight, Shape: torch.Size([768, 3, 16, 16])
Name: conv_proj.bias, Shape: torch.Size([768])
Name: encoder.pos_embedding, Shape: torch.Size([1, 197, 768])
Name: encoder.layers.encoder_layer_0.ln_1.weight, Shape: torch.Size([768])
Name: encoder.layers.encoder_layer_0.ln_1.bias, Shape: torch.Size([768])
Name: encoder.layers.encoder_layer_0.self_attention.in_proj_weight, Shape: torch.Size([2304, 768])
Name: encoder.layers.encoder_layer_0.self_attention.in_proj_bias, Shape: torch.Size([2304])
Name: encoder.layers.encoder_layer_0.self_attention.out_proj.weight, Shape: torch.Size([768, 768])
Name: encoder.layers.encoder_layer_0.self_attention.out_proj.bias, Shape: torch.Size([768])
Name: encoder.layers.encoder_layer_0.ln_2.weight, Shape: torch.Size([768])
Name: encoder.layers.encoder_layer_0.ln_2.bias, Shape: torch.Size([768])
Name: encoder.layers.encoder_layer_0.mlp.0.weight, Shape: torch.Size([3072, 768])
Name: encoder.layers.

In [31]:
# Modify the last fully connected layer to match the number of classes (e.g., 10 classes)
num_classes = 10
vit.heads.head = nn.Linear(vit.heads.head.in_features, num_classes)

# Freeze all layers except the last fully connected layer (heads.head)
for name, param in vit.named_parameters():
    if 'heads.head' in name:  # Only unfreeze the heads.head layer
        param.requires_grad = True
        print(f"Unfreezing layer: {name}")
    else:
        param.requires_grad = False

# Move the model to the appropriate device (GPU/CPU)
vit = vit.to(device)

# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()  # CrossEntropy for classification
# optimizer = optim.Adam(vit.parameters(), lr=0.001)
optimizer = optim.Adam(vit.parameters(), lr=0.016)

# Training and validation code with timing
def train_model(model, train_loader, valid_loader, num_epochs=10):
    for epoch in range(num_epochs):
        # Start time for epoch
        start_time = time.time()
        
        model.train()  # Set model to training mode
        running_loss = 0.0
        correct = 0
        total = 0

        # Loop through batches in the training data
        for inputs, labels in train_loader:
            inputs, labels = inputs.to(device), labels.to(device)
            
            # Zero the parameter gradients
            optimizer.zero_grad()
            
            # Forward pass
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            
            # Backward pass and optimization
            loss.backward()
            optimizer.step()
            
            # Calculate running loss
            running_loss += loss.item()

        # Validation after every epoch
        model.eval()  # Set model to evaluation mode
        correct = 0
        total = 0
        with torch.no_grad():
            for inputs, labels in valid_loader:
                inputs, labels = inputs.to(device), labels.to(device)
                outputs = model(inputs)
                _, predicted = torch.max(outputs, 1)
                total += labels.size(0)
                correct += (predicted == labels).sum().item()

        accuracy = 100 * correct / total

        # End time for epoch
        end_time = time.time()
        epoch_time = end_time - start_time

        print(f"Epoch {epoch+1}/{num_epochs}, Loss: {running_loss/len(train_loader)}, Accuracy: {accuracy}%, Time: {epoch_time:.2f} seconds")

# Use your train_loader and valid_loader
train_model(vit, train_loader, valid_loader, num_epochs=10)

Unfreezing layer: heads.head.weight
Unfreezing layer: heads.head.bias


  attn_output = scaled_dot_product_attention(q, k, v, attn_mask, dropout_p, is_causal)


ZeroDivisionError: division by zero

In [None]:
# Call the test function
test_model(vit, test_loader)

In [None]:
# Call the function to calculate and print F1-scores
test_model_with_f1(vit, test_loader)