<h1> ECE4179 - Semi-Supervised Learning Project</h1>
<h2>Data</h2>

We will be using a dataset that can be obtained directly from the torchvision package. There are 10 classes and we will be training a CNN for the image classification task. We have training, validation and test sets that are labelled with the class, and a large unlabeled set.

We will simulating a low training data scenario by only sampling a small percentage of the labelled data (10%) as training data. The remaining examples will be used as the validation set.

To get the labelled data, change the dataset_dir to something suitable for your machine, and execute the following (you will then probably want to wrap the dataset objects in a PyTorch DataLoader):

In [79]:
import torch
import torch.nn as nn
from torchvision.datasets import STL10 as STL10
import torchvision.transforms as transforms
from torch.utils.data import random_split
from torch.utils.data import DataLoader
import torchvision

####### CHANGE TO APPROPRIATE DIRECTORY TO STORE DATASET
dataset_dir = r"\\ad.monash.edu\home\User030\rbea0007\Documents\ECE6179\VS Code\Course Project"
#For MonARCH
# dataset_dir = "/mnt/lustre/projects/ds19/SHARED"

#All images are 3x96x96
image_size = 96
#Example batch size
batch_size = 32

# Define the device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")
print(torch.cuda.is_available())  # Should return True

Using device: cuda
True


<h3>Create the appropriate transforms</h3>

In [70]:
#Perform random crops and mirroring for data augmentation
transform_train = transforms.Compose(
    [transforms.RandomCrop(image_size, padding=4),
     transforms.RandomHorizontalFlip(p=0.5),
     transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

transform_unlabelled = transforms.Compose(
    [transforms.RandomHorizontalFlip(p=0.5),
     transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

#No random 
transform_test = transforms.Compose(
    [transforms.CenterCrop(image_size),
     transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])


<h3>Create training and validation split</h3>

In [71]:
#Load train and validation sets
trainval_set = STL10(dataset_dir, split='train', transform=transform_train, download=True)

#Use 10% of data for training - simulating low data scenario
num_train = int(len(trainval_set)*0.1)

#Split data into train/val sets
torch.manual_seed(0) #Set torch's random seed so that random split of data is reproducible
train_set, val_set = random_split(trainval_set, [num_train, len(trainval_set)-num_train])

#Load test set
test_set = STL10(dataset_dir, split='test', transform=transform_test, download=True)

Files already downloaded and verified
Files already downloaded and verified


<h3>Get the unlabelled data</h3>

In [72]:
unlabelled_set = STL10(dataset_dir, split='unlabeled', transform=transform_unlabelled, download=True)

Files already downloaded and verified


You may find later that you want to make changes to how the unlabelled data is loaded. This might require you sub-classing the STL10 class used above or to create your own dataloader similar to the Pytorch one.
https://pytorch.org/docs/stable/_modules/torchvision/datasets/stl10.html#STL10

<h3>Create the four dataloaders</h3>

In [73]:
train_loader = DataLoader(train_set, shuffle=True, batch_size=batch_size)
unlabelled_loader = DataLoader(unlabelled_set, shuffle=True, batch_size=batch_size)

valid_loader = DataLoader(val_set, batch_size=batch_size)
test_loader  = DataLoader(test_set, batch_size=batch_size)

## Network

Let's use a ResNet18 architecture for our CNN...

In [74]:
# baseline model to compare results to
# model = torchvision.models.resnet18()

Momentum Contrast Model

In [81]:
class MoCo(nn.Module):
    def __init__(self, base_encoder, dim=128, K=8192, m=0.999, T=0.07):
        super(MoCo, self).__init__()
        self.encoder_q = base_encoder
        self.encoder_k = base_encoder

        # Replace the final layer to output the desired dimension
        self.encoder_q.fc = nn.Linear(self.encoder_q.fc.in_features, dim)
        self.encoder_k.fc = nn.Linear(self.encoder_k.fc.in_features, dim)
        
        for param in self.encoder_k.parameters():
            param.requires_grad = False
    
        self.K = K # queue size
        self.m = m # momentum
        self.T = T # temperature

        self.register_buffer("queue", torch.randn(dim, K))  # Register queue as a buffer
        self.queue = nn.functional.normalize(self.queue, dim=0)
        self.register_buffer("queue_ptr", torch.zeros(1, dtype=torch.long))  # Pointer for queue


    def forward(self, x):
        q = self.encoder_q(x)
        q = nn.functional.normalize(q, dim=1)
        return q
    
    @torch.no_grad()   
    def update_key_encoder(self):
        for param_q, param_k in zip(self.encoder_q.parameters(), self.encoder_k.parameters()):
            param_k.data = param_k.data * self.m + param_q.data * (1. - self.m)
        
    @torch.no_grad()
    def enqueue_and_dequeue(self, keys):
        keys = nn.functional.normalize(keys, dim=1)
        batch_size = keys.shape[0]
        ptr = int(self.queue_ptr.item())

        self.queue[:, ptr:ptr + batch_size] = keys.T
        ptr = (ptr + batch_size) % self.K
        self.queue_ptr[0] = ptr

    def contrastive_loss(self, query):
        batch_size = query.shape[0]
        
        # Compute logits
        logits = torch.mm(query, self.queue.clone().detach()) / self.T
        labels = torch.arange(batch_size).cuda()
        
        loss = nn.CrossEntropyLoss()(logits, labels)
        
        return loss

In [82]:
# Initialise model using resnet18
model_resnet = torch.hub.load('pytorch/vision', 'resnet18', weights="ResNet18_Weights.IMAGENET1K_V1")
base_encoder = model_resnet
#base_encoder.fc = nn.Identity()
model_moco = MoCo(base_encoder, dim=128)

Using cache found in C:\Users\rbea0007/.cache\torch\hub\pytorch_vision_main


In [83]:
# Pretrain on unlabelled data
def pretrain_model(model, dataloader, num_epochs):
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    print(f"Using device: {device}")
    model.to(device)  # Move the model to the appropriate device

    model.train()
    optimiser = torch.optim.Adam(model.parameters(), lr=1e-4)

    for epoch in range(num_epochs):
        for images, _ in dataloader:
            images = images.cuda()
            optimiser.zero_grad()

            # Generate augmented views
            images_q = images  # Original images as query
            #images_k = images(transform_unlabelled) # You can apply different augmentations
            images_k = images
            
            # Forward pass
            query = model(images_q)
            query.requires_grad_()

            with torch.no_grad():
                model.update_key_encoder()  # Update the key encoder
                key = model.encoder_k(images_k)

            # Contrastive loss
            loss = model.contrastive_loss(query)

            # Backward pass
            loss.backward()
            optimiser.step()

            # Update key encoder
            model.enqueue_and_dequeue(key)
            #model.update_key_encoder()
        
        print(f'Epoch [{epoch + 1}/{num_epochs}], Loss: {loss.item():.4f}')

# Pretrain the model
#model_moco.cuda()  # Move model to GPU
pretrain_model(model_moco, unlabelled_loader, num_epochs=10)

Using device: cuda
Epoch [1/10], Loss: 9.7482
Epoch [2/10], Loss: 10.0628
Epoch [3/10], Loss: 10.1734
Epoch [4/10], Loss: 9.4327
Epoch [5/10], Loss: 10.4353
Epoch [6/10], Loss: 9.6153
Epoch [7/10], Loss: 9.5417
Epoch [8/10], Loss: 10.2128
Epoch [9/10], Loss: 9.7266
Epoch [10/10], Loss: 9.8079


In [84]:
def evaluate_model(model, test_loader): # accuracy only
    # Define the device inside the function
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model.to(device)  # Move the model to the appropriate device

    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for images, labels in test_loader:
            images, labels = images.cuda(), labels.cuda()
            outputs = model(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
    print(f'Accuracy: {100 * correct / total:.2f}%')

evaluate_model(model_moco, test_loader)

Accuracy: 0.88%


In [95]:
# Fine tune the model
num_classes = 10  # Adjust this based on your dataset
model_moco.encoder_q.fc = nn.Linear(model_moco.encoder_q.fc.in_features, num_classes)

def finetune_model(model, train_loader, val_loader, num_epochs):
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model.to(device)  # Move the model to the appropriate device

    model.train()
    optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
    criterion = nn.CrossEntropyLoss()

    for epoch in range(num_epochs):
        for images, labels in train_loader:
            images, labels = images.cuda(), labels.cuda()
            optimizer.zero_grad()
            outputs = model(images)  # Forward pass
            loss = criterion(outputs, labels)  # Compute loss
            loss.backward()  # Backward pass
            optimizer.step()  # Update weights
        print(f'Epoch [{epoch + 1}/{num_epochs}], Loss: {loss.item():.4f}')

finetune_model(model_moco, train_loader, valid_loader, num_epochs=10)

Epoch [1/10], Loss: 2.1295
Epoch [2/10], Loss: 1.8836
Epoch [3/10], Loss: 1.8685
Epoch [4/10], Loss: 1.9443
Epoch [5/10], Loss: 1.8252
Epoch [6/10], Loss: 1.8322
Epoch [7/10], Loss: 1.8002
Epoch [8/10], Loss: 1.6823
Epoch [9/10], Loss: 1.7423
Epoch [10/10], Loss: 1.7654


In [96]:
# Evaluate finetuned model
model_moco.eval()
evaluate_model(model_moco, test_loader)

Accuracy: 66.99%
