Task 3 - Evaluation for Domain Generalization - PACS Dataset

Making All the required Imports

In [3]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision.transforms as transforms
from torch.utils.data import DataLoader, Dataset
from tqdm import tqdm
import deeplake
import os
from PIL import Image
import torchvision.models as models

Loading In the VGG Disciminative Model and Moving it to the GPU if Available

In [12]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = models.vgg19(pretrained=True)



Getting The VGG Pretraind Model Ready for training.

This step includes freezing all parameters in the feature extractor and replacing the classifier head with a new head based on the number of classes our dataset has which is 7. We then define loss function and optimizer as.

In [13]:
for param in model.features.parameters():
    param.requires_grad = False  

num_ftrs = model.classifier[6].in_features
model.classifier[6] = nn.Linear(num_ftrs, 7)  

model.to(device)

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.classifier[6].parameters(), lr=0.001) 

Loading In The PACS Dataset

Defining our Custom Dataset Wrapper Class to convert Deep Lake tensors into pytorch tensors

In [14]:
class PACSDataset(Dataset):
    def __init__(self, deeplake_dataset, transform=None):
        self.ds = deeplake_dataset
        self.transform = transform

    def __len__(self):
        return len(self.ds)

    def __getitem__(self, idx):
        image = self.ds['images'][idx].numpy() 
        label = int(self.ds['labels'][idx].numpy()) 

        image = Image.fromarray(image)
        if self.transform:
            image = self.transform(image)

        return image, label

Load in PACS DATA set using Deep Lake and define the train and validation dataset

In [15]:
train_dataset = deeplake.load("hub://activeloop/pacs-train")
val_dataset = deeplake.load("hub://activeloop/pacs-val")

-

Opening dataset in read-only mode as you don't have write permissions.


\

This dataset can be visualized in Jupyter Notebook by ds.visualize() or at https://app.activeloop.ai/activeloop/pacs-train



-

hub://activeloop/pacs-train loaded successfully.





Opening dataset in read-only mode as you don't have write permissions.


|

This dataset can be visualized in Jupyter Notebook by ds.visualize() or at https://app.activeloop.ai/activeloop/pacs-val





hub://activeloop/pacs-val loaded successfully.





Define Image Transformations. These Image Transformations are passed along with train dataset to our PACSDataset Wrapper. Once Passed A series of transformations are applied to the images, along with converting the deeplake tensors into the corrected workable format that we need.

We also get the trainloader which essentially splits train_dataset into batches or chunks. This can be helpful in making the model train more faster and smoother

In [16]:
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

train_dataset = PACSDataset(train_dataset, transform=transform)

trainloader = DataLoader(train_dataset, batch_size=4, shuffle=True,num_workers=4)

This portion will now is now finetuning the VGG model. Essentially we will be doing a forward pass, calculating the loss and then updating the weights in the backward pass. This is done for the complete dataset for 3 Epochs. After the model is trained we save it so if we require it later we can easily load it in.


In [17]:
model.train()
num_epochs = 3

for epoch in range(num_epochs):
    running_loss = 0.0
    with tqdm(total=len(trainloader), desc=f'Epoch {epoch + 1}/{num_epochs}', unit='batch') as pbar:
        for i, (images, labels) in enumerate(trainloader):
            images, labels = images.to(device), labels.to(device)

            optimizer.zero_grad() 

            outputs = model(images)
            loss = criterion(outputs, labels)

            loss.backward()  
            optimizer.step()

            running_loss += loss.item()
            pbar.set_postfix(loss=running_loss / (i + 1))
            pbar.update(1) 

            if (i + 1) % 100 == 0:
                print(f'Epoch [{epoch + 1}], Step [{i + 1}], Loss: {running_loss / (i + 1):.4f}')


save_path = "models/vgg_19_PACS_TASK2.pth"
os.makedirs(os.path.dirname(save_path), exist_ok=True)
torch.save(model.state_dict(), save_path) 
print(f'Model saved to {save_path}')


  label = int(self.ds['labels'][idx].numpy())  # Convert label to int
  label = int(self.ds['labels'][idx].numpy())  # Convert label to int
  label = int(self.ds['labels'][idx].numpy())  # Convert label to int
  label = int(self.ds['labels'][idx].numpy())  # Convert label to int
Epoch 1/3:   4%|▍         | 100/2245 [06:36<2:26:30,  4.10s/batch, loss=1.39]

Epoch [1], Step [100], Loss: 1.3932


Epoch 1/3:   9%|▉         | 200/2245 [13:12<2:14:48,  3.96s/batch, loss=1.21]

Epoch [1], Step [200], Loss: 1.2132


Epoch 1/3:  13%|█▎        | 300/2245 [19:45<2:03:44,  3.82s/batch, loss=1.11]

Epoch [1], Step [300], Loss: 1.1062


Epoch 1/3:  18%|█▊        | 400/2245 [26:22<1:57:55,  3.84s/batch, loss=1.07]

Epoch [1], Step [400], Loss: 1.0674


Epoch 1/3:  22%|██▏       | 500/2245 [32:56<1:59:31,  4.11s/batch, loss=1.06]

Epoch [1], Step [500], Loss: 1.0552


Epoch 1/3:  27%|██▋       | 600/2245 [39:29<1:52:41,  4.11s/batch, loss=1.04]

Epoch [1], Step [600], Loss: 1.0387


Epoch 1/3:  31%|███       | 700/2245 [46:01<1:42:53,  4.00s/batch, loss=1]

Epoch [1], Step [700], Loss: 1.0035


Epoch 1/3:  36%|███▌      | 800/2245 [52:34<1:35:39,  3.97s/batch, loss=0.989]

Epoch [1], Step [800], Loss: 0.9892


Epoch 1/3:  40%|████      | 900/2245 [59:08<1:27:56,  3.92s/batch, loss=0.971]

Epoch [1], Step [900], Loss: 0.9707


Epoch 1/3:  45%|████▍     | 1000/2245 [1:05:41<1:21:02,  3.91s/batch, loss=0.965]

Epoch [1], Step [1000], Loss: 0.9653


Epoch 1/3:  49%|████▉     | 1100/2245 [1:12:14<1:14:35,  3.91s/batch, loss=0.948]

Epoch [1], Step [1100], Loss: 0.9482


Epoch 1/3:  53%|█████▎    | 1200/2245 [1:18:47<1:07:16,  3.86s/batch, loss=0.941]

Epoch [1], Step [1200], Loss: 0.9410


Epoch 1/3:  58%|█████▊    | 1300/2245 [1:25:19<1:00:04,  3.81s/batch, loss=0.938]

Epoch [1], Step [1300], Loss: 0.9384


Epoch 1/3:  62%|██████▏   | 1400/2245 [1:31:51<53:33,  3.80s/batch, loss=0.935]

Epoch [1], Step [1400], Loss: 0.9349


Epoch 1/3:  67%|██████▋   | 1500/2245 [1:38:24<47:22,  3.81s/batch, loss=0.938]

Epoch [1], Step [1500], Loss: 0.9383


Epoch 1/3:  71%|███████▏  | 1600/2245 [1:44:58<40:29,  3.77s/batch, loss=0.932]

Epoch [1], Step [1600], Loss: 0.9322


Epoch 1/3:  76%|███████▌  | 1700/2245 [1:51:33<34:07,  3.76s/batch, loss=0.926]

Epoch [1], Step [1700], Loss: 0.9260


Epoch 1/3:  80%|████████  | 1800/2245 [1:58:05<27:51,  3.76s/batch, loss=0.922]

Epoch [1], Step [1800], Loss: 0.9217


Epoch 1/3:  85%|████████▍ | 1900/2245 [2:04:40<21:11,  3.69s/batch, loss=0.922]

Epoch [1], Step [1900], Loss: 0.9220


Epoch 1/3:  89%|████████▉ | 2000/2245 [2:11:13<15:14,  3.73s/batch, loss=0.91]

Epoch [1], Step [2000], Loss: 0.9099


Epoch 1/3:  94%|█████████▎| 2100/2245 [2:17:47<09:24,  3.89s/batch, loss=0.914]

Epoch [1], Step [2100], Loss: 0.9143


Epoch 1/3:  98%|█████████▊| 2200/2245 [2:24:25<03:02,  4.07s/batch, loss=0.914]

Epoch [1], Step [2200], Loss: 0.9139


Epoch 1/3: 100%|██████████| 2245/2245 [2:27:18<00:00,  3.94s/batch, loss=0.917]
  label = int(self.ds['labels'][idx].numpy())  # Convert label to int
  label = int(self.ds['labels'][idx].numpy())  # Convert label to int
  label = int(self.ds['labels'][idx].numpy())  # Convert label to int
  label = int(self.ds['labels'][idx].numpy())  # Convert label to int
Epoch 2/3:   4%|▍         | 100/2245 [06:33<2:15:17,  3.78s/batch, loss=0.842]

Epoch [2], Step [100], Loss: 0.8422


Epoch 2/3:   9%|▉         | 200/2245 [13:04<2:09:29,  3.80s/batch, loss=0.758]

Epoch [2], Step [200], Loss: 0.7577


Epoch 2/3:  13%|█▎        | 300/2245 [19:35<2:04:24,  3.84s/batch, loss=0.797]

Epoch [2], Step [300], Loss: 0.7971


Epoch 2/3:  18%|█▊        | 400/2245 [26:06<2:00:13,  3.91s/batch, loss=0.843]

Epoch [2], Step [400], Loss: 0.8425


Epoch 2/3:  22%|██▏       | 500/2245 [32:36<1:55:23,  3.97s/batch, loss=0.836]

Epoch [2], Step [500], Loss: 0.8359


Epoch 2/3:  27%|██▋       | 600/2245 [39:07<1:55:36,  4.22s/batch, loss=0.843]

Epoch [2], Step [600], Loss: 0.8427


Epoch 2/3:  31%|███       | 700/2245 [45:37<1:43:01,  4.00s/batch, loss=0.838]

Epoch [2], Step [700], Loss: 0.8383


Epoch 2/3:  36%|███▌      | 800/2245 [52:10<1:39:10,  4.12s/batch, loss=0.853]

Epoch [2], Step [800], Loss: 0.8525


Epoch 2/3:  40%|████      | 900/2245 [58:41<1:34:26,  4.21s/batch, loss=0.864]

Epoch [2], Step [900], Loss: 0.8642


Epoch 2/3:  45%|████▍     | 1000/2245 [1:05:12<1:27:18,  4.21s/batch, loss=0.862]

Epoch [2], Step [1000], Loss: 0.8625


Epoch 2/3:  49%|████▉     | 1100/2245 [1:11:44<1:19:29,  4.17s/batch, loss=0.883]

Epoch [2], Step [1100], Loss: 0.8826


Epoch 2/3:  53%|█████▎    | 1200/2245 [1:18:15<1:12:10,  4.14s/batch, loss=0.876]

Epoch [2], Step [1200], Loss: 0.8758


Epoch 2/3:  58%|█████▊    | 1300/2245 [1:24:46<1:04:52,  4.12s/batch, loss=0.887]

Epoch [2], Step [1300], Loss: 0.8869


Epoch 2/3:  62%|██████▏   | 1400/2245 [1:31:17<57:36,  4.09s/batch, loss=0.884]

Epoch [2], Step [1400], Loss: 0.8844


Epoch 2/3:  67%|██████▋   | 1500/2245 [1:37:50<50:30,  4.07s/batch, loss=0.876]

Epoch [2], Step [1500], Loss: 0.8762


Epoch 2/3:  71%|███████▏  | 1600/2245 [1:44:24<42:39,  3.97s/batch, loss=0.882]

Epoch [2], Step [1600], Loss: 0.8825


Epoch 2/3:  76%|███████▌  | 1700/2245 [1:50:56<36:25,  4.01s/batch, loss=0.89]

Epoch [2], Step [1700], Loss: 0.8895


Epoch 2/3:  80%|████████  | 1800/2245 [1:57:27<30:29,  4.11s/batch, loss=0.887]

Epoch [2], Step [1800], Loss: 0.8867


Epoch 2/3:  85%|████████▍ | 1900/2245 [2:03:57<23:32,  4.10s/batch, loss=0.891]

Epoch [2], Step [1900], Loss: 0.8907


Epoch 2/3:  89%|████████▉ | 2000/2245 [2:10:27<15:38,  3.83s/batch, loss=0.894]

Epoch [2], Step [2000], Loss: 0.8942


Epoch 2/3:  94%|█████████▎| 2100/2245 [2:17:05<10:21,  4.29s/batch, loss=0.905]

Epoch [2], Step [2100], Loss: 0.9050


Epoch 2/3:  98%|█████████▊| 2200/2245 [2:23:48<02:55,  3.90s/batch, loss=0.903]

Epoch [2], Step [2200], Loss: 0.9035


Epoch 2/3: 100%|██████████| 2245/2245 [2:26:49<00:00,  3.92s/batch, loss=0.905]
  label = int(self.ds['labels'][idx].numpy())  # Convert label to int
  label = int(self.ds['labels'][idx].numpy())  # Convert label to int
  label = int(self.ds['labels'][idx].numpy())  # Convert label to int
  label = int(self.ds['labels'][idx].numpy())  # Convert label to int
Epoch 3/3:   4%|▍         | 100/2245 [06:52<2:20:29,  3.93s/batch, loss=0.845]

Epoch [3], Step [100], Loss: 0.8453


Epoch 3/3:   9%|▉         | 200/2245 [13:28<2:15:46,  3.98s/batch, loss=0.844]

Epoch [3], Step [200], Loss: 0.8435


Epoch 3/3:  13%|█▎        | 300/2245 [20:03<2:05:08,  3.86s/batch, loss=0.836]

Epoch [3], Step [300], Loss: 0.8355


Epoch 3/3:  18%|█▊        | 400/2245 [26:35<2:03:47,  4.03s/batch, loss=0.851]

Epoch [3], Step [400], Loss: 0.8506


Epoch 3/3:  22%|██▏       | 500/2245 [33:07<2:00:16,  4.14s/batch, loss=0.868]

Epoch [3], Step [500], Loss: 0.8679


Epoch 3/3:  27%|██▋       | 600/2245 [39:39<1:54:06,  4.16s/batch, loss=0.855]

Epoch [3], Step [600], Loss: 0.8547


Epoch 3/3:  31%|███       | 700/2245 [46:12<1:43:09,  4.01s/batch, loss=0.867]

Epoch [3], Step [700], Loss: 0.8672


Epoch 3/3:  36%|███▌      | 800/2245 [52:44<1:35:59,  3.99s/batch, loss=0.848]

Epoch [3], Step [800], Loss: 0.8484


Epoch 3/3:  40%|████      | 900/2245 [59:15<1:27:51,  3.92s/batch, loss=0.855]

Epoch [3], Step [900], Loss: 0.8554


Epoch 3/3:  45%|████▍     | 1000/2245 [1:05:47<1:20:26,  3.88s/batch, loss=0.864]

Epoch [3], Step [1000], Loss: 0.8637


Epoch 3/3:  49%|████▉     | 1100/2245 [1:12:19<1:15:07,  3.94s/batch, loss=0.86]

Epoch [3], Step [1100], Loss: 0.8598


Epoch 3/3:  53%|█████▎    | 1200/2245 [1:18:52<1:09:01,  3.96s/batch, loss=0.859]

Epoch [3], Step [1200], Loss: 0.8589


Epoch 3/3:  58%|█████▊    | 1300/2245 [1:25:23<1:03:41,  4.04s/batch, loss=0.868]

Epoch [3], Step [1300], Loss: 0.8679


Epoch 3/3:  62%|██████▏   | 1400/2245 [1:31:54<59:20,  4.21s/batch, loss=0.872]

Epoch [3], Step [1400], Loss: 0.8723


Epoch 3/3:  67%|██████▋   | 1500/2245 [1:38:23<50:19,  4.05s/batch, loss=0.877]

Epoch [3], Step [1500], Loss: 0.8767


Epoch 3/3:  71%|███████▏  | 1600/2245 [1:44:54<40:57,  3.81s/batch, loss=0.871]

Epoch [3], Step [1600], Loss: 0.8708


Epoch 3/3:  76%|███████▌  | 1700/2245 [1:51:23<33:43,  3.71s/batch, loss=0.87]

Epoch [3], Step [1700], Loss: 0.8704


Epoch 3/3:  80%|████████  | 1800/2245 [1:57:54<28:01,  3.78s/batch, loss=0.868]

Epoch [3], Step [1800], Loss: 0.8679


Epoch 3/3:  85%|████████▍ | 1900/2245 [2:04:24<22:03,  3.83s/batch, loss=0.87]

Epoch [3], Step [1900], Loss: 0.8699


Epoch 3/3:  89%|████████▉ | 2000/2245 [2:10:54<15:43,  3.85s/batch, loss=0.878]

Epoch [3], Step [2000], Loss: 0.8779


Epoch 3/3:  94%|█████████▎| 2100/2245 [2:17:26<09:09,  3.79s/batch, loss=0.879]

Epoch [3], Step [2100], Loss: 0.8790


Epoch 3/3:  98%|█████████▊| 2200/2245 [2:23:57<02:50,  3.80s/batch, loss=0.882]

Epoch [3], Step [2200], Loss: 0.8815


Epoch 3/3: 100%|██████████| 2245/2245 [2:26:53<00:00,  3.93s/batch, loss=0.88]


Model saved to models/vgg_19_PACS_TASK2.pth


Running the Training Loop

Evaluate the VGG model. See how Well the Model performs on an Out of Domain Dataset like PACS. It Represents A covariate Shift so a drop in Accuracy is expected.

In [18]:
val_dataset = PACSDataset(val_dataset, transform=transform)
valloader = DataLoader(val_dataset, batch_size=4, shuffle=False, num_workers=4)

model.eval()
correct = 0
total = 0
with torch.no_grad():
    for images, labels in tqdm(valloader):
        images, labels = images.to(device), labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

PACS_Accuracy = 100 * correct / total

print(f'Accuracy on PACS Validation Set: {100 * correct / total:.2f}%')

  label = int(self.ds['labels'][idx].numpy())  # Convert label to int
  label = int(self.ds['labels'][idx].numpy())  # Convert label to int
  label = int(self.ds['labels'][idx].numpy())  # Convert label to int
  label = int(self.ds['labels'][idx].numpy())  # Convert label to int
100%|██████████| 254/254 [14:20<00:00,  3.39s/it]

Accuracy on PACS Validation Set: 78.01%



