<a href="https://colab.research.google.com/github/Seba485/Deep-Learning-project/blob/main/TransferLearning_grayscale_trial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using pre-trained models related to our dataset (grayscale, mri dataset)

Our MRI dataset is greyscale which consists of only one channel. However, most of the pretrained models are trained on RGB image such as ImageNet. Thus, we firstly finetuned our dataset on pretrained models with ImageNet (RGB) by stacking the same channel 3 times. However, the research  ([Pre-training on Grayscale ImageNet Improves
Medical Image Classification](https://openaccess.thecvf.com/content_ECCVW_2018/papers/11134/Xie_Pre-training_on_Grayscale_ImageNet_Improves_Medical_Image_Classification_ECCVW_2018_paper.pdf)) shows that finetuning grayscale X-ray image on grayscale pretraining model has higher AUC score than finetuning grayscale image on original RGB pretraining model. (*However, the difference is not significant. AUC 0.7706 and AUC 0.7498) Therefore, we finetuned our MRI dataset using open source models that have been trained on grayscale dataset.



In [3]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
import torch.backends.cudnn as cudnn
import numpy as np
import pandas as pd
import torchvision
from torchvision import datasets, models, transforms
import matplotlib.pyplot as plt
import time
import os
from PIL import Image
from tempfile import TemporaryDirectory
import tensorflow as tf
from torch.utils.data import TensorDataset, Dataset, DataLoader
from tqdm import tqdm
from sklearn.metrics import roc_auc_score
import copy

try:
    from google.colab import drive
    drive.mount('/content/drive/')
    print('running the notebook in colab')
except:
    pass

Drive already mounted at /content/drive/; to attempt to forcibly remount, call drive.mount("/content/drive/", force_remount=True).
running the notebook in colab


### Load & Prepare Dataset

We have to load our dataset as pytorch format in order to use pytorch-based models provided by huggingface. Also, the same train and val set division and the same augmented training set should be used to identically compare the model performance. Therefore, the data type conversion from tensorflow dataloader to pytorch dataloader is required.

We first tried to load the already preprocessed tensorflow dataset and change it to pytorch format. However, we encountered various errors, mainly not being able to scale the input image range from [0,255] to [0,1].

Thus, we alternatively used saved image files to make pytorch format.

(Failed codes commented out here)
<!---

```
# transform tensorflow dataset to pytorch dataloader

############ Idea 1 ############ -> [x]

# load saved augmented dataset
dataset_path = '/content/drive/MyDrive/UNIPD-DLNR/Dataset'
train_ds_aug_tf = tf.data.Dataset.load(dataset_path+'/train_ds_aug.tfrecord')
val_ds_tf = tf.data.Dataset.load(dataset_path+'/val_ds.tfrecord')
val_ds_tf = tf.data.Dataset.load('/content/drive/MyDrive/UNIPD-DLNR/Dataset/val_ds.tfrecord')
train_ds_tf = tf.data.Dataset.load(dataset_path+'/train_ds.tfrecord')

# Step 1) extract images, labels from tf dataset
def images_labels_fromDataset(dataset):
    dataset_it = dataset.as_numpy_iterator() # Numpy iterator, the object in input is not iterable
    all_labels=[]; all_images=[]
    for batch_images, batch_labels in dataset_it:
        all_labels.extend(batch_labels)
        # batch_images = batch_images.astype(np.uint8)
        all_images.extend(batch_images)
    return all_images, all_labels

# tensors_train, labels_train = images_labels_fromDataset(train_ds_aug_tf)
tensors_val, labels_val = images_labels_fromDataset(val_ds_tf)
# tensors_train_, labels_train_ = images_labels_fromDataset(train_ds_tf)

# Step 2) list -> torch dataset
# use transforms.ToTensor to scale images from [0,255] to [0,1]

# print(type(val_ds_tf))
# AUTOTUNE = tf.data.experimental.AUTOTUNE
# val_ds_tf = val_ds_tf.cache().prefetch(buffer_size=AUTOTUNE)
# print(type(val_ds_tf))

basic_transform = transforms.Compose([
    transforms.ToPILImage(),
    # transforms.Resize((256, 256)),
    transforms.ToTensor(), # can be scaled only if the numpy.ndarray has dtype = np.uint8
                           # however, type changes to float after torch.Tensor(ndarray)
                           # therefore, manually scale it to [0,1]
    transforms.Normalize([73.4793],[83.3339]),
    transforms.Grayscale(num_output_channels=1),
])

''' Calculate mean, std of train dataset for normalization
def check_img_stats(loader):
    mean = 0.
    std = 0.
    for images, _ in loader:
        batch_samples = images.size(0) # batch size (the last batch can have smaller size!)
        # images = images.view(batch_samples, images.size(1), -1)
        images = images.view(batch_samples, 1, -1)
        mean += images.mean(2).sum(0)
        std += images.std(2).sum(0)
    mean /= len(loader.dataset)
    std /= len(loader.dataset)
    print(mean, std)

check_img_stats(train_ds_aug)
# > tensor([73.4793]) tensor([83.3339])

check_img_stats(train_ds)
# > tensor([70.7964]) tensor([82.4302])
'''

# train_ds_aug_dataset = TensorDataset(torch.Tensor(tensors_train), torch.Tensor(labels_train))
val_ds_dataset = TensorDataset(torch.Tensor(tensors_val), torch.Tensor(labels_val))
# train_ds_dataset = TensorDataset(torch.Tensor(tensors_train_), torch.Tensor(labels_train_))
# x = torch.Tensor(tensors_train_)
# y = torch.Tensor(labels_train_)
# train_ds_dataset = TensorDataset(x, y)

# train_ds_aug_dataset.transform = basic_transform
val_ds_dataset.transform = basic_transform
# train_ds_dataset.transform = basic_transform

# check if scaling worked -> [x]
images, _ = next(iter(val_ds_dataset))
pd.Series(images.reshape(-1)).hist() # should be [0,1] range

# Step 3) torch dataset -> torch dataloader
BATCH_SIZE = 40
# train_ds_aug = DataLoader(train_ds_aug_dataset, batch_size=BATCH_SIZE, shuffle=True)
val_ds = DataLoader(val_ds_dataset, batch_size=BATCH_SIZE, shuffle=False)
# train_ds = DataLoader(train_ds_dataset, batch_size=BATCH_SIZE, shuffle=True)```


############ Idea 2 ############ -> [x]
class CustomPyTorchDataset(Dataset):
    def __init__(self, dataset, transform=None):
        self.dataset = dataset
        self.transform = transform

    def __len__(self):
        return len(self.dataset)

    def __getitem__(self, idx):
        sample = self.dataset[idx]

        image = sample['image']
        label = sample['label']
        # print(image.shape)
        # image = np.squeeze(image, axis=0)

        if self.transform:
            image = self.transform(image)

        return {'image': image, 'label': label}

torch_transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.ToTensor(),
    transforms.Grayscale(num_output_channels=1),
])
BATCH_SIZE = 40

# train set
train_ds_aug_np = [{'image': img.numpy(), 'label': l.numpy()} for img, l in train_ds_aug_tf]
train_ds_aug_dataset = CustomPyTorchDataset(dataset=train_ds_aug_np, transform=torch_transform)
train_ds_aug = DataLoader(train_ds_aug_dataset, batch_size=BATCH_SIZE, shuffle=True)

# val set
val_ds_np = [{'image': img.numpy(), 'label': l.numpy()} for img, l in val_ds_tf]
val_ds_dataset = CustomPyTorchDataset(dataset=val_ds_np, transform=torch_transform)
val_ds = DataLoader(val_ds_dataset, batch_size=BATCH_SIZE, shuffle=False)
```

-->


In [None]:
# save validation arrays to image files
'''
val_ds_tf = tf.data.Dataset.load('/content/drive/MyDrive/UNIPD-DLNR/Dataset/val_ds.tfrecord')
class_name = ['Non_Demented', 'Mild_Demented', 'Very_Mild_Demented', 'Moderate_Demented']
folder_path = '/content/drive/MyDrive/UNIPD-DLNR/Dataset/validation'

from tqdm import tqdm
import imageio

for i in tqdm(range(len(tensors_val))):
    img = np.squeeze(tensors_val[i], axis=-1).astype(np.uint8)
    path = os.path.join(folder_path, class_name[labels_val[i]], f"{i}.jpg")
    imageio.imwrite(path, img)
'''

100%|██████████| 1023/1023 [00:04<00:00, 217.53it/s]


### 1. [Theem/fasterrcnn_resnet50_fpn_grayscale](https://huggingface.co/Theem/fasterrcnn_resnet50_fpn_grayscale)
: PyTorch FasterRCNN with ResNet50 backbone finetuned on grayscale COCO
- COCO Dataset: The COCO (Common Objects in Context) dataset is a large-scale image recognition dataset for object detection, segmentation, and captioning tasks. It contains over 330,000 images, each annotated with 80 object categories and 5 captions describing the scene.

***Training Failed**
:the model is specifically designed for object detection task while our task is image classification. Even though we modified the last layers into classifier layers, the model training was not successful as the model required the target label for object detection.

In [2]:
########### LOAD DATASET ###########

dataset_path = '/content/drive/MyDrive/UNIPD-DLNR/Dataset'
BATCH_SIZE = 40

def check_img_stats(loader):
    mean = 0.
    std = 0.
    for images, _ in tqdm(loader):
        batch_samples = images.size(0) # batch size (the last batch can have smaller size!)
        images = images.view(batch_samples, images.size(1), -1)
        mean += images.mean(2).sum(0)
        std += images.std(2).sum(0)
    mean /= len(loader.dataset)
    std /= len(loader.dataset)
    print(mean, std)
    return mean, std

# load train dataset
train_path = dataset_path+'/Augmented_TrainDataset'
'''
basic_transforms = transforms.Compose([
    transforms.ToTensor(),
    transforms.Grayscale(num_output_channels=1),
])
train_dataset = datasets.ImageFolder(train_path, basic_transforms)
train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)

print('Train dataset statistics:')
mean, std = check_img_stats(train_dataloader) # 0.2887, 0.3272
'''
# normalize it with the updated stats
updated_transforms = transforms.Compose([
    transforms.ToTensor(),
    transforms.Grayscale(num_output_channels=1),
    transforms.Normalize([0.2887],[0.3272])
])
train_dataset = datasets.ImageFolder(train_path, updated_transforms)
train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)
# print('Final train dataset statistics:')
# check_img_stats(train_dataloader) # almost 0, 1 -> correct

# load val dataset
val_path = dataset_path+'/validation'
val_dataset = datasets.ImageFolder(val_path, updated_transforms)
val_dataloader = torch.utils.data.DataLoader(val_dataset, batch_size=BATCH_SIZE, shuffle=False)
# check_img_stats(val_dataloader) # ? , 0.9xxx -> ok

# load test dataset
test_path = dataset_path+'/test'
test_dataset = datasets.ImageFolder(test_path, updated_transforms)
test_dataloader = torch.utils.data.DataLoader(test_dataset, batch_size=BATCH_SIZE, shuffle=False)
# check_img_stats(test_dataloader) # -0.0331, 0.9888 -> ok


In [27]:
########### LOAD MODEL ###########

# Load torchvision model structure & modify it for grescale images
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(weights=None)
model.backbone.body.conv1 = torch.nn.Conv2d(1, 64,
                            kernel_size=(7, 7), stride=(2, 2),
                            padding=(3, 3), bias=False).requires_grad_(True)

# Load pretrained weights to the model
model_path = '/content/drive/MyDrive/UNIPD-DLNR/pretrained_model/fasterrcnn_resnet50_fpn_coco_grayscale.pth'
state_dict = torch.load(model_path)['model']
model.load_state_dict(state_dict)

# Romove object detecion box predictor & Add classifier layer
num_classes = 4
model.roi_heads.box_predictor = nn.Sequential(
    nn.Linear(1024,256),
    nn.ReLU(),
    nn.Linear(256, num_classes)
)

# Freeze model params except for the classifier layer
only_last = True
if only_last:
    for param in model.backbone.parameters():
        param.requires_grad = False
    for param in model.rpn.parameters():
        param.requires_grad = False
    for param in model.roi_heads.box_head.parameters():
        param.requires_grad = False
else:
    for param in model.parameters():
        param.requires_grad = True

In [None]:
########### FINETUNE ###########

# Set training hyperparameters
epochs = 20

criterion = nn.CrossEntropyLoss() # can be used for multi-class classification, not one-hot encoded targets

params_to_update = []
for name,param in model.named_parameters():
    if param.requires_grad == True:
        print("\t",name)
        params_to_update.append(param)
optimizer = torch.optim.AdamW(params_to_update, lr=0.001, weight_decay=0.01)

# Load model to GPU
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)
model.to(device)

dataloaders = {'train':train_dataloader, 'val':val_dataloader}

In [34]:
# Train

since = time.time()

val_acc_history = []

best_model_wts = copy.deepcopy(model.state_dict())
best_acc = 0.0
num_epochs = 20

for epoch in range(num_epochs):
    print('Epoch {}/{}'.format(epoch+1, num_epochs))
    print('-' * 10)

    # Each epoch has a training and validation phase
    for phase in ['train', 'val']:
        if phase == 'train':
            model.train()  # Set model to training mode
        else:
            model.eval()   # Set model to evaluate mode

        running_loss = 0.0
        running_corrects = 0

        # Iterate over data.
        for inputs, labels in dataloaders[phase]:
            inputs = inputs.to(device)
            labels = labels.to(device)

            # zero the parameter gradients
            optimizer.zero_grad()

            # forward
            # track history if only in train
            with torch.set_grad_enabled(phase == 'train'):
                # Get model outputs and calculate loss
                # Special case for inception because in training it has an auxiliary output. In train
                #   mode we calculate the loss by summing the final output and the auxiliary output
                #   but in testing we only consider the final output.

                targets = {'labels':labels}

                outputs = model(inputs, targets)
                loss = criterion(outputs, labels)

                _, preds = torch.max(outputs, 1)

                # backward + optimize only if in training phase
                if phase == 'train':
                    loss.backward()
                    optimizer.step()

            # statistics
            running_loss += loss.item() * inputs.size(0)
            running_corrects += torch.sum(preds == labels.data)

        epoch_loss = running_loss / len(dataloaders[phase].dataset)
        epoch_acc = running_corrects.double() / len(dataloaders[phase].dataset)

        print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc))

        # deep copy the model
        if phase == 'val' and epoch_acc > best_acc:
            best_acc = epoch_acc
            best_model_wts = copy.deepcopy(model.state_dict())
        if phase == 'val':
            val_acc_history.append(epoch_acc)

    print()

time_elapsed = time.time() - since
print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))
print('Best val Acc: {:4f}'.format(best_acc))

# load best model weights
model.load_state_dict(best_model_wts)

# save model
model_name = f"fasterrcnn_coco_greyscale_e{epochs}.pt"
torch.save(model.state_dict(), '/content/drive/MyDrive/UNIPD-DLNR/model' + model_name)

Epoch 1/20
----------


TypeError: string indices must be integers

### raedinkhaled/vit-base-mri
: fine-tuned version of google/vit-base-patch16-224-in21k on the mriDataSet dataset: https://huggingface.co/raedinkhaled/vit-base-mri

While we have to duplicate the single gray-scale channel three times to be able to use this model, we thought it's worth trying as the model is fine-tuned on mri dataset.

However, there were several unexpected errors occured during training, so we decided to discard this method.

In [62]:
########### LOAD DATASET ###########

dataset_path = '/content/drive/MyDrive/UNIPD-DLNR/Dataset'
BATCH_SIZE = 40

def check_img_stats(loader):
    mean = 0.
    std = 0.
    for images, _ in tqdm(loader):
        batch_samples = images.size(0) # batch size (the last batch can have smaller size!)
        images = images.view(batch_samples, images.size(1), -1)
        mean += images.mean(2).sum(0)
        std += images.std(2).sum(0)
    mean /= len(loader.dataset)
    std /= len(loader.dataset)
    print(mean, std)
    return mean, std

# load train dataset
train_path = dataset_path+'/Augmented_TrainDataset'
'''
basic_transforms = transforms.Compose([
    transforms.ToTensor(),
    transforms.Grayscale(num_output_channels=1),
])
train_dataset = datasets.ImageFolder(train_path, basic_transforms)
train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)

print('Train dataset statistics:')
mean, std = check_img_stats(train_dataloader) # 0.2887, 0.3272
'''
# normalize it with the updated stats
updated_transforms = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Grayscale(num_output_channels=3),
    # transforms.Normalize([0.2887, 0.2887, 0.2887],[0.3272, 0.3272, 0.3272])
])
train_dataset = datasets.ImageFolder(train_path, updated_transforms)
train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)
# print('Final train dataset statistics:')
# check_img_stats(train_dataloader) # almost 0, 1 -> correct

# load val dataset
val_path = dataset_path+'/validation'
val_dataset = datasets.ImageFolder(val_path, updated_transforms)
val_dataloader = torch.utils.data.DataLoader(val_dataset, batch_size=BATCH_SIZE, shuffle=False)
# check_img_stats(val_dataloader) # ? , 0.9xxx -> ok

# load test dataset
test_path = dataset_path+'/test'
test_dataset = datasets.ImageFolder(test_path, updated_transforms)
test_dataloader = torch.utils.data.DataLoader(test_dataset, batch_size=BATCH_SIZE, shuffle=False)
# check_img_stats(test_dataloader) # -0.0331, 0.9888 -> ok

dataloaders = {'train':train_dataloader, 'val':val_dataloader}

In [63]:
# fine-tuned version of google/vit-base-patch16-224-in21k on the mriDataSet dataset: https://huggingface.co/raedinkhaled/vit-base-mri
from transformers import AutoImageProcessor, AutoModelForImageClassification
processor = AutoImageProcessor.from_pretrained("raedinkhaled/vit-base-mri")
model = AutoModelForImageClassification.from_pretrained("raedinkhaled/vit-base-mri")

model.classifier = nn.Linear(in_features=768, out_features=4)
for param in model.vit.parameters():
    param.requires_grad = False

# Set training hyperparameters
criterion = nn.CrossEntropyLoss() # can be used for multi-class classification, not one-hot encoded targets

params_to_update = []
for name,param in model.named_parameters():
    if param.requires_grad == True:
        print("\t",name)
        params_to_update.append(param)
optimizer = torch.optim.AdamW(params_to_update, lr=0.001, weight_decay=0.01)

# Load model to GPU
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)
model.to(device)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
Could not find image processor class in the image processor config or the model config. Loading based on pattern matching with the model's feature extractor configuration.


	 classifier.weight
	 classifier.bias
cuda:0


ViTForImageClassification(
  (vit): ViTModel(
    (embeddings): ViTEmbeddings(
      (patch_embeddings): ViTPatchEmbeddings(
        (projection): Conv2d(3, 768, kernel_size=(16, 16), stride=(16, 16))
      )
      (dropout): Dropout(p=0.0, inplace=False)
    )
    (encoder): ViTEncoder(
      (layer): ModuleList(
        (0-11): 12 x ViTLayer(
          (attention): ViTAttention(
            (attention): ViTSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.0, inplace=False)
            )
            (output): ViTSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.0, inplace=False)
            )
          )
          (intermediate): ViTIntermediate(
            (dense): Linear(in_features=7

In [65]:
since = time.time()

val_acc_history = []

best_model_wts = copy.deepcopy(model.state_dict())
best_acc = 0.0
num_epochs = 20

for epoch in range(num_epochs):
    print('Epoch {}/{}'.format(epoch+1, num_epochs))
    print('-' * 10)

    # Each epoch has a training and validation phase
    for phase in ['train', 'val']:
        if phase == 'train':
            model.train()  # Set model to training mode
        else:
            model.eval()   # Set model to evaluate mode

        running_loss = 0.0
        running_corrects = 0

        # Iterate over data.
        for inputs, labels in dataloaders[phase]:
            inputs = processor(inputs, return_tensors="pt").to(device)

            inputs = inputs.to(device)
            labels = labels.to(device)

            # zero the parameter gradients
            optimizer.zero_grad()

            # forward
            # track history if only in train
            with torch.set_grad_enabled(phase == 'train'):
                # Get model outputs and calculate loss
                # Special case for inception because in training it has an auxiliary output. In train
                #   mode we calculate the loss by summing the final output and the auxiliary output
                #   but in testing we only consider the final output.

                # targets = {'labels':labels}

                outputs = model(inputs, labels)
                loss = criterion(outputs, labels)

                _, preds = torch.max(outputs, 1)

                # backward + optimize only if in training phase
                if phase == 'train':
                    loss.backward()
                    optimizer.step()

            # statistics
            running_loss += loss.item() * inputs.size(0)
            running_corrects += torch.sum(preds == labels.data)

        epoch_loss = running_loss / len(dataloaders[phase].dataset)
        epoch_acc = running_corrects.double() / len(dataloaders[phase].dataset)

        print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc))

        # deep copy the model
        if phase == 'val' and epoch_acc > best_acc:
            best_acc = epoch_acc
            best_model_wts = copy.deepcopy(model.state_dict())
        if phase == 'val':
            val_acc_history.append(epoch_acc)

    print()

time_elapsed = time.time() - since
print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))
print('Best val Acc: {:4f}'.format(best_acc))

# load best model weights
model.load_state_dict(best_model_wts)

Epoch 1/20
----------


AttributeError: 

### greyscale imagenet
- https://github.com/DaveRichmond-/grayscale-imagenet/tree/master
- (+additionally finetuned on X-RAY dataset ?)

This is the model which is described on the paper above. We expected to use this model but failed to load the model that is provided on the github repository with no description on its usage.

In [44]:
path1 = '/content/drive/MyDrive/UNIPD-DLNR/pretrained_model/imagenet-gray/model.ckpt-1495066.data-00000-of-00001'
path2 = '/content/drive/MyDrive/UNIPD-DLNR/pretrained_model/imagenet-gray/model.ckpt-1495066.index'
path3 = '/content/drive/MyDrive/UNIPD-DLNR/pretrained_model/imagenet-gray/model.ckpt-1495066.meta'
# model_1 = tf.keras.models.load_model(path3)
with tf.Session() as sess:
    # Restore the saved model
    saver = tf.train.import_meta_graph(path1)
    saver.restore(sess, path1)


AttributeError: module 'tensorflow' has no attribute 'Session'