## Healthcare Image Classification

A healthcare company wants to develop a model to classify X-ray images into different categories of lung diseases. They have a limited dataset of labeled X-ray images for training. To improve the model's performance, they decide to use transfer learning with a pre-trained model like VGG16.

### Task 1: Transfer Learning with VGG16

- Implement transfer learning with VGG16 to train a model on the given dataset of X-ray images.
- Evaluate the model's performance on a separate test set.
- Calculate relevant metrics such as accuracy, precision, recall, and F1 score.

### Task 2: Quantum Hyperparameter Optimization

- Design a quantum circuit that can be used to optimize the hyperparameters of the neural network model for the X-ray image classification task.
- Compare the efficiency of the classical and quantum approaches in terms of hyperparameter optimization time and model performance improvement.



In [1]:
import os
import shutil
import tempfile
import matplotlib.pyplot as plt
import PIL
import torch
import numpy as np
from sklearn.metrics import classification_report

from monai.apps import download_and_extract
from monai.config import print_config
from monai.data import decollate_batch, DataLoader
from monai.metrics import ROCAUCMetric
from monai.networks.nets import DenseNet121
from monai.transforms import (
    Activations,
    EnsureChannelFirst,
    AsDiscrete,
    Compose,
    LoadImage,
    RandFlip,
    RandRotate,
    RandZoom,
    ScaleIntensity,
)
from monai.utils import set_determinism

print_config()

MONAI version: 1.3.0
Numpy version: 1.23.5
Pytorch version: 2.0.1+cpu
MONAI flags: HAS_EXT = False, USE_COMPILED = False, USE_META_DICT = False
MONAI rev id: 865972f7a791bf7b42efbcd87c8402bd865b329e
MONAI __file__: c:\Users\<username>\AppData\Local\Programs\Python\Python39\lib\site-packages\monai\__init__.py

Optional dependencies:
Pytorch Ignite version: NOT INSTALLED or UNKNOWN VERSION.
ITK version: NOT INSTALLED or UNKNOWN VERSION.
Nibabel version: NOT INSTALLED or UNKNOWN VERSION.
scikit-image version: 0.21.0
scipy version: 1.10.1
Pillow version: 9.4.0
Tensorboard version: 2.12.3
gdown version: NOT INSTALLED or UNKNOWN VERSION.
TorchVision version: NOT INSTALLED or UNKNOWN VERSION.
tqdm version: 4.65.0
lmdb version: NOT INSTALLED or UNKNOWN VERSION.
psutil version: 5.9.5
pandas version: 2.0.2
einops version: NOT INSTALLED or UNKNOWN VERSION.
transformers version: NOT INSTALLED or UNKNOWN VERSION.
mlflow version: NOT INSTALLED or UNKNOWN VERSION.
pynrrd version: NOT INSTALLED or UNK

In [2]:
import pandas as pd
import pydicom
import pickle
import cv2

from sklearn.model_selection import train_test_split

# Timing utility
from timeit import default_timer as timer
from tqdm import tqdm

import torch
from sklearn.metrics import roc_auc_score, accuracy_score, precision_score, recall_score, f1_score, roc_curve, auc
from sklearn.metrics import roc_auc_score
from sklearn.metrics import multilabel_confusion_matrix
import seaborn as sn

import warnings
warnings.filterwarnings('ignore', category=FutureWarning)
warnings.filterwarnings('ignore', category=UserWarning)


print("all imported")

set_determinism(seed=0)

all imported


In [3]:
diseases = ['Aortic enlargement', 'Atelectasis', 'Calcification', 'Cardiomegaly',
 'Consolidation', 'ILD', 'Infiltration', 'Lung Opacity', 'Nodule/Mass',
 'Other lesion', 'Pleural effusion', 'Pleural thickening', 'Pneumothorax',
 'Pulmonary fibrosis']

# decided on the basis of frequency of occurence of individual diseases in images.

# Drop columns not in the list
columns_to_keep = diseases.copy()
columns_to_keep.append('image_id')

print(diseases)
print(columns_to_keep)

['Aortic enlargement', 'Atelectasis', 'Calcification', 'Cardiomegaly', 'Consolidation', 'ILD', 'Infiltration', 'Lung Opacity', 'Nodule/Mass', 'Other lesion', 'Pleural effusion', 'Pleural thickening', 'Pneumothorax', 'Pulmonary fibrosis']
['Aortic enlargement', 'Atelectasis', 'Calcification', 'Cardiomegaly', 'Consolidation', 'ILD', 'Infiltration', 'Lung Opacity', 'Nodule/Mass', 'Other lesion', 'Pleural effusion', 'Pleural thickening', 'Pneumothorax', 'Pulmonary fibrosis', 'image_id']


In [None]:
import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.optim as optim
import monai
from monai.transforms import Compose, LoadImage, AddChannel, ScaleIntensity, RandRotate, RandFlip, ToTensor
from monai.networks.nets import VGG16

In [None]:

# Define transforms
train_transforms = Compose([
    LoadImage(image_only=True),
    AddChannel(),
    ScaleIntensity(),
    RandRotate(range_x=15, prob=0.5),
    RandFlip(spatial_axis=0, prob=0.5),
    RandFlip(spatial_axis=1, prob=0.5),
    ToTensor()
])

test_transforms = Compose([
    LoadImage(image_only=True),
    AddChannel(),
    ScaleIntensity(),
    ToTensor()
])

# Load dataset
train_dataset = monai.data.ImageDataset(
    image_files=train_files,
    labels=train_labels,
    transform=train_transforms
)

In [None]:

test_dataset = monai.data.ImageDataset(
    image_files=test_files,
    labels=test_labels,
    transform=test_transforms
)

In [None]:
# Create dataloaders
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=8, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=1)

# Load pre-trained VGG16 model
vgg16 = VGG16(spatial_dims=2, in_channels=1, num_classes=num_classes)

In [None]:
# Freeze layers except the final classifier layers
for param in vgg16.parameters():
    param.requires_grad = False

# Define new classifier
num_ftrs = vgg16.classifier[6].in_features
vgg16.classifier[6] = nn.Linear(num_ftrs, num_classes)

# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(vgg16.parameters(), lr=0.001)

In [None]:

# Train the model
num_epochs = 10
for epoch in range(num_epochs):
    vgg16.train()
    running_loss = 0.0
    for i, data in enumerate(train_loader, 0):
        inputs, labels = data['image'], data['label']
        optimizer.zero_grad()
        outputs = vgg16(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
    
    print(f"Epoch {epoch+1}, Loss: {running_loss / len(train_loader)}")

# Evaluate the model
correct = 0
total = 0
with torch.no_grad():
    vgg16.eval()
    for data in test_loader:
        images, labels = data['image'], data['label']
        outputs = vgg16(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f"Accuracy: {100 * correct / total}%")

In [None]:
import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.optim as optim
import monai
from monai.transforms import Compose, LoadImage, AddChannel, ScaleIntensity, RandRotate, RandFlip, ToTensor
from monai.networks.nets import VGG16

# Define transforms
train_transforms = Compose([
    LoadImage(image_only=True),
    AddChannel(),
    ScaleIntensity(),
    RandRotate(range_x=15, prob=0.5),
    RandFlip(spatial_axis=0, prob=0.5),
    RandFlip(spatial_axis=1, prob=0.5),
    ToTensor()
])

test_transforms = Compose([
    LoadImage(image_only=True),
    AddChannel(),
    ScaleIntensity(),
    ToTensor()
])

# Load dataset
train_dataset = monai.data.ImageDataset(
    image_files=train_files,
    labels=train_labels,
    transform=train_transforms
)

test_dataset = monai.data.ImageDataset(
    image_files=test_files,
    labels=test_labels,
    transform=test_transforms
)

# Create dataloaders
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=8, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=1)

# Load pre-trained VGG16 model
vgg16 = VGG16(spatial_dims=2, in_channels=1, num_classes=num_classes)

# Freeze layers except the final classifier layers
for param in vgg16.parameters():
    param.requires_grad = False

# Define new classifier
num_ftrs = vgg16.classifier[6].in_features
vgg16.classifier[6] = nn.Linear(num_ftrs, num_classes)

# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(vgg16.parameters(), lr=0.001)

# Train the model
num_epochs = 10
for epoch in range(num_epochs):
    vgg16.train()
    running_loss = 0.0
    for i, data in enumerate(train_loader, 0):
        inputs, labels = data['image'], data['label']
        optimizer.zero_grad()
        outputs = vgg16(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
    
    print(f"Epoch {epoch+1}, Loss: {running_loss / len(train_loader)}")

# Evaluate the model
correct = 0
total = 0
with torch.no_grad():
    vgg16.eval()
    for data in test_loader:
        images, labels = data['image'], data['label']
        outputs = vgg16(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f"Accuracy: {100 * correct / total}%")


This Python code uses the PyTorch and MONAI libraries to train a convolutional neural network (CNN) for image classification. The specific CNN used is VGG16, a popular model known for its simplicity and high performance on the ImageNet dataset.

The code performs the following steps:

1. Import the necessary libraries and modules.
2. Define transformations for the training and testing datasets, including loading the image, adding a channel dimension, scaling the intensity, and converting to a PyTorch tensor.
3. Load the training and testing datasets using MONAI's `ImageDataset` class and apply the defined transformations.
4. Create data loaders for the training and testing datasets.
5. Load the pre-trained VGG16 model using MONAI's `VGG16` class.
6. Modify the model by freezing all layers except the final classifier layer and replacing the final classifier layer with a new linear layer.
7. Define the loss function as cross-entropy and the optimizer as Adam with a learning rate of 0.001.
8. Train the model for a specified number of epochs, updating the parameters to minimize the loss on the training data.
9. Print the loss for each epoch.
10. Evaluate the trained model on the testing data and calculate the accuracy.

This code provides a framework for training and evaluating a CNN model using VGG16 for image classification tasks.
