<a href="https://colab.research.google.com/github/bptripp/ai-course/blob/main/pneumonia_detection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Transfer Learning for Pneumonia Detection
In this activity you will use a convolutional network for detection of pneumonia in chest radiographs. This is a more difficult task than classifying digits, so it will require a larger network.

Training such a network from scratch would take several days and would require a large number of labelled examples. However, one can often get away with less training and less data by starting with a network that has already been trained for a related task.

For vision tasks, it is common to begin with a network that has been trained on ImageNet-1K, a dataset of over a million images of objects that are labelled with 1000 categories, including different kinds of animals, vehicles, etc.

Detecting pneumonia in chest radiographs is quite different, even in terms of image statistics, so it is not clear in advance how well this approach will work for this problem.

You should run the code for this example on a graphical processing unit (GPU). Before proceeding, select "Change runtime type" from the "Runtime" menu, and select a GPU option from among the "Hardware accelerator" choices. Default values are fine for any other choices.

*After selecting a GPU option, run the code below to import some required libraries and confirm that you are using a GPU. This code should print "cuda:0". If it prints "cpu" instead, check your hardware accelerator setting and try again.*

In [1]:
import numpy as np
import torch
from torchvision import datasets, models, transforms
import torch.nn as nn
from torch.nn import functional as F
import torch.optim as optim

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

cuda:0


PyTorch has a number of built-in convolutional networks that have been pre-trained on ImageNet, so it is easy to download one and use it. ResNet50 is an effective network for many visual classication tasks.

*Run the code below to download a convolutional network that has already been trained on ImageNet.*

In [3]:
model = models.resnet50(weights=models.ResNet50_Weights.IMAGENET1K_V1)

For your interest, print the network structure by running the code below.

*Optionally run this code to print the network structure.*

In [None]:
print(model)

The next step is to make some changes to the network to prepare it for the new task. First, you will "freeze" the existing network parameters so that they do not change during subsequent training. The network will therefore continue to calculate the features it learned on ImageNet.

Second, you will replace the model's final "fully-connected" layer with three new layers. These new layers will learn to use the network's ImageNet-trained features for the new task of pneumonia detection.

*Run the code below to freeze the existing network parameters and replace the exisiting ImageNet-trained output layer with new randomly initialized layers.*

In [4]:
# freeze existing network parameters
for parameter in model.parameters():
    parameter.requires_grad = False

# replace final fully-connected layer with two new layers
model.fc = nn.Sequential(
               nn.Linear(2048, 64),
               nn.ReLU(inplace=True),
               nn.Linear(64, 64),
               nn.ReLU(inplace=True),
               nn.Linear(64, 2))

model.to(device);

The network is now ready to learn pneumonia detection, but it needs a labelled dataset for this purpose. You will use the data from this paper:

Kermany, D. S., Goldbaum, M., Cai, W., Valentim, C. C., Liang, H., Baxter, S. L., ... & Zhang, K. (2018). Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell, 172(5), 1122-1131.

The dataset is available from https://www.kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia. Downloading it, moving it to this server, and unzipping it would normally involve some manual steps. However, the dataset has been copied to a convenient location so that the code below can perform these steps automatically. This is possible because the paper's authors made the dataset available under a Creative Commons License. This copy of the dataset has not been altered.

*Run the code below to download the dataset to this server and unzip it.*

In [5]:
!wget bptripp.com/Kermany-Chest-XRay-Data.zip
!unzip -q Kermany-Chest-XRay-Data.zip

--2023-09-25 18:58:23--  http://bptripp.com/Kermany-Chest-XRay-Data.zip
Resolving bptripp.com (bptripp.com)... 64.90.50.171
Connecting to bptripp.com (bptripp.com)|64.90.50.171|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1237670572 (1.2G) [application/zip]
Saving to: ‘Kermany-Chest-XRay-Data.zip.1’


2023-09-25 18:58:41 (66.9 MB/s) - ‘Kermany-Chest-XRay-Data.zip.1’ saved [1237670572/1237670572]

replace __MACOSX/._chest_xray? [y]es, [n]o, [A]ll, [N]one, [r]ename: A


Similar to the process for training the digit recognition network, the next step is to create data loaders. The code is slightly different in this case, in part because this dataset is not built in to PyTorch.  

*Run the code below to create dataloaders that provide batches of training and validation examples.*

In [6]:
# Normalize images in the same way images are normalized for ImageNet
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

# This "transform" object will perform a few simple operations to turn images
# into suitable network inputs. One operation is to tilt images a little bit at random
# each time they are presented to the network. This is a kind of dataset
# augmentation that helps the network get more out of limited training data.
transform = transforms.Compose([
        transforms.Resize((224,224)),
        transforms.RandomAffine(8, shear=8),
        transforms.ToTensor(),
        normalize
    ])

input_path = 'chest_xray/'
train_set = datasets.ImageFolder(input_path + 'train', transform)
validation_set = datasets.ImageFolder(input_path + 'test', transform)

train_loader = torch.utils.data.DataLoader(train_set, batch_size=32, shuffle=True)
validation_loader = torch.utils.data.DataLoader(validation_set, batch_size=32, shuffle=False)

Now, as in the MNIST example, you will create a function to process all the batches in a given dataset. There are various ways to organize such code, but for simplicity, the code below is essentially the same as the corresponding code in the MNIST example.

In [7]:
def process_dataset(model, loader, optimizer=None):
  """
  model: the deep network
  loader: the dataloader of the dataset that is to be procesed
  optimizer: an object that updates the network's parameters
  """

  # create lists of losses and accuracies for each batch in the dataset
  losses = []
  accuracies = []

  for batch_num, (inputs, targets) in enumerate(loader): # loop through all the batches
    if batch_num % 2 == 1:
      print('.', end='') # print a dot every other batch (as a progress bar)

    inputs = inputs.to(device) # move inputs to the GPU
    targets = targets.to(device) # move labels to the GPU
    outputs = model(inputs) # run the inputs through the network
    loss = nn.CrossEntropyLoss()(outputs, targets) # calculate the loss

    if optimizer is not None:
      optimizer.zero_grad() # delete any gradients from previous steps
      loss.backward() # perform backpropagation to calculate new gradients
      optimizer.step() # perform gradient descent to improve parameters

    # calculate the fraction of predictions in this batch that were correct
    predictions = outputs.argmax(dim=1) # the predicted category corresponds to the output neurons with highest value
    fraction_correct = torch.sum(predictions == targets.data).item() / len(predictions)

    # add this batch's loss and fraction-correct to the lists
    losses.append(loss.item())
    accuracies.append(fraction_correct)

  # print the average loss and accuracy over the whole dataset
  print('{} loss: {:.4f} accuracy: {:.4f}'.format(
      'Validation' if optimizer is None else 'Training',
      np.mean(losses),
      np.mean(accuracies)))

Now you are ready to train the network. As in the MNIST example, the code below runs several epochs of training, and prints the loss and accuracy on the validation dataset after each epoch. This is a more difficult problem than digit recognition, so the accuracy will not be as high after a few minutes of training, and accuracy may not increase monotonically.

*Run the code below to partially train the network.*

In [8]:
# this optimizer implements a more complex variation of gradient descent
optimizer = optim.Adam(model.fc.parameters(), lr=0.005, weight_decay=1e-6)

num_epochs = 5
for epoch in range(num_epochs):
    print('Epoch {}/{}'.format(epoch+1, num_epochs))

    model.train() # put model in training mode
    process_dataset(model, train_loader, optimizer=optimizer)

    model.eval() # this disables things like dropout that are only beneficial during training
    process_dataset(model, validation_loader)


Epoch 1/5
..................................................................................Training loss: 0.2616 accuracy: 0.8929
..........Validation loss: 0.7176 accuracy: 0.7484
Epoch 2/5
..................................................................................Training loss: 0.1575 accuracy: 0.9354
..........Validation loss: 0.5707 accuracy: 0.8063
Epoch 3/5
..................................................................................Training loss: 0.1548 accuracy: 0.9396
..........Validation loss: 0.3417 accuracy: 0.8578
Epoch 4/5
..................................................................................Training loss: 0.1624 accuracy: 0.9333
..........Validation loss: 0.6229 accuracy: 0.7906
Epoch 5/5
..................................................................................Training loss: 0.1550 accuracy: 0.9421
..........Validation loss: 0.4000 accuracy: 0.8625
