# Medical Image Analysis: Neural Networks

Follow these instructions (https://github.com/antoniosehk/keras-tensorflow-windows-installation) and apply everything that is needed to run tensorflow and keras. Check if your computer has GPU compatible with NVIDIA drivers. If it is not compatible or you do not have GPU, there are two options available:

1) Use CPU. Our project is not going to be super computationally expensive

2) Use Colab, follow the first lecture and check here https://colab.research.google.com/notebooks/gpu.ipynb

# Exercise

## Database

There is a database with chest X-rays with lung segmentaiton. Please use imread from pyplot to read X-rays and lung masks as follows:

In [21]:
import numpy as np
import matplotlib.pyplot as plt
import glob
import os
import cv2

In [2]:
# reading gif images as numpy arrays
leftLung = imread("scratch/fold1/masks/left lung/JPCLN001.gif")
rightLung = imread("scratch/fold1/masks/right lung/JPCLN001.gif")
bothLungs = leftLung + rightLung
#im2 = imread(".....bmp")

Exercise 1.1. Read the assignemtn data and separate it into training and testing parts

All raw X-ray images are stored at "Data\scratch\images".
The training and testing masks of the database are stored in two folders named "Data\scratch\fold1\masks\" and "Data\scratch\fold2\masks\", repsectively. The organs of interest include "left lung" and "right lung"

generate 4 numpy arrays mask_array_training, im_array_training, mask_array_testing, im_array_testing:

    0) n_training_cases = len(glob.glob('...Data\scratch\fold1\masks\left lung\'))
       n_testing_cases = len(glob.glob('...Data\scratch\fold2\masks\left lung\'))
    
    
    1) mask_array_training # should be of size (n_training_cases x 256 x 256) generated from images in folders "Data\scratch\fold1\masks\left lung" and "Data\scratch\fold1\masks\right lung".
    
    2) im_array_training # should be of size (n_training_cases x 256 x 256) generated from images in folder "Data\scratch\images". To be sure get case name from every file in "Data\scratch\fold1\masks\left lung" and find the matching file in "Data\scratch\images"
    
        a) files = glob.glob('...Data\scratch\fold1\masks\left lung\')
        b) fileName = os.path.basename(files[i])
        c) fileNameWithoutExtension = os.path.splitext(fileName)[0]
    
    3) mask_array_testing # should be of size (n_testing_cases x 256 x 256) generated from images in folder "Data\scratch\fold2\masks\left lung" and "Data\scratch\fold2\masks\right lung".
    
    4) im_array_testing # should be of size (n_testing_cases x 256 x 256) generated from images in folder "Data\scratch\images" to match testing masks in "Data\scratch\fold2\masks\left lung".

In [29]:
n_training_cases = len(glob.glob('./scratch/fold1/masks/left lung/*'))
n_testing_cases = len(glob.glob('./scratch/fold2/masks/left lung/*'))
print(n_training_cases)
print(n_testing_cases)

# Training masks
filesLeft = glob.glob('./scratch/fold1/masks/left lung/*')
filesRight = glob.glob('./scratch/fold1/masks/right lung/*')
mask_array_training = np.array([cv2.resize(plt.imread(L)+plt.imread(R),(256,256)) for L,R in zip(filesLeft,filesRight)])
print(mask_array_training.shape)

# Training images
files = glob.glob('./scratch/fold1/masks/left lung/*')
im_array_training = np.array([cv2.cvtColor(cv2.imread('scratch/images/'+os.path.splitext(os.path.basename(file))[0]+'.bmp'),cv2.COLOR_BGR2GRAY) for file in files])
print(im_array_training.shape)

# Test masks
filesLeft = glob.glob('./scratch/fold2/masks/left lung/*')
filesRight = glob.glob('./scratch/fold2/masks/right lung/*')
mask_array_testing = np.array([cv2.resize(plt.imread(L)+plt.imread(R),(256,256)) for L,R in zip(filesLeft,filesRight)])
print(mask_array_testing.shape)

# Test images
files = glob.glob('./scratch/fold2/masks/left lung/*')
im_array_training = np.array([cv2.cvtColor(cv2.imread('scratch/images/'+os.path.splitext(os.path.basename(file))[0]+'.bmp'),cv2.COLOR_BGR2GRAY) for file in files])
print(im_array_training.shape)

124
123
(124, 256, 256)
(124, 256, 256)
(123, 256, 256)
(123, 256, 256)


# Train UNet

Exercise 1.2. Adapt some existing implementation of Unet for the segmentation of lung fields. Train the Unet on fold1 of the database.

Check this link for Unet implementation https://github.com/zhixuhao/unet

Here is the implementation of the Unet https://github.com/zhixuhao/unet/blob/master/model.py
Move learning rate (lr) to the parameters of the unet()

Here is an example of how to visualize results of network training, you will need them for your report https://machinelearningmastery.com/display-deep-learning-model-training-history-in-keras/

Some implementation hits for training Unets:

    0) par_batch_size = 10
       par_epochs = 100
       par_validation_split = 0.15
       par_learning_rate = 0.0001
       # play with these parameters to find best combination

    1) model = unet(input_size = (256, 256, 1), lr = par_learning_rate) # generate Unet
        
    2) validationSplit = 0.15
       
    3) im_array_training = np.expand_dims(np.asarray(im_array_training), axis = 4)
       mask_array_training = np.expand_dims(np.asarray(mask_array_training, dtype = np.float), axis = 4)
       # this is needed to add an explicit dimension at the end of the training data. You can basically consider this as color of the image. You add one dimension of size one to indicate that there is only one data channel per input example
        
    4) model.fit(im_array_training, mask_array_training, batch_size = par_batch_size, epochs = par_epochs, validation_split = par_validation_split)
    
    5) model.save('../resultUnet.hdf5') # save model somewhere on your disk

# Test Unet

Exercise 1.3. Test the Unet performance on fold2. Use Dice coefficient to evaluate the Unet performance.

Some implementation hits for testing Unet:
    
    0) model = load_model('../resultUnet.hdf5') # load model
    
    1) results = model.predict(np.expand_dims(np.asarray(im_array_testing), axis = 4), batch_size = 5)
    
    2) results[i, :, :, 0] # this is the result of segmentation corresponding to im_array_testing[i]
    
    3) compute_dice(results[i, :, :, 0], mask_array_testing[i]) # compute the dice for ith testing images

# Tasks for the report

1) Train and test Unet for segmentation of lung fields.

2) Select appropriate par_batch_size, par_epochs, par_validation_split, par_learning_rate. Explain what these parameters mean and how did you choice
   (check this https://machinelearningmastery.com/difference-between-a-batch-and-an-epoch/)

3) Plot loss functions for training and validation

4) Plot results for cases JPCLN016, JPCLN048, JPCLN058. Explain why you think the results look like this.