# Abstract

Pneumonia is an inflammation of the lung tissue caused by an infection that can lead to serious health problems and even death. To help doctors analyze X-ray images to diagnose pneumonia, Deep Learning has become increasingly important. However, the accuracy of neural networks is a crucial aspect. In our work, we took an existing project on pneumonia detection using Deep Learning and applied various methods to improve the performance of the model.

The [original project](https://www.kaggle.com/amyjang/tensorflow-pneumonia-classification-on-x-rays) used a database with a total of 5856 X-ray images, which were first downsized to [180 180] pixels and pixel values rescaled to [0 1]. 624 of these images were reserved for testing and the remaining images were split into training and validation sets in a ratio of 80%/20%. Since the dataset contains more images of pneumonia than healthy radiographs, the data must be balanced for training. The weights of the pneumonia images and the normal images were adjusted so that a normal image has a higher impact on the neural network. A new convolutional neural network was built from scratch, consisting mainly of convolutional blocks and dense layers. The training was performed in two steps: the first with 25 epochs with a fixed learning rate and then a further training with 100 epochs, with early stopping and exponential decay of the learning rate. The TPU accelerator was used to run the code.

In our project, we tried to solve some of the problems and improve the performance. First, overfitting was reduced  through data augmentation. Starting with the given images, new images were created for every epoch using transformers to zoom, rotate and shift. The *ImageDataGenerator* function was used to apply the transforms, which unfortunately required a switch to the GPU for training and significantly increased the training time. With the data extension alone, we were already able to increase the performance of the model by roughly 10%.
The next step was to implement transfer learning using three different models: VGG16, Xception and ResNet are 3 CNN models that have been trained on the ImageNet database and the trained models were made available on Keras. To implement these models in our project, we only took the base of the imported models and froze them first. We built new blocks on top of the base and trained the model to fit the new weights. In the end we unfreezed the base again and finetuned the entire model. 
For all three models, we were again able to improve performance by a few percent, resulting in both accuracy and precision of about 0.93. 

In this project, we have shown the importance of having a large dataset for training to achieve more independent learning and avoid overfitting, and that data augmentation can be a useful source of unseen images when there is not enough original material. Neural networks trained for similar tasks can be adapted to a new task with great performance, compensating for the lack of data and significantly reducing training time. Overall, Deep Learning has proven to be a useful option to quickly analyse X-ray images and assist doctors in diagnosis.



# Introduction

Deep neural networks are now the state-of-the-art machine learning models across a variety of areas. Regarding the medical domain, it can be a huge potential for medical imaging technology, medical data analysis, medical diagnostics and healthcare in general.

The implementation of clinical-decision support algorithms for medical imaging aid in expediting the diagnosis of these treatable conditions, thereby facilitating earlier treatment, resulting in improved clinical outcomes. 

This notebook will show how using GPU, in order to build and train convolution neural networks that are able to predict if an X-ray scans shows presence of pneumonia, it’s possible to get results with higher accuracy with data augmentation and transfer learning as additional tools. 


# 1. Setup

Here we import the necessary libraries.

In this notebook we’re using GPUs instead of TPUs in order to implement the data augmentation performing *ImageDataGenerator* function. Therefore, the implementation time of the code will also be much longer than the original notebook.


In [None]:
import re
import os
import numpy as np
import pandas as pd
import tensorflow as tf
import seaborn as sns
from tensorflow import keras
from kaggle_datasets import KaggleDatasets
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import ReduceLROnPlateau
from collections import Counter
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.metrics import ConfusionMatrixDisplay

!pip install visualkeras #https://github.com/paulgavrikov/visualkeras
import visualkeras

try: #Part of original notebook to set TPU
    tpu = tf.distribute.cluster_resolver.TPUClusterResolver()
    print('Device:', tpu.master())
    tf.config.experimental_connect_to_cluster(tpu)
    tf.tpu.experimental.initialize_tpu_system(tpu)
    strategy = tf.distribute.experimental.TPUStrategy(tpu)
except:
    strategy = tf.distribute.get_strategy()
print('Number of replicas:', strategy.num_replicas_in_sync)
    
print(tf.__version__)

We set some hyperparameters such as the image size, the batch size and the number of epochs used to train the following models. 

Images will be rescaled to the defined IMAGE_SIZE: the size is set to [100,100] and we get performances similar to models trained with larger images (so with images of higher quality), but with a shorter computational time. 

The number of EPOCHS is set to a high number: to avoid the risk of overfitting, we add some ‘’early stops’’ in the training part. Moreover, it is helpful from the computational point of view. 

The BATCH_SIZE using GPU is set to 16 (*strategy.num_replicas_in_sync*=1 for GPU): we have to increase/decrease the batch size according to computational resources and model’s performances. Size of 16 seems quite good for our models: the train part is slower, but we have not hardware problems and the models converge faster.

In [None]:
AUTOTUNE = tf.data.experimental.AUTOTUNE
BATCH_SIZE = 16 * strategy.num_replicas_in_sync #GPU:1 num_replicas
IMAGE_SIZE = [100, 100]
EPOCHS = 100 #early stops in model.fit-->need to save best weights

# 2. Loading data

The chest X-ray images that we are using come from this [dataset](https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia): the dataset is organized into 3 folders (train, val, test) and each one contains two subfolders: one for "PNEUMONIA" and the other for "NORMAL" cases. There's a total of 5863 images, saved in JPEG type and there are two classes (Pneumonia/Normal), so our models will perform a binary classification. The validation set is composed only by 16 images, while the training set is composed by 5216 images, so we will create a new validation set to get a 80:20 standard division for the training set and the validation set. The test set contains 624 images (234 Normal and 390 Pneumonia) and we decide to keep it in order to compare the performance between our model and the original notebook on the same test set.  


Get the directory path for the training, validation and test set

In [None]:
train_folder= '../input/chest-xray-pneumonia/chest_xray/train/'
val_folder = '../input/chest-xray-pneumonia/chest_xray/val/'
test_folder = '../input/chest-xray-pneumonia/chest_xray/test/'

Create an empty list to save images file path and the label (Pneumonia/Normal), for the images of training and validation set

In [None]:
filepath = [] #list of paths of images of train & validation
categories = [] #label

Load the images path and label of the training set

In [None]:
filenames = os.listdir(os.path.join(train_folder,'NORMAL'))
for filename in filenames:
        filepath.append(os.path.join(train_folder,'NORMAL',filename))
        categories.append("NORMAL") #0: Normal

filenames = os.listdir(os.path.join(train_folder,'PNEUMONIA'))
for filename in filenames:
        filepath.append(os.path.join(train_folder,'PNEUMONIA',filename))
        categories.append("PNEUMONIA") #1: Pneumonia

Load the images path and label of the validation set

In [None]:
filenames = os.listdir(os.path.join(val_folder,'NORMAL'))
for filename in filenames:
        filepath.append(os.path.join(val_folder,'NORMAL',filename))
        categories.append("NORMAL") #0: Normal
        
filenames = os.listdir(os.path.join(val_folder,'PNEUMONIA'))
for filename in filenames:
        filepath.append(os.path.join(val_folder,'PNEUMONIA',filename))
        categories.append("PNEUMONIA") #1: Pneumonia

Let's create a **dataframe containing all images of test and validation set**. 
It's composed by 3883 images labelled as "Pneumonia" and 1349 images labelled as "Normal"

In [None]:
df = pd.DataFrame({'filepath':filepath,'label':categories})
df.info
plot = sns.countplot(x ='label', data = df).set_title('Train+Validation')
print("Df Train-Vali: ",df.head())
print("")
print(df['label'].value_counts())

Now we will recreate the validation set and the training set from the dataframe that we have created. Let's perform a **Split training set 80% - validation set 20%**. To do this let's use the *train_test_split* function and care to set *stratify*: it's important so the training set and validation set that we crate will have the same distribution

In [None]:
train, vali = train_test_split(df, test_size=0.2,stratify=df['label'],random_state=6) #stratify to keep distribution

In [None]:
plot_train = sns.countplot(x ='label', data = train).set_title('Train 80%')
print('Train 80%: ')
print(train['label'].value_counts())

In [None]:
plot_vali = sns.countplot(x ='label', data = vali).set_title('Validation 20%')
print('Validation 20%: ')
print(vali['label'].value_counts())

As we can see the new validation set is composed by 1047 images and the new training set is composed by 4185 images, and they have the same distribution between the two classes

# 3 Data Augmentation

Let's perform [data augmentation](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator) on the training set, trying to reduce overfitting that affected the previous notebook. To augment our images let's use the function *ImageDataGenerator*: this function will modify our entire training set of images with random transformations at each epoch (so the model during the training phase will be trained on new images at each epoch)

First of all, load the images of the sets (use *ImageDataGenerator* without specifying any transformation)

In [None]:
load = ImageDataGenerator() #To load images

The images are imported with pixel value between [0 255] and then are resized with the IMAGE_SIZE, specified before. Be careful to set Shuffle=False for the test set, this is important in order to avoid any possible wrong classifications after the execution of the confusion matrix. These images will be imported in batches, whose size is BATCH_SIZE. 

In [None]:
#Load images [0 255] and resize images

train_set = load.flow_from_dataframe(train,
                                        x_col = 'filepath',
                                        y_col = 'label',
                                        target_size = IMAGE_SIZE,
                                        batch_size = BATCH_SIZE,
                                        class_mode = 'binary')

vali_set = load.flow_from_dataframe(vali,
                                        x_col = 'filepath',
                                        y_col = 'label',
                                        target_size = IMAGE_SIZE,
                                        batch_size = BATCH_SIZE,
                                        class_mode = 'binary')

test_set = load.flow_from_directory(test_folder,
                                        target_size = IMAGE_SIZE,
                                        batch_size = BATCH_SIZE,
                                        shuffle = False, #for confusionMatrix
                                        class_mode = 'binary')

#Check classes in output
print("train: ",train_set.class_indices)
print("vali: ",vali_set.class_indices)
print("test: ",test_set.class_indices)
#it returns a DataFrameIterator yielding tuples of (x, y) where x is a numpy array containing a batch of images with shape (batch_size, *target_size, channels) and y is a numpy array of corresponding labels.

In [None]:
VAL_IMG_COUNT = vali_set.samples
TEST_IMG_COUNT = test_set.samples

In [None]:
print('Train: ',Counter(train_set.classes))
print('Validation: ',Counter(vali_set.classes))
print('Test: ',Counter(test_set.classes))

In [None]:
#Meaning of output of ImageDataGenerator

#print(len(train_set)) #number of batch in training set (from 0)
#print(train_set[0][0][1]) #list of images of first batch
#print((train_set[0][1])) #list of labels of first batch
#print(train_set[0][0][1].shape) #shape of one image
#train_set[0][0][1] #float32

Let's plot some images of first batch of training set

In [None]:
plt.figure(figsize=(10,10))
plt.suptitle('Original training set')
for n in range(16):
    ax = plt.subplot(4,4,n+1)
    plt.imshow(train_set[0][0][n].astype(np.uint8))  #uint8 to plot
    if train_set[0][1][n]:
        plt.title("PNEUMONIA")
    else:
        plt.title("NORMAL")
    plt.axis("off")

Now let's perform the **Data augmentation** only on the training set. To do this we need to specify the random transformations that we want to do and reload all the images of the training set

We define only geometrical random transformations that can be relevant for our images

In [None]:
augment_gen = ImageDataGenerator(zoom_range = 0.2,
                                   rotation_range=0.2,
                                   width_shift_range=0.1,
                                   height_shift_range=0.1,
                                   ) #to perform augmentation on training set

In [None]:
aug_train_set =  augment_gen.flow_from_dataframe(train,
                                        x_col = 'filepath',
                                        y_col = 'label',
                                        target_size = IMAGE_SIZE,
                                        batch_size = BATCH_SIZE,
                                        class_mode = 'binary')
print("\n Aug_train: ",train_set.class_indices)

Plot some images of first batch of training set after data augmentation

In [None]:
plt.figure(figsize=(10,10))
plt.suptitle('Augmented training set')
for n in range(16):
    ax = plt.subplot(4,4,n+1)
    plt.imshow(aug_train_set[0][0][n].astype(np.uint8))
    if train_set[0][1][n]:
        plt.title("PNEUMONIA")
    else:
        plt.title("NORMAL")
    plt.axis("off")

But there's a problem in our dataset: the class imbalance, that may lead problems to the performance of our models. So **check the class imbalance in the training set**

In [None]:
counter = Counter(aug_train_set.classes)
#print(counter.items()) #0: Normal, 1: Pneumonia

counter['NORMAL'] = counter.pop(0)
counter['PNEUMONIA'] = counter.pop(1) #change name for visualization dictionary

print('Aug_train_set')
for i in counter:
    print(i,': ',counter[i])

plt.bar(counter.keys(),counter.values())
plt.title('Aug_Train_set')
plt.show()

Now we have to **correct data imbalancing** for the training set, because we have more images classified as pneumonia than normal.

In [None]:
COUNT_PNEUMONIA=counter['PNEUMONIA']
COUNT_NORMAL=counter['NORMAL']
TRAIN_IMG_COUNT=aug_train_set.samples

weight_for_0 = (1 / COUNT_NORMAL)*(TRAIN_IMG_COUNT)/2.0 
weight_for_1 = (1 / COUNT_PNEUMONIA)*(TRAIN_IMG_COUNT)/2.0

class_weight = {0: weight_for_0, 1: weight_for_1}

print('Weight for class 0: {:.2f}'.format(weight_for_0))
print('Weight for class 1: {:.2f}'.format(weight_for_1))

Since the images labelled as Normal are less than those one labelled as Pneumonia, each image labelled as Normal will be weighted more to balance the data, so the CNN model will work better due to balanced training data.

# 4 Build CNN of the [original notebook](https://www.kaggle.com/amyjang/tensorflow-pneumonia-classification-on-x-rays?scriptVersionId=39162263&cellId=33)

Let's use the CNN model defined in the original notebook and try to improve the performance. Our aim is trying to reduce overfitting of the original model

In [None]:
keras.backend.clear_session() #clear any model

Create a convolution block and a dense layer block

In [None]:
def conv_block(filters):
    block = tf.keras.Sequential([
        tf.keras.layers.SeparableConv2D(filters, 3, activation='relu', padding='same'),
        tf.keras.layers.SeparableConv2D(filters, 3, activation='relu', padding='same'),
        tf.keras.layers.BatchNormalization(),
        tf.keras.layers.MaxPool2D()
    ]
    )
    
    return block

In [None]:
def dense_block(units, dropout_rate):
    block = tf.keras.Sequential([
        tf.keras.layers.Dense(units, activation='relu'),
        tf.keras.layers.BatchNormalization(),
        tf.keras.layers.Dropout(dropout_rate)
    ])
    
    return block

Let's build the model in a Sequential way, adding every layer we prefer. We include a rescaling layer to scale pixel values between [0 1], cause CNN works better with small numbers. The last layer is a Dense layer, whose output is one: Pneumonia or Normal case (binary classification). 

In [None]:
def build_model():
    model = tf.keras.Sequential([
        tf.keras.Input(shape=(IMAGE_SIZE[0], IMAGE_SIZE[1], 3)),
        
        tf.keras.layers.Rescaling(scale=1./255), #Scaling images between [0 1]
        
        tf.keras.layers.Conv2D(16, 3, activation='relu', padding='same'),
        tf.keras.layers.Conv2D(16, 3, activation='relu', padding='same'),
        tf.keras.layers.MaxPool2D(),
        
        conv_block(32),
        conv_block(64),
        
        conv_block(128),
        tf.keras.layers.Dropout(0.2), #Dropout layers to reduce overfitting
        
        conv_block(256),
        tf.keras.layers.Dropout(0.2),
        
        tf.keras.layers.Flatten(),
        dense_block(512, 0.7),
        dense_block(128, 0.5),
        dense_block(64, 0.3),
        
        tf.keras.layers.Dense(1, activation='sigmoid') #sigmoid activation function for the last layer, because it's a binary classification
    ])

    return model

# 5 train & test CNN model (original notebook)

Train the model. Since there are only two possible labels for the image, we will use the *binary_crossentropy* loss

In [None]:
with strategy.scope():
    model = build_model()

    METRICS = [
        'accuracy',
        tf.keras.metrics.Precision(name='precision'),
        tf.keras.metrics.Recall(name='recall')
    ]
    
    model.compile(
        optimizer=tf.keras.optimizers.Adam(),
        loss='binary_crossentropy',
        metrics=METRICS
    )
    
model.summary()

In [None]:
visualkeras.layered_view(model, legend=True)

In [None]:
print("steps_per_epoch : ",TRAIN_IMG_COUNT // BATCH_SIZE)
print("validation_steps : ",VAL_IMG_COUNT // BATCH_SIZE)

Train the model reducing the learning rate and saving the best model

In [None]:
checkpoint_cb = tf.keras.callbacks.ModelCheckpoint("CNN_xray_model.h5",
                                                    save_best_only=True,verbose=1)

early_stopping_cb = tf.keras.callbacks.EarlyStopping(patience=8,
                                                     restore_best_weights=True,verbose=1)#check val_loss

reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2,
                              patience=5, min_lr=0.00001,verbose=1) #if val_loss not improved for patience_epochs-->reduce the learning rate

In [None]:
history = model.fit(
    aug_train_set,
    steps_per_epoch=TRAIN_IMG_COUNT // BATCH_SIZE,
    epochs=EPOCHS,
    validation_data=vali_set,
    validation_steps=VAL_IMG_COUNT // BATCH_SIZE,
    class_weight=class_weight,
    callbacks=[checkpoint_cb,early_stopping_cb,reduce_lr]
)

In [None]:
fig, ax = plt.subplots(1, 4, figsize=(20, 3))
ax = ax.ravel()

for i, met in enumerate(['precision', 'recall', 'accuracy', 'loss']):
    ax[i].plot(history.history[met])
    ax[i].plot(history.history['val_' + met])
    ax[i].set_title('Model {}'.format(met))
    ax[i].set_xlabel('epochs')
    ax[i].set_ylabel(met)
    ax[i].legend(['train', 'val'])

In [None]:
#Load model for interactive session
#keras.backend.clear_session()
#model = keras.models.load_model('../input/models/CNN_xray_model.h5')

Evaluate the test set

In [None]:
loss, acc, prec, rec = model.evaluate(test_set)

**Confusion Matrix**

Create the confusion matrix and the report of the performance

In [None]:
#Predicted labels
predictions = model.predict(test_set)
predictions = predictions > 0.5

#True labels
orig = test_set.labels

In [None]:
cm = confusion_matrix(orig, predictions)
print('Confusion matrix:')
print(cm)
print('')

cr = classification_report(orig, predictions)
print('Classification report:')
print(cr)
print('')

In [None]:
disp = ConfusionMatrixDisplay(confusion_matrix=cm,display_labels=['NORMAL','PNEUMONIA'])
disp.plot()

**Plot some wrong classified images**

In [None]:
err=[i for i, (x, y) in enumerate(zip(predictions, orig)) if x != y] #check where orig and predictions don't match (wrong classification)
#print(err)
print('errori commessi: ',len(err))

plt.figure(figsize=(10,10))
plt.suptitle("WRONG PREDICTIONS")
for n in range(16): #plot max 16 wrong predictions
    bn=err[n]//BATCH_SIZE #number of batch of wrong prediction train_set[bn][0][image]
    ib=err[n]-BATCH_SIZE*bn #number of the image in the batch size train_set[bn][0][ib]
    #print('id_err:',err[n],' batch:',bn,' diff:',err[n]-BATCH_SIZE*bn)
    ax = plt.subplot(4,4,n+1)
    plt.imshow(test_set[bn][0][ib].astype(np.uint8))  #uint8 to plot
    
    if test_set[bn][1][ib]: #real value
        plt.title("Real: PNEUMONIA")
    else:
        plt.title("Real: NORMAL")
    plt.axis("off")

# 6 Transfer Learning
Let's try to perform [transfer learning](https://keras.io/guides/transfer_learning/) to create an adapt model for our situation.
We will try to use the base of the different CNN models: VGG16, Xcpetion and ResNet152V2

# 7 Transfer Learning VGG16 + Fine Tuning


In [None]:
keras.backend.clear_session() #delete the previous model (clear model variable)

Using **[VGG16](https://keras.io/api/applications/vgg/)**

In [None]:
base_model = keras.applications.VGG16(
    weights='imagenet',  # Load weights pre-trained on ImageNet
    input_shape=(IMAGE_SIZE[0], IMAGE_SIZE[1], 3),
    include_top=False)  # Do not include the ImageNet classifier at the top

**Freeze** the base of the model

In [None]:
base_model.trainable = False #freeze the base model

Crate new model on top and include preprocessing step for the input mages in the model

In [None]:
inputs = keras.layers.Input(shape=(IMAGE_SIZE[0], IMAGE_SIZE[1], 3))

#Preprocess for VGG16
x = tf.keras.applications.vgg16.preprocess_input(inputs)

# We make sure that the base_model is running in inference mode here, passing `training=False`
x = base_model(x, training=False) #-->not updating weights of this part of the model

# Convert features of shape `base_model.output_shape[1:]` to vectors
x = keras.layers.GlobalAveragePooling2D()(x)

x = keras.layers.Dense(128, activation='relu')(x)
x = keras.layers.Dropout(0.2)(x)  # Regularize with dropout

# A Dense classifier with a single unit (binary classification)
outputs = keras.layers.Dense(1, activation='sigmoid')(x)
model = keras.Model(inputs, outputs)

model.summary()

Define and early stop to avoid overfitting and using a decreasing learning rate

In [None]:
early_stopping_cb = tf.keras.callbacks.EarlyStopping(patience=6,
                                                     restore_best_weights=True,verbose=1) #check val_loss

reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2,
                              patience=3, min_lr=1e-5 ,verbose=1) #min_lr as input for fine tuning next

In [None]:
model.compile(
    optimizer=keras.optimizers.Adam(),
    loss='binary_crossentropy',
    metrics=['accuracy']
)


history = model.fit(
    aug_train_set,
    steps_per_epoch=TRAIN_IMG_COUNT // BATCH_SIZE,
    epochs=EPOCHS,
    validation_data=vali_set,
    validation_steps=VAL_IMG_COUNT // BATCH_SIZE,
    class_weight=class_weight,
    callbacks=[early_stopping_cb, reduce_lr]
)

In [None]:
fig, ax = plt.subplots(1, 2, figsize=(20, 3))
ax = ax.ravel()

for i, met in enumerate(['accuracy', 'loss']):
    ax[i].plot(history.history[met])
    ax[i].plot(history.history['val_' + met])
    ax[i].set_title('Model {}'.format(met))
    ax[i].set_xlabel('epochs')
    ax[i].set_ylabel(met)
    ax[i].legend(['train', 'val'])

Perform **FINE TUNING : UNFREEZE ALL LAYERS**

Once your model has converged on the new data, unfreeze all or part of the base model and retrain the whole model end-to-end with a very low learning rate.

In [None]:
# Unfreeze the base model
base_model.trainable = True

model.summary()

model.compile(optimizer=keras.optimizers.Adam(1e-5), #Very low learning rate
              loss='binary_crossentropy',
              metrics='accuracy')

Save the best model

In [None]:
checkpoint_cb = tf.keras.callbacks.ModelCheckpoint("VGG16_xray_model.h5",
                                                    save_best_only=True,verbose=1)

early_stopping_cb = tf.keras.callbacks.EarlyStopping(patience=5,
                                                     restore_best_weights=True,verbose=1) #check val_loss by default for the early stop (check if improve)

In [None]:
#Be careful to stop before overfitting (check the validation_loss)
history = model.fit(
    aug_train_set,
    steps_per_epoch=TRAIN_IMG_COUNT // BATCH_SIZE,
    epochs=100,
    validation_data=vali_set,
    validation_steps=VAL_IMG_COUNT // BATCH_SIZE,
    class_weight=class_weight,
    callbacks=[checkpoint_cb,early_stopping_cb]
)

In [None]:
fig, ax = plt.subplots(1, 2, figsize=(20, 3))
ax = ax.ravel()

for i, met in enumerate(['accuracy', 'loss']):
    ax[i].plot(history.history[met])
    ax[i].plot(history.history['val_' + met])
    ax[i].set_title('Model {}'.format(met))
    ax[i].set_xlabel('epochs')
    ax[i].set_ylabel(met)
    ax[i].legend(['train', 'val'])

In [None]:
#Load model for interactive session
#keras.backend.clear_session()
#model = keras.models.load_model('../input/models/VGG16_xray_model.h5')

Evaluate the test set

In [None]:
loss, acc = model.evaluate(test_set)

**Confusion Matrix**

Create the confusion matrix and the report of the performance

In [None]:
#Predicted labels
predictions = model.predict(test_set)
predictions = predictions > 0.5

#True labels
orig = test_set.labels

In [None]:
cm = confusion_matrix(orig, predictions)
print('Confusion matrix:')
print(cm)
print('')

cr = classification_report(orig, predictions)
print('Classification report:')
print(cr)
print('')

In [None]:
disp = ConfusionMatrixDisplay(confusion_matrix=cm,display_labels=['NORMAL','PNEUMONIA'])
disp.plot()

**Plot some wrong classified images**

In [None]:
err=[i for i, (x, y) in enumerate(zip(predictions, orig)) if x != y] #check where orig and predictions don't match (wrong classification)
#print(err)
print('errori commessi: ',len(err))

plt.figure(figsize=(10,10))
plt.suptitle("WRONG PREDICTIONS")
for n in range(16): #plot max 16 wrong predictions
    bn=err[n]//BATCH_SIZE #number of batch of wrong prediction train_set[bn][0][image]
    ib=err[n]-BATCH_SIZE*bn #number of the image in the batch size train_set[bn][0][ib]
    #print('id_err:',err[n],' batch:',bn,' diff:',err[n]-BATCH_SIZE*bn)
    ax = plt.subplot(4,4,n+1)
    plt.imshow(test_set[bn][0][ib].astype(np.uint8))  #uint8 to plot
    
    if test_set[bn][1][ib]: #real value
        plt.title("Real: PNEUMONIA")
    else:
        plt.title("Real: NORMAL")
    plt.axis("off")

# 8 Transfer Learning Xception + Fine Tuning

Using **[Xception](https://keras.io/api/applications/xception/)**

In [None]:
keras.backend.clear_session()

In [None]:
base_model = keras.applications.Xception(
    weights='imagenet',  # Load weights pre-trained on ImageNet
    input_shape=(IMAGE_SIZE[0], IMAGE_SIZE[1], 3),
    include_top=False)  # Do not include the ImageNet classifier at the top

**Freeze** the base of the model

In [None]:
base_model.trainable = False #freeze the base model

Crate new model on top and include preprocessing step for the input mages in the model

In [None]:
inputs = keras.layers.Input(shape=(IMAGE_SIZE[0], IMAGE_SIZE[1], 3))

#Preprocess for Xception
x = tf.keras.applications.xception.preprocess_input(inputs)

# We make sure that the base_model is running in inference mode here passing `training=False`
x = base_model(x, training=False)

# Convert features of shape `base_model.output_shape[1:]` to vectors
x = keras.layers.GlobalAveragePooling2D()(x)

x = keras.layers.Dense(128, activation='relu')(x)
x = keras.layers.Dropout(0.2)(x)  # Regularize with dropout

# A Dense classifier with a single unit (binary classification)
outputs = keras.layers.Dense(1, activation='sigmoid')(x)
model = keras.Model(inputs, outputs)

model.summary()

Define and early stop to avoid overfitting and using a decreasing learning rate

In [None]:
early_stopping_cb = tf.keras.callbacks.EarlyStopping(patience=6,
                                                     restore_best_weights=True,verbose=1)#check val_loss

reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2,
                              patience=3, min_lr=1e-5 ,verbose=1) #min_lr as input for fine tuning next

In [None]:
model.compile(
    optimizer=keras.optimizers.Adam(),
    loss='binary_crossentropy',
    metrics=['accuracy']
)


history = model.fit(
    aug_train_set,
    steps_per_epoch=TRAIN_IMG_COUNT // BATCH_SIZE,
    epochs=EPOCHS,
    validation_data=vali_set,
    validation_steps=VAL_IMG_COUNT // BATCH_SIZE,
    class_weight=class_weight,
    callbacks=[early_stopping_cb, reduce_lr]
)

In [None]:
fig, ax = plt.subplots(1, 2, figsize=(20, 3))
ax = ax.ravel()

for i, met in enumerate(['accuracy', 'loss']):
    ax[i].plot(history.history[met])
    ax[i].plot(history.history['val_' + met])
    ax[i].set_title('Model {}'.format(met))
    ax[i].set_xlabel('epochs')
    ax[i].set_ylabel(met)
    ax[i].legend(['train', 'val'])

**Fine Tuning**

Unfreeze the base model

In [None]:
# Unfreeze the base model
base_model.trainable = True

model.summary()

model.compile(optimizer=keras.optimizers.Adam(1e-5), #Very low learning rate
              loss='binary_crossentropy',
              metrics='accuracy')

Save the best model

In [None]:
checkpoint_cb = tf.keras.callbacks.ModelCheckpoint("Xception_xray_model.h5",
                                                    save_best_only=True,verbose=1)

early_stopping_cb = tf.keras.callbacks.EarlyStopping(patience=5,
                                                     restore_best_weights=True,verbose=1) #check val_loss

In [None]:
history = model.fit(
    aug_train_set,
    steps_per_epoch=TRAIN_IMG_COUNT // BATCH_SIZE,
    epochs=100,
    validation_data=vali_set,
    validation_steps=VAL_IMG_COUNT // BATCH_SIZE,
    class_weight=class_weight,
    callbacks=[checkpoint_cb,early_stopping_cb]
)

In [None]:
fig, ax = plt.subplots(1, 2, figsize=(20, 3))
ax = ax.ravel()

for i, met in enumerate(['accuracy', 'loss']):
    ax[i].plot(history.history[met])
    ax[i].plot(history.history['val_' + met])
    ax[i].set_title('Model {}'.format(met))
    ax[i].set_xlabel('epochs')
    ax[i].set_ylabel(met)
    ax[i].legend(['train', 'val'])

In [None]:
#Load model for interactive session
#keras.backend.clear_session()
#model = keras.models.load_model('../input/models/Xception_xray_model.h5')

Evaluate the test set

In [None]:
loss, acc = model.evaluate(test_set)

**Confusion Matrix**

In [None]:
#Predicted labels
predictions = model.predict(test_set)
predictions = predictions > 0.5

#True labels
orig = test_set.labels

In [None]:
cm = confusion_matrix(orig, predictions)
print('Confusion matrix:')
print(cm)
print('')

cr = classification_report(orig, predictions)
print('Classification report:')
print(cr)
print('')

In [None]:
disp = ConfusionMatrixDisplay(confusion_matrix=cm,display_labels=['NORMAL','PNEUMONIA'])
disp.plot()

**Plot some wrong classified images**

In [None]:
err=[i for i, (x, y) in enumerate(zip(predictions, orig)) if x != y] #check where orig and predictions don't match (wrong classification)
#print(err)
print('errori commessi: ',len(err))

plt.figure(figsize=(10,10))
plt.suptitle("WRONG PREDICTIONS")
for n in range(16): #plot max 16 wrong predictions
    bn=err[n]//BATCH_SIZE #number of batch of wrong prediction train_set[bn][0][image]
    ib=err[n]-BATCH_SIZE*bn #number of the image in the batch size train_set[bn][0][ib]
    #print('id_err:',err[n],' batch:',bn,' diff:',err[n]-BATCH_SIZE*bn)
    ax = plt.subplot(4,4,n+1)
    plt.imshow(test_set[bn][0][ib].astype(np.uint8))  #uint8 to plot
    
    if test_set[bn][1][ib]: #real value
        plt.title("Real: PNEUMONIA")
    else:
        plt.title("Real: NORMAL")
    plt.axis("off")

# 9 Transfer Learning ResNet + Fine Tuning

Using **[ResNet152V2](https://keras.io/api/applications/resnet/#resnet152v2-function)**

In [None]:
keras.backend.clear_session()

In [None]:
base_model = keras.applications.ResNet152V2(
    weights='imagenet',  # Load weights pre-trained on ImageNet
    input_shape=(IMAGE_SIZE[0], IMAGE_SIZE[1], 3),
    include_top=False)  # Do not include the ImageNet classifier at the top

**Freeze** the base of the model

In [None]:
base_model.trainable = False #freeze the base model

Crate new model on top and include preprocessing step for the input mages in the model

In [None]:
inputs = keras.layers.Input(shape=(IMAGE_SIZE[0], IMAGE_SIZE[1], 3))

#Preprocess for ResNet152V2
x = tf.keras.applications.resnet_v2.preprocess_input(inputs)

# We make sure that the base_model is running in inference mode here, passing `training=False`
x = base_model(x, training=False)

# Convert features of shape `base_model.output_shape[1:]` to vectors
x = keras.layers.GlobalAveragePooling2D()(x)

x = keras.layers.Dense(128, activation='relu')(x)
x = keras.layers.Dropout(0.2)(x)  # Regularize with dropout

# A Dense classifier with a single unit (binary classification)
outputs = keras.layers.Dense(1, activation='sigmoid')(x)
model = keras.Model(inputs, outputs)

model.summary()

Define and early stop to avoid overfitting and using a decreasing learning rate

In [None]:
early_stopping_cb = tf.keras.callbacks.EarlyStopping(patience=6,
                                                     restore_best_weights=True,verbose=1)#check val_loss

reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2,
                              patience=3, min_lr=1e-5 ,verbose=1) #min_lr as input for fine tuning next

In [None]:
model.compile(
    optimizer=keras.optimizers.Adam(),
    loss='binary_crossentropy',
    metrics=['accuracy']
)


history = model.fit(
    aug_train_set,
    steps_per_epoch=TRAIN_IMG_COUNT // BATCH_SIZE,
    epochs=EPOCHS,
    validation_data=vali_set,
    validation_steps=VAL_IMG_COUNT // BATCH_SIZE,
    class_weight=class_weight,
    callbacks=[early_stopping_cb, reduce_lr]
)

In [None]:
fig, ax = plt.subplots(1, 2, figsize=(20, 3))
ax = ax.ravel()

for i, met in enumerate(['accuracy', 'loss']):
    ax[i].plot(history.history[met])
    ax[i].plot(history.history['val_' + met])
    ax[i].set_title('Model {}'.format(met))
    ax[i].set_xlabel('epochs')
    ax[i].set_ylabel(met)
    ax[i].legend(['train', 'val'])

**Fine Tuning**

Unfreeze the base model

In [None]:
# Unfreeze the base model
base_model.trainable = True

model.summary()

model.compile(optimizer=keras.optimizers.Adam(1e-5),  # Very low learning rate
              loss='binary_crossentropy',
              metrics='accuracy')

Save the best model

In [None]:
checkpoint_cb = tf.keras.callbacks.ModelCheckpoint("ResNet152V2_xray_model.h5",
                                                    save_best_only=True,verbose=1)

early_stopping_cb = tf.keras.callbacks.EarlyStopping(patience=5,
                                                     restore_best_weights=True,verbose=1)#check val_loss

In [None]:
history = model.fit(
    aug_train_set,
    steps_per_epoch=TRAIN_IMG_COUNT // BATCH_SIZE,
    epochs=100,
    validation_data=vali_set,
    validation_steps=VAL_IMG_COUNT // BATCH_SIZE,
    class_weight=class_weight,
    callbacks=[checkpoint_cb,early_stopping_cb]
)

In [None]:
fig, ax = plt.subplots(1, 2, figsize=(20, 3))
ax = ax.ravel()

for i, met in enumerate(['accuracy', 'loss']):
    ax[i].plot(history.history[met])
    ax[i].plot(history.history['val_' + met])
    ax[i].set_title('Model {}'.format(met))
    ax[i].set_xlabel('epochs')
    ax[i].set_ylabel(met)
    ax[i].legend(['train', 'val'])

In [None]:
#Load model for interactive session
#keras.backend.clear_session()
#model = keras.models.load_model('../input/models/Xception_xray_model.h5')

Evaluate the test set

In [None]:
loss, acc = model.evaluate(test_set)

**Confusion Matrix**

Create the confusion matrix and the report of the performance

In [None]:
#Predicted labels
predictions = model.predict(test_set)
predictions = predictions > 0.5

#True labels
orig = test_set.labels

In [None]:
cm = confusion_matrix(orig, predictions)
print('Confusion matrix:')
print(cm)
print('')

cr = classification_report(orig, predictions)
print('Classification report:')
print(cr)
print('')

In [None]:
disp = ConfusionMatrixDisplay(confusion_matrix=cm,display_labels=['NORMAL','PNEUMONIA'])
disp.plot()

**Plot some wrong classified images**

In [None]:
err=[i for i, (x, y) in enumerate(zip(predictions, orig)) if x != y] #check where orig and predictions don't match (wrong classification)
#print(err)
print('errori commessi: ',len(err))

plt.figure(figsize=(10,10))
plt.suptitle("WRONG PREDICTIONS")
for n in range(16): #plot max 16 wrong predictions
    bn=err[n]//BATCH_SIZE #number of batch of wrong prediction train_set[bn][0][image]
    ib=err[n]-BATCH_SIZE*bn #number of the image in the batch size train_set[bn][0][ib]
    #print('id_err:',err[n],' batch:',bn,' diff:',err[n]-BATCH_SIZE*bn)
    ax = plt.subplot(4,4,n+1)
    plt.imshow(test_set[bn][0][ib].astype(np.uint8))  #uint8 to plot
    
    if test_set[bn][1][ib]==1: #real value
        plt.title("Real: PNEUMONIA")
    else:
        plt.title("Real: NORMAL")
    plt.axis("off")