# Pneumonia Detection

I am trying to build a CNN that can detect Pneumonia using the images. I am using this Kaggle notebook as a guide: https://www.kaggle.com/dpaluszk/pneumonia-transfer-learning-94-acc. This project is for me to learn how to make Image classification neural networks.

## Setup

In [None]:
# Imports

# Generic Imports
import os
import numpy as np 
import pandas as pd 

# Visualisation
import matplotlib.pyplot as plt
import PIL

%matplotlib inline

# Creating the CNN's
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Conv2D, Dense, MaxPool2D, Flatten, LeakyReLU, BatchNormalization, Dropout, Input

In [None]:
# File Paths

base_path = '/kaggle/input/chest-xray-pneumonia/chest_xray/chest_xray/'
test_path = os.path.join(base_path,'test')
train_path = os.path.join(base_path,'train')
val_path = os.path.join(base_path, 'val')

## Visualisation

In [None]:
def visualise_images(dir_path):
        images = os.listdir(dir_path)
        images = [img for img in images if img.endswith('jpeg')]
        fig = plt.figure(figsize=(24,24))
        for i in range(2):
            for j in range(2):
                img = np.random.choice(images)
                img = PIL.Image.open(os.path.join(dir_path,img))
                axobj = fig.add_subplot(2, 2, i * 2 + j + 1)
                axobj.imshow(img)

In [None]:
#visualise_images(os.path.join(test_path, 'PNEUMONIA'))

In [None]:
#visualise_images(os.path.join(test_path, 'NORMAL'))

## Preprocessing the Images
The images are different sizes. We will have to make all the images the same size for our model to work.

In [None]:
# Global Variables
IMAGE_SIZE=(224,224)
BATCH_SIZE=64

# Resizes the training data
train_datagen  = ImageDataGenerator(
    rescale=1./255, # Ensures the values are between 0 and 1
    zoom_range=0.2,
    horizontal_flip=True
)

train_generator = train_datagen.flow_from_directory(
    train_path,
    target_size=IMAGE_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='binary')

# Resizes the test data
test_datagen  = ImageDataGenerator(
    rescale=1./255,
)

test_generator = test_datagen.flow_from_directory(
    test_path,
    target_size=IMAGE_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='binary')

## Models
We are going to experiment with four models for image classification:
* A 'simple' CNN with 4 convolution layers with different features (i.e. Batchnormalization, dropout)
* Xception transfer learning
* ResNet transfer learning
* VGG16 transfer learning
* Inception transfer learning
* Densenet121 transfer learning

### Simple CNN

Here we will create a customisable CNN. There are a lot of different features we can add to a CNN so it is worth experimenting with different features to see which one gives the best results.

In [None]:
def simple_model( activ, batchnorm=False, dropout=False):
    if batchnorm:
        if dropout:
            simp = Sequential([
                Input(shape=(224, 224, 3,)), Conv2D(64, 5, 2), LeakyReLU(), MaxPool2D(2), BatchNormalization(), Dropout(0.2),
                Conv2D(128, 3, 2), LeakyReLU(), MaxPool2D(2), BatchNormalization(), Dropout(0.2),
                Conv2D(256, 3, 1, padding='same'),  LeakyReLU(), MaxPool2D(2), BatchNormalization(), Dropout(0.2),
                Conv2D(512, 3, 1, padding='same'),  LeakyReLU(), MaxPool2D(2), BatchNormalization(), Dropout(0.2),
                Flatten(), Dense(128), LeakyReLU(), Dense(64), LeakyReLU(), Dense(1, activation=activ)])
        if not dropout:
            simp = Sequential([
                Input(shape=(224, 224, 3,)), Conv2D(64, 5, 2), LeakyReLU(), MaxPool2D(2), BatchNormalization(),
                Conv2D(128, 3, 2), LeakyReLU(), MaxPool2D(2), BatchNormalization(),
                Conv2D(256, 3, 1, padding='same'),  LeakyReLU(), MaxPool2D(2), BatchNormalization(),
                Conv2D(512, 3, 1, padding='same'),  LeakyReLU(), MaxPool2D(2), BatchNormalization(),
                Flatten(), Dense(128), LeakyReLU(), Dense(64), LeakyReLU(), Dense(1, activation=activ)])
    if not batchnorm:
        if dropout:
            simp = Sequential([
                Input(shape=(224, 224, 3,)), Conv2D(64, 5, 2), LeakyReLU(), MaxPool2D(2), Dropout(0.2),
                Conv2D(128, 3, 2), LeakyReLU(), MaxPool2D(2), Dropout(0.2),
                Conv2D(256, 3, 1, padding='same'),  LeakyReLU(), MaxPool2D(2), Dropout(0.2),
                Conv2D(512, 3, 1, padding='same'),  LeakyReLU(), MaxPool2D(2), Dropout(0.2),
                Flatten(), Dense(128), LeakyReLU(), Dense(64), LeakyReLU(), Dense(1, activation=activ)])
        if not dropout:
            simp = Sequential([
                Input(shape=(224, 224, 3,)), Conv2D(64, 5, 2), LeakyReLU(), MaxPool2D(2),
                Conv2D(128, 3, 2), LeakyReLU(), MaxPool2D(2),
                Conv2D(256, 3, 1, padding='same'),  LeakyReLU(), MaxPool2D(2),
                Conv2D(512, 3, 1, padding='same'),  LeakyReLU(), MaxPool2D(2),
                Flatten(), Dense(128), LeakyReLU(), Dense(64), LeakyReLU(), Dense(1, activation=activ)])
    return simp

In [None]:
simp = simple_model('sigmoid')
simp.compile(optimizer='adam', loss='binary_crossentropy', metrics='accuracy')
simp2 = simple_model('relu')
simp2.compile(optimizer='adam', loss='binary_crossentropy', metrics='accuracy')
simp3 = simple_model('softmax')
simp3.compile(optimizer='adam', loss='binary_crossentropy', metrics='accuracy')

### The other models

To speed things up we will define some functions that do most of the work for us since the code for these models is ver similar. 

In [None]:
# Get's the model to the point where it can be trained
def prepare_model(model, input_shape=(224, 224, 3), optimizer='adam'):
    pre_model = model(input_shape=input_shape,
                 include_top=False,
                 weights='imagenet')
    
    for layer in pre_model.layers:
        layer.trainable = False
        
    last_out = pre_model.layers[-1].output
    x = Flatten()(last_out) 
    x = Dense(512, activation='relu')(x)
    x = Dense(1, activation='sigmoid')(x)
    
    model = tf.keras.Model(pre_model.input, x)
    model.compile(optimizer=optimizer, loss='binary_crossentropy', metrics='accuracy')
    
    return model   

In [None]:
# Trains the models
def train_model(model, name, train, test, epochs=5):
    callbacks =[] 
    callbacks.append(tf.keras.callbacks.EarlyStopping(patience=5))
    callbacks.append(tf.keras.callbacks.ModelCheckpoint(os.path.join('/kaggle/working/models', name), save_best_only=True))
    
    history = model.fit(train, validation_data=test, epochs=epochs, callbacks=callbacks)
    
    return model, name, history


See this link for a list of pretrained models: https://keras.io/api/applications/

In [None]:
# Creating a list of models
models_to_prepare = [
    (tf.keras.applications.inception_v3.InceptionV3, 'inception'),
    (tf.keras.applications.ResNet50, 'resnet'),
    (tf.keras.applications.vgg16.VGG16, 'vgg'),
    (tf.keras.applications.xception.Xception, 'xception'),
    (tf.keras.applications.DenseNet121, 'densenet')
]

models = [(prepare_model(model[0]), model[1]) for model in models_to_prepare]

models += [
    (simp, 'simp'),
    (simp2, 'simp2'),
    (simp3, 'simp3')
]

In [None]:
# Trains the models
trained_models = []
histories = []
for model in models:
    print(model[1])
    if model[1] == 'vgg':
        model, name, history = train_model(model[0], model[1], train_generator, test_generator, epochs=3)
    else:
        model, name, history = train_model(model[0], model[1], train_generator, test_generator, epochs=5)            
    trained_models.append(model)
    histories.append((name, history))

## Evaluation Visualisation

In [None]:
def present_training_results(histories):
    length = len(histories)
    fig = plt.figure(figsize=(24, 4 * length))
    for i, j in enumerate(histories):
        name, history = j
        history = history.history
        for k, key in enumerate(history.keys()):
            axobj = fig.add_subplot(length, 4, 4 * i + k + 1)
            axobj.plot(history[key], label=(key + '_' + name))
            axobj.legend()
            if 'acc' in key:
                axobj.set_ylim((0.5 ,1))

In [None]:
present_training_results(histories)

In [None]:
for j in histories:
    name, history = j
    history = history.history
    print(name, max(history['val_accuracy']))

In [None]:
# Preprocess the validation data
val_datagen  = ImageDataGenerator(
    rescale=1./255,
)

val_generator = test_datagen.flow_from_directory(
    val_path,
    target_size=IMAGE_SIZE,
    batch_size=BATCH_SIZE,
    class_mode='binary')

In [None]:
# Evaluates a given model with the validation data
def evaluate_model(name, data):
    model = tf.keras.models.load_model(os.path.join('/kaggle/working/models', name))
    model.evaluate(data)

In [None]:
for model in models:
    nom = model[1]
    print(nom)
    evaluate_model(nom, val_generator)

## Conclusion
**Inception** \
Inception was very correct to train (3-4s per step) but it started overfitting very quickly (after only 2 epochs). However Inception did get a very high evaluation: $0.9375$. \
**ResNet** \
Resnet was still quite fast to train (6-7s per step) and there wasn't any evidence of overfitting after 5 epochs. Despite this, ResNet got a very low evaluation: $0.6875$. \
**VGG16** \
VGG16 took a very long to train, so long in fact I only had it do three epochs. Despite that VGG16 performed very well and got an evaluation score of $0.9375$. \
**Xception** \
Xception got up to its peak accuracy after only two epochs, but its validation accuracy for each epoch stayed roughly the same. Like Inception and VGG16, Xception got an evaluation score of $0.9375$. \
**DenseNet121** \
Densenet managed to correctly predict the entire validation dataset. It also managed to get a validation accuracy of 91% after only three epochs. It also trained reasonably quickly (7s per step). I think it is clear that DenseNet121 is the best all round model. \
**CNNs** \
I wanted to test the which was the best activator for the Sequential model: Sigmoid (Simp) or Relu (Simp2). Both models trained very quickly. Simp kept on improving throughout the 5 epochs and likely would have peaked alongside the other models. Simp2 on the otherhand plateaued pretty much immediately and didn't approve at all across the 5 epochs. This showed in the evaluation where Simp scored $0.8125$ which is noticably worse that the best three models. Simp2 score $0.5$ which means it was no better than randomly guessing. Softmax is clearly the better activation function.