# CNN Pneumonia Diagnoser

Let's build a machine learning model to diagnose pneumonia in patients given an X-ray of their chest using convolutional neural networks. Gratitude must be expressed to [Faizunnabi](https://www.kaggle.com/faizunnabi) for his excellent kernel [Diagnose Pneumonia](https://www.kaggle.com/faizunnabi/diagnose-pneumonia), which served as an inspiration for me to try and build a CNN for this task.

## Data Exploration

Let's begin by loading all necessary packages and taking a look at where the data files are located.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import cv2
import seaborn as sns
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten, Dense
from keras.optimizers import Adam
from keras.callbacks import ReduceLROnPlateau
import os
print(os.listdir("../input/chest_xray/chest_xray/"))
print(os.listdir("../input/chest_xray/chest_xray/train/"))

Before building a machine learning model, let's view some of the X-ray images in our training set.

In [None]:
file_loc = "../input/chest_xray/chest_xray/"
train_n = os.listdir(file_loc + "train/NORMAL/")
train_p = os.listdir(file_loc + "train/PNEUMONIA/")
fig, axarr = plt.subplots(3, 2, figsize=(16, 16))
axarr[0][0].set_title("Normal Sample Cases")
axarr[0][1].set_title("Pneumonia Sample Cases")
for i in range(3):
    axarr[i][0].imshow(cv2.imread(file_loc + "train/NORMAL/" + train_n[i]))
    axarr[i][0].axis("off")
    axarr[i][1].imshow(cv2.imread(file_loc + "train/PNEUMONIA/" + train_p[i]))
    axarr[i][1].axis("off")

As we can see, images in this dataset are focused square on the chest of a given patient, all coming in different sizes. It is also not obvious, at least to me, what visual features distinguish a case of pneumonia and a "normal" case. Let's now see what the distribution in our training data is between normal and pneumonia cases.

In [None]:
sns.barplot(x=["Normal", "Pneumonia"], y=[len(train_n), len(train_p)])

As we can see, there are roughly three times as many pneumonia cases as there are normal cases in our training set. Let's also take a peek at the number of images we actually have in each of our train, validation, and test sets.

In [None]:
train_images = train_n + train_p
train_images = [img for img in train_images if img != ".DS_Store"]
val_images = os.listdir(file_loc + "val/NORMAL/") + os.listdir(file_loc + "val/PNEUMONIA/")
val_images = [img for img in val_images if img != ".DS_Store"]
test_images = os.listdir(file_loc + "test/NORMAL/") + os.listdir(file_loc + "test/PNEUMONIA/")
test_images = [img for img in test_images if img != ".DS_Store"]

sns.barplot(x=["Train", "Validation", "Test"], y=[len(train_images), len(val_images), len(test_images)])
print("There are {} images in the training set.".format(len(train_images)))
print("There are {} images in the validation set.".format(len(val_images)))
print("There are {} images in the test set.".format(len(test_images)))

As expected, there are much more training samples then validation and test samples. We will augment the data in our training set to offer further images for a convolutional neural network to train from.

## Training a Convolutional Neural Network

Let's now build data generators for this dataset. We will transform all images to be 256x256 pixels, as well as rescale their values to be between 0 and 1. We will also perform data augmentation on our training data by shifting these images horizontally and vertically by 10% of their total width and height respectively, as well as setting a zoom range to be 0.1.

In [None]:
image_height, image_width = 256, 256
batch_size=32

data_generator_train = ImageDataGenerator(rescale=1/255, 
                                          width_shift_range=0.1, 
                                          height_shift_range=0.1, 
                                          zoom_range=0.1)
train = data_generator_train.flow_from_directory(directory=file_loc + "train/", 
                                                 target_size=(image_height, 
                                                              image_width), 
                                                 class_mode="binary", 
                                                 batch_size=batch_size)

data_generator_val = ImageDataGenerator(rescale=1/255)
val = data_generator_val.flow_from_directory(directory=file_loc + "val/", 
                                             target_size=(image_height, 
                                                          image_width), 
                                             class_mode="binary", 
                                             batch_size=batch_size)

data_generator_test = ImageDataGenerator(rescale=1/255)
test = data_generator_test.flow_from_directory(directory=file_loc + "test/", 
                                               target_size=(image_height, 
                                                            image_width), 
                                               class_mode="binary", 
                                               batch_size=batch_size)

Let's now design the structure of our CNN, using a convolution, max pooling, and dropout layer, as well as a dense output layer with a sigmoid activation function outputting the probability of a given image belonging to a particular class. The Adam optimizer will be used with a 1e-5 learning rate and binary cross entropy loss function. The only performance metric that we will keep track of for this study is accuracy. Let's now train the model for 10 epochs, reducing the learning rate by a factor of 0.1 if the validation loss has not improved for 2 epochs.

In [None]:
model = Sequential()
model.add(Conv2D(64, 
                 (3, 3), 
                 input_shape=(image_height, image_width, 3), 
                 activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(1, activation="sigmoid"))
model.compile(optimizer=Adam(lr=1e-5), loss="binary_crossentropy", metrics=["accuracy"])
model.summary()
num_epochs = 10
history = model.fit_generator(train, 
                              steps_per_epoch=5216//batch_size, 
                              epochs=num_epochs, 
                              validation_data=val, 
                              validation_steps=16, 
                              callbacks=[ReduceLROnPlateau(patience=2, verbose=1)])

## Analyzing our Model

With our model trained for 10 epochs, let's see how the model's loss and accuracy on both the training and validation set evolved throughout the training procedure.

In [None]:
fig, axarr = plt.subplots(1, 2, figsize=(24, 8))
axarr[0].set_xlabel("Number of Epochs")
axarr[0].set_ylabel("Loss")
sns.lineplot(x=range(1, num_epochs+1), y=history.history["loss"], label="Train", ax=axarr[0])
sns.lineplot(x=range(1, num_epochs+1), y=history.history["val_loss"], label="Validation", ax=axarr[0])
axarr[1].set_xlabel("Number of Epochs")
axarr[1].set_ylabel("Accuracy")
axarr[1].set_ylim(0, 1)
sns.lineplot(x=range(1, num_epochs+1), y=history.history["acc"], label="Train", ax=axarr[1])
sns.lineplot(x=range(1, num_epochs+1), y=history.history["val_acc"], label="Validation", ax=axarr[1])

As we can see, our losses decreased and accuracy scores increased throughout the training procedure. The validation curves are alot less smooth then their training counterparts, but this is most likely due to their being only 16 images in the validation set. Let's calculate our model's accuracy on the test set.

In [None]:
test_results = model.evaluate_generator(test, steps=624//batch_size)
print("The model has a test accuracy of {}.".format(test_results[1]))

## Final Remarks

Training a simple convolutional neural network, we were able to build a pneumonia diagnoser with a test accuracy of around 88%. Training a more complex model for a greater number of epochs, as well as having more data available, we could most likely further improve upon these results. Regardless, not only was this a very interesting and rewarding task to work on, but it further proved to me how effective convolutional neural networks can be for image recognition.