# Matthew's CNN Notebook

## Overview
In this notebook I'll be creating a baseline CNN, and iterating off of that model.

## Preparing the Data


In [1]:
# Import statements
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import keras

# Instantiating a generator object and normalizing the RGB values
traingen = keras.preprocessing.image.ImageDataGenerator(rescale=1/255)
testgen = keras.preprocessing.image.ImageDataGenerator(rescale=1/255)
valgen = keras.preprocessing.image.ImageDataGenerator(rescale=1/255)

# Creating the generator for the training data
train_data = traingen.flow_from_directory(
    # Specifying location of training data
    directory='../input/chest-xray-pneumonia/chest_xray/train',
    # Re-sizing images to 150x150
    target_size=(150, 150),
    # Class mode to binary to recoginize the two directories "NORMAL" and "PNEUMONIA" as the labels
    class_mode='binary',
    batch_size=20,
    seed=42
)
# Creating the generator for the testing data
test_data = testgen.flow_from_directory(
    # Specifying location of testing data
    directory='../input/chest-xray-pneumonia/chest_xray/test',
    # Re-sizing images to 150x150
    target_size=(150, 150),
    # Class mode to binary to recoginize the two directories "NORMAL" and "PNEUMONIA" as the labels
    class_mode='binary',
    batch_size=20,
    seed=42
)

# Setting aside a validation set
val_data = valgen.flow_from_directory(
    # Specifying location of testing data
    directory='../input/chest-xray-pneumonia/chest_xray/val',
    # Re-sizing images to 150x150
    target_size=(150, 150),
    # Class mode to binary to recoginize the two directories "NORMAL" and "PNEUMONIA" as the labels
    class_mode='binary',
    batch_size=20,
    seed=42
)

## Baseline CNN

In [None]:
# Create model
base_cnn = keras.Sequential()

# Add single Conv2D and MaxPool layer
base_cnn.add(keras.layers.Conv2D(32, (2, 2), activation='relu', input_shape=(150, 150, 3)))
base_cnn.add(keras.layers.MaxPool2D(2, 2))

base_cnn.add(keras.layers.Flatten())
base_cnn.add(keras.layers.Dense(1, 'sigmoid'))


#Compile model
base_cnn.compile(
    loss='binary_crossentropy',
    optimizer='sgd',
    metrics=['acc']
    
)

# Fit Model to Training
base_cnn_results = base_cnn.fit_generator(train_data,
                              steps_per_epoch=100,
                              epochs=10,
                              validation_data=test_data
)

### Conclusion
This is a good result for a first baseline model, but some obvious issues just from looking at these results:

- The model is overfitting
- Validation accuracy is bouncing all over the place, instead of consistently improving.

There are several things that could be done from here, so let's move on to something a little more robust.

## Deeper CNN

To start, I'm just going to add more layers to the network.

In [None]:
# Create model
deep_cnn = keras.Sequential()

# Adding first Conv2D and MaxPool layer, starting small and then growing larger.
deep_cnn.add(keras.layers.Conv2D(32, (2, 2), activation='relu', input_shape=(150, 150, 3)))
deep_cnn.add(keras.layers.MaxPool2D(2, 2))

# Second layer with 64 filters
deep_cnn.add(keras.layers.Conv2D(64, (2, 2), activation='relu'))
deep_cnn.add(keras.layers.MaxPool2D(2, 2))

# Third layer with 96 filters
deep_cnn.add(keras.layers.Conv2D(96, (2, 2), activation='relu'))
deep_cnn.add(keras.layers.MaxPool2D(2, 2))
# Flatten layers, and add Densley connected layers for prediction
deep_cnn.add(keras.layers.Flatten())

# Dense layer with 32 nodes
deep_cnn.add(keras.layers.Dense(32, activation='relu'))

# Dense layer with 64 nodes
deep_cnn.add(keras.layers.Dense(64, activation='relu'))

# Dense layer with 96 nodes
deep_cnn.add(keras.layers.Dense(96, activation='relu'))

# Sigmoid output layer
deep_cnn.add(keras.layers.Dense(1, 'sigmoid'))


#Compile model
deep_cnn.compile(
    loss='binary_crossentropy',
    optimizer='sgd',
    # Adding additonal metrics for better monitoring of training.
    metrics=['acc', 'Recall', 'Precision']
    
)

# Fit Model to Training
deep_cnn_results = deep_cnn.fit_generator(train_data,
                              steps_per_epoch=100,
                              epochs=10,
                              validation_data=test_data)

### Conclusion
I added additional metrics on this model for more insights into the results of the training proccess. As far as performance goes it's definetly an improvement from the last model in terms of validation accuracy.

Some other notes about the model:
- The model is still overfitting
- The validation accuracy is not conistently improving
- Validation recall is very high, ~97% of true positives were identified correctly. This is good, since we decided that, in context of our buisness problem, false negatives are more costly then false positives.

Lets do some tuning to address the overfitting issues.

### Deeper CNN with Dropout Layers
I'm going to add dropout layers to the model in order to combat the rampant overfitting in my data.

In [2]:
# Create model
r_cnn = keras.Sequential()

# Adding first Conv2D and MaxPool layer, starting small and then growing larger.
r_cnn.add(keras.layers.Conv2D(32, (2, 2), activation='relu', input_shape=(150, 150, 3)))
r_cnn.add(keras.layers.MaxPool2D(2, 2))

# Second layer with 64 filters
r_cnn.add(keras.layers.Conv2D(64, (2, 2), activation='relu'))
r_cnn.add(keras.layers.MaxPool2D(2, 2))

# Third layer with 96 filters
r_cnn.add(keras.layers.Conv2D(96, (2, 2), activation='relu'))
r_cnn.add(keras.layers.MaxPool2D(2, 2))
# Flatten layers, and add Densley connected layers for prediction
r_cnn.add(keras.layers.Flatten())

# Dense layer with 32 nodes with dropout layer
r_cnn.add(keras.layers.Dense(32, activation='relu'))
r_cnn.add(keras.layers.Dropout(0.3))

# Dense layer with 64 nodes with dropout layer
r_cnn.add(keras.layers.Dense(64, activation='relu'))
r_cnn.add(keras.layers.Dropout(0.3))

# Dense layer with 96 nodes with dropout layer
r_cnn.add(keras.layers.Dense(96, activation='relu'))
r_cnn.add(keras.layers.Dropout(0.3))
# Sigmoid output layer
r_cnn.add(keras.layers.Dense(1, 'sigmoid'))


#Compile model
r_cnn.compile(
    loss='binary_crossentropy',
    optimizer='sgd',
    # Adding additonal metrics for better monitoring of training.
    metrics=['acc', 'Recall', 'Precision']
    
)

# Fit Model to Training
r_cnn_results = r_cnn.fit_generator(train_data,
                              steps_per_epoch=100,
                              epochs=10,
                              validation_data=test_data)

In [25]:
fig, (ax1, ax2) = plt.subplots(1,2,figsize=(16, 8))
history = r_cnn_results.history
ax1.plot(history['loss'])
ax1.plot(history['val_loss'])
ax1.xaxis.set_label('Epochs')
ax1.yaxis.set_label('Loss')
ax1.legend(['loss', 'val_loss'])


ax2.plot(history['acc'])
ax2.plot(history['val_acc'])
ax1.xaxis.set_label('Epochs')
ax1.yaxis.set_label('Accuracy')
ax2.legend(['Accuracy', 'Val_acc'])

fig.suptitle('Loss and Accuracy of Model');