<a href="https://colab.research.google.com/github/jammy-bot/dsc-mod-4-project-v2-1-online-ds-ft-120919/blob/master/p4index3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
%%html
<marquee style='width: 50%; color: Green;'>It's a Kind of Magic!</marquee>

Heading
# Title
## Subtitle

>For this project, we will be working with the __Chest X-Ray Images (Pneumonia)__ dataset, from Kaggle [https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia](https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia). The objective of the project is to build a deep neural network model that can classify whether a given patient has pneumonia, given a chest x-ray image.

        Acknowledgements
        Data: https://data.mendeley.com/datasets/rscbjbr9sj/2

        License: CC BY 4.0

        Citation: http://www.cell.com/cell/fulltext/S0092-8674(18)30154-5


# Notebook Preparation

Import required libraries.

In [0]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

from sklearn.preprocessing import StandardScaler, LabelBinarizer
from keras.preprocessing.image import load_img, img_to_array, array_to_img
from keras.preprocessing.image import ImageDataGenerator
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras import layers, initializers, optimizers
from keras.optimizers import SGD
from keras.layers import Activation, Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.callbacks import EarlyStopping, ModelCheckpoint
from sklearn.metrics import mean_squared_error
from keras import backend as K

import random
from random import shuffle
from tqdm import tqdm  

import warnings
warnings.filterwarnings('ignore')

Verify the current working directory



In [0]:
import os
!pwd

Let's get a sense of our directory structure.

In [0]:
# using Ipython magic to move to the `root` folder
%cd /root/

# viewing root subdirectories as a list
# subdirs = [x[0] for x in os.walk('.')]
# print(subdirs)

# viewing root subdirectories in a vertical orientation
#for x in os.walk('.'):
#    print(x[0])

Create a `data` subdirectory and a hidden `.kaggle` directory for data download credentials.

In [0]:
# creating a directory for sample data
!mkdir ./data/

# creating a directory for data download credentials
!mkdir ~/.kaggle

# list items in the `content` directory
!ls

Verify the hidden directory, and move into it.

In [0]:
%cd ~/.kaggle/

Upload your **Kaggle API key** to the current, hidden directory.

In [0]:
from google.colab import files
files.upload()

Verify the current directory and its contents.

In [0]:
!pwd
%ls

__Update permissions__ on the json file to readable and writeabile by the owner and not readable, writeable, or executable by anyone else.

>  Clear output from the `files.upload()` cell, above, before saving publicly the notebook.

In [0]:
!chmod 600 ./kaggle.json

Move to the `content/data` directory.

In [0]:
%cd ..
%cd data/

# viewing the current, `data` directory
%ls

# Obtain Data

>The dataset is provided in "3 folders (train, test, val) and contains subfolders for each image category (Pneumonia/Normal). There are 5,863 X-Ray images (JPEG) and 2 categories (Pneumonia/Normal)."

Download the dataset to the current `data`directory.

In [0]:
# downloading the dataset into `content/data'
!kaggle datasets download -d paultimothymooney/chest-xray-pneumonia

Verify download.

In [0]:
%ls

Inflate the compressed files.

In [0]:
# extracting the zipped files and directories
!unzip chest-xray-pneumonia.zip

>We now have a `chest_xray` folder with subdirectories for *training*, *validation*, and *test* data.


In [0]:
# verify directory contents
!ls


>Data added to Google Colab projects get cleared at the close of each session, so we do not need to worry about removing the `chest-xray-pneumonia.zip` compressed file.

In [0]:
# viewing 1 level deep
print(os.listdir("/root/data/chest_xray/chest_xray"))

In [0]:
# viewing next level
print(os.listdir("/root/data/chest_xray/chest_xray/val/"))

# Scrub

In [0]:
# instantiate variables for dataset directory paths
train_dir = "/root/data/chest_xray/chest_xray/train/"
train_normal = train_dir + 'NORMAL'
train_pneumonia = train_dir + 'PNEUMONIA'

val_dir = "/root/data/chest_xray/chest_xray/val/"
val_normal = val_dir + 'NORMAL'
val_pneumonia = val_dir + 'PNEUMONIA'

test_dir = "/root/data/chest_xray/chest_xray/test/"
test_normal = test_dir + 'NORMAL'
test_pneumonia = test_dir + 'PNEUMONIA'

In [0]:
# instantiate variables for set lengths
tn_normal_length = len([f for f in os.listdir(
    train_normal) if f.endswith('.jpeg')])
tn_pneumonial_length = len([f for f in os.listdir(
    train_pneumonia) if f.endswith('.jpeg')])
val_normal_length = len([f for f in os.listdir(
    val_normal) if f.endswith('.jpeg')])
val_pneumonia_length = len([f for f in os.listdir(
    val_pneumonia) if f.endswith('.jpeg')])
tt_normal_length = len([f for f in os.listdir(
    test_normal) if f.endswith('.jpeg')])
tt_pneumonia_length = len([f for f in os.listdir(
    test_pneumonia) if f.endswith('.jpeg')])

# printing how many images we have in each directory
print('There are', tn_normal_length, 
'normal images in the training set\n')

print('There are', tn_pneumonial_length, 
'pneumonia images in the training set\n')

print('There are', val_normal_length, 
'normal images in the validation set\n')

print('There are', val_pneumonia_length, 
'pneumonia images in the validation set\n')

print('There are', tt_normal_length, 
'normal images in the test set\n')

print('There are', tt_pneumonia_length, 
'pneumonia images in the test set\n')

>* Pneumonia images surpass the number of normal images nearly 3-fold, in the training data set.
* Altogether, there are more than 5000 training images, 16 validation images, and 624 test images.

# Explore and Model

In [0]:
# importing necessary libraries.
import time
import matplotlib.pyplot as plt
import scipy
import numpy as np
from PIL import Image
from scipy import ndimage
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img

np.random.seed(123)

__Data Counts and Shapes__

__Standardize__ image data by dividing each matrix by 255 and resizing images to 150 x 150.

In [0]:
# instantiating rescaling generators tor train and val data
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

__Reshape__ image data and labels

In [0]:
# generating training and validation data
train_generator = train_datagen.flow_from_directory(
        # This is the target directory
        train_dir,
        # All images will be resized to 150x150
        target_size=(150, 150),
        batch_size=20,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')
validation_generator = test_datagen.flow_from_directory(val_dir,
                                                        target_size=(150, 150),
                                                        batch_size=20,
                                                        class_mode='binary')

__Visualize Training Data Counts__

In [0]:
def bar_plot(dir):
    x_dir = os.listdir(dir+'/NORMAL')
    y_dir = os.listdir(dir+'/PNEUMONIA')
    x=len(x_dir)
    y=len(y_dir)
    if dir != test_dir: 
        # account for non - image files in `train` and `val` directories
        x = x - 1
        y = y - 1
    print(f'{dir} Images:\n')
    print('NORMAL:', x)
    print('PNEUMONIA:', y)
    print('Total images:', x + y)
    print('-'*50)
    # instantiate directory name for plot title
    subdir_title = str.split(dir, '/')[-2] + ' directory'
    category = ['NORMAL', 'PNEUMONIA']
    count = [x, y]
    plot = plt.bar(category,count) 
    plot[0].set_color('orange')
    plt.title(
        f"Number of values per category in {subdir_title.title()}\n")
    plt.show()

In [0]:
# plotting training data counts
bar_plot(train_dir)

In [0]:
# plotting validation data counts
bar_plot(val_dir)

In [0]:
# plotting test data counts
bar_plot(test_dir)

### Preview Images

In [0]:
# instantiate variables for a training image from each class
normal_example = os.listdir(f'{train_dir}/NORMAL')[33]
pneumonia_example = os.listdir(f'{train_dir}/PNEUMONIA')[33]

In [0]:
# plot a normal class training image
normal_img = plt.imread(f'{train_dir}/NORMAL/{normal_example}')
plt.title("normal")
plt.imshow(normal_img)
plt.show()

In [0]:
# plot a pneumonia class training image
pneumonia_img = plt.imread(f'{train_dir}/PNEUMONIA/{pneumonia_example}')
plt.imshow(pneumonia_img)
plt.title('pneumonia')
plt.show()

## Design Model 1
>Using Keras:
* Alternate convolutional and pooling layers
* Include later layers with a larger number of parameters in order to detect more abstract patterns
* Add final dense layers to add a classifier to the convolutional base
* Compile this model

In [0]:
# importing necessary libraries
from keras import layers
from keras import models
from keras import optimizers

In [0]:
# building the CNN model
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
                        input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

In [0]:
# compiling the model
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
              metrics=['acc'])

>Import keras modules: 
* to stop training when a monitored quantity has stopped improving
* to save the model after every epoch
* to reduce learning rate when a metric has stopped improving


In [0]:
# importing keras modules
from keras.callbacks import EarlyStopping
from keras.callbacks import ModelCheckpoint
from keras.callbacks import ReduceLROnPlateau

In [0]:
# creating a directory for saving models
!mkdir ./models/

In [0]:
# creating callback checkpoints
f_path = './models/'
my_callbacks = [
    EarlyStopping(patience=4, verbose=1),
    ReduceLROnPlateau(factor=0.1, patience=3, min_lr=0.00001, verbose=1),
    ModelCheckpoint(filepath = f_path + 'p3_model.h5', 
    verbose=1, save_best_only=True, save_weights_only=False) 
    ]

In [0]:
# training and evaluating the model
# using the callback in the `model.fit`
history = model.fit_generator(train_generator, 
                              steps_per_epoch=100, 
                              epochs=30, 
                              validation_data=validation_generator, 
                              validation_steps=50, 
                              callbacks = my_callbacks)

In [0]:
model.summary()

__Visualize Training Results__

In [0]:
import matplotlib.pyplot as plt
%matplotlib inline 
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(len(acc))
plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy along epochs')
plt.legend()
plt.figure()
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss along epochs')
plt.legend()
plt.show()

In [0]:
# download the model
files.download('./models/p3_model.h5')

In [0]:
val_loss, val_acc = model.evaluate_generator(validation_generator, steps=50)

print('val loss:', val_loss)
print('val acc:', val_acc)

### Data Augmentation

In [0]:
# generating transformed data
train_datagen = ImageDataGenerator(rotation_range=40, 
                                   width_shift_range=0.2, 
                                   height_shift_range=0.2, 
                                   shear_range=0.2, 
                                   zoom_range=0.2, 
                                   horizontal_flip=True, 
                                   fill_mode='nearest')

test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        # This is the target directory
        train_dir,
        # All images will be resized to 150x150
        target_size=(150, 150),
        batch_size=32,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

# transforming only image size, for validation data
validation_generator = test_datagen.flow_from_directory(val_dir, 
                                                        target_size=(150, 150), 
                                                        batch_size=32, 
                                                        class_mode='binary')

# fitting a model to training set including transformations
history = model.fit_generator(train_generator, 
                              steps_per_epoch=100, 
                              epochs=100, 
                              validation_data=validation_generator, 
                              validation_steps=50, 
                              callbacks = my_callbacks)

In [0]:
val_loss, val_acc = model.evaluate_generator(validation_generator, steps=50)

print('val loss:', val_loss)
print('val acc:', val_acc)

> No significant change

In [0]:
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(len(acc))
plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy along epochs')
plt.legend()
plt.figure()
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss along epochs')
plt.legend()
plt.show()

In [0]:
# download the model
files.download('./models/p3_model.h5')

# summarize the model
model.summary()

__Plot Confusion Matrix__

In [0]:
# from sklearn.metrics import confusion_matrix
y_true=validation_generator.classes
print(y_true)

y_pred = model.predict_generator(validation_generator)
y_pred = np.argmax(y_pred,axis = 1)
print(y_pred)

In [0]:
from mlxtend.plotting import plot_confusion_matrix

In [0]:
CM = confusion_matrix(y_true, y_pred)

fig, ax = plot_confusion_matrix(conf_mat=CM , 
                                figsize=(7, 7), 
                                hide_ticks=True, 
                                cmap=plt.cm.Blues)
# plt.xticks(range(len(classes)), classes, fontsize=12)
# plt.yticks(range(len(classes)), classes, fontsize=12)
plt.title("Confusion Matrix for ...: \n") #+model_title, fontsize=11
# fig.savefig(image_file_name_CM, dpi=100)
plt.show()

### Evaluate Model 1 on Test Data

In [0]:
test_generator = test_datagen.flow_from_directory(test_dir, 
                                                  target_size=(150, 150), 
                                                  batch_size=20, 
                                                  class_mode='binary')

test_loss, test_acc = model.evaluate_generator(test_generator, steps=50)

print('test loss:', test_loss)
print('test acc:', test_acc)

> So the first model, with augmentation, has achieved __78.56\%__ accuracy predicting the class ('normal' or 'pneumonia' of our test set.

## Model 2

* add padding to input layer
* add dropout layers after each pooling layer

In [0]:
# building the a 2nd CNN model
model_2 = models.Sequential()
model_2.add(layers.Conv2D(32, (3, 3), activation='relu', padding="same",
                        input_shape=(150, 150, 3)))
model_2.add(layers.MaxPooling2D((2, 2)))
model_2.add(Dropout(0.25))
model_2.add(layers.Conv2D(64, (3, 3), activation='relu'))
model_2.add(layers.MaxPooling2D((2, 2)))
model.add(Dropout(0.25))
model_2.add(layers.Conv2D(128, (3, 3), activation='relu'))
model_2.add(layers.MaxPooling2D((2, 2)))
model_2.add(Dropout(0.25))
model_2.add(layers.Conv2D(128, (3, 3), activation='relu'))
model_2.add(layers.MaxPooling2D((2, 2)))
model_2.add(Dropout(0.25))
model_2.add(layers.Flatten() )# converts 3D feature maps to 1D vectors
model_2.add(Dense(64))
model_2.add(Activation('relu'))
model_2.add(layers.Dense(512, activation='relu'))
model_2.add(layers.Dense(1, activation='sigmoid'))

* Change the optimizer to `Adam`
* use 'accuracy' metrics

In [0]:
# compiling the model
model_2.compile(loss='binary_crossentropy',
              optimizer=optimizers.Adam(lr=1e-4),
              metrics=['acc'])

>Import keras modules: 
* to stop training when a monitored quantity has stopped improving
* to save the model after every epoch
* to reduce learning rate when a metric has stopped improving


In [0]:
# importing keras modules
from keras.callbacks import EarlyStopping
from keras.callbacks import ModelCheckpoint
from keras.callbacks import ReduceLROnPlateau

In [0]:
# creating callback checkpoints
f_path = './models/'
my_callbacks = [
    EarlyStopping(patience=4, verbose=1),
    ReduceLROnPlateau(factor=0.1, patience=3, min_lr=0.00001, verbose=1),
    ModelCheckpoint(filepath = f_path + 'p3_model_2.h5', 
    verbose=1, save_best_only=True, save_weights_only=False) 
    ]

In [0]:
# training and evaluating the model
# using the callback in the `model.fit`
history_2 = model_2.fit_generator(train_generator, 
                              steps_per_epoch=100, 
                              epochs=30, 
                              validation_data=validation_generator, 
                              validation_steps=50, 
                              callbacks = my_callbacks)

In [0]:
model_2.summary()

__Visualize Training Results__

In [0]:
import matplotlib.pyplot as plt
%matplotlib inline 
acc = history_2.history['acc']
val_acc = history_2.history['val_acc']
loss = history_2.history['loss']
val_loss = history_2.history['val_loss']
epochs = range(len(acc))
plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy along epochs')
plt.legend()

plt.figure()
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss along epochs')
plt.legend()
plt.show()

In [0]:
# download the model
files.download('./models/p3_model_2.h5')

In [0]:
val_loss, val_acc = model_2.evaluate_generator(validation_generator, 
                                               steps=50)

print('val loss:', val_loss)
print('val acc:', val_acc)

> Early stopping may have been a bit premature, for this model, as measures do not appear to have settled into a definitive trend.

__Plot Confusion Matrix__

In [0]:
# from sklearn.metrics import confusion_matrix
y_true=validation_generator.classes
print(y_true)

y_pred = model_2.predict_generator(validation_generator)
y_pred = np.argmax(y_pred,axis = 1)
print(y_pred)

In [0]:
CM = confusion_matrix(y_true, y_pred)

fig, ax = plot_confusion_matrix(conf_mat=CM , 
                                figsize=(7, 7), 
                                hide_ticks=True, 
                                cmap=plt.cm.Blues)
# plt.xticks(range(len(classes)), classes, fontsize=12)
# plt.yticks(range(len(classes)), classes, fontsize=12)
plt.title("Confusion Matrix for ...: \n") #+model_title, fontsize=11
# fig.savefig(image_file_name_CM, dpi=100)
plt.show()

### Evaluate Model 2 on Test Data

In [0]:
test_generator = test_datagen.flow_from_directory(test_dir, 
                                                  target_size=(150, 150), 
                                                  batch_size=20, 
                                                  class_mode='binary')

test_loss, test_acc = model_2.evaluate_generator(test_generator, steps=50)

print('test loss:', test_loss)
print('test acc:', test_acc)

>The second model performed poorly, with a test set prediction accuracy score ~__16\%__ lower than the first model.

## Model 3

* trying fewer layers and fewer dropouts
* add dropout layers after each pooling layer

In [0]:
# building the a 2nd CNN model
model_3 = models.Sequential()
model_3.add(layers.Conv2D(32, (3, 3), activation='relu', padding="same",
                        input_shape=(150, 150, 3)))
model_3.add(layers.MaxPooling2D((2, 2)))
model_3.add(Dropout(0.25))
model_3.add(layers.Conv2D(64, (3, 3), activation='relu'))
model_3.add(layers.MaxPooling2D((2, 2)))
model.add(Dropout(0.25))
model_3.add(layers.Conv2D(128, (3, 3), activation='relu'))
model_3.add(layers.MaxPooling2D((2, 2)))
model_3.add(layers.Flatten() )# this converts 3D feature maps to 1D vectors
model_3.add(layers.Dense(512, activation='relu'))
model_3.add(layers.Dense(1, activation='sigmoid'))

* Change the optimizer to `Adam`
* use 'accuracy' metrics

In [0]:
# compiling the model
model_3.compile(loss='binary_crossentropy',
              optimizer=optimizers.Adam(lr=1e-4),
              metrics=['acc'])

>Import keras modules: 
* to stop training when a monitored quantity has stopped improving
* to save the model after every epoch
* to reduce learning rate when a metric has stopped improving


In [0]:
# importing keras modules
from keras.callbacks import EarlyStopping
from keras.callbacks import ModelCheckpoint
from keras.callbacks import ReduceLROnPlateau

In [0]:
# increase early stopping patience

In [0]:
# updating callback checkpoints
f_path = './models/'
my_callbacks = [
    EarlyStopping(patience=7, verbose=1),
    ReduceLROnPlateau(factor=0.1, patience=3, min_lr=0.00001, verbose=1),
    ModelCheckpoint(filepath = f_path + 'p3_model_3.h5', 
    verbose=1, save_best_only=True, save_weights_only=False) 
    ]

In [0]:
# training and evaluating the model
# using the callback in the `model.fit`
history_3 = model_3.fit_generator(train_generator, 
                              steps_per_epoch=100, 
                              epochs=30, 
                              validation_data=validation_generator, 
                              validation_steps=50, 
                              callbacks = my_callbacks)

In [0]:
model_3.summary()

__Visualize Training Results__

In [0]:
import matplotlib.pyplot as plt
%matplotlib inline 
acc = history_3.history['acc']
val_acc = history_3.history['val_acc']
loss = history_3.history['loss']
val_loss = history_3.history['val_loss']
epochs = range(len(acc))
plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy along epochs')
plt.legend()
plt.figure()
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss along epochs')
plt.legend()
plt.show()

In [0]:
val_loss, val_acc = model_3.evaluate_generator(validation_generator, 
                                               steps=50)

print('val loss:', val_loss)
print('val acc:', val_acc)

> Acurracy droped from the previous model.

__Plot Confusion Matrix__

In [0]:
# from sklearn.metrics import confusion_matrix
y_true=validation_generator.classes
print(y_true)

y_pred = model_3.predict_generator(validation_generator)
y_pred = np.argmax(y_pred,axis = 1)
print(y_pred)

In [0]:
CM = confusion_matrix(y_true, y_pred)

fig, ax = plot_confusion_matrix(conf_mat=CM , 
                                figsize=(7, 7), 
                                hide_ticks=True, 
                                cmap=plt.cm.Blues)
# plt.xticks(range(len(classes)), classes, fontsize=12)
# plt.yticks(range(len(classes)), classes, fontsize=12)
plt.title("Confusion Matrix for ...: \n") #+model_title, fontsize=11
# fig.savefig(image_file_name_CM, dpi=100)
plt.show()

In [0]:
# download the model
files.download('./models/p3_model_3.h5')

### Evaluate Model 3 on Test Data

In [0]:
test_generator = test_datagen.flow_from_directory(test_dir, 
                                                  target_size=(150, 150), 
                                                  batch_size=20, 
                                                  class_mode='binary')

test_loss, test_acc = model_3.evaluate_generator(test_generator, steps=50)

print('test loss:', test_loss)
print('test acc:', test_acc)

>The accuracy score for test data is even (slightly) lower than that of the second model.

# Select a Final Model

> It turns out that our best performing model was our first augmented model, with RMSprop optimization.

In [0]:
# evaluating the best model on test data
# setting steps to 39 and batch_size to 16 (for 624 images)
test_generator = test_datagen.flow_from_directory(test_dir, 
                                                  target_size=(150, 150), 
                                                  batch_size=16, 
                                                  class_mode='binary')

test_loss, test_acc = model.evaluate_generator(test_generator, steps=39)

print('test loss:', test_loss)
print('test acc:', test_acc)

* We can use this model as a basis for further tuning, or distribute the model as - is.
* A potential approach would be to perform a gridsearch on the optimizer, to determine best parameters.


## Distribution

* Since we previously saved our `my_model.h5` model, we can distribute it and use it on another system or for another project, as follows.
* The string summary can be saved and shared, as well.

In [0]:
# loading the saved model
from keras.models import load_model

filepath = '/root/data/models/p3_model.h5' # current path to saved file

dist_model = load_model(filepath, 
         custom_objects={'loss':'binary_crossentropy'})

# printing string summary
dist_model.summary()

The model's architecture may be shared as an image.

In [0]:
from keras.utils import plot_model

# plot model architecture, only
plot_model(dist_model)

In [0]:
# evaluating the distributed model on test data
# setting steps to 39 and batch_size to 16 (for 624 images)
test_generator = test_datagen.flow_from_directory(test_dir, 
                                                  target_size=(150, 150), 
                                                  batch_size=16, 
                                                  class_mode='binary')

test_loss, test_acc = dist_model.evaluate_generator(test_generator, 
                                                    steps=39)

print('Restored model, loss:', test_loss)
print('Restored model, acc:', test_acc)

> Somehow, the restored model reports an even better performance than we realized when it was first run (> __86.06\%__ vs. __83.97\%__). 
* One apparent difference is that we used 39 steps to evaluate the reloaded model, where we originally had used 50 steps.
* I also included a 'ModelCheckpoint' callback that saved the model only after epochs where loss improved.
* Also worth of exploration is the effect of the lack of dropout layers in testing. 
>
>For now, let's re - check the distributed model against the test data.

# Summary