# Module 4 Practice 2 - CNN modification

## **NOTE: You need to use the Tensorflow CPU container for this notebook**

In this practice exercise you will modify the CNN that was built in the lab.  You will add and alter layers in the CNN to examine the effect on the trained model

In [1]:
import sys
!{sys.executable} -m pip install keras==2.3.1
!{sys.executable} -m pip install --upgrade "numpy>=1.2"


Requirement already up-to-date: numpy>=1.2 in /opt/conda/lib/python3.7/site-packages (1.21.5)


In [2]:
import os
import numpy as np
import math
import glob
from PIL import Image
import matplotlib.pyplot as plt

from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten, BatchNormalization, Conv2D, MaxPooling2D
from keras.optimizers import Adadelta, SGD
from keras.utils import np_utils
from keras.callbacks import ModelCheckpoint, LearningRateScheduler

from sklearn.metrics import accuracy_score, roc_auc_score, roc_curve, auc, classification_report
from sklearn.model_selection import train_test_split

import tensorflow as tf

# we will set a random seed to exert some control over the random processes that occur in the neural network training.
tf.set_random_seed(42) 
np.random.seed(42)

Using TensorFlow backend.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


## Create functions from the lab
These functions are from the lab and we'll define them in their original form first.

In [3]:
def shuffle(X, y):
    # shuffle the images in a random order so similar images and labeled classes are not grouped together
    rng = np.random.default_rng(seed=42)
    perm = rng.permutation(len(y))
    X = X[perm]
    y = y[perm]
    print (np.shape(X))
    return X, y


def read_image_data(filename):
    # read grayscale image data to an 2d numpy array
    image = Image.open(filename)
    image = image.getdata()
    image = np.array(image)
    return image.reshape(-1)


def image_dir_to_array(dir):
    data = [read_image_data(image) for image in glob.glob(os.path.join(dir, '*.jpg'))]
    return np.array(data)


def load_data(negative_images_path, positive_images_path):
    negatives = image_dir_to_array(negative_images_path)
    positives = image_dir_to_array(positive_images_path)
    
    X=np.vstack((negatives, positives))
    X=X.astype(np.float) / 255 # reduce colordepth normalize the image grayscale values from 0..1
    y=np.concatenate((np.zeros(len(negatives)), np.ones(len(positives))))
    
    print ('shape of X', np.shape(X)) 
    print ('scale of X', np.min(X), np.max(X))
    print ('shape of y', np.shape(y)) 
    
    X, y = shuffle(X, y)
    return X, y


def reshape_X(X, img_channels, img_rows, img_cols):
    # reshape the data to the 4 dimensional format required by the CNN
    # the resulting shape will be (num_samples, img_channels (1 for grayscale images), img_rows, img_cols)
    return X.reshape(-1, img_channels, img_rows, img_cols)


def step_decay(epoch):
    initial_lrate = 0.01
    drop = 0.3
    epochs_drop = 30.0
    lrate = initial_lrate * math.pow(drop, math.floor((1+epoch)/epochs_drop))
    print ('learning rate', lrate)
    return lrate

In [4]:
def create_model(img_channels, img_rows, img_cols):
    model = Sequential()
    
    model.add(Conv2D(16, kernel_size = 3, padding='same', input_shape=(img_channels, img_rows, img_cols)))
    model.add(Activation('relu'))

    model.add(Conv2D(16, kernel_size = 5, padding="same"))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    
    model.add(MaxPooling2D(pool_size = 2, data_format='channels_first'))

    model.add(Conv2D(16, kernel_size = 3, padding="same"))
    model.add(BatchNormalization())
    model.add(Activation('relu'))

    model.add(Conv2D(64, kernel_size = 5, padding="same"))
    model.add(BatchNormalization())
    model.add(Activation('relu'))

    model.add(Conv2D(64, kernel_size = 3, padding="same"))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    
    model.add(MaxPooling2D(pool_size = 2, data_format='channels_first'))

    model.add(Conv2D(128, kernel_size = 3, padding="same"))
    model.add(BatchNormalization())
    model.add(Activation('relu'))

    model.add(MaxPooling2D(pool_size = 2, data_format='channels_first'))
    
    model.add(Flatten())
    model.add(Dense(128, kernel_initializer="he_normal"))
    model.add(Activation('relu'))

    model.add(Dropout(0.5)) 
    model.add(Dense(32, kernel_initializer="he_normal"))
    model.add(Activation('relu'))

    model.add(Dropout(0.5)) 
    model.add(Dense(1))
    model.add(Activation('sigmoid'))
    
    # learning rate optimizer
    optimizer = Adadelta(lr=0)
    model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])

    return model

## Load the data and create a train test split
Use a 33% test size

In [5]:
# your code here

img_rows=32
img_cols=32
img_channels=1

negative_images_path = '../resources/cnn-images/negative_images/'
positive_images_path = '../resources/cnn-images/positive_images/'

X, y = load_data(negative_images_path, positive_images_path)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

X_train = reshape_X(X_train, img_channels, img_rows, img_cols)
X_test = reshape_X(X_test, img_channels, img_rows, img_cols)

shape of X (8710, 1024)
scale of X 0.0 1.0
shape of y (8710,)
(8710, 1024)


Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations


## Recreate the original model
Use the code from the lab to recreate the original model

In [6]:
# your code here

model = create_model(img_channels, img_rows, img_cols)

Instructions for updating:
Colocations handled automatically by placer.


## Train the original model
Use 10 epochs to train the model.  This should provide slightly more stable models, but will take twice as long to train.


In [7]:
# your code here

batch_size=64
nb_epoch=10

# save our model at the end
model_checkpoint = ModelCheckpoint('model.h5', verbose=1, monitor='val_loss', save_best_only=True)

# create a learning rate callback
learning_rate = LearningRateScheduler(step_decay)

model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=nb_epoch, validation_split=0.1,
          callbacks=[model_checkpoint,learning_rate], shuffle=True)

Instructions for updating:
Use tf.cast instead.


  del sys.path[0]


Train on 5251 samples, validate on 584 samples
Epoch 1/10
learning rate 0.01

Epoch 00001: val_loss improved from inf to 0.73058, saving model to model.h5
Epoch 2/10
learning rate 0.01

Epoch 00002: val_loss improved from 0.73058 to 0.68832, saving model to model.h5
Epoch 3/10
learning rate 0.01

Epoch 00003: val_loss improved from 0.68832 to 0.65240, saving model to model.h5
Epoch 4/10
learning rate 0.01

Epoch 00004: val_loss improved from 0.65240 to 0.64091, saving model to model.h5
Epoch 5/10
learning rate 0.01

Epoch 00005: val_loss did not improve from 0.64091
Epoch 6/10
learning rate 0.01

Epoch 00006: val_loss improved from 0.64091 to 0.62509, saving model to model.h5
Epoch 7/10
learning rate 0.01

Epoch 00007: val_loss improved from 0.62509 to 0.57217, saving model to model.h5
Epoch 8/10
learning rate 0.01

Epoch 00008: val_loss improved from 0.57217 to 0.52834, saving model to model.h5
Epoch 9/10
learning rate 0.01

Epoch 00009: val_loss improved from 0.52834 to 0.50368, savi

<keras.callbacks.callbacks.History at 0x7fe9f03b8d68>

## Test the model
Output the ROC value.  We will use the ROC to evaluate the models.

In [8]:
# your code here

Y_pred = model.predict(X_test, batch_size = 32)
roc = roc_auc_score(y_test, Y_pred)
print("ROC:", round(roc, 3))

ROC: 0.812


## Modify the model to remove the dropouts

Create the model without the dropouts.  Train, and evalutate this new model.

In [9]:
# your code here

def create_model(img_channels, img_rows, img_cols):
    model = Sequential()
    
    model.add(Conv2D(16, kernel_size = 3, padding='same', input_shape=(img_channels, img_rows, img_cols)))
    model.add(Activation('relu'))

    model.add(Conv2D(16, kernel_size = 5, padding="same"))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    
    model.add(MaxPooling2D(pool_size = 2, data_format='channels_first'))

    model.add(Conv2D(16, kernel_size = 3, padding="same"))
    model.add(BatchNormalization())
    model.add(Activation('relu'))

    model.add(Conv2D(64, kernel_size = 5, padding="same"))
    model.add(BatchNormalization())
    model.add(Activation('relu'))

    model.add(Conv2D(64, kernel_size = 3, padding="same"))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
    
    model.add(MaxPooling2D(pool_size = 2, data_format='channels_first'))

    model.add(Conv2D(128, kernel_size = 3, padding="same"))
    model.add(BatchNormalization())
    model.add(Activation('relu'))

    model.add(MaxPooling2D(pool_size = 2, data_format='channels_first'))
    
    model.add(Flatten())
    model.add(Dense(128, kernel_initializer="he_normal"))
    model.add(Activation('relu'))

#     model.add(Dropout(0.5)) # we could also set the rate to zero but it incurs additional processing for no reason
    model.add(Dense(32, kernel_initializer="he_normal"))
    model.add(Activation('relu'))

#     model.add(Dropout(0.5)) 
    model.add(Dense(1))
    model.add(Activation('sigmoid'))
    
    # learning rate optimizer
    optimizer = Adadelta(lr=0)
    model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])

    return model

## Compare
Did the ROC increase?  

Your answer here

## Modify the model
Add back in the dropouts, and also add in a new convolution layer.

After this code:
```
    model.add(Conv2D(128, kernel_size = 3, padding="same"))
    model.add(BatchNormalization())
    model.add(Activation('relu'))

```

add the following code:
```
    model.add(Conv2D(128, kernel_size = 5, padding="same"))
    model.add(BatchNormalization())
    model.add(Activation('relu'))
```

Recreate, retrain, and evaluate.

In [10]:
# your code here

model = create_model(img_channels, img_rows, img_cols)

model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=nb_epoch, validation_split=0.1,
          callbacks=[model_checkpoint,learning_rate], shuffle=True)

Y_pred = model.predict(X_test, batch_size = 32)
roc = roc_auc_score(y_test, Y_pred)
print("ROC:", round(roc, 3))

  


Train on 5251 samples, validate on 584 samples
Epoch 1/10
learning rate 0.01

Epoch 00001: val_loss did not improve from 0.49007
Epoch 2/10
learning rate 0.01

Epoch 00002: val_loss did not improve from 0.49007
Epoch 3/10
learning rate 0.01

Epoch 00003: val_loss did not improve from 0.49007
Epoch 4/10
learning rate 0.01

Epoch 00004: val_loss did not improve from 0.49007
Epoch 5/10
learning rate 0.01

Epoch 00005: val_loss did not improve from 0.49007
Epoch 6/10
learning rate 0.01

Epoch 00006: val_loss did not improve from 0.49007
Epoch 7/10
learning rate 0.01

Epoch 00007: val_loss did not improve from 0.49007
Epoch 8/10
learning rate 0.01

Epoch 00008: val_loss improved from 0.49007 to 0.42647, saving model to model.h5
Epoch 9/10
learning rate 0.01

Epoch 00009: val_loss improved from 0.42647 to 0.39224, saving model to model.h5
Epoch 10/10
learning rate 0.01

Epoch 00010: val_loss improved from 0.39224 to 0.37521, saving model to model.h5
ROC: 0.886


## Conclusion

Did this new layer improve the model over the original (first) model?

Your answer here

The last model is likely to have improved the ROC over the original model, and possibly rivaled the overfit model without the Dropouts.  With a very low number of epochs, it's possible that the model fluctuated enough that this wasn't the case when you ran the notebook, but running the final model multiple times should produce a better ROC than the first model.  