---
<a id='step3'></a>
## Test image augmentation effect on a typical CNN for classifying Dog Breeds 

- 1) First we will create a CNN _from scratch_ and test its accuracy against 50 different dog breeds
- 2) Second we will manually augment data by physically creating augmented images on disk and then train against this data set to identify any changes to accuracy and overfitting
- 3) Third we will use real-time data augmentation and see how that compares against point 2


Note:
That random chance presents an low bar: setting aside the fact that the classes are slightly imabalanced, a random guess will provide a correct answer 1 in 50 times, that is, in 2%.  


---
### Create realtime augmented images on the fly

In the code block below we will load existing images and manipulate each image into an additional images in memory. Each images may be scaled, rotated, mirrored and/or undergo linear translation in order to obtain a better invariant representation. 

So what to expect: by using augmentation we expect better performance by having more data to train on and better statistical invariant data --> better at generalising and less overfitting.

In [244]:
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
from sklearn.datasets import load_files 
from keras.utils import np_utils
from glob import glob
import numpy as np
import os

noOfBreeds = 10
batchSize = 1

def load_dataset(path):
    data = load_files(path)
    dog_files = np.array(data['filenames'])
    dog_targets = np_utils.to_categorical(np.array(data['target']), noOfBreeds)
    return dog_files, dog_targets, np.array(data['target'])

train_files, train_targets, train_targets_index = load_dataset('dogImages10/train')
valid_files, valid_targets, valid_targets_index = load_dataset('dogImages10/valid')
test_files, test_targets, test_targets_index  = load_dataset('dogImages10/test')
# load list of dog names, 17 is due to the length of the string 'dogImages50/train'
dog_names = [item[17:-1] for item in sorted(glob("dogImages10/train/*/"))]

train_data_generator = ImageDataGenerator(
    rescale=1.0/255,
    width_shift_range=0.15,  # randomly shift images horizontally (15% of total width)
    height_shift_range=0.15,  # randomly shift images vertically (15% of total height)
    rotation_range=45,  # degree range for random rotations
    zoom_range=0.2, # range for random zoom [1-zoom_range, 1+zoom_range]
    horizontal_flip=True, # randomly flip images horizontally
    fill_mode='nearest') 

valid_data_generator = ImageDataGenerator(rescale=1.0/255)

### Pre-process the Data

When using TensorFlow as backend, Keras CNNs require a 4D array (which we'll also refer to as a 4D tensor) as input, with shape

$$
(\text{nb_samples}, \text{rows}, \text{columns}, \text{channels}),
$$

where `nb_samples` corresponds to the total number of images (or samples), and `rows`, `columns`, and `channels` correspond to the number of rows, columns, and channels for each image, respectively.  

The `path_to_tensor` function below takes a string-valued file path to a color image as input and returns a 4D tensor suitable for supplying to a Keras CNN.  The function first loads the image and resizes it to a square image that is $224 \times 224$ pixels.  Next, the image is converted to an array, which is then resized to a 4D tensor.  In this case, since we are working with color images, each image has three channels.  Likewise, since we are processing a single image (or sample), the returned tensor will always have shape

$$
(1, 224, 224, 3).
$$

The `paths_to_tensor` function takes a numpy array of string-valued image paths as input and returns a 4D tensor with shape 

$$
(\text{nb_samples}, 224, 224, 3).
$$

Here, `nb_samples` is the number of samples, or number of images, in the supplied array of image paths.  It is best to think of `nb_samples` as the number of 3D tensors (where each 3D tensor corresponds to a different image) in your dataset!

And finally we rescale the images by dividing every pixel in every image by 255. And why scale in the first place:

1. Treat all images in the same manner: some images are high pixel range, some are low pixel range. The images are all sharing the same model, weights and learning rate. The high range image tends to create stronger loss while low range create weak loss, the sum of them will all contribute the back propagation update. But for visual understanding, you care about the contour more than how strong is the contrast as long as the contour is reserved. Scaling every images to the same range [0,1] will make images contributes more evenly to the total loss. In other words, a high pixel range cat image has one vote, a low pixel range cat image has one vote, a high pixel range dog image has one vote, a low pixel range dog image has one vote... this is more like what we expect for training a model for dog/cat image classifier. Without scaling, the high pixel range images will have large amount of votes to determine how to update weights. For example, black/white cat image could be higher pixel range than pure black cat image, but it just doesn't mean black/white cat image is more important for training.
2. Using typical learning rate: when we reference learning rate from other's work, we can directly reference to their learning rate if both works do the scaling preprocessing over images data set. Otherwise, higher pixel range image results higher loss and should use smaller learning rate, lower pixel range image will need larger learning rate.

In [245]:
from keras.preprocessing import image                  
from tqdm import tqdm
from PIL import ImageFile                            
ImageFile.LOAD_TRUNCATED_IMAGES = True                 

def path_to_tensor(img_path):
    img = image.load_img(img_path, target_size=(224, 224))
    # convert PIL.Image.Image type to 3D tensor with shape (224, 224, 3)
    x = image.img_to_array(img)
    # convert 3D tensor to 4D tensor with shape (1, 224, 224, 3) and return 4D tensor
    return np.expand_dims(x, axis=0)

def paths_to_tensor(img_paths):
    list_of_tensors = [path_to_tensor(img_path) for img_path in tqdm(img_paths)]
    return np.vstack(list_of_tensors)


# pre-process the data for Keras
train_tensors = paths_to_tensor(train_files).astype('float32')
valid_tensors = paths_to_tensor(valid_files).astype('float32')
test_tensors = paths_to_tensor(test_files).astype('float32')/255

# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
train_data_generator.fit(train_tensors)
valid_data_generator.fit(valid_tensors)


100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 200/200 [00:01<00:00, 99.17it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 65.14it/s]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:00<00:00, 139.65it/s]


### Define the Model

In [258]:
from keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D
from keras.layers import Dropout, Flatten, Dense
from keras.models import Sequential

model = Sequential()
model.add(Conv2D(filters=16, kernel_size=2, padding='same', activation='relu', input_shape=(224, 224, 3)))
model.add(MaxPooling2D(pool_size=4, strides=2))
model.add(Conv2D(filters=32, kernel_size=2, padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=3, strides=2))
model.add(Conv2D(filters=64, kernel_size=2, padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=2, strides=2))    
model.add(Conv2D(filters=128, kernel_size=2, padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=2, strides=2))

#model.add(GlobalAveragePooling2D())   
model.add(Flatten())  # this converts our 3D feature maps to 1D feature vectors
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

model.summary()


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_85 (Conv2D)           (None, 224, 224, 16)      208       
_________________________________________________________________
max_pooling2d_85 (MaxPooling (None, 111, 111, 16)      0         
_________________________________________________________________
conv2d_86 (Conv2D)           (None, 111, 111, 32)      2080      
_________________________________________________________________
max_pooling2d_86 (MaxPooling (None, 55, 55, 32)        0         
_________________________________________________________________
conv2d_87 (Conv2D)           (None, 55, 55, 64)        8256      
_________________________________________________________________
max_pooling2d_87 (MaxPooling (None, 27, 27, 64)        0         
_________________________________________________________________
conv2d_88 (Conv2D)           (None, 27, 27, 128)       32896     
__________

### Compile the Model

In [252]:
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

### (IMPLEMENTATION) Train the Model

Train your model in the code cell below.  Use model checkpointing to save the model that attains the best validation loss.  

You are welcome to [augment the training data](https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html), but this is not a requirement. 

In [253]:
from keras.callbacks import ModelCheckpoint  

epochs = 10
batchSize = 2

checkpointer = ModelCheckpoint(filepath='saved_models/weights.best.from_scratch.real.aug.v9.hdf5', 
                               verbose=1, save_best_only=True)

# fits the model on batches with real-time data augmentation:
model.fit_generator(train_data_generator.flow(train_tensors, train_targets, batch_size=batchSize),
                    steps_per_epoch=len(train_tensors) / batchSize, 
                    epochs=epochs, callbacks=[checkpointer],
                    validation_data=valid_data_generator.flow(valid_tensors, valid_targets, batch_size=batchSize),
                    #validation_data=(valid_tensors, valid_targets),
                    validation_steps=len(valid_tensors) // batchSize,
                    verbose=1, workers=1)


Epoch 1/10
Epoch 2/10
Epoch 3/10


Epoch 4/10
Epoch 5/10


Epoch 6/10
Epoch 7/10


Epoch 8/10
Epoch 9/10


Epoch 10/10


<keras.callbacks.History at 0x29b48a576d8>

### (IMPLEMENTATION) Load the Model with the Best Validation Loss

In [254]:
model.load_weights('saved_models/weights.best.from_scratch.real.aug.v9.hdf5')

### (IMPLEMENTATION) Test the Model

Try out your model on the test dataset of dog images. Ensure that your test accuracy is greater than 60%.

In [255]:

# get index of predicted dog breed for each image in test set
dog_breed_predictions = [np.argmax(model.predict(np.expand_dims(tensor, axis=0))) for tensor in test_tensors]

# report test accuracy
test_accuracy = 100*np.sum(np.array(dog_breed_predictions)==np.argmax(test_targets, axis=1))/len(dog_breed_predictions)
print('Test accuracy: %.4f%%' % test_accuracy)


Test accuracy: 6.0000%
