# Assignment 04

Pedro Stramantinoli P. Cagliume Gomes 175955

Ruy Castilho Barrichelo 177012

In [0]:
  %matplotlib inline

tensorflow version: 1.12.0-rc0
scikit-learn version: 0.17
keras version: 2.2.4
tensorboard version: 1.10

# Transfer Learning
In this assignment, we will use the weights of a network pre-trained in a particular problem as starting point to train our CNN to a different problem. As training a network from scratch is time-consuming and demands a lot of data, this is a frequent strategy, specially if both datasets (the one used for pre-training and the target) shares similar structures/elements/concepts. 

This is specially true when working with images. Most filters learned in initial convolutional layers will detect low-level elements, such as borders, corners and color blobs, which are common to most problems in the image domain. 

In this notebook, we will load the SqueezeNet architecture trained in the ImageNet dataset and fine-tune it to CIFAR-10.

## Imports

In [0]:
import os
import numpy as np
from random import sample, seed
seed(42)
np.random.seed(42)

import matplotlib.pyplot as plt
# plt.rcParams['figure.figsize'] = (15,15) # Make the figures a bit bigger

from google.colab import files

# Keras imports
from keras.layers import Input, Convolution2D, MaxPooling2D, Activation, concatenate, Dropout, GlobalAveragePooling2D
from keras.models import Model
from keras import regularizers
from keras.optimizers import Adam
from keras.utils import np_utils
from keras.preprocessing.image import load_img, img_to_array
from keras.datasets import cifar10
from keras.callbacks import TensorBoard
from sklearn.model_selection import StratifiedShuffleSplit

#Utility to plot
def plotImages(imgList):
    for i in range(len(imgList)):
        plotImage(imgList[i])
        
        
def plotImage(img):
    fig = plt.figure(figsize=(3,3))
    ax = fig.add_subplot(111)

    ax.imshow(np.uint8(img), interpolation='nearest')
    plt.show()

## SqueezeNet definition
These methods define our architecture and load the weights obtained using ImageNet data.

In [0]:
# Fire Module Definition
sq1x1 = "squeeze1x1"
exp1x1 = "expand1x1"
exp3x3 = "expand3x3"
relu = "relu_"

def fire_module(x, fire_id, squeeze=16, expand=64):
    s_id = 'fire' + str(fire_id) + '/'
  
    channel_axis = 3
    
    x = Convolution2D(squeeze, (1, 1), padding='valid', name=s_id + sq1x1)(x)
    x = Activation('relu', name=s_id + relu + sq1x1)(x)

    left = Convolution2D(expand, (1, 1), padding='valid', name=s_id + exp1x1)(x)
    left = Activation('relu', name=s_id + relu + exp1x1)(left)

    right = Convolution2D(expand, (3, 3), padding='same', name=s_id + exp3x3)(x)
    right = Activation('relu', name=s_id + relu + exp3x3)(right)

    x = concatenate([left, right], axis=channel_axis, name=s_id + 'concat')
    return x

#SqueezeNet model definition
def SqueezeNet(input_shape):
    img_input = Input(shape=input_shape) #placeholder
    
    x = Convolution2D(64, (3, 3), strides=(2, 2), padding='valid', name='conv1')(img_input)
    x = Activation('relu', name='relu_conv1')(x)
    x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2), name='pool1')(x)

    x = fire_module(x, fire_id=2, squeeze=16, expand=64)
    x = fire_module(x, fire_id=3, squeeze=16, expand=64)
    x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2), name='pool3')(x)

    x = fire_module(x, fire_id=4, squeeze=32, expand=128)
    x = fire_module(x, fire_id=5, squeeze=32, expand=128)
    x = MaxPooling2D(pool_size=(3, 3), strides=(2, 2), name='pool5')(x)

    x = fire_module(x, fire_id=6, squeeze=48, expand=192)
    x = fire_module(x, fire_id=7, squeeze=48, expand=192)
    x = fire_module(x, fire_id=8, squeeze=64, expand=256)
    x = fire_module(x, fire_id=9, squeeze=64, expand=256)
    
    x = Dropout(0.5, name='drop9')(x)

    x = Convolution2D(1000, (1, 1), padding='valid', name='conv10')(x)
    x = Activation('relu', name='relu_conv10')(x)
    x = GlobalAveragePooling2D()(x)
    x = Activation('softmax', name='loss')(x)

    model = Model(img_input, x, name='squeezenet')

    # Download and load ImageNet weights
    model.load_weights('./squeezenet_weights_tf_dim_ordering_tf_kernels.h5')
    
    return model    

## CIFAR-10

The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. The class are **airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck**.

In [4]:
# Load data
(trainVal_data, trainVal_label), (X_test, y_test) = cifar10.load_data()
print("Train/Val data. X: ", trainVal_data.shape, ", Y: ", trainVal_label.shape)
print("Test data. X: ", X_test.shape, ", Y: ", y_test.shape)

Train/Val data. X:  (50000, 32, 32, 3) , Y:  (50000, 1)
Test data. X:  (10000, 32, 32, 3) , Y:  (10000, 1)


In [0]:
# Prepare data

# Scaling and Normalization
trainVal_data, X_test = trainVal_data/255, X_test/255

trainVal_data_mean = np.mean(trainVal_data, axis=0)

X_test_mean = np.mean(X_test, axis=0)

trainVal_data = trainVal_data - trainVal_data_mean
X_test = X_test - X_test_mean

# Encoding
trainVal_label = np_utils.to_categorical(trainVal_label)
y_test = np_utils.to_categorical(y_test)

# Train and Validation

X_train, y_train = trainVal_data[:40000], trainVal_label[0:40000]
X_val, y_val = trainVal_data[40000:], trainVal_label[40000:]

''-----------------
## SqueezeNet with frozen layers
Our initial attempt will be to remove SqueezeNet's top layers --- responsible for the classification into ImageNet classes --- and train a new set of layers to our CIFAR-10 classes. We will also freeze the layers before `drop9`. Our architecture will be like this:

<img src="frozenSqueezeNet.png" width=70% height=70%>

In [0]:
from google.colab import files
squeezeNetModel = SqueezeNet((32,32,3))

# Freeze layers
for layer in squeezeNetModel.layers:
    layer.trainable = False

# Popping last 4layers
for i in range(0, 4):
  squeezeNetModel.layers.pop()

# Add new classification layers
  
x = squeezeNetModel.layers[-1].output
x = Convolution2D(10, (1, 1), padding='valid', name='conv10')(x)
x = Activation('relu', name='relu_conv10')(x)
x = GlobalAveragePooling2D()(x)
x = Activation('softmax', name='loss')(x)

# New Model
model = Model(squeezeNetModel.inputs, x, name='squeezenet_new')

Now, we compile our model and train it:

In [7]:
import copy

# Compile model and train it.

# Compilation

model.compile(Adam(), loss='mean_squared_error', metrics=['accuracy'])

# Training
initial_results = []

# 1

batch_size=500
epochs=50

current_model = copy.deepcopy(model)

current_model.fit(x=X_train, y=y_train, batch_size=batch_size, epochs=epochs, verbose=1)

evaluated = current_model.evaluate(x=X_val, y=y_val, verbose=1)

initial_results.append({'model': current_model,  'result': evaluated, 'batch_size':batch_size, 'epochs': epochs})

# 2

batch_size=500
epochs=150

current_model = copy.deepcopy(model)

current_model.fit(x=X_train, y=y_train, batch_size=batch_size, epochs=epochs, verbose=1)

evaluated = current_model.evaluate(x=X_val, y=y_val, verbose=1)

initial_results.append({'model': current_model,  'result': evaluated, 'batch_size':batch_size, 'epochs': epochs})


# 3

batch_size=500
epochs=300

current_model = copy.deepcopy(model)

current_model.fit(x=X_train, y=y_train, batch_size=batch_size, epochs=epochs, verbose=1)

evaluated = current_model.evaluate(x=X_val, y=y_val, verbose=1)

initial_results.append({'model': current_model,  'result': evaluated, 'batch_size':batch_size, 'epochs': epochs})


# 4

batch_size=250
epochs=50

current_model = copy.deepcopy(model)

current_model.fit(x=X_train, y=y_train, batch_size=batch_size, epochs=epochs, verbose=1)

evaluated = current_model.evaluate(x=X_val, y=y_val, verbose=1)

initial_results.append({'model': current_model,  'result': evaluated, 'batch_size':batch_size, 'epochs': epochs})

# 5

batch_size=250
epochs=150

current_model = copy.deepcopy(model)

current_model.fit(x=X_train, y=y_train, batch_size=batch_size, epochs=epochs, verbose=1)

evaluated = current_model.evaluate(x=X_val, y=y_val, verbose=1)

initial_results.append({'model': current_model,  'result': evaluated, 'batch_size':batch_size, 'epochs': epochs})

# 6

batch_size=250
epochs=300

current_model = copy.deepcopy(model)

current_model.fit(x=X_train, y=y_train, batch_size=batch_size, epochs=epochs, verbose=1)

evaluated = current_model.evaluate(x=X_val, y=y_val, verbose=1)

initial_results.append({'model': current_model,  'result': evaluated, 'batch_size':batch_size, 'epochs': epochs})


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Epoch 1/150
Epoch 2/150
Epoch 3/150
Epoch 4/150
Epoch 5/150
Epoch 6/150
Epoch 7/150
Epoch 8/150
Epoch 9/150
Epoch 10/150
Epoch 11/150
Epoch 12/150
Epoch 13/150
Epoch 14/150
Epoch 15/150
Epoch 16/150
Epoch 17/150
Epoch 18/150
Epoch 19/150
Epoch 20/150
Epoch 21/150
Epoch 22/150
Epoch 23/150
Epoch 24/150
Epoch 25/150
Epoch 26/150
Epoch 27/150
Epoch 28/150
Epoch 29/150
Epoch 30/150
Epoch 31/150
Epoch 32/150
Ep

Finally, let's evaluate on our test set:

In [9]:
# Evaluate on validation:

print('Evaluation on validation sets\n')
for i in range(0,len(initial_results)):
  score = initial_results[i]
  print('#' + str(i+1), 'Validation loss:', score['result'][0])
  print('#' + str(i+1), 'Validation accuracy (NORMALIZED):', score['result'][1], '\n')

Evaluation on validation sets

#1 Validation loss: 0.07471009876728057
#1 Validation accuracy (NORMALIZED): 0.4067 

#2 Validation loss: 0.07427106039524078
#2 Validation accuracy (NORMALIZED): 0.4078 

#3 Validation loss: 0.07417579771280289
#3 Validation accuracy (NORMALIZED): 0.4101 

#4 Validation loss: 0.07454592949151993
#4 Validation accuracy (NORMALIZED): 0.4086 

#5 Validation loss: 0.07418544821739197
#5 Validation accuracy (NORMALIZED): 0.4087 

#6 Validation loss: 0.07421631144285203
#6 Validation accuracy (NORMALIZED): 0.4063 



-----------------
-----------------

# Training last 2 Fire Modules + classification layers
As we could see, the frozen network performed very poorly. By freezing most layers, we do not allow SqueezeNet to adapt its weights to features present in CIFAR-10.

Let's try to unfreeze the last two fire modules and train once more. The architecture will be:
<img src="partFrozenSqueezeNet.png" width=70% height=70%>

In [10]:
squeezeNetModel = SqueezeNet((32,32,3))

print([layer.name for layer in squeezeNetModel.layers])

# Freeze layers
for layer in squeezeNetModel.layers[:-19]: # The former print command shows that the first layer belonging to the 8th fire module is located 19 positions from the end of the array
    layer.trainable = False


# Popping last 4 layers
for i in range(0, 4):
  squeezeNetModel.layers.pop()

# Add new classification layers
x = squeezeNetModel.layers[-1].output
x = Convolution2D(10, (1, 1), padding='valid', name='conv10')(x)
x = Activation('relu', name='relu_conv10')(x)
x = GlobalAveragePooling2D()(x)
x = Activation('softmax', name='loss')(x)

#new Model
model2 = Model(squeezeNetModel.inputs, x, name='squeezenet_new')

['input_2', 'conv1', 'relu_conv1', 'pool1', 'fire2/squeeze1x1', 'fire2/relu_squeeze1x1', 'fire2/expand1x1', 'fire2/expand3x3', 'fire2/relu_expand1x1', 'fire2/relu_expand3x3', 'fire2/concat', 'fire3/squeeze1x1', 'fire3/relu_squeeze1x1', 'fire3/expand1x1', 'fire3/expand3x3', 'fire3/relu_expand1x1', 'fire3/relu_expand3x3', 'fire3/concat', 'pool3', 'fire4/squeeze1x1', 'fire4/relu_squeeze1x1', 'fire4/expand1x1', 'fire4/expand3x3', 'fire4/relu_expand1x1', 'fire4/relu_expand3x3', 'fire4/concat', 'fire5/squeeze1x1', 'fire5/relu_squeeze1x1', 'fire5/expand1x1', 'fire5/expand3x3', 'fire5/relu_expand1x1', 'fire5/relu_expand3x3', 'fire5/concat', 'pool5', 'fire6/squeeze1x1', 'fire6/relu_squeeze1x1', 'fire6/expand1x1', 'fire6/expand3x3', 'fire6/relu_expand1x1', 'fire6/relu_expand3x3', 'fire6/concat', 'fire7/squeeze1x1', 'fire7/relu_squeeze1x1', 'fire7/expand1x1', 'fire7/expand3x3', 'fire7/relu_expand1x1', 'fire7/relu_expand3x3', 'fire7/concat', 'fire8/squeeze1x1', 'fire8/relu_squeeze1x1', 'fire8/expa

Now, we compile our model and train it:

In [11]:
import copy

# Compile model and train it.

# Compilation

model2.compile(Adam(), loss='mean_squared_error', metrics=['accuracy'])

# Training
initial_results2 = []

# 1

batch_size=500
epochs=50

current_model = copy.deepcopy(model2)

current_model.fit(x=X_train, y=y_train, batch_size=batch_size, epochs=epochs, verbose=1)

evaluated = current_model.evaluate(x=X_val, y=y_val, verbose=1)

initial_results2.append({'model': current_model,  'result': evaluated, 'batch_size':batch_size, 'epochs': epochs})

# 2

batch_size=500
epochs=150

current_model = copy.deepcopy(model2)

current_model.fit(x=X_train, y=y_train, batch_size=batch_size, epochs=epochs, verbose=1)

evaluated = current_model.evaluate(x=X_val, y=y_val, verbose=1)

initial_results2.append({'model': current_model,  'result': evaluated, 'batch_size':batch_size, 'epochs': epochs})

# 3

batch_size=500
epochs=300

current_model = copy.deepcopy(model2)

current_model.fit(x=X_train, y=y_train, batch_size=batch_size, epochs=epochs, verbose=1)

evaluated = current_model.evaluate(x=X_val, y=y_val, verbose=1)

initial_results2.append({'model': current_model,  'result': evaluated, 'batch_size':batch_size, 'epochs': epochs})

# 4

batch_size=250
epochs=50

current_model = copy.deepcopy(model2)

current_model.fit(x=X_train, y=y_train, batch_size=batch_size, epochs=epochs, verbose=1)

evaluated = current_model.evaluate(x=X_val, y=y_val, verbose=1)

initial_results2.append({'model': current_model,  'result': evaluated, 'batch_size':batch_size, 'epochs': epochs})

# 5

batch_size=250
epochs=150

current_model = copy.deepcopy(model2)

current_model.fit(x=X_train, y=y_train, batch_size=batch_size, epochs=epochs, verbose=1)

evaluated = current_model.evaluate(x=X_val, y=y_val, verbose=1)

initial_results2.append({'model': current_model,  'result': evaluated, 'batch_size':batch_size, 'epochs': epochs})

# 6

batch_size=250
epochs=300

current_model = copy.deepcopy(model2)

current_model.fit(x=X_train, y=y_train, batch_size=batch_size, epochs=epochs, verbose=1)

evaluated = current_model.evaluate(x=X_val, y=y_val, verbose=1)

initial_results2.append({'model': current_model,  'result': evaluated, 'batch_size':batch_size, 'epochs': epochs})


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Epoch 1/150
Epoch 2/150
Epoch 3/150
Epoch 4/150
Epoch 5/150
Epoch 6/150
Epoch 7/150
Epoch 8/150
Epoch 9/150
Epoch 10/150
Epoch 11/150
Epoch 12/150
Epoch 13/150
Epoch 14/150
Epoch 15/150
Epoch 16/150
Epoch 17/150
Epoch 18/150
Epoch 19/150
Epoch 20/150
Epoch 21/150
Epoch 22/150
Epoch 23/150
Epoch 24/150
Epoch 25/150
Epoch 26/150
Epoch 27/150
Epoch 28/150
Epoch 29/150
Epoch 30/150
Epoch 31/150
Epoch 32/150
Ep

Finally, let's evaluate on our test set:

In [12]:
# Evaluate on validation:

print('Evaluation on validation sets\n')
for i in range(0,len(initial_results2)):
  score = initial_results2[i]
  print('#' + str(i+1), 'Validation loss:', score['result'][0])
  print('#' + str(i+1), 'Validation accuracy (NORMALIZED):', score['result'][1], '\n')

Evaluation on validation sets

#1 Validation loss: 0.06655586757659912
#1 Validation accuracy (NORMALIZED): 0.513 

#2 Validation loss: 0.08337904014587402
#2 Validation accuracy (NORMALIZED): 0.4851 

#3 Validation loss: 0.08842464905977249
#3 Validation accuracy (NORMALIZED): 0.4762 

#4 Validation loss: 0.07076282346248627
#4 Validation accuracy (NORMALIZED): 0.5018 

#5 Validation loss: 0.0876219997882843
#5 Validation accuracy (NORMALIZED): 0.4728 

#6 Validation loss: 0.09158251746892929
#6 Validation accuracy (NORMALIZED): 0.4654 



-----------
-----------
-----------
# Tensorboard

Tensorboard is a visualization tool for Tensorflow. Among other things, it allows us to monitor the progress of our training, plot metrics per epochs, visualize the architecture's schematics. 

Just like for Early Stopping, we will use the [Tensorboard callback](https://keras.io/callbacks/#tensorboard) to log the information about our training. An example of usage, would be:

## Just an example, DON'T RUN! 
### You will need to change <<LOG_DIR>>
import keras.callbacks as callbacks
tbCallBack = callbacks.TensorBoard(log_dir = "./<<LOG_DIR>>")
model.fit(..., callbacks=[tbCallBack])

As your training progresses, Keras will log the metrics (e.g., loss, accuracy) to `<<LOG_DIR>>` (**make sure `<<LOG_DIR>>` is a valid directory)**. On your terminal, you will need to run Tensorboard, assign a port and access it via browser (just like jupyter).

#### ----> MAKE SURE YOU USE A DIFFERENT PORT FOR JUPYTER AND TENSORBOARD <----

### Docker
For those using docker, open a new terminal and create a new container (using the same image) running Tensorboard:

$ docker run -it -p <<port_host>>:<<port_container>>
            --volume=<<LOG_DIR>>:<<LOG_DIR>>
            --name=<<container_name>> <<docker_image>> 
            tensorboard --logdir=<<LOG_DIR>> --port=<<port_container>>

For example:

$ docker run -it -p 8887:8887
            --volume=/your/path/ml2018/:/ml2018
            --name=mdc_container_tensorboard mdc-keras:cpu
            tensorboard --logdir=/ml2018/logs --port=8887

After starting Tensorboard, access it via browser on `http://localhost:<<port_container>>`.

### Anaconda
$ tensorboard --logdir=<<LOG_DIR>> --port=<<port>>

After starting Tensorboard, access it via browser on `http://localhost:<<port>>`.

-----------
-----------
-----------

# Fine-tuning all layers

What if we fine-tune all layers of SqueezeNet?
<img src="unfrozenSqueezeNet.png" width=70% height=70%>

In [0]:
squeezeNetModel = SqueezeNet((32,32,3))

for layer in squeezeNetModel.layers:
    layer.trainable = True       #by default they are all trainable, but just for clarification

# Popping last 4 layers
for i in range(0, 4):
  squeezeNetModel.layers.pop()

# Add new classification layers
x = squeezeNetModel.layers[-1].output
x = Convolution2D(10, (1, 1), padding='valid', name='conv10')(x)
x = Activation('relu', name='relu_conv10')(x)
x = GlobalAveragePooling2D()(x)
x = Activation('softmax', name='loss')(x)

#new Model
model3 = Model(squeezeNetModel.inputs, x, name='squeezenet_new')

Now, we compile our model and train it:

In [14]:
from time import time
import copy

#Tensorboard callback
#tbCallBack = TensorBoard(log_dir="./logs/rafa", write_graph=True)
tbCallBack = TensorBoard(log_dir="./logs/{}".format(time()), write_graph=True)

# Compile model and train it.

# Compilation

model3.compile(Adam(), loss='mean_squared_error', metrics=['accuracy'])

# Training
initial_results3 = []

# 1

batch_size=500
epochs=50

current_model = copy.deepcopy(model3)

current_model.fit(x=X_train, y=y_train, batch_size=batch_size, epochs=epochs, verbose=1)

evaluated = current_model.evaluate(x=X_val, y=y_val, verbose=1)

initial_results3.append({'model': current_model,  'result': evaluated, 'batch_size':batch_size, 'epochs': epochs})

# 2

batch_size=500
epochs=150

current_model = copy.deepcopy(model3)

current_model.fit(x=X_train, y=y_train, batch_size=batch_size, epochs=epochs, verbose=1)

evaluated = current_model.evaluate(x=X_val, y=y_val, verbose=1)

initial_results3.append({'model': current_model,  'result': evaluated, 'batch_size':batch_size, 'epochs': epochs})

# 3

batch_size=500
epochs=300

current_model = copy.deepcopy(model3)

current_model.fit(x=X_train, y=y_train, batch_size=batch_size, epochs=epochs, verbose=1)

evaluated = current_model.evaluate(x=X_val, y=y_val, verbose=1)

initial_results3.append({'model': current_model,  'result': evaluated, 'batch_size':batch_size, 'epochs': epochs})

# 4

batch_size=250
epochs=50

current_model = copy.deepcopy(model3)

current_model.fit(x=X_train, y=y_train, batch_size=batch_size, epochs=epochs, verbose=1)

evaluated = current_model.evaluate(x=X_val, y=y_val, verbose=1)

initial_results3.append({'model': current_model,  'result': evaluated, 'batch_size':batch_size, 'epochs': epochs})

# 5

batch_size=250
epochs=150

current_model = copy.deepcopy(model3)

current_model.fit(x=X_train, y=y_train, batch_size=batch_size, epochs=epochs, verbose=1)

evaluated = current_model.evaluate(x=X_val, y=y_val, verbose=1)

initial_results3.append({'model': current_model,  'result': evaluated, 'batch_size':batch_size, 'epochs': epochs})


# 6

batch_size=250
epochs=300

current_model = copy.deepcopy(model3)

current_model.fit(x=X_train, y=y_train, batch_size=batch_size, epochs=epochs, verbose=1)

evaluated = current_model.evaluate(x=X_val, y=y_val, verbose=1)

initial_results3.append({'model': current_model,  'result': evaluated, 'batch_size':batch_size, 'epochs': epochs})


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Epoch 1/150
Epoch 2/150
Epoch 3/150
Epoch 4/150
Epoch 5/150
Epoch 6/150
Epoch 7/150
Epoch 8/150
Epoch 9/150
Epoch 10/150
Epoch 11/150
Epoch 12/150
Epoch 13/150
Epoch 14/150
Epoch 15/150
Epoch 16/150
Epoch 17/150
Epoch 18/150
Epoch 19/150
Epoch 20/150
Epoch 21/150
Epoch 22/150
Epoch 23/150
Epoch 24/150
Epoch 25/150
Epoch 26/150
Epoch 27/150
Epoch 28/150
Epoch 29/150
Epoch 30/150
Epoch 31/150
Epoch 32/150
Ep

Finally, let's evaluate on our validation set:

In [15]:
# Evaluate on validation:

print('Evaluation on validation sets\n')
for i in range(0,len(initial_results3)):
  score = initial_results3[i]
  print('#' + str(i+1), 'Validation loss:', score['result'][0])
  print('#' + str(i+1), 'Validation accuracy (NORMALIZED):', score['result'][1], '\n')

Evaluation on validation sets

#1 Validation loss: 0.040314953559637066
#1 Validation accuracy (NORMALIZED): 0.7566 

#2 Validation loss: 0.041372676312923434
#2 Validation accuracy (NORMALIZED): 0.7634 

#3 Validation loss: 0.04131418175697327
#3 Validation accuracy (NORMALIZED): 0.77 

#4 Validation loss: 0.0391673731058836
#4 Validation accuracy (NORMALIZED): 0.7641 

#5 Validation loss: 0.04209419682919979
#5 Validation accuracy (NORMALIZED): 0.7548 

#6 Validation loss: 0.04793142475783825
#6 Validation accuracy (NORMALIZED): 0.7218 



In [16]:
# Evaluate your best model on test

best_model = initial_results3[2]['model']

test_evaluation = best_model.evaluate(x=X_test, y=y_test, verbose=1)

print('Test Loss:', test_evaluation[0])
print('Test ACcuracy:', test_evaluation[1])

Test Loss: 0.04257381748557091
Test ACcuracy: 0.7612


## Saving the model
Now that we are working on more complex tasks and our trainings are starting to take more time it is usually a good idea to save the trained model from time to time. [Keras has a lot of ways of saving and loading the model](https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model), but in this exercise we will use the simplest of them all: `model.save()`. It saves the architecture, the weights, the choice of loss function/optimizer/metrics and even the current state of the training, so you can resume your training later.

In [0]:
for i in range(0, len(initial_results)):
  model_1 = initial_results[i]['model']
  name = 'model_1_' + str(i) + '.h5' 
  model_1.save(name)  # creates a HDF5 file 'my_model.h5'
#   files.download(name)

In [0]:
for i in range(0, len(initial_results2)):
  model_2 = initial_results2[i]['model']
  name = 'model_2_' + str(i) + '.h5' 
  model_2.save(name)  # creates a HDF5 file 'my_model.h5'
#   files.download(name)

In [0]:
for i in range(0, len(initial_results3)):
  model_3 = initial_results3[i]['model']
  name = 'model_3_' + str(i) + '.h5' 
  model_3.save(name)  # creates a HDF5 file 'my_model.h5'
  files.download(name)

## Loading a model
Once we have our model trained, we can load it using:

In [30]:
from keras.models import load_model

# returns a compiled model identical to the previous one
loaded_model = load_model('model_3_2.h5')

score = loaded_model.evaluate(x=X_test, y=y_test, verbose=1)

# evaluate test set again... should give us the same result
print('Test loss:', score[0])
print('Test accuracy (NORMALIZED):', score[1])

Test loss: 0.04257381748557091
Test accuracy (NORMALIZED): 0.7612
