# Saving model every training epoch

It is very common that models overfit. It is not possible to tell in advance how many epochs one should train.

Solution is to save model coefficients (weights) every epoch, and then examine those weights epoch by epoch, to see which one is better. This is what we are doing here

For this demonstration, I have to:
1. Load data and reshape them so they fit to the chosen model
2. Define model. I'm going to use simplest interconnected NN
3. Save model architecture
4. Initialize callback function, which will save weigths every epoch
5. Train the model
6. Load model architecture
7. Load model weights

In [84]:
import keras

from keras.models import Sequential
from keras.models import model_from_json
from keras.layers import Dense, Dropout

# Getting MNIST data set for the demonstration
from keras.datasets import mnist

# Loading data

In [2]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()

In [3]:
x_train.shape, y_train.shape

((60000, 28, 28), (60000,))

In [4]:
y_train

array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

# Reshaping for interconnected network

I am going to use simplest interconnected network to train model

In [5]:
training_data_size = x_train.shape[0]
test_data_size = x_test.shape[0]

training_num_features = x_train.shape[1] * x_train.shape[2]
test_num_features = x_test.shape[1] * x_test.shape[2]

In [6]:
training_num_features

784

In [7]:
x_tr_interconn = x_train.reshape((training_data_size, training_num_features))
x_te_interconn = x_test.reshape((test_data_size, test_num_features))

In [8]:
x_tr_interconn.shape, x_te_interconn.shape

((60000, 784), (10000, 784))

# Converting results matrix into binary class matrix

In [9]:
y_train

array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

In [12]:
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)

In [13]:
y_train.shape

(60000, 10)

# Building model

In [61]:
model = Sequential()

model.add(Dense(30, input_shape=(training_num_features,), activation='sigmoid'))
model.add(Dropout(0.2))

model.add(Dense(10, activation='sigmoid'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.SGD(),
              metrics=['accuracy'])

In [62]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_36 (Dense)             (None, 30)                23550     
_________________________________________________________________
dropout_26 (Dropout)         (None, 30)                0         
_________________________________________________________________
dense_37 (Dense)             (None, 10)                310       
Total params: 23,860
Trainable params: 23,860
Non-trainable params: 0
_________________________________________________________________


# Saving model architecture

In [70]:
weights_dir = 'Keras_Model_Checkpoint__Saving_Weights_Every_Epoch/'

In [71]:
!mkdir $weights_dir

mkdir: cannot create directory ‘Keras_Model_Checkpoint__Saving_Weights_Every_Epoch/’: File exists


In [79]:
json_path = weights_dir + 'model.json'
model_json = model.to_json() 
with open(json_path, 'w') as json_file:
    json_file.write(model_json)

# Initializing callback for saving model weigths

In [72]:
filepath = weights_dir + 'weights.{epoch:02d}-{val_loss:.2f}.hdf5'

In [73]:
m_check = keras.callbacks.ModelCheckpoint(filepath, 
                                monitor='val_loss', 
                                verbose=0, 
                                save_best_only=False, 
                                save_weights_only=False, 
                                mode='auto', 
                                period=1)

# Training model

In [74]:
model.fit(x_tr_interconn, y_train, 
          batch_size=128,
          epochs=10,
          verbose=2,
          validation_data=(x_te_interconn, y_test),
          callbacks=[m_check])

Train on 60000 samples, validate on 10000 samples
Epoch 1/10
1s - loss: 0.5506 - acc: 0.8643 - val_loss: 0.4277 - val_acc: 0.8977
Epoch 2/10
1s - loss: 0.5319 - acc: 0.8644 - val_loss: 0.4130 - val_acc: 0.9002
Epoch 3/10
1s - loss: 0.5182 - acc: 0.8687 - val_loss: 0.4082 - val_acc: 0.9005
Epoch 4/10
1s - loss: 0.5011 - acc: 0.8721 - val_loss: 0.3871 - val_acc: 0.9004
Epoch 5/10
1s - loss: 0.4931 - acc: 0.8709 - val_loss: 0.3844 - val_acc: 0.9022
Epoch 6/10
1s - loss: 0.4772 - acc: 0.8729 - val_loss: 0.3701 - val_acc: 0.9073
Epoch 7/10
1s - loss: 0.4715 - acc: 0.8752 - val_loss: 0.3629 - val_acc: 0.9053
Epoch 8/10
1s - loss: 0.4603 - acc: 0.8778 - val_loss: 0.3526 - val_acc: 0.9063
Epoch 9/10
1s - loss: 0.4557 - acc: 0.8797 - val_loss: 0.3529 - val_acc: 0.9053
Epoch 10/10
1s - loss: 0.4493 - acc: 0.8794 - val_loss: 0.3577 - val_acc: 0.9059


<keras.callbacks.History at 0x7ff5f44f6d10>

In [75]:
score = model.evaluate(x_te_interconn, y_test, verbose=0)

In [76]:
score

[0.35771372845172883, 0.90590000000000004]

# Loading saved model

In [80]:
!ls $weights_dir

model.json	      weights.03-0.39.hdf5  weights.07-0.35.hdf5
weights.00-0.43.hdf5  weights.04-0.38.hdf5  weights.08-0.35.hdf5
weights.01-0.41.hdf5  weights.05-0.37.hdf5  weights.09-0.36.hdf5
weights.02-0.41.hdf5  weights.06-0.36.hdf5


## Loading model architecture

In [81]:
json_file = open(weights_dir + 'model.json', 'r')
loaded_model_json = json_file.read()
json_file.close()

In [85]:
loaded_model = model_from_json(loaded_model_json)

## Loading model weights

For this example, I'll load model as it was at epoch 6.

In [87]:
loaded_model.load_weights(weights_dir+'weights.06-0.36.hdf5')

In [89]:
loaded_model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.SGD(),
              metrics=['accuracy'])

In [90]:
score = loaded_model.evaluate(x_te_interconn, y_test, verbose=0)

In [91]:
score

[0.36290293645858762, 0.90529999999999999]