<a href="https://colab.research.google.com/github/AjeetSingh02/Notebooks/blob/master/save_and_load_from_checkpoint.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install pyyaml h5py  # Required to save models in HDF5 format



In [2]:
import os

import tensorflow as tf
from tensorflow import keras

print(tf.version.VERSION)

2.4.1


### Get an example dataset

To demonstrate how to save and load weights, you'll use the [MNIST dataset](http://yann.lecun.com/exdb/mnist/). To speed up these runs, use the first 1000 examples:

In [3]:
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()

train_labels = train_labels[:1000]
test_labels = test_labels[:1000]

train_images = train_images[:1000].reshape(-1, 28 * 28) / 255.0
test_images = test_images[:1000].reshape(-1, 28 * 28) / 255.0

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


### Define a model

Start by building a simple sequential model:

In [4]:
# Define a simple sequential model
def create_model():
  model = tf.keras.models.Sequential([
    keras.layers.Dense(512, activation='relu', input_shape=(784,)),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(10)
  ])

  model.compile(optimizer='adam',
                loss=tf.losses.SparseCategoricalCrossentropy(from_logits=True),
                metrics=[tf.metrics.SparseCategoricalAccuracy()])

  return model

# Create a basic model instance
model = create_model()

# Display the model's architecture
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 512)               401920    
_________________________________________________________________
dropout (Dropout)            (None, 512)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                5130      
Total params: 407,050
Trainable params: 407,050
Non-trainable params: 0
_________________________________________________________________


## Save checkpoints during training

You can use a trained model without having to retrain it, or pick-up training where you left off in case the training process was interrupted. The `tf.keras.callbacks.ModelCheckpoint` callback allows you to continually save the model both *during* and at *the end* of training.

### Checkpoint callback usage

Create a `tf.keras.callbacks.ModelCheckpoint` callback that saves weights only during training:

In [5]:
checkpoint_path = "training_1/cp.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)

# Create a callback that saves the model's weights
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
                                                 save_weights_only=True,
                                                 verbose=1)

In [6]:
# Train the model with the new callback
model.fit(train_images, 
          train_labels,  
          epochs=10,
          validation_data=(test_images, test_labels),
          callbacks=[cp_callback])  # Pass callback to training

Epoch 1/10

Epoch 00001: saving model to training_1/cp.ckpt
Epoch 2/10

Epoch 00002: saving model to training_1/cp.ckpt
Epoch 3/10

Epoch 00003: saving model to training_1/cp.ckpt
Epoch 4/10

Epoch 00004: saving model to training_1/cp.ckpt
Epoch 5/10

Epoch 00005: saving model to training_1/cp.ckpt
Epoch 6/10

Epoch 00006: saving model to training_1/cp.ckpt
Epoch 7/10

Epoch 00007: saving model to training_1/cp.ckpt
Epoch 8/10

Epoch 00008: saving model to training_1/cp.ckpt
Epoch 9/10

Epoch 00009: saving model to training_1/cp.ckpt
Epoch 10/10

Epoch 00010: saving model to training_1/cp.ckpt


<tensorflow.python.keras.callbacks.History at 0x7f007032db10>

In [7]:
model = create_model()

In [8]:
# Train the model with the new callback
model.load_weights(checkpoint_path)

<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7f0070039210>

In [9]:
# Look at the loss. If loss is lower then the last loss of last training then it has 
# started from last checkpoint only.
model.fit(train_images, 
          train_labels,  
          epochs=10,
          validation_data=(test_images, test_labels),
          callbacks=[cp_callback])  # Pass callback to training

Epoch 1/10

Epoch 00001: saving model to training_1/cp.ckpt
Epoch 2/10

Epoch 00002: saving model to training_1/cp.ckpt
Epoch 3/10

Epoch 00003: saving model to training_1/cp.ckpt
Epoch 4/10

Epoch 00004: saving model to training_1/cp.ckpt
Epoch 5/10

Epoch 00005: saving model to training_1/cp.ckpt
Epoch 6/10

Epoch 00006: saving model to training_1/cp.ckpt
Epoch 7/10

Epoch 00007: saving model to training_1/cp.ckpt
Epoch 8/10

Epoch 00008: saving model to training_1/cp.ckpt
Epoch 9/10

Epoch 00009: saving model to training_1/cp.ckpt
Epoch 10/10

Epoch 00010: saving model to training_1/cp.ckpt


<tensorflow.python.keras.callbacks.History at 0x7f00381502d0>

In [None]:
# import numpy as np
# from numpy.testing import assert_allclose
# from tensorflow.keras.models import Sequential, load_model
# from tensorflow.keras.layers import LSTM, Dropout, Dense
# from tensorflow.keras.callbacks import ModelCheckpoint

# vec_size = 100
# n_units = 10

# x_train = np.random.rand(500, 10, vec_size)
# y_train = np.random.rand(500, vec_size)

# model = Sequential()
# model.add(LSTM(n_units, input_shape=(None, vec_size), return_sequences=True))
# model.add(Dropout(0.2))
# model.add(LSTM(n_units, return_sequences=True))
# model.add(Dropout(0.2))
# model.add(LSTM(n_units))
# model.add(Dropout(0.2))
# model.add(Dense(vec_size, activation='linear'))
# model.compile(loss='mean_squared_error', optimizer='adam')

# # define the checkpoint
# filepath = "model.h5"
# checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=1, save_best_only=True, mode='min')
# callbacks_list = [checkpoint]

# # fit the model
# model.fit(x_train, y_train, epochs=5, batch_size=50, callbacks=callbacks_list)

# # load the model
# new_model = load_model("model.h5")
# assert_allclose(model.predict(x_train),
#                 new_model.predict(x_train),
#                 1e-5)

# # fit the model
# checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=1, save_best_only=True, mode='min')
# callbacks_list = [checkpoint]
# new_model.fit(x_train, y_train, epochs=5, batch_size=50, callbacks=callbacks_list)