# CS470 Introduction to Artificial Intelligence
## Deep Learning Practice 
#### TA. Minho Sim


---

## 2-4. Save and restore models

#### Topics for this chapter
 * Saving weights during training
 * Restoring the saved model
 * Saving manually
 * Saving/loading the entire model
 ---

Model progress can be saved during—and after—training. This means that a model can resume where it ended the training and resume at that point. Saving also means you can share your model and others can recreate or reproduce your work. When publishing research models and ML techniques, most machine learning practitioners share:
- code to create the model
- the trained weights, or parameters, for the model

Sharing these data helps others understand how the model works and try it themselves with new or original data.

#### Setup

Install and import TensorFlow and dependencies:

In [2]:
try:
  # %tensorflow_version only exists in Colab.
  %tensorflow_version 2.x
except Exception:
    pass

!pip install -q pyyaml h5py 

In [3]:
from __future__ import absolute_import, division, print_function, unicode_literals

import os

import tensorflow as tf
from tensorflow import keras

print(tf.version.VERSION)

2.6.0


#### Get an example dataset

To demonstrate how to save and load weights, you'll use the MNIST dataset. To speed up these runs, we will use only the first 1000 examples:

In [4]:
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()

train_labels = train_labels[:1000]
test_labels = test_labels[:1000]

train_images = train_images[:1000].reshape(-1, 28 * 28) / 255.0
test_images = test_images[:1000].reshape(-1, 28 * 28) / 255.0

#### Define model 
Start by building a simple sequential model:

In [5]:
# Define a simple sequential model
def create_model():
    model = tf.keras.models.Sequential([
        keras.layers.Dense(512, activation='relu', input_shape=(784,)),
        keras.layers.Dropout(0.2),
        keras.layers.Dense(10, activation='softmax')
      ])
    
    model.compile(optimizer='adam',
                loss='sparse_categorical_crossentropy',
                metrics=['accuracy'])
    
    return model

# Create a basic model instance
model = create_model()

# Display the model's architecture
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 512)               401920    
_________________________________________________________________
dropout (Dropout)            (None, 512)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                5130      
Total params: 407,050
Trainable params: 407,050
Non-trainable params: 0
_________________________________________________________________


#### Save checkpoints during training (Save weights only)

When you have a _trained model_, you don't have to retrain it from the scratch. You can just pick-up training where you left off—in case the training process was interrupted. The callback function in [`tf.keras.callbacks.ModelCheckpoint`](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/callbacks/ModelCheckpoint)  allows us to continually save the model during and at the end of training.

#### Checkpoint callback usage
Create a tf.keras.callbacks.ModelCheckpoint callback that **_saves weights only_** during training:

In [6]:
import os

ckpt_path_for_t1 = "checkpoints/training_1/cp.ckpt"
ckpt_dir_for_t1 = os.path.dirname(ckpt_path_for_t1)

# Create a callback that saves the model's weights
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=ckpt_path_for_t1,
                                                 save_weights_only=True,
                                                 save_best_only=True)

In [7]:
# Train the model with the new callback
model.fit(train_images, 
          train_labels,  
          epochs=10,
          validation_data=(test_images,test_labels),
          callbacks=[cp_callback])  # Pass callback to training

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x24b9a0abfc8>

This creates a single collection of TensorFlow checkpoint files that are updated at the end of each epoch:

In [8]:
print(ckpt_dir_for_t1)

checkpoints/training_1


#### Restore to the untrained model

Now let's rebuild a fresh and untrained model by calling `create_model()`. We can evaluate it on the test set in order to check that it is not trained. An untrained model will have very low accuracy (< 10%).

In [9]:
# Create a basic model instance
new_model = create_model()

# Evaluate the model
loss, acc = new_model.evaluate(test_images, test_labels, verbose=2)
print("Untrained model, accuracy: {:5.2f}%".format(100*acc))

32/32 - 0s - loss: 2.4716 - accuracy: 0.0390
Untrained model, accuracy:  3.90%


In this time, let's load the weights from the checkpoint and re-evaluate the test dataset. We can load the weights using `load_weights()`. Since they have the same model architecture, we can share weights despite that it's a different instance of the model.

In [10]:
# Loads the weights
new_model.load_weights(ckpt_path_for_t1)

# Re-evaluate the model
loss,acc = new_model.evaluate(test_images, test_labels, verbose=2)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

32/32 - 0s - loss: 0.4052 - accuracy: 0.8710
Restored model, accuracy: 87.10%


<br/>

#### Requirement for load weights

When restoring a model from weights-only, we must have a model with the **same architecture as the original model**. When we restore weights to the model with different architecture, an error occurs.

In [11]:
# Define a different sequential model
def different_model():
    model = tf.keras.models.Sequential([
        keras.layers.Dense(100, activation='relu', input_shape=(784,)),
        keras.layers.Dense(10, activation='softmax')
      ])
    
    model.compile(optimizer='adam',
                loss='sparse_categorical_crossentropy',
                metrics=['accuracy'])
    
    return model

different_model = different_model()

different_model.load_weights(ckpt_path_for_t1)

ValueError: Shapes (100,) and (512,) are incompatible

<br/><br/>
#### Manually save weights

You saw how to load the weights into a model. Another way to save the weights is manually saving them by using `Model.save_weights()`. By default, `save_weights` in TensorFlow saves checkpoints format with a .ckpt extension.

In [12]:
# Save the weights
model.save_weights('./checkpoints/manual_checkpoint/cur_weights')

In [20]:
# Create a new model instance
model = create_model()

# Restore the weights
model.load_weights('./checkpoints/manual_checkpoint/cur_weights')

# Evaluate the model
loss,acc = model.evaluate(test_images, test_labels, verbose=0)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

Restored model, accuracy: 86.40%


<br/><br/>

#### Save the entire model 

Not only weights, the entire model including optimizer and other settings can be saved to a single file. This allows us to export a model and use it without access to the original Python code for creating a model. Since the optimizer-state is recovered, we can resume training from exactly where we left off.

We will save the entire model by using HDF5 file extension.

#### Save model as HDF5 file

Keras provides a basic save format using the [HDF5](https://en.wikipedia.org/wiki/Hierarchical_Data_Format) standard.

In [21]:
# Create a new model instance
model = create_model()

# Train the model
model.fit(train_images, train_labels, epochs=10)

# Save the entire model to a HDF5 file
save_dir = 'saved_models'
if not os.path.exists(save_dir):
    os.makedirs(save_dir)
    
model.save(os.path.join(save_dir, 'my_model.h5'))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [22]:
!ls saved_models

'ls'은(는) 내부 또는 외부 명령, 실행할 수 있는 프로그램, 또는
배치 파일이 아닙니다.


<br/><br/>

#### Restore the entire model

Now, we will load the entire saved model by using `keras.models.load_model()`. We need a path for the saved model as a parameter.

In [23]:
# Recreate the exact same model, including its weights and the optimizer
new_model = keras.models.load_model(os.path.join(save_dir, 'my_model.h5'))

# Show the model architecture
new_model.summary()

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_8 (Dense)              (None, 512)               401920    
_________________________________________________________________
dropout_3 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_9 (Dense)              (None, 10)                5130      
Total params: 407,050
Trainable params: 407,050
Non-trainable params: 0
_________________________________________________________________


<br/><br/>

By checking the model's accuracy, we can check whether it is loaded well or not.

In [24]:
loss, acc = new_model.evaluate(test_images, test_labels, verbose=0)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

Restored model, accuracy: 86.00%


<br/>

This **save the entire model** technique saves following attributes:

- The weight values
- The model's configuration (architecture)
- The optimizer configuration

In [25]:
print(new_model.optimizer)
print(new_model.loss)

<keras.optimizer_v2.adam.Adam object at 0x0000024DC4582E88>
<function sparse_categorical_crossentropy at 0x0000024B99197048>
