# Save and load models

#### Table of contents
1. Options   
2. Setup   
   -Installs and imports   
   -Get an example dataset   
   -Define a model   
3. Save checkpoints during training   
   -Checkpoint callback usage   
   -Checkpoint callback options   
4. What are these files?   
5. Manually save weights   
6. Save the entire model   
   -SavedModel format   
   -HDF5 format   
   -Saving custom objects   

모델 진행 상황은 훈련 중 및 훈련 후에 저장할 수 있습니다. 즉, 모델이 중단된 위치에서 다시 시작하고 긴 훈련 시간을 피할 수 있습니다. 저장은 또한 모델을 공유할 수 있고 다른 사람들이 작업을 다시 만들 수 있음을 의미합니다. 연구 모델 및 기술을 게시할 때 대부분의 기계 학습 실무자는 다음을 공유합니다.

   - 모델을 생성하는 코드 및
   - 모델에 대해 훈련된 가중치 또는 매개변수
   
이 데이터를 공유하면 다른 사람들이 모델의 작동 방식을 이해하고 새 데이터로 직접 시도하는 데 도움이 됩니다.

## Options

사용 중인 API에 따라 TensorFlow 모델을 저장하는 다양한 방법이 있습니다. 이 가이드는 TensorFlow에서 모델을 빌드하고 훈련하기 위해 고수준 API인 tf.keras를 사용합니다.

## Setup

### Installs and imports

Install and import TensorFlow and dependencies:

In [1]:
pip install -q pyyaml h5py

You should consider upgrading via the '/opt/conda/bin/python -m pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.


In [3]:
import os
import tensorflow as tf
from tensorflow import keras

print(tf.version.VERSION)

2.4.1


## Get an example dataset

가중치를 저장하고 로드하는 방법을 보여주기 위해 MNIST 데이터 세트를 사용합니다 . 이러한 실행 속도를 높이려면 처음 1000개의 예를 사용하십시오.

In [6]:
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()

train_labels = train_labels[:1000]
test_labels = test_labels[:1000]

train_images = train_images[:1000].reshape(-1, 28*28) / 255.0
test_images = test_images[:1000].reshape(-1, 28*28) /255.0

## Define a model

Start by building a simple sequential model:

In [13]:
# Define a simple sequential model
def create_model():
    model = tf.keras.models.Sequential([
        keras.layers.Dense(512, activation='relu', input_shape=(784,)),
        keras.layers.Dropout(0.2),
        keras.layers.Dense(10)
  ])
    
    model.compile(optimizer='adam',
                loss=tf.losses.SparseCategoricalCrossentropy(from_logits=True),
                metrics=[tf.metrics.SparseCategoricalAccuracy()])

    return model

In [14]:
model = create_model()

In [15]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_2 (Dense)              (None, 512)               401920    
_________________________________________________________________
dropout_1 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 10)                5130      
Total params: 407,050
Trainable params: 407,050
Non-trainable params: 0
_________________________________________________________________


## Save checkpoints during training
훈련된 모델을 다시 훈련할 필요 없이 사용하거나 훈련 과정이 중단된 경우 중단한 부분에서 훈련을 다시 시작할 수 있습니다. 

The tf.keras.callbacks.ModelCheckpoint callback allows you to continually save the model both during and at the end of training.

## Checkpoint callback usage
tf.keras.callbacks.ModelCheckpoint훈련 중에만 가중치를 저장 하는 콜백을 만듭니다

In [17]:
checkpoint_path = 'training_1/cp.ckpt'
checkpoint_dir = os.path.dirname(checkpoint_path)

In [20]:
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
                                                 save_weights_only=True,
                                                 verbose=1)

In [21]:
model.fit(train_images, train_labels, epochs = 10,
         validation_data = (test_images, test_labels),
         callbacks=[cp_callback])

Epoch 1/10

Epoch 00001: saving model to training_1/cp.ckpt
Epoch 2/10

Epoch 00002: saving model to training_1/cp.ckpt
Epoch 3/10

Epoch 00003: saving model to training_1/cp.ckpt
Epoch 4/10

Epoch 00004: saving model to training_1/cp.ckpt
Epoch 5/10

Epoch 00005: saving model to training_1/cp.ckpt
Epoch 6/10

Epoch 00006: saving model to training_1/cp.ckpt
Epoch 7/10

Epoch 00007: saving model to training_1/cp.ckpt
Epoch 8/10

Epoch 00008: saving model to training_1/cp.ckpt
Epoch 9/10

Epoch 00009: saving model to training_1/cp.ckpt
Epoch 10/10

Epoch 00010: saving model to training_1/cp.ckpt


<tensorflow.python.keras.callbacks.History at 0x7fece18a3150>

This creates a single collection of TensorFlow checkpoint files that are updated at the end of each epoch:

In [22]:
os.listdir(checkpoint_dir)

['cp.ckpt.index', 'cp.ckpt.data-00000-of-00001', 'checkpoint']

두 모델이 동일한 아키텍처를 공유하는 한 두 모델 간에 가중치를 공유할 수 있습니다. 따라서 weight만에서 모델을 복원할 때 원래 모델과 동일한 아키텍처로 모델을 생성한 다음 가중치를 설정합니다.

이제 훈련되지 않은 새로운 모델을 다시 빌드하고 테스트 세트에서 평가하십시오. 훈련되지 않은 모델은 확률 수준(~10% 정확도)에서 수행됩니다.

In [23]:
model = create_model()

In [27]:
loss, acc = model.evaluate(test_images, test_labels, verbose=2)
print("Untrained model, accuracy: {:5.2f}%".format(100 * acc))

32/32 - 0s - loss: 2.3385 - sparse_categorical_accuracy: 0.0970
Untrained model, accuracy:  9.70%


그런 다음 체크포인트에서 가중치를 로드하고 다시 평가합니다.

In [28]:
model.load_weights(checkpoint_path)

loss, acc = model.evaluate(test_images, test_labels, verbose=2)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

32/32 - 0s - loss: 0.4015 - sparse_categorical_accuracy: 0.8750
Restored model, accuracy: 87.50%


## Checkpoint callback options

콜백은 체크포인트에 고유한 이름을 제공하고 체크포인트 빈도를 조정하는 몇 가지 옵션을 제공합니다.

새 모델을 훈련하고 5개의 Epoch마다 고유한 이름의 체크포인트를 저장합니다.

In [29]:
checkpoint_path = "training_2/cp-{epoch:04d}.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)

In [31]:
batch_size = 32

cp_callback = tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_path,
    verbose=1,
    save_weights_only=True,
    save_freq=5*batch_size)

In [32]:
model = create_model()

In [35]:
model.save_weights(checkpoint_path.format(epoch=0))

In [36]:
model.fit(train_images, train_labels, epochs=50, batch_size=batch_size,
         callbacks=[cp_callback],
         validation_data=(test_images, test_labels),
         verbose=0)


Epoch 00005: saving model to training_2/cp-0005.ckpt

Epoch 00010: saving model to training_2/cp-0010.ckpt

Epoch 00015: saving model to training_2/cp-0015.ckpt

Epoch 00020: saving model to training_2/cp-0020.ckpt

Epoch 00025: saving model to training_2/cp-0025.ckpt

Epoch 00030: saving model to training_2/cp-0030.ckpt

Epoch 00035: saving model to training_2/cp-0035.ckpt

Epoch 00040: saving model to training_2/cp-0040.ckpt

Epoch 00045: saving model to training_2/cp-0045.ckpt

Epoch 00050: saving model to training_2/cp-0050.ckpt


<tensorflow.python.keras.callbacks.History at 0x7fecd026ccd0>

In [37]:
os.listdir(checkpoint_dir)

['cp-0025.ckpt.data-00000-of-00001',
 'cp-0025.ckpt.index',
 'cp-0050.ckpt.index',
 'cp-0015.ckpt.data-00000-of-00001',
 'cp-0045.ckpt.data-00000-of-00001',
 'cp-0040.ckpt.index',
 'cp-0000.ckpt.index',
 'cp-0030.ckpt.index',
 'cp-0010.ckpt.data-00000-of-00001',
 'cp-0005.ckpt.data-00000-of-00001',
 'cp-0020.ckpt.index',
 'cp-0020.ckpt.data-00000-of-00001',
 'cp-0040.ckpt.data-00000-of-00001',
 'cp-0030.ckpt.data-00000-of-00001',
 'cp-0010.ckpt.index',
 'cp-0035.ckpt.index',
 'cp-0000.ckpt.data-00000-of-00001',
 'cp-0005.ckpt.index',
 'cp-0035.ckpt.data-00000-of-00001',
 'cp-0050.ckpt.data-00000-of-00001',
 'cp-0015.ckpt.index',
 'cp-0045.ckpt.index',
 'checkpoint']

In [38]:
latest = tf.train.latest_checkpoint(checkpoint_dir)
latest

'training_2/cp-0050.ckpt'

테스트하려면 모델을 재설정하고 최신 체크포인트를 로드하세요.

In [39]:
model = create_model()

model.load_weights(latest)

<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7fecd82c9590>

In [41]:
loss, acc = model.evaluate(test_images, test_labels, verbose=2)


print("Restored model, accuracy :{:5.2f}%".format(100*acc))

32/32 - 0s - loss: 0.4888 - sparse_categorical_accuracy: 0.8710
Restored model, accuracy :87.10%


## What are these files?

The above code stores the weights to a collection of checkpoint-formatted files that contain only the trained weights in a binary format. Checkpoints contain:

   - One or more shards that contain your model's weights.
   - An index file that indicates which weights are stored in which shard.

If you are training a model on a single machine, you'll have one shard with the suffix: .data-00000-of-00001

## Manually save weights

Manually saving weights with the Model.save_weights method. By default, tf.keras—and save_weights in particular—uses the TensorFlow checkpoint format with a .ckpt extension (saving in HDF5 with a .h5 extension is covered in the Save and serialize models guide):

In [42]:
model.save_weights('/checkpoints/my_checkpoint')

In [43]:
model = create_model()

In [46]:
model.load_weights('/checkpoints/my_checkpoint')

<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7fecd013ef50>

In [47]:
loss, acc = model.evaluate(test_images, test_labels, verbose=2)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

32/32 - 0s - loss: 0.4888 - sparse_categorical_accuracy: 0.8710
Restored model, accuracy: 87.10%


## Save the entire model

model.save단일 파일/폴더에 모델의 아키텍처, 가중치 및 교육 구성을 저장하기 위해 호출 합니다. 이렇게 하면 원본 Python 코드에 액세스하지 않고도 사용할 수 있도록 모델을 내보낼 수 있습니다*. 옵티마이저 상태가 복구되었으므로 중단한 위치에서 정확히 훈련을 재개할 수 있습니다.

전체 모델은 두 가지 다른 파일 형식( SavedModel및 HDF5) 으로 저장할 수 있습니다 . TensorFlow SavedModel형식은 TF2.x의 기본 파일 형식입니다. 그러나 모델은 HDF5형식 으로 저장할 수 있습니다 . 전체 모델을 두 가지 파일 형식으로 저장하는 방법에 대한 자세한 내용은 아래에 설명되어 있습니다.

완전한 기능을 갖춘 모델을 저장하는 것은 매우 유용합니다. TensorFlow.js( 저장된 모델 , HDF5 ) 에서 로드 한 다음 웹 브라우저에서 훈련 및 실행하거나 TensorFlow Lite( 저장된 모델 , HDF5)를 사용하여 모바일 장치에서 실행되도록 변환할 수 있습니다. )

*사용자 정의 개체(예: 하위 클래스 모델 또는 레이어)는 저장 및 로드할 때 특별한 주의가 필요합니다. 아래의 사용자 정의 개체 저장 섹션을 참조하십시오.

## SavedModel format

SavedModel 형식은 모델을 직렬화하는 또 다른 방법입니다. 이 형식으로 저장된 모델 tf.keras.models.load_model은 TensorFlow Serving을 사용하여 복원할 수 있으며 호환됩니다. SavedModel 가이드는 SavedModel 저장/제공하는 방법에 대한 내용이수록. 아래 섹션에서는 모델을 저장하고 복원하는 단계를 보여줍니다.

In [49]:
model = create_model()
model.fit(train_images, train_labels, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7fed51333a90>

The SavedModel format is a directory containing a protobuf binary and a TensorFlow checkpoint. Inspect the saved model directory:

In [51]:
# Save the entire model as a SavedModel.
!mkdir -p saved_model
model.save('saved_model/my_model')

INFO:tensorflow:Assets written to: saved_model/my_model/assets


In [55]:
# my_model directory

!ls saved_model

# Contains an assets folder, saved_model.pb, and variables folder.
! ls saved_model/my_model

my_model
assets	saved_model.pb	variables


Reload a fresh Keras model from the saved model:

In [56]:
new_model =tf.keras.models.load_model('saved_model/my_model')

new_model.summary()

Model: "sequential_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_14 (Dense)             (None, 512)               401920    
_________________________________________________________________
dropout_7 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_15 (Dense)             (None, 10)                5130      
Total params: 407,050
Trainable params: 407,050
Non-trainable params: 0
_________________________________________________________________


The restored model is compiled with the same arguments as the original model. Try running evaluate and predict with the loaded model:

In [57]:
loss, acc = new_model.evaluate(test_images, test_labels, verbose=2)
print('Restored model, accuracy: {:5.2f}%'.format(100*acc))

print(new_model.predict(test_images).shape)

32/32 - 0s - loss: 0.4083 - sparse_categorical_accuracy: 0.8630
Restored model, accuracy: 86.30%
(1000, 10)


# HDF5 format

Keras provides a basic save format using the HDF5 standard.

In [58]:
model = create_model()
model.fit(train_images, train_labels, epochs=5)

model.save('my_model.h5')

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


Now, recreate the model from that file:

In [60]:
new_model = tf.keras.models.load_model('my_model.h5')

new_model.summary()

Model: "sequential_8"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_16 (Dense)             (None, 512)               401920    
_________________________________________________________________
dropout_8 (Dropout)          (None, 512)               0         
_________________________________________________________________
dense_17 (Dense)             (None, 10)                5130      
Total params: 407,050
Trainable params: 407,050
Non-trainable params: 0
_________________________________________________________________


Check its accuracy:

In [61]:
loss, acc = new_model.evaluate(test_images, test_labels, verbose=2)

print('Restored model, accuracy: {:5.2f}%'.format(100*acc))

32/32 - 0s - loss: 0.4285 - sparse_categorical_accuracy: 0.8600
Restored model, accuracy: 86.00%


Keras saves models by inspecting their architectures. This technique saves everything:

   - The weight values
   - The model's architecture
   - The model's training configuration (what you pass to the .compile() method)
   - The optimizer and its state, if any (this enables you to restart training where you left off)

Keras is not able to save the v1.x optimizers (from tf.compat.v1.train) since they aren't compatible with checkpoints. For v1.x optimizers, you need to re-compile the model after loading—losing the state of the optimizer.

## Saving custom objects

If you are using the SavedModel format, you can skip this section. The key difference between HDF5 and SavedModel is that HDF5 uses object configs to save the model architecture, while SavedModel saves the execution graph. Thus, SavedModels are able to save custom objects like subclassed models and custom layers without requiring the original code.

To save custom objects to HDF5, you must do the following:

1. Define a get_config method in your object, and optionally a from_config classmethod.

   -get_config(self) returns a JSON-serializable dictionary of parameters needed to recreate the object.
   -from_config(cls, config) uses the returned config from get_config to create a new object. By default, this function will use the config as initialization kwargs (return cls(**config)).

2. Pass the object to the custom_objects argument when loading the model. The argument must be a dictionary mapping the string class name to the Python class. E.g. tf.keras.models.load_model(path, custom_objects={'CustomLayer': CustomLayer})

See the Writing layers and models from scratch tutorial for examples of custom objects and get_config.