使用MNIST数据集进行

In [1]:
from __future__ import absolute_import,division,print_function
import os

import tensorflow as tf
from tensorflow import keras
tf.__version__

'1.13.1'

为了加快演示速度，仅使用前1k个样本

In [2]:
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()

train_labels = train_labels[:1000]
test_labels = test_labels[:1000]

train_images = train_images[:1000].reshape(-1, 28 * 28) / 255.0
test_images = test_images[:1000].reshape(-1, 28 * 28) / 255.0

## 定义模型

In [3]:
def create_model():
    model = tf.keras.models.Sequential([
    keras.layers.Dense(512, activation=tf.nn.relu, input_shape=(784,)),
    keras.layers.Dropout(0.2),
    keras.layers.Dense(10, activation=tf.nn.softmax)
  ])

    model.compile(optimizer=tf.keras.optimizers.Adam(),
                loss=tf.keras.losses.sparse_categorical_crossentropy,
                metrics=['accuracy'])

    return model

# Create a basic model instance
model = create_model()
model.summary()

Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 512)               401920    
_________________________________________________________________
dropout (Dropout)            (None, 512)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                5130      
Total params: 407,050
Trainable params: 407,050
Non-trainable params: 0
_________________________________________________________________


## 在训练期间保存检查点

在训练期间或者训练结束时自动保存检查点。这样就可以使用之前训练过的模型，而不需要重新训练；或者从上次暂停的地方继续训练，以防训练过程中断。

tf.keras.callbacks.ModelCheckpoint是使用检查点的回调函数

In [4]:
checkpoint_path='training_1/cp.ckpt'#.ckpt就是checkpoint
checkpoint_dir=os.path.dirname(checkpoint_path)
checkpoint_dir

'training_1'

In [5]:
train_labels[:10]  #这是一个10分类问题，数字0-9

array([5, 0, 4, 1, 9, 2, 1, 3, 1, 4], dtype=uint8)

In [7]:
cp_callback=tf.keras.callbacks.ModelCheckpoint(checkpoint_path, save_weights_only=True,verbose=1)
model = tf.keras.models.Sequential()
model.add(keras.layers.Dense(512, activation=tf.nn.relu, input_shape=(784,)))
model.add(keras.layers.Dropout(0.2))
model.add(keras.layers.Dense(10, activation=tf.nn.softmax))

model.compile(optimizer=tf.keras.optimizers.Adam(),loss=tf.keras.losses.sparse_categorical_crossentropy, metrics=['accuracy'])
model.fit(train_images,train_labels,epochs=10,validation_data=(test_images,test_labels),callbacks=[cp_callback])

Train on 1000 samples, validate on 1000 samples
Epoch 1/10
Epoch 00001: saving model to training_1/cp.ckpt

Consider using a TensorFlow optimizer from `tf.train`.
Epoch 2/10
Epoch 00002: saving model to training_1/cp.ckpt

Consider using a TensorFlow optimizer from `tf.train`.
Epoch 3/10
Epoch 00003: saving model to training_1/cp.ckpt

Consider using a TensorFlow optimizer from `tf.train`.
Epoch 4/10
Epoch 00004: saving model to training_1/cp.ckpt

Consider using a TensorFlow optimizer from `tf.train`.
Epoch 5/10
Epoch 00005: saving model to training_1/cp.ckpt

Consider using a TensorFlow optimizer from `tf.train`.
Epoch 6/10
Epoch 00006: saving model to training_1/cp.ckpt

Consider using a TensorFlow optimizer from `tf.train`.
Epoch 7/10
Epoch 00007: saving model to training_1/cp.ckpt

Consider using a TensorFlow optimizer from `tf.train`.
Epoch 8/10
Epoch 00008: saving model to training_1/cp.ckpt

Consider using a TensorFlow optimizer from `tf.train`.
Epoch 9/10
Epoch 00009: saving m

<tensorflow.python.keras.callbacks.History at 0x2225ec14b70>

In [13]:
os.listdir(checkpoint_dir)

['checkpoint', 'cp.ckpt.data-00000-of-00001', 'cp.ckpt.index']

创建一个未经训练的全新模型，save_weight_only=True,上面只保存了权重。仅通过权重恢复模型时，必须有一个与原始模型架构相同的模型，模型架构相同才可以分享权重（尽管是不同的模型实例）

In [14]:
new_model = tf.keras.models.Sequential()
new_model.add(keras.layers.Dense(512, activation=tf.nn.relu, input_shape=(784,)))
new_model.add(keras.layers.Dropout(0.2))
new_model.add(keras.layers.Dense(10, activation=tf.nn.softmax))

new_model.compile(optimizer=tf.keras.optimizers.Adam(),loss=tf.keras.losses.sparse_categorical_crossentropy, metrics=['accuracy'])
#这个模型没有训练过

In [15]:
loss,acc=new_model.evaluate(test_images,test_labels)



In [16]:
loss,acc  #没训练过的模型准确率是10.3%

(2.378454959869385, 0.103)

In [17]:
checkpoint_path

'training_1/cp.ckpt'

In [18]:
new_model.load_weights(checkpoint_path) #其实不用特别在意保存的权重文件的形式，不用太过关注后缀名是.h5还是hdf5或者是这里的ckpt
loss,acc=new_model.evaluate(test_images,test_labels)
loss,acc



(0.4296477489471436, 0.864)

## 检查点回调选项

ModelCheckPoint这个回调函数提供了一些参数选项，用于为生成的检查点提供可辨识的名称以及调整检查点创建频率

例： 训练一个新模型，每隔5个周期保存一次检查点并设置唯一名称

In [19]:
checkpoint_path='traing_2/cp-{epoch:04d}.ckpt'
checkpoint_dir=os.path.dirname(checkpoint_path)

cp_callback=tf.keras.callbacks.ModelCheckpoint(checkpoint_path,verbose=1,save_weights_only=True,period=5)


new_model = tf.keras.models.Sequential()
new_model.add(keras.layers.Dense(512, activation=tf.nn.relu, input_shape=(784,)))
new_model.add(keras.layers.Dropout(0.2))
new_model.add(keras.layers.Dense(10, activation=tf.nn.softmax))

new_model.compile(optimizer=tf.keras.optimizers.Adam(),loss=tf.keras.losses.sparse_categorical_crossentropy, metrics=['accuracy'])
new_model.fit(train_images,train_labels,epochs=50,callbacks=[cp_callback],validation_data=(test_images,test_labels),verbose=0)



Epoch 00005: saving model to traing_2/cp-0005.ckpt

Consider using a TensorFlow optimizer from `tf.train`.

Epoch 00010: saving model to traing_2/cp-0010.ckpt

Consider using a TensorFlow optimizer from `tf.train`.

Epoch 00015: saving model to traing_2/cp-0015.ckpt

Consider using a TensorFlow optimizer from `tf.train`.

Epoch 00020: saving model to traing_2/cp-0020.ckpt

Consider using a TensorFlow optimizer from `tf.train`.

Epoch 00025: saving model to traing_2/cp-0025.ckpt

Consider using a TensorFlow optimizer from `tf.train`.

Epoch 00030: saving model to traing_2/cp-0030.ckpt

Consider using a TensorFlow optimizer from `tf.train`.

Epoch 00035: saving model to traing_2/cp-0035.ckpt

Consider using a TensorFlow optimizer from `tf.train`.

Epoch 00040: saving model to traing_2/cp-0040.ckpt

Consider using a TensorFlow optimizer from `tf.train`.

Epoch 00045: saving model to traing_2/cp-0045.ckpt

Consider using a TensorFlow optimizer from `tf.train`.

Epoch 00050: saving model t

<tensorflow.python.keras.callbacks.History at 0x22261644d68>

In [21]:
os.listdir(checkpoint_dir)

['checkpoint',
 'cp-0005.ckpt.data-00000-of-00001',
 'cp-0005.ckpt.index',
 'cp-0010.ckpt.data-00000-of-00001',
 'cp-0010.ckpt.index',
 'cp-0015.ckpt.data-00000-of-00001',
 'cp-0015.ckpt.index',
 'cp-0020.ckpt.data-00000-of-00001',
 'cp-0020.ckpt.index',
 'cp-0025.ckpt.data-00000-of-00001',
 'cp-0025.ckpt.index',
 'cp-0030.ckpt.data-00000-of-00001',
 'cp-0030.ckpt.index',
 'cp-0035.ckpt.data-00000-of-00001',
 'cp-0035.ckpt.index',
 'cp-0040.ckpt.data-00000-of-00001',
 'cp-0040.ckpt.index',
 'cp-0045.ckpt.data-00000-of-00001',
 'cp-0045.ckpt.index',
 'cp-0050.ckpt.data-00000-of-00001',
 'cp-0050.ckpt.index']

In [23]:
latest=tf.train.latest_checkpoint(checkpoint_dir)
latest

'traing_2\\cp-0050.ckpt'

**注意：默认的TensorFlow格式仅保存最近的五个检查点**

In [24]:
# 测试，重置模型加载最新的检查点
new_model = tf.keras.models.Sequential()
new_model.add(keras.layers.Dense(512, activation=tf.nn.relu, input_shape=(784,)))
new_model.add(keras.layers.Dropout(0.2))
new_model.add(keras.layers.Dense(10, activation=tf.nn.softmax))

new_model.compile(optimizer=tf.keras.optimizers.Adam(),loss=tf.keras.losses.sparse_categorical_crossentropy, metrics=['accuracy'])

new_model.load_weights(latest)
loss,acc=new_model.evaluate(test_images,test_labels)
loss,acc



(0.49132867592573165, 0.875)

In [27]:
# 保存模型
new_model.save('./my_checkpoint')

#load模型
model=keras.models.load_model('./my_checkpoint')

loss,acc=model.evaluate(test_images,test_labels)



In [29]:
loss,acc

(0.49132867592573165, 0.875)