### 4.1 使用Keras的callback避免过拟合

#### 一. ModelChackpoint与Earlystoping  
1. `EarlyStopping`回调, 在监控到验证集上的效果不再提升时, 停止神经网络训练. 需要配合`ModelCheckpoint`使用  
2. `ModelCheckpoint`: 持续保存模型在每个epoch后的权重参数, 会覆盖上次epoch得到的权重, 而只保留最近的表现最好的模型的参数  
3. 做法 : 
  1. 先声明callback的列表
  2. 在`model.fit`上传进指定的callback
  

In [2]:
import keras

callbacks_list = [
    keras.callbacks.EarlyStopping(
        monitor='val_loss',  # 使用val_acc指标决定是否early stopping
        patience=1),    # 在验证集表现不在提升后, 在执行1轮
    keras.callbacks.ModelCheckpoint(
        filepath = 'model_param.h5',
        monitor = 'val_loss',    # 如下两个参数, 表示只要val_loss表现有提升, 就覆盖保存的参数文件
        save_best_only = True    
    )
]

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


In [3]:
from keras import Sequential
from keras.datasets import mnist

# 1. 导入mnist数据集
from keras.datasets import mnist
(train_images,train_labels),(test_images,test_labels) = mnist.load_data()
print 'train shape: ',train_images.shape  # ndarray
# 2. 数据预处理
train_images = train_images.reshape((60000,28*28))
train_images = train_images.astype('float32')/255

test_images = test_images.reshape((10000,28*28))
test_images = test_images.astype('float32')/255
# one-hot输出
from keras import utils
print 'labels: ',test_labels
train_labels = utils.to_categorical(train_labels)
test_labels = utils.to_categorical(test_labels)
# 3. 构建网络
from keras import models
from keras import layers
model = models.Sequential()
# input_shape : 输入张量的形状, (28*28,)表示1维度向量
model.add(layers.Dense(512,activation='relu',input_shape=(28*28,)))
model.add(layers.Dense(10,activation='softmax'))

train shape:  (60000, 28, 28)
labels:  [7 2 1 ... 4 5 6]


In [4]:
# 4. 编译
model.compile(optimizer = 'rmsprop',
               loss = 'categorical_crossentropy',
               metrics = ['acc'])

# 5. 训练模型
model.fit(train_images,train_labels,
            epochs=20,
            batch_size=32,
            callbacks = callbacks_list,
            validation_split=0.2
           )

Train on 48000 samples, validate on 12000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20


<keras.callbacks.History at 0x7fbb9664d890>

#### 二. ReduceLROnPlateau callback
1. `ReduceLROnPlateau`可以在验证集的'val_loss'表现不再提升时, 降低参数的学习率('learning rate'). 用来更精细的到达局部最优解
2. ReduceLROnPlateau使用方法如下

In [6]:
callback_list = [
    keras.callbacks.ReduceLROnPlateau(
        monitor = 'val_loss',  # 监控指标
        factor = 0.1,  # new_lr = lr * factor
        patience = 10  # 验证机上的表现不再提升后, 再经过10轮epoch再降低学习率
    )
]

### 4.2 TensorBoard
1. Tensorbord可视化训练过程的监控指标, 通过制定keras的callback对象为`keras.callbacks.TensorBoard`, 将训练过程中的监控指标输出到一个文件上. 
2. 然后使用**cli**读取文件
```shell
tensorboard --logdir=my_log_dir
```
3. 访问**url**  
```http://localhost:6006```

如下使用mnist举例

In [12]:
from os import makedirs
from os.path import exists, join

import keras
from keras.callbacks import TensorBoard
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

import numpy as np

batch_size = 128
num_classes = 10
epochs = 12
log_dir = './logs'

if not exists(log_dir):
    makedirs(log_dir)

# input image dimensions
img_rows, img_cols = 28, 28

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# save class labels to disk to color data points in TensorBoard accordingly
with open(join(log_dir, 'metadata.tsv'), 'w') as f:
    np.savetxt(f, y_test)

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

tensorboard = TensorBoard(batch_size=batch_size,
                          embeddings_freq=1,
                          embeddings_layer_names=['features'],
                          embeddings_metadata='metadata.tsv',
                          embeddings_data=x_test)

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu', name='features'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          callbacks=[tensorboard],
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

('x_train shape:', (60000, 28, 28, 1))
(60000, 'train samples')
(10000, 'test samples')
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
Epoch 2/12
Epoch 3/12
Epoch 4/12
Epoch 5/12
Epoch 6/12
Epoch 7/12
Epoch 8/12
Epoch 9/12
Epoch 10/12
Epoch 11/12
Epoch 12/12
('Test loss:', 0.028899025906383942)
('Test accuracy:', 0.9911)
