<a href="https://colab.research.google.com/github/sam19980822/3rd-ML100Days/blob/master/D85_%E4%BD%BF%E7%94%A8_callbacks_%E5%87%BD%E6%95%B8%E5%81%9A_earlystop.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import os
import keras
import itertools
import matplotlib.pyplot as plt
from keras.layers import BatchNormalization,Dropout
from keras import regularizers
from keras.callbacks import EarlyStopping


Using TensorFlow backend.


In [2]:
train, test = keras.datasets.cifar10.load_data()

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz


In [0]:
## 資料前處理
def preproc_x(x, flatten=True):
    x = x / 255.
    if flatten:
        x = x.reshape((len(x), -1))
    return x

def preproc_y(y, num_classes=10):
    if y.shape[-1] == 1:
        y = keras.utils.to_categorical(y, num_classes)
    return y    

In [0]:
x_train, y_train = train
x_test, y_test = test

# Preproc the inputs
x_train = preproc_x(x_train)
x_test = preproc_x(x_test)

# Preprc the outputs
y_train = preproc_y(y_train)
y_test = preproc_y(y_test)

In [0]:
## 超參數設定
LEARNING_RATE = 1e-3
EPOCHS = 50
BATCH_SIZE = 1024
MOMENTUM = 0.95

In [0]:
from keras.layers import BatchNormalization

"""
建立神經網路，並加入 BN layer
"""
def build_mlp(input_shape, output_units=10, num_neurons=[512, 256, 128]):
    input_layer = keras.layers.Input(input_shape)
    
    for i, n_units in enumerate(num_neurons):
        if i == 0:
            x = keras.layers.Dense(units=n_units, 
                                   activation="relu", 
                                   name="hidden_layer"+str(i+1))(input_layer)
            x = BatchNormalization()(x)
        else:
            x = keras.layers.Dense(units=n_units, 
                                   activation="relu", 
                                   name="hidden_layer"+str(i+1))(x)
            x = BatchNormalization()(x)
    
    out = keras.layers.Dense(units=output_units, activation="softmax", name="output")(x)
    
    model = keras.models.Model(inputs=[input_layer], outputs=[out])
    return model

In [7]:
model = build_mlp(input_shape=x_train.shape[1:])
model.summary()
optimizer = keras.optimizers.SGD(lr=LEARNING_RATE, nesterov=True, momentum=MOMENTUM)
model.compile(loss="categorical_crossentropy", metrics=["accuracy"], optimizer=optimizer)





Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, 3072)              0         
_________________________________________________________________
hidden_layer1 (Dense)        (None, 512)               1573376   
_________________________________________________________________
batch_normalization_1 (Batch (None, 512)               2048      
_________________________________________________________________
hidden_layer2 (Dense)        (None, 256)               131328    
_________________________________________________________________
batch_normalization_2 (Batch (None, 256)               1024      
_________________________________________________________________
hidden_layer3 (Dense)        (None, 128)               32896     
_________________________________________________________________
batch_normalization_3 (Batch (None, 128)               

## 什麼都不動

In [0]:
earlystop_1 = EarlyStopping(monitor="val_loss", 
                          patience=5, 
                          verbose=1
                          )

In [9]:
model.fit(x_train, y_train, 
          epochs=EPOCHS, 
          batch_size=BATCH_SIZE, 
          validation_data=(x_test, y_test), 
          shuffle=True,
          callbacks=[earlystop_1]
         )

# Collect results
train_loss1 = model.history.history["loss"]
valid_loss1 = model.history.history["val_loss"]
train_acc1 = model.history.history["acc"]
valid_acc1 = model.history.history["val_acc"]

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where



Train on 50000 samples, validate on 10000 samples
Epoch 1/50





Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 00027: early stopping


## 1.改變 monitor "Validation Accuracy" 並比較結果


In [0]:
earlystop_2 = EarlyStopping(monitor="val_acc", 
                          patience=5, 
                          verbose=1
                          )

In [11]:

model.fit(x_train, y_train, 
          epochs=EPOCHS, 
          batch_size=BATCH_SIZE, 
          validation_data=(x_test, y_test), 
          shuffle=True,
          callbacks=[earlystop_2]
         )

# Collect results
train_loss2 = model.history.history["loss"]
valid_loss2 = model.history.history["val_loss"]
train_acc2 = model.history.history["acc"]
valid_acc2 = model.history.history["val_acc"]

Train on 50000 samples, validate on 10000 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 00008: early stopping


## 2.調整 earlystop 的等待次數至 10, 25 並比較結果

In [0]:
earlystop_3 = EarlyStopping(monitor="val_loss", 
                          patience=10, 
                          verbose=1
                          )

In [13]:

model.fit(x_train, y_train, 
          epochs=EPOCHS, 
          batch_size=BATCH_SIZE, 
          validation_data=(x_test, y_test), 
          shuffle=True,
          callbacks=[earlystop_3]
         )

# Collect results
train_loss3 = model.history.history["loss"]
valid_loss3 = model.history.history["val_loss"]
train_acc3 = model.history.history["acc"]
valid_acc3 = model.history.history["val_acc"]

Train on 50000 samples, validate on 10000 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 00012: early stopping


In [0]:
earlystop_4 = EarlyStopping(monitor="val_loss", 
                          patience=25, 
                          verbose=1
                          )

In [15]:

model.fit(x_train, y_train, 
          epochs=EPOCHS, 
          batch_size=BATCH_SIZE, 
          validation_data=(x_test, y_test), 
          shuffle=True,
          callbacks=[earlystop_4]
         )

# Collect results
train_loss4 = model.history.history["loss"]
valid_loss4 = model.history.history["val_loss"]
train_acc4 = model.history.history["acc"]
valid_acc4 = model.history.history["val_acc"]

Train on 50000 samples, validate on 10000 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 00027: early stopping


## 結論：
1.使用val_loss,patience=5 , 在第27epoch停住，train_acc=0.6938, val_loss = 1.4  
2.使用val_acc, patience=5 , 在第08epoch停住，train_acc=0.7551, val_loss = 1.5  
3.使用val_loss, patience=10 , 在第12epoch停住，train_acc=0.8411, val_loss = 1.6  
4.使用val_loss, patience=25 , 在第27epoch停住，train_acc=0.9695, val_loss = 2.1  

## 整體來說，patience不要訂太高，使用val_acc會最好