<a href="https://colab.research.google.com/github/glopez21/Deep-Learning-Intro/blob/main/4_Early_Stop.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Early Stopping in Keras to Prevent Overfitting

It can be difficult to determine how many epochs to cycle through to train a neural network. Overfitting will occur if you train the neural network for too many epochs, and the neural network will not perform well on new data, despite attaining a good accuracy on the training set. Overfitting occurs when a neural network is trained to the point that it begins to memorize rather than generalize, as demonstrated in Figure 3.OVER. 

**Figure 3.OVER: Training vs. Validation Error for Overfitting**
![Training vs. Validation Error for Overfitting](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/class_3_training_val.png "Training vs. Validation Error for Overfitting")

It is important to segment the original dataset into several datasets:

* **Training Set**
* **Validation Set**
* **Holdout Set**

You can construct these sets in several different ways. The following programs demonstrate some of these.

The first method is a training and validation set. We use the training data to train the neural network until the validation set no longer improves. This attempts to stop at a near-optimal training point. This method will only give accurate "out of sample" predictions for the validation set; this is usually 20% of the data. The predictions for the training data will be overly optimistic, as these were the data that we used to train the neural network. Figure 3.VAL demonstrates how we divide the dataset.

**Figure 3.VAL: Training with a Validation Set**
![Training with a Validation Set](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/class_1_train_val.png "Training with a Validation Set")

## Early Stopping with Classification

We will now see an example of classification training with early stopping. We will train the neural network until the error no longer improves on the validation set.

In [None]:
import pandas as pd
import io
import requests
import numpy as np
from sklearn import metrics
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.callbacks import EarlyStopping
from sklearn.metrics import accuracy_score

In [None]:
df = pd.read_csv('https://raw.githubusercontent.com/glopez21/Deep-Learning-Intro/main/data/iris.csv', na_values=['NA', '?'])

In [None]:
# Convert to numpy - Classification
x = df[['sepal_l', 'sepal_w', 'petal_l', 'petal_w']].values
dummies = pd.get_dummies(df['species']) # Classification
species = dummies.columns
y = dummies.values

In [None]:
# Split into validation and training sets
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.25, random_state=42)

In [None]:
# Build neural network
model = Sequential()
model.add(Dense(50, input_dim=x.shape[1], activation='relu')) # Hidden 1
model.add(Dense(25, activation='relu')) # Hidden 2
model.add(Dense(y.shape[1],activation='softmax')) # Output
model.compile(loss='categorical_crossentropy', optimizer='adam')

In [None]:
# Defining the early stopping parameters
monitor = EarlyStopping(monitor='val_loss', min_delta=1e-3, patience=5, verbose=1, mode='auto', restore_best_weights=True)

In [None]:
# Training the model
model.fit(x_train,y_train,validation_data=(x_test,y_test), callbacks=[monitor],verbose=2,epochs=1000)

Epoch 1/1000
4/4 - 1s - loss: 1.2650 - val_loss: 1.3043 - 998ms/epoch - 250ms/step
Epoch 2/1000
4/4 - 0s - loss: 1.1698 - val_loss: 1.2006 - 35ms/epoch - 9ms/step
Epoch 3/1000
4/4 - 0s - loss: 1.0941 - val_loss: 1.1213 - 48ms/epoch - 12ms/step
Epoch 4/1000
4/4 - 0s - loss: 1.0395 - val_loss: 1.0546 - 39ms/epoch - 10ms/step
Epoch 5/1000
4/4 - 0s - loss: 0.9945 - val_loss: 0.9953 - 30ms/epoch - 8ms/step
Epoch 6/1000
4/4 - 0s - loss: 0.9480 - val_loss: 0.9442 - 35ms/epoch - 9ms/step
Epoch 7/1000
4/4 - 0s - loss: 0.9087 - val_loss: 0.8988 - 38ms/epoch - 10ms/step
Epoch 8/1000
4/4 - 0s - loss: 0.8720 - val_loss: 0.8590 - 32ms/epoch - 8ms/step
Epoch 9/1000
4/4 - 0s - loss: 0.8400 - val_loss: 0.8240 - 34ms/epoch - 9ms/step
Epoch 10/1000
4/4 - 0s - loss: 0.8124 - val_loss: 0.7990 - 36ms/epoch - 9ms/step
Epoch 11/1000
4/4 - 0s - loss: 0.7894 - val_loss: 0.7738 - 33ms/epoch - 8ms/step
Epoch 12/1000
4/4 - 0s - loss: 0.7688 - val_loss: 0.7489 - 30ms/epoch - 8ms/step
Epoch 13/1000
4/4 - 0s - loss: 

<keras.callbacks.History at 0x7f2faa035b10>

There are a number of parameters that are specified to the **EarlyStopping** object. 

* **min_delta** This value should be kept small. It simply means the minimum change in error to be registered as an improvement.  Setting it even smaller will not likely have a great deal of impact.
* **patience** How long should the training wait for the validation error to improve?  
* **verbose** How much progress information do you want?
* **mode** In general, always set this to "auto".  This allows you to specify if the error should be minimized or maximized.  Consider accuracy, where higher numbers are desired vs log-loss/RMSE where lower numbers are desired.
* **restore_best_weights** This should always be set to true.  This restores the weights to the values they were at when the validation set is the highest.  Unless you are manually tracking the weights yourself (we do not use this technique in this course), you should have Keras perform this step for you.

As you can see from above, the entire number of requested epochs were not used.  The neural network training stopped once the validation set no longer improved.

In [None]:
pred = model.predict(x_test)
predict_classes = np.argmax(pred,axis=1)
expected_classes = np.argmax(y_test,axis=1)
correct = accuracy_score(expected_classes,predict_classes)
print(f"Accuracy: {correct}")

Accuracy: 1.0


## Early Stopping with Regression

The following code demonstrates how we can apply early stopping to a regression problem.  The technique is similar to the early stopping for classification code that we just saw.

In [None]:
df = pd.read_csv('https://raw.githubusercontent.com/glopez21/Deep-Learning-Intro/main/data/auto-mpg.csv', na_values=['NA', '?'])

In [None]:
cars = df['name']

In [None]:
# Handle missing value
df['horsepower'] = df['horsepower'].fillna(df['horsepower'].median())

In [None]:
# Pandas to Numpy
x = df[['cylinders', 'displacement', 'horsepower', 'weight','acceleration', 'year', 'origin']].values
y = df['mpg'].values

In [None]:
# Split into validation and training sets
x_train, x_test, y_train, y_test = train_test_split(    
    x, y, test_size=0.25, random_state=42)

In [None]:
# Build the neural network
model = Sequential()
model.add(Dense(25, input_dim=x.shape[1], activation='relu')) # Hidden 1
model.add(Dense(10, activation='relu')) # Hidden 2
model.add(Dense(1)) # Output
model.compile(loss='mean_squared_error', optimizer='adam')

In [None]:
# Defining the early stopping parameters
monitor = EarlyStopping(monitor='val_loss', min_delta=1e-3, patience=5, verbose=1, mode='auto', restore_best_weights=True)

In [None]:
# Training the model
model.fit(x_train,y_train,validation_data=(x_test,y_test), callbacks=[monitor], verbose=2,epochs=1000)

Epoch 1/1000
10/10 - 1s - loss: 613.8735 - val_loss: 326.4026 - 569ms/epoch - 57ms/step
Epoch 2/1000
10/10 - 0s - loss: 152.3117 - val_loss: 117.1055 - 60ms/epoch - 6ms/step
Epoch 3/1000
10/10 - 0s - loss: 111.9459 - val_loss: 100.3307 - 45ms/epoch - 4ms/step
Epoch 4/1000
10/10 - 0s - loss: 88.7659 - val_loss: 56.1233 - 65ms/epoch - 7ms/step
Epoch 5/1000
10/10 - 0s - loss: 71.3055 - val_loss: 60.7528 - 42ms/epoch - 4ms/step
Epoch 6/1000
10/10 - 0s - loss: 70.2268 - val_loss: 60.1952 - 42ms/epoch - 4ms/step
Epoch 7/1000
10/10 - 0s - loss: 68.7826 - val_loss: 56.0937 - 45ms/epoch - 4ms/step
Epoch 8/1000
10/10 - 0s - loss: 66.9420 - val_loss: 53.9640 - 40ms/epoch - 4ms/step
Epoch 9/1000
10/10 - 0s - loss: 69.7120 - val_loss: 61.3949 - 62ms/epoch - 6ms/step
Epoch 10/1000
10/10 - 0s - loss: 67.5225 - val_loss: 59.2326 - 65ms/epoch - 6ms/step
Epoch 11/1000
10/10 - 0s - loss: 66.9951 - val_loss: 51.9686 - 42ms/epoch - 4ms/step
Epoch 12/1000
10/10 - 0s - loss: 64.3653 - val_loss: 52.6477 - 40m

<keras.callbacks.History at 0x7f2fa3dba950>

Finally, we evaluate the error.

In [None]:
# Measure RMSE error.  RMSE is common for regression.
pred = model.predict(x_test)
score = np.sqrt(metrics.mean_squared_error(pred,y_test))
print(f"Final score (RMSE): {score}")

Final score (RMSE): 6.549242720482738
