# Early stops in NN
A problem with complex or networks that train for too long, is that they may start to poorly perform on the validation set, and overfit to the training data.

Often, for NN, we define three data sets
- training
- validation
- holdout

The holdout set is not seen by the NN until a final performance measure is needed, thus is supposed to represent an unbiased efficiency.

For training and validation, normally an 80/20 split is used in favour of training data. We can use the validation to trigger early stops when the performance of the NN starts to overfit.

We can seperate out our data sets with an `sklearn` function:

In [None]:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(
    x, y
    test_size=0.20,
    random_state=414141  # seed value for prng
)

## Classification
Using the network defined in the Iris notebook, between `.compile()` and `.fit()` function calls, we will add

In [None]:
monitor = tf.keras.callbacks.EarlyStopping(
    monitor='val_loss',
    min_delta=1e-3,
    patience=5,
    verbose=1,
    mode='min'
    restore_best_weights=True
)

We instruct the [`EarlyStopping`](https://keras.io/api/callbacks/early_stopping/) class to monitor the loss value
- `min_delta`: minimum change in error to be measured as an improvement. If `val_loss` changes by less than `min_delta`, we satisfy an early stop condition. 
- `patience`: how many epochs do we wait to stop training, provided the early_stop condition is met for each epoch.
- `mode`: can be either `min`, `max`, or default to `auto` -- sets whether we want to monitor minimizing or maximizing changes.
- `restore_best_weighs`: after stop is triggered, restores the weights where performance on the validation set is highest.

In order to use this `monitor` instance, we adjust our `.fit()` function with a few new paraments, all of which are self-explanatory:

In [None]:
model.fit(
    x_train, y_train, 
    validation_data=(x_test, y_test),
    callbacks=[monitor],
    epochs=1000
)

We can then evaluate our model exactly as before.

## Regression
We can use exactly the same prescription for regression models without any change to the function calls or parameters. Such is the power of abstraction.