# Introduction #

In this lesson, you'll learn how to diagnose **overfitting** and **underfitting** and learn some strategies to correct them.

There's a trade-off between how flexible a machine learning model is and how in danger it is of overfitting.

# Interpreting the Learning Curves #

The error between the training set and the test set is the **generalization error**. This is a measure of what the model has learned about the training data that isn't true in general. For instance, it might come by chance that **TODO** cars in the training set which are blue and owned by someone whose name starts with 'N' all have especially low gas milage. **END** A neural network might learn this as a rule even though there's no real connection.

The smaller a dataset is, the more likely are chance correspondences like this.

# Early Stopping #

If you've taken the Introduction to Machine Learning Course, you might remember learning about overfitting in Lesson 4, where you saw how to choose the parameters of a decision tree that gave you the best validation loss. You saw a curve like this:

which looks pretty much the same as the learning curves we've produced while training our neural nets.

With the decision tree, you chose the number of nodes that would minimize validation loss. With SGD, you can choose the number of *epochs* that minimize the validation loss. Choosing the epochs that minimize training loss is called **early stopping**. It's a simple and effective technique that you should almost always use.

## Example - Early Stopping in Keras ##

We use a **callback**. A callback in Keras is just a function you want run every so often during training. Keras has [a variety of useful callbacks](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks) pre-defined, but you can [define your own](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/LambdaCallback), too.

Here's how to define the `EarlyStopping` callback.

In [None]:
from tensorflow.keras.callbacks import EarlyStopping

early_stopping = EarlyStopping(
    min_delta=0.01, # minimium amount of change to count as an improvement
    patience=5, # how many epochs to wait before stopping
    restore_best_weights=True,
)

These parameters say: "If there hasn't been at least an improvement of 0.01 in the validation loss over 5 epochs, then stop the training and keep the best model we found."

We'll use it later when we train a model.

# Dropout #

You might be familiar with the Random Forest model. (You might have learned about it in Lesson **TODO: LESSON** of [Introduction to Machine Learning]()**TODO: LINK**.) Recall that a random forest creates an *ensemble* of decision trees by combining together the predictions of many individual trees.

In effect, dropout creates an ensemble of sub-networks during training. 

# Example - Train a Model with Regularization #

# Conclusion #

# Your Turn #