# <center> Keras </center>
## <center>1.8 Overfitting</center>

# Overfitting

Overfitting refers to a model that models the training data too well.

Overfitting happens when a model learns the detail and noise in the training data to the extent that it negatively impacts the performance of the model on new data. This means that the noise or random fluctuations in the training data is picked up and learned as concepts by the model. The problem is that these concepts do not apply to new data and negatively impact the models ability to generalize.


<img src="https://i.stack.imgur.com/13vdb.png" width = "70%" /><br>
There have only been 56 presidential elections and 43 presidents. That is not a lot of data to learn from. When the predictor space expands to include things like having false teeth and the Scrabble point value of names, it's pretty easy for the model to go from fitting the generalizable features of the data (the signal) and to start matching the noise. When this happens, the fit on the historical data may improve, but the model will fail miserably when used to make inferences about future presidential elections.
<br> <br>

Here's a graph which illustrates overfitting: 
<img src="img/Overfitting.png" /><br>

The left graph does not capture the required behaviour, where as the right most graph overfits to the trained data. Such an overfitted network will perform very accurately on training data but not so well on test data. The middle figure represents a good compromise.

<br>
### Generalization


Generalization refers to how well the concepts learned by a machine learning model applys to specific examples not seen by the model when it was learning.

## Best Practice

It makes sense to first overfit your network for various different reasons: 
- To find out where the boundary is
- To gain confidence that the problem can be solved 

# Code

In [None]:
# Importing the MNIST dataset
from keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Processing the input data
train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype('float32') / 255

# Processing the output data
from keras.utils import to_categorical

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

# Build a network
from keras import models
from keras import layers

network = models.Sequential()
network.add(layers.Dense(units=512, activation='relu', input_shape=(28 * 28,)))
network.add(layers.Dense(units=10, activation='softmax'))

# Compile the network
network.compile(optimizer='rmsprop',
                loss='categorical_crossentropy',
                metrics=['accuracy'])

# Train the network
history = network.fit(train_images, train_labels, epochs=5, batch_size=128, 
                      verbose=1, validation_data=(test_images, test_labels))

In [None]:
import matplotlib.pyplot as plt
def plot_training_history(history):
    plt.plot(history.history['acc'])
    plt.plot(history.history['val_acc'])
    plt.title('model accuracy')
    plt.ylabel('accuracy')
    plt.xlabel('epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.show()
    #loss
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.title('model loss')
    plt.ylabel('loss')
    plt.xlabel('epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.show()

In [None]:
# Plot the training results
plot_training_history(history)

# Task

Questions: 
- How to identify if a network is overfitting? 
- What could be the reasons for overfitting?
- What countermeasures can be taken to avoid overfitting?
- Should we in any case avoid overfitting? 

# Feedback
<a href = "http://goto/ml101_doc/Keras13">Feedback: Overfitting</a> <br>

# Navigation

<div>
<span> <h3 style="display:inline">&lt;&lt; Prev: <a href = "Keras12.ipynb">Batch size and Epochs</a></h3> </span>
<span style="float: right"><h3 style="display:inline">Next: <a href = "Keras14.ipynb">Dropout</a> &gt;&gt; </h3></span>
</div>