<a href="https://colab.research.google.com/github/PaulToronto/DataCamp---Introduction-to-Deep-Learning-with-Keras/blob/main/Part_3_of_4_Improving_Your_Model_Performance.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Part 3 of 4 - Improving Your Model Performance

## Imports

In [1]:
import numpy as np

from sklearn.model_selection import train_test_split

## Learning curves

- Learning curves provide a lot of information about your model
- So far, we've seen two types:
    1. Loss curve
        - <img src='https://raw.githubusercontent.com/PaulToronto/DataCamp---Introduction-to-Deep-Learning-with-Keras/main/images/loss_curve.png'/>
        - tends to decrease as epochs go by
        - this is expected since our model is learning to minimize loss
        - afer a certain number of epochs, the value converges and we've arrived at a minimum
    2. Accuracy curve
        - <img src='https://raw.githubusercontent.com/PaulToronto/DataCamp---Introduction-to-Deep-Learning-with-Keras/main/images/accuracy_curve.png'/>
        - accuracy tends to increase as epochs go by
        - this shows that the model makes fewer mistakes as it learns
- Overfitting
    - <img src='https://raw.githubusercontent.com/PaulToronto/DataCamp---Introduction-to-Deep-Learning-with-Keras/main/images/overfitting.png'/>
    - Overfitting is when our model starts learning particularities of our training data which don't generalize well on unseen data
        - The early stopping callback is useful to stop our model before it starts overfitting
    - If we plot training vs validation data we can identify overfitting
    - The training and validation curves start to diverge
- Unstable curves
    - <img src='https://raw.githubusercontent.com/PaulToronto/DataCamp---Introduction-to-Deep-Learning-with-Keras/main/images/unstable_curve.png'/>
    - Not all curves are smooth and pretty
    - There are many reasons that can lead to unstable learning curves:
        - the chosen optimizer
        - learning rate
        - batch size
        - network architecture
        - weight initialization
        - ...
    - All these parameters can be tuned
- Can we benefit from more data?
    - Neural networks are well known for surpassing traditional machine learning techniquss as the amount of data increases
    - We can check whether collecting more data would increase a model's generalization and accuracy     
- We aim at producing a graph like this one, where we have fitted our model with increasing amounts of training data
    - <img src='https://raw.githubusercontent.com/PaulToronto/DataCamp---Introduction-to-Deep-Learning-with-Keras/main/images/aim.png'/>
    - If, after using all our data, we see that our test still has a tendency to improve (meaning it's not parallet to our training set curve and it's increasing), then it's worth it to gather more data if possible to allow the model to keep learning
        - <img src='https://raw.githubusercontent.com/PaulToronto/DataCamp---Introduction-to-Deep-Learning-with-Keras/main/images/still_increasing.png'/>
- Code for creating a graph like the previous one:

```python
# Store initial model weights
init_weights = model.get_weights()

# Lists for storing accuracies
train_accs = []
tests_accs = []

# loop over a predetermined list of train sized and
# for each train size we get the corresponding training data fraction
for train_size in train_sizes:
    # Split a fraction according to train_size
    X_train_frac, _, y_train_frac, _ = train_test_split(X_train,
                                                        y_train,
                                                        train_size=train_size)
    # make sure our model starts with the same set of weights
    model.set_weights(initial_weights)
    model.fit(X_train_frac,
              y_train_frac,
              epochs=100,
              verbose=0,
              callbacks=[EarlyStopping(monintor='loss', patience=1)])
    # Get the accuracy for this training set fraction
    train_acc = model.evaluate(X_train_frac, y_train_frac, verbose=0)[1]
    train_accs.append(train_acc)
    # Get the accuracy on the whole test set
    test_acc = model.evaluate(X_test, y_test, verbose=0)[1]
    test_accs.append(test_acc)
    print('Done with size: ', train_size)
```


### Learning the digits

#### The Data

The files `digits_pixels.npy` and `digits_target.npy` can be downloaded via these two links:

https://github.com/PaulToronto/DataCamp---Introduction-to-Deep-Learning-with-Keras/raw/main/data/Digits/digits_pixels.npy

https://github.com/PaulToronto/DataCamp---Introduction-to-Deep-Learning-with-Keras/raw/main/data/Digits/digits_target.npy

I stored them in my Google Drive and I am reading them directly from there. I don't think it is possible to read *.npy files directly from GitHub because the files start downloading automatically when I call `np.load()`

In [2]:
# Load the Drive helper and mount
from google.colab import drive

# This will prompt for authorization.
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [3]:
file_path = '/content/drive/My Drive/Colab Notebooks/Data Science/Mars-Seluna/'
file_path += 'DataCamp - Introduction to Deep Learning with Keras/'
file_path += 'digits_pixels.npy'

X = np.load(file_path)

file_path = '/content/drive/My Drive/Colab Notebooks/Data Science/Mars-Seluna/'
file_path += 'DataCamp - Introduction to Deep Learning with Keras/'
file_path += 'digits_target.npy'

y = np.load(file_path)

In [4]:
X.shape, X

((1797, 64),
 array([[ 0.,  0.,  5., ...,  0.,  0.,  0.],
        [ 0.,  0.,  0., ..., 10.,  0.,  0.],
        [ 0.,  0.,  0., ..., 16.,  9.,  0.],
        ...,
        [ 0.,  0.,  1., ...,  6.,  0.,  0.],
        [ 0.,  0.,  2., ..., 12.,  0.,  0.],
        [ 0.,  0., 10., ..., 12.,  1.,  0.]]))

In [5]:
y.shape, y

((1797,), array([0, 1, 2, ..., 8, 9, 8]))

In [6]:
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7)

X_train.shape, X_test.shape, y_train.shape, y_test.shape

((1257, 64), (540, 64), (1257,), (540,))

#### Build and compile the model

In [7]:
# first, in the data section above, I need to one-hot encode the labels