# Framework for Better Deep Learning
https://machinelearningmastery.com/framework-for-better-deep-learning/

## Problems of poor performance
There are three types of problems that
are straightforward to diagnose with regard to poor performance of a deep learning neural
network model; they are:
- *Problems with Learning*. Problems with learning manifest in a model that cannot
effectively learn a training dataset or shows slow progress or bad performance when learning the training dataset.
- *Problems with Generalization*. Problems with generalization manifest in a model
that overfits the training dataset and makes poor predictions on a holdout dataset.
- *Problems with Predictions*. Problems with predictions manifest in the stochastic
training algorithm having a strong influence on the final model, causing high variance in
behavior and performance.

## How to Use the Framework

**Step 1: Diagnose the Performance Problem**
A robust diagnostic tool is to calculate a learning curve of loss and a
problem-specific metric (like RMSE for regression or accuracy for classification) on a train and
validation dataset over a given number of training epochs

- If the loss on the training dataset is poor, stuck, or fails to improve, perhaps you have a
learning problem.
- If the loss or problem-specific metric on the training dataset continues to improve and
gets worse on the validation dataset, perhaps you have a generalization problem.
- If the loss or problem-specific metric on the validation dataset shows a high variance
towards the end of the run, perhaps you have a prediction problem.

**Step 2: Select and Evaluate a Technique**
Review the techniques that are designed to address your problem. Select a technique that
appears to be a good fit for your model and problem.
- *Learning Problem*: Tuning the hyperparameters of the learning algorithm; specifically,
the learning rate offers the biggest leverage.
- *Generalization Problem*: Using weight regularization and early stopping works well
on most models with most problems, or try dropout with early stopping.
- *Prediction Problem*: Average the prediction from models collected over multiple runs
or multiple epochs on one run to add sufficient bias.

**Step 3: Go To Step 1**

# Diagnostic Learning Curves
https://machinelearningmastery.com/learning-curves-for-diagnosing-machine-learning-model-performance/

*Learning Curve: Line plot of learning (y-axis) over experience (x-axis).*

## Good Fit Learning Curves

A plot of learning curves shows a good fit if:

- The plot of training loss decreases to a point of stability.
- The plot of validation loss decreases to a point of stability and has a small gap with the training loss.

<img height="400" width="500" src="https://machinelearningmastery.com/wp-content/uploads/2018/12/Example-of-Train-and-Validation-Learning-Curves-Showing-A-Good-Fit.png">

## Underfit Learning Curves
*Underfitting refers to a model that cannot learn the training dataset.*

A plot of learning curves shows underfitting if:

- The training loss remains flat regardless of training.
- The training loss continues to decrease until the end of training.

<img height="300" width="400" src="https://machinelearningmastery.com/wp-content/uploads/2019/02/Example-of-Training-Learning-Curve-Showing-An-Underfit-Model-That-Does-Not-Have-Sufficient-Capacity.png">

<img height="300" width="400" src="https://machinelearningmastery.com/wp-content/uploads/2018/12/Example-of-Training-Learning-Curve-Showing-An-Underfit-Model-That-Requires-Further-Training.png">

## Overfit Learning Curves

A plot of learning curves shows overfitting if:

- The plot of training loss continues to decrease with experience.
- The plot of validation loss decreases to a point and begins increasing again.

<img height="500" width="600" src="https://machinelearningmastery.com/wp-content/uploads/2018/12/Example-of-Train-and-Validation-Learning-Curves-Showing-An-Overfit-Model.png">

## Diagnosing Unrepresentative Datasets

An **unrepresentative dataset** means a dataset that may not capture the statistical characteristics relative to another dataset drawn from the same domain, such as between a train and a validation dataset. This can commonly occur if the number of samples in a dataset is too small, relative to another dataset.

There are two common cases that could be observed; they are:
- **Unrepresentative Train Dataset**:
An unrepresentative training dataset means that the training dataset does not provide sufficient information to learn the problem, relative to the validation dataset used to evaluate it.

This may occur if the training dataset has too few examples as compared to the validation dataset.

This situation can be identified by a learning curve for training loss that shows improvement and similarly a learning curve for validation loss that shows improvement, but a large gap remains between both curves.

<img height="500" width="700" src="https://machinelearningmastery.com/wp-content/uploads/2018/12/Example-of-Train-and-Validation-Learning-Curves-Showing-a-Training-Dataset-the-May-be-too-Small-Relative-to-the-Validation-Dataset.png">


- **Unrepresentative Validation Dataset**:
An unrepresentative validation dataset means that the validation dataset does not provide sufficient information to evaluate the ability of the model to generalize.

This may occur if the validation dataset has too few examples as compared to the training dataset.

This case can be identified by a learning curve for training loss that looks like a good fit (or other fits) and a learning curve for validation loss that shows noisy movements around the training loss.

It may also be identified by a validation loss that is lower than the training loss. In this case, it indicates that the validation dataset may be easier for the model to predict than the training dataset.

<img height="350" width="450" src="https://machinelearningmastery.com/wp-content/uploads/2018/12/Example-of-Train-and-Validation-Learning-Curves-Showing-a-Validation-Dataset-the-May-be-too-Small-Relative-to-the-Training-Dataset.png">
<img height="350" width="450" src="https://machinelearningmastery.com/wp-content/uploads/2018/12/Example-of-Train-and-Validation-Learning-Curves-Showing-a-Validation-Dataset-that-is-Easier-to-Predict-than-the-Training-Dataset.png">

# How To Improve Deep Learning Performance
https://machinelearningmastery.com/improve-deep-learning-performance/

## 1. Improve Performance With Data

Here’s a short list of what we’ll cover:

- Get More Data: Can you get more training data?
- Invent More Data: You can use a generative model and simple tricks with data augumentation
- Rescale Your Data: Rescale your data to the bounds of your activation functions.
- Transform Your Data: Guesstimate the univariate distribution of each column.
- Feature Selection: Can you remove some attributes from your data using feature importance ?
- Reframe Your Problem: Are the observations that you’ve collected the only way to frame your problem?

## 2. Improve Performance With Algorithms

Here’s the short list

- Spot-Check Algorithms: Spot-check a suite of top methods and see which fair well and which do not.
- Steal From Literature: A great shortcut to picking a good method, is to steal ideas from literature.
- Resampling Methods: Deep learning methods are slow to train. Perhaps you can perform model selection and tuning using the smaller dataset, then scale the final technique up to the full dataset at the end.

## 3. Improve Performance With Algorithm Tuning
Here are some ideas on tuning your neural network algorithms in order to get more out of them.

- Diagnostics: Is your model overfitting or underfitting?
- Weight Initialization: Initialize using small random numbers.
- Learning Rate: Larger networks need more training, and the reverse. If you add more neurons or more layers, increase your learning rate
- Activation Functions: Before that it was sigmoid and tanh, then a softmax, linear or sigmoid on the output layer. I don’t recommend trying more than that unless you know what you’re doing.
- Network Topology: How many layers and how many neurons do you need? No one knows. No one. Don’t ask.
- Batches and Epochs: The batch size defines the gradient and how often to update weights. An epoch is the entire training data exposed to the network, batch-by-batch.
- Regularization: Regularization is a great approach to curb overfitting the training data.
- Optimization and Loss: Have you experimented with different optimization procedures?
- Early Stopping: You can stop learning once performance starts to degrade.

## 4. Improve Performance With Ensembles
We’ll take a look at three general areas of ensembles you may want to consider:

- Combine Models: Don’t select a model, combine them.
- Combine Views: As above, but train each network on a different view or framing of your problem.
- Stacking: You can also learn how to best combine the predictions from multiple models.

# Others Articles

## Why Training a Neural Network Is Hard
https://machinelearningmastery.com/why-training-a-neural-network-is-hard/

 
## Recommendations for Deep Learning Neural Network Practitioners
https://machinelearningmastery.com/recommendations-for-deep-learning-neural-network-practitioners/

## 8 Tricks for Configuring Backpropagation to Train Better Neural Networks
https://machinelearningmastery.com/best-advice-for-configuring-backpropagation-for-deep-learning-neural-networks/

## How to Demonstrate Your Basic Skills with Deep Learning
https://machinelearningmastery.com/how-to-demonstrate-basic-deep-learning-competence/ 
