# Advice for Applying Machine Learning

What should you do if your (e.g.) regularised linear regression model is making unacceptably large errors in its predictions?

Options include:

+ Collect more training examples.
+ Try smaller sets of features.
+ Try getting additional features.
+ Try adding polynomial features.
+ Try decreasing $\lambda$.
+ Try increasing $\lambda$.

How do you know which to try first?

Machine learning diagnostic:  
A diagnostic is a test that you can run to gain insight into what is or isn't working with a learning algorithm, and gain guidance as to how best to improve its performance.

Diagnostics can take time to implement, but are worth it in the end.

## Evaluating a Hypothesis

A really low value of training error might indicate overfitting rather than accuracy. For this reason you should always divide your (randomly ordered) data into a training set and a test set - a 70:30 split is typical - and check the error on the test set after the model has been trained with the training set data. A low value for training error and a high value for test error suggests overfitting.

Procedure:  

+ Learn parameter $\theta$ from training data (minimising training error $J(\theta)$).
+ Compute test error (for e.g. linear regression): $$J_{test}(\theta) = \frac{1}{2m} \sum^{m_{test}}_{i = 1} (h_\theta(x^{(i)}_{test}) - y^{(i)}_{test})^2$$
+ Compare test error with training error.

For logistic regression, you can use test error as follows:

$$J_{test}(\theta) = -\frac{1}{m_{test}} \sum^{m_{test}}_{i = 1} y^{(i)}_{test} \log h_\theta(x^{(i)}_{test}) + (1 - y^{(i)}_{test})\log h_\theta(x^{(i)}_{test}) $$

Or an alternative error metric, the misclassification error:

$$ err(h_\theta(x), y) =
\begin{cases}
 1 \text{ if } h_{x} \geq 0.5, y = 1 \text{ or } h_{x} < 0.5, y = 0 \\
 0 \text{ otherwise}
\end{cases} $$

That is, 1 if the sample was misclassified, 0 otherwise. Then average these values to give you an idea of how many samples your hypothesis is misclassifying:

$$ \text{test error } = \frac{1}{m_{test}} \sum^{m_{test}}_{i = 1} err(h_\theta(x^{(i)}_{test}), y^{(i)}_{test}) $$