# Performance Metrics

Performance metric or loss functions measure the overall performance of a model. They give an idea how much error the system typically makes in its predictions

## Terminologies

### Error vs Accuracy

Error and accuracy describe the same thing in different ways. Error is ratio of incorrect evaluations and accuracy is the ratio of correct evaluations. $\text{Error} = 1-\text{Accuracy}$, $\text{Accuracy} = 1-\text{Error}$

### Base Line Accuracies

For your model to have some significance, it has to do better than baseline accuracy, which is $\frac{1}{k}$ if $k$ is the number of possible classfications. The lowest goal, but definition is just to do better than random guessing.

Another baseline accuracy in case of classification is just estimating based on the proportion of classifications. For example, if 90% of classifications are "A", then by default a classifier that returns all "A", will bias an accuracy of 90%. For this reason accuracy is not a perferred metric for skewed datasets.



## Regression

### MSE - Root Mean Square Error

$$\text{MSE}(X, h) = \frac{1}{m} \sum_{i=1}^m(h(x^{(i)}) - y^{(i)})^2$$


Mean squared error, or the squared loss of over the expected value. This loss function computes the average distance a models predictions are off from the true value. This base lose function forms the basis for many other loss functions.

**Why it's Good**

Evaluations or predictions that are very away from the true more are "punished" than points that are closer to the truth due to the squared term. In this way, the model learns to avoid making really bad predictions on the trade off of making many smaller errors. The final value is also averaged, so it can be interpret directly inline with the context prediction, e.g. "A average error of $45,000".


### RMSE - Root Mean Square Error

$$\text{RMSE}(X, h) = \sqrt{\frac{1}{m} \sum_{i=1}^m(h(x^{(i)}) - y^{(i)})^2}$$

Same as above, but it takes the squared root on top of  $\sqrt{x^2}=x$  in order to obtain back the original scale of the data.


### MAE - Mean Absolute Loss

$$\text{MAE}(X, h) = \frac{1}{m} \sum_{i=1}^m|h(x^{(i)}) - y^{(i)}|$$

Mean absolute loss has a few advantages.

1. It provides a clear metric how close the predictions are to the actual value.
2. Points further away are punished at the same linear scale as points closer away. This can be useful in some contexts. For example, outliers that produce errors do not make the error grow out of control, as they would have in the squared context because it is squared.


### $R^2$ - R-squared

R-squared is a value between 0 and 1 that indicate the goodness of fit for a model. On this scale, 1 indicates a perfect fit and 0 indicates no fit.