# Evaluating model performance

To be able to confidently tell your model is working, you need concrete metrics that proof that the model is working. 

## Confusion matrix
The confusion matrix can help to see how your model is performing by showing the distribution of the possible outcomes of your target feature. By using the confusion matrix, it is possible to see where your model is performing poorly during classification. 

**Matrix visualization**
<table>
    <tr>
        <td colspan="2" rowspan="2"></td>
        <td colspan="2">Prediction</td>
    </tr>
    <tr>
        <td>Positive</td>
        <td>Negative</td>
    </tr>
    <tr>
        <td rowspan="2">Target</td>
        <td>Positive</td>
        <td>TP</td>
        <td>TN</td>
    </tr>
        <tr>
        <td>Negative</td>
        <td>FP</td>
        <td>FN</td>
    </tr>
</table>

**Example for non-binary targets**
<table>
    <tr>
        <td colspan="2" rowspan="2"></td>
        <td colspan="3">Prediction</td>
    </tr>
    <tr>
        <td>High</td>
        <td>Medium</td>
        <td>Low</td>
    </tr>
    <tr>
        <td rowspan="3">Target</td>
        <td>High</td>
        <td>0</td>
        <td>5</td>
        <td>0</td>
    </tr>
    <tr>
        <td>Medium</td>
        <td>2</td>
        <td>1</td>
        <td>3</td>
    </tr>
    <tr>
        <td>Low</td>
        <td>7</td>
        <td>3</td>
        <td>1</td>
    </tr>
</table>

## Performance metrics

**Misclassification Rate** <br />
Nr. of incorrect predictions divided by total predictions. The misclassification rate shows the percentage the model was wrong (range [0,1]). 

$$ rate = { (FP + FN) \over (TP + TN + FP + FN)} $$

**Classification Accuracy** <br />
The accuracy shows the percentage of predictions the model was right. Inverting the misclassification rate.

$$ accuracy = { (TP + TN) \over (TP + TN + FP + FN)} $$


## Categorical target performance measures
**Rates:**<br /> <br />
**TPR** = ${ TP \over TP + FN} $ &nbsp;&nbsp;
**FNR** = ${ FN \over TP + FN} $ &nbsp;&nbsp;
**TNR** = ${ TN \over TN + FP} $ &nbsp;&nbsp;
**FPR** = ${ FP \over TN + FP} $ &nbsp;&nbsp;


**Precision:** ${ TP \over TP + FP} $ shows how often the model is correct when predicting a positive. <br /><br />
**Recall (TPR):** ${ TP \over TP + FN} $ shows how confident we can be that the model will find all the Positives. <br />

**F1 Measure** <br />
The F1 measure is the harmonic mean of precision and recall. The F1 measure is good for binary targets and emphasizes the performance on the most important level (positive, is the machine correct with the assumption).
 $$\text{F1 Measure} = 2 * {  Precision * Recall \over Precision + Recall} $$

## Average Class accuracy:

*Good for imbalanced datasets e.g 80/20 distribution on pos/neg target labels.*

Arithmetic means are susceptible to the influence of large outliers and is a more optimistic view. The harmonic mean emphasizes on smaller values and is a rather pessimistic view. 

*Arithmetic mean*
$$ averageclassaccuracy_{am} = { 1 \over |levels(t)|} \sum_{i∈levels(t)} recall(i)$$

*Harmonic mean*
$$ averageclassaccuracy_{hm} = { 1 \over {1 \over levels(t)} \sum_{i∈levels(t)} {1 \over recall(i)}}$$