# Evaluation

This section examines the different methods of examining models and where they might be applicable.

## Classification


### Metrics

\begin{equation*}
Recall = Sensitivity =  TPR = \frac{TP}{TP + FN}
\end{equation*}

\begin{equation*}
FPR = \frac{FP}{FP + TN}
\end{equation*}

\begin{equation*}
Specificity = 1 - FPR = \frac{TP}{TP + FP}
\end{equation*}

\begin{equation*}
Precision = \frac{TP}{TP + FP}
\end{equation*}

#### F1 Score

#### Metric Table

<img src="imgs/metric_table.png" style="width:800px"/>

### Graphs
#### ROC

A Receiver Operating Characteristic Curve (ROC) plots the True Positive Rate (TPR) vs the False Positive Rate (FPR)

<img src="imgs/roc-curve.png" style="width:400px"/>

ROC is a curve of probability, considering the distributions a classifier may appear like:

<img src="imgs/prob_dist.png" style="width:500px"/>

##### Perfect Classifier


In a perfect classifier, where the TPR = 1 at FPR = 0, the two distributions would have no overlap. For most problems the distributions would look similar to above.


<img src="imgs/perfect_classifier.png" style="width:800px"/>

##### Worst Classifier

In the worst possible classifier, the two distributions would be identical, with compelete overlap. This indicates that the model has no ability to discriminate between the two inputs.

<img src="imgs/worst_classifier.png" style="width:800px"/>


##### Perfectly Inverted Classifier

You may expect that this would instead be the worst classifier, and in performance terms this is true as the model classifies everything wrong with 100% certainty. However for a binary classification problem this is in fact equivalent to the perfect soluton, as the output can simply be inverted to be 100% correct.

<img src="imgs/perfectly_inverted_classifier.png" style="width:800px"/>



#### Threshold

The threshold can be adjusted and tuned to deliver the most  'useful' model, depending on what metric is the most important and what practical trade-offs are acceptable for your real-world problem.

The metrics have an intimate relationship to one another, guided by the threshold.

Sensitivity INCREASES, Specificity DECREASES

Specificity DECREASES, Sensitivity INCREASES


\begin{equation}
\theta = Threshold
\end{equation}

As the threshold increases: 
\begin{equation}
\lim_{\theta \to 1} Specificity \to 1 \space \space and \space \space
Sensitivity \to 0 
\end{equation}

As the threshold decreases: 

\begin{equation}
\lim_{\theta \to 0} Specificity \to 0 \space \space and \space \space
Sensitivity \to 1 
\end{equation}

As the threshold increases: 
\begin{equation}
\lim_{\theta \to 1} TPR \to 1 \space \space and \space \space
FPR \to 1
\end{equation}

As the threshold decreases: 

\begin{equation}
\lim_{\theta \to 0} TPR \to 0 \space \space and \space \space
FPR \to 0
\end{equation}


#### Multi-class ROC Curves

For a multi-class model, you can use ROCs with a One vs All methodology. I.e. given 3 classes X, Y, Z you can have one ROC for X vs Y, Z, etc

#### PR Curves (Precision vs Recall)

Precision vs recall curves can also be very important in situations where ROC may fall short. For example, in a problem with an extreme class imbalance, the ROC may fail to show the poor performance of the model but a PR curve would make this obvious.

<img src="imgs/roc_vs_prrc.png" style="width:600px"/>
