## Metrics for evaluating Classification Models

Statistical models for classifying data into two groups predict the probability a data point belongs to one group or the other.
The binary nature of our data suggests we use different methods for evaluating a classification model than a model that predicts a continuous target.

### Goal
We aim to understand metrics for evaluating classification models. 
We will learn how to compute and interpret: Accuracy, True Positive, False Positive, True Negative, False Negative, Positive Predictive value, Negative Predictive value, and Brier Score.

## How LR decides a data point should be a "1" or "0"

Our logistic regression model, given a set of $x$ covariates, returns the log-odds a data point belongs to the group we labeled "1" versus the group we labeled "0".
The logistic regression model returns a continuous value, but we need to **decide** whether our data point should be classified as a 0 or 1.
The most common method for classifying data points is to classify points that have a positive log-odds as a "1" and negative log-odds as "0".

We can see why by looking at the log-odds.

$$
    \log \left( \frac{p}{1-p} \right)
$$

When the log-odds equals $0$ we can solve for $p$

\begin{align}
    \log \left( \frac{p}{1-p} \right) &= 0\\
    \frac{p}{1-p} &= 1\\
    p &= 1-p\\
    p &= \frac{1}{2}
\end{align}

The log-odds equals zero when the predicted probability equal $\frac{1}{2}$.
Positive log-odds assign a data point a greater than 50\% of belonging to group "1".
Negative log-odds assign a data point a less than 50\% of belonging to group "1".

For a binary classification problem, a decision rule assigns data points to one group or the other.
We typically create the following decision rule when using logistic regression, the data point is predicted to belong to group $1$ if the log-odds is greater than or equal to $0$ and predicted to belong to group $0$ is the log-odds is negative.

## Metrics

The observed group for data point $i$ will be denoted $o_{i}$.
If a data point belongs to group "1" then $o_{i}=1$.
The variable $o_{i}=0$ if the data point belongs to group "0".

Our prediction for data point $i$ will be denoted $f_{i}$---think "forecast".
We can make one of two predictions for each data point, a 1 or 0.

Some of the below metric will use the Indicator function $I$.
The indicator function equals $1$ when its argument is true and equals the value $0$ when its argument is false. For example, 

$$
I(\text{Topics in Applied Regression} = \text{Awesome})
$$

is **true** and so would equal the value $1$. 

Given $N$ data points,
### True Positive

The True Positive Rate (**TP**) equals

$$
    TP = \sum_{i=1}^{N} I(o_{i}=1) \times I(f_{i}=1) 
$$

and measures how well our classification models correctly predicts data points belong to group 1. 

### False Positive

The False Positive Rate (**FP**) equals

$$
    FP = \sum_{i=1}^{N} I(o_{i}=0) \times I(f_{i}=1) 
$$

and measures how well our classification models incorrectly predicts data points belong to group 1. Instead, the data point belongs to group 0.

### True Negative

The True Negative Rate (**TN**) equals

$$
    TN = \sum_{i=1}^{N} I(o_{i}=0) \times I(f_{i}=0) 
$$

and measures how well our classification models correctly predicts data points belong to group 0.

### False Negative

The False Negative Rate (**FN**) equals

$$
    FN = \sum_{i=1}^{N} I(o_{i}=1) \times I(f_{i}=0) 
$$

and measures how well our classification models incorrectly predicts data points belong to group 0. Instead, the data point belongs to group 1.

### Positive Predictive Value

The Positive Predictive Value (**PPV**) equals

$$
    PPV = \frac{\sum TP}{\sum TP + \sum FP} 
$$
 
The PPV measures the ratio of data points that were accurately predicted to belong to group 1, over the total number of times we predicted a data point belongs to group 1.

### Negative Predictive Value

The Negative Predictive Value (**NPV**) equals

$$
    NPV = \frac{\sum TN}{\sum TN + \sum FN} 
$$
 
The NPV measures the ratio of data points that were accurately predicted to belong to group 0, over the total number of times we predicted a data point belongs to group 0.

### Brier Score

The Brier score (**BS**) equals

$$
    BS = \frac{1}{N} \sum_{i=1}^{N} \left( o_{i} - p_{i} \right)^2
$$
where $p_{i}$ is the probability data point $o_{i}$ belongs to group 1.
If our model perfectly predicts what group each data point belongs to then the BS equals a perfect 0.
On the other hand, if our model always predicts the wrong group our BS equals $1$, the worst score.