analysis/bin_class.Rmd

---
title: "Performance of binary classifiers"
date: "`r Sys.Date()`"
output:
  workflowr::wflow_html:
    toc: false
editor_options:
  chunk_output_type: console
---

```{r setup, include=FALSE}
library(tidyverse)
knitr::opts_chunk$set(echo = TRUE)
```

Calculate performance metrics under different scenarios.

## Setup

Set theme.

```{r theme_set}
theme_set(theme_bw())
```

Source code for calculating performance measures.

```{r table_metrics}
source("https://raw.githubusercontent.com/davetang/learning_r/main/code/table_metrics.R")
```

Set number of cases to use.

```{r num_case}
num_case <- 10000
```

Function to generate labels.

```{r gen_labels}
gen_labels <- function(n, prob, pos = 'yes', neg = 'no'){
  factor(ifelse(rbinom(n, 1, prob) == 1, pos, neg), levels = c(pos, neg))
}
```

## Testing

Generate data where positive and negative cases are balanced.

```{r pca_data}
truth = gen_labels(num_case, 0.5)
table(truth)
```

Classifier that predicts `yes` for every case.

```{r yes_all}
yes_all <- factor(rep('yes', num_case), levels = c('yes', 'no'))
table_metrics(table(truth, yes_all), 'yes', 'no', 'row')
```

Classifier that predicts `no` for every case.

```{r no_all}
no_all <- factor(rep('no', num_case), levels = c('yes', 'no'))
table_metrics(table(truth, no_all), 'yes', 'no', 'row')
```

Classifier that predicts `yes` 95% of the time.

```{r yes_95}
yes_95 <- gen_labels(num_case, 0.95)
table_metrics(table(truth, yes_95), 'yes', 'no', 'row')
```

## Label ratio

Function to calculate and plot metrics.

```{r test_label_ratio}
test_label_ratio <- function(pred, title = NULL){
  probs <- seq(0.05, 0.95, 0.05)
  perf <- map(probs, function(x){
    truth_ <- gen_labels(num_case, x)
    table_metrics(table(truth_, pred), 'yes', 'no', 'row')
  })
  
  df <- map_df(perf, function(x) x)
  df$label_ratio <- factor(probs)
  df <- pivot_longer(df, -label_ratio, names_to = 'metric')
  
  ggplot(
    df,
    aes(
      label_ratio,
      value,
      group = metric
    )
  ) +
    geom_line() +
    facet_wrap(~metric) +
    theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
    scale_y_continuous(breaks = seq(0, 1, 0.25)) +
    ggtitle(title)
}
```

Performance of a 50/50 classifier with different ratio of real labels. Recall that precision is calculated by `TP` divided by `positive predictions` and therefore concerns positive predictions, i.e. when a positive prediction is made, how often is it correct? When most of the labels are positive, most of the positive predictions will be correct resulting in high precision.

```{r pred_0.50}
test_label_ratio(gen_labels(num_case, 0.50), '50/50 classifier')
```

Performance of a classifier that predicts positive 95% of the times. The true positive rate/sensitivity/recall is calculated by `TP` divided by the `total number of positives`. Therefore, this will be high regardless of the data if a classifier predicts positive 95% of the time. The precision tells a different picture because it takes into account the number of predictions made. Therefore, if there are few positive cases (leading to few `TP`s) and a large number of positive predictions, the precision is low.

```{r pred_0.95}
test_label_ratio(gen_labels(num_case, 0.95), '95/5 classifier')
```

Performance of a classifier that predicts negative 95% of the times. The number of true positives will be low with a classifier that does not predict many positive cases. This results in a low true positive rate/sensitivity/recall. Precision can increase with few positive predictions when the data is mostly positive cases. The true negative rate/specificity is calculated by `TN` divided by the `total number of negatives`. Therefore if a classifier mostly outputs negative predictions, the true negative number will be close to the total number of negatives, resulting in a high specificity.

```{r pred_0.05}
test_label_ratio(gen_labels(num_case, 0.05), '5/95 classifier')
```

## Summary

Metrics focus on different aspects and therefore should be reported together to paint a full picture of the performance of a binary classifier.