Compute "chance agreement" baseline 

## Background
The following idea was inspired by the term `p_e` ("probability of chance agreement") in Cohen's Kappa [definition](https://en.wikipedia.org/wiki/Cohen%27s_kappa#Definition).

## Formula for "chance agreement"
For a (multi-class) classification problem, let's consider the vector of ground-truth labels `y_true`. If we assume that the dataset represents accurately the proportions of each label, we can then say that the probability of any given sample to have label `k` (for `k in  1...K`) is:
<img src="https://latex.codecogs.com/svg.image?\hat{p}_k&space;=&space;\frac{n_k}{N}" title="https://latex.codecogs.com/svg.image?\hat{p}_k = \frac{n_k}{N}" />
where `n_k` is the number of samples in `y_true` with label equal to `k`, and `N` is the total number of samples in `y_true`.

Based on this observation, let's consider a model that predicts `y_pred` by attributing to each sample, independently from the other samples, a random label according to the observed occurrence probabilities. This means that the predicted label of the `i`-th sample, `y_pred[i]` is given by
<img src="https://latex.codecogs.com/svg.image?y_{pred}[i]&space;\sim&space;\text{Categorical}(\hat{p}_1,&space;...,&space;\hat{p_k})&space;=&space;\text{Categorical}\left(\frac{n_1}{N},&space;...,&space;\frac{n_K}{N}\right)" title="https://latex.codecogs.com/svg.image?y_{pred}[i] \sim \text{Categorical}(\hat{p}_1, ..., \hat{p_k}) = \text{Categorical}\left(\frac{n_1}{N}, ..., \frac{n_K}{N}\right)" />

Then, the probability of the event `y_true[i] == y_pred[i]` is computed using the [Law of Total Probability](https://en.wikipedia.org/wiki/Law_of_total_probability) as
<img src="https://latex.codecogs.com/svg.image?p(y_{pred}[i]&space;=&space;y_{true}[i])=&space;\sum_{k=1}^K\left(\frac{n_k}{N}\right)^2" title="https://latex.codecogs.com/svg.image?p(y_{pred}[i] = y_{true}[i])= \sum_{k=1}^K\left(\frac{n_k}{N}\right)^2" />

## Actions
- [ ] Implement "chance accuracy" as a metric in `make_performance_table()`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compute "chance agreement" baseline #49

Background

Formula for "chance agreement"

Actions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Compute "chance agreement" baseline #49

Description

Background

Formula for "chance agreement"

Actions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions