Background
The following idea was inspired by the term p_e ("probability of chance agreement") in Cohen's Kappa definition.
Formula for "chance agreement"
For a (multi-class) classification problem, let's consider the vector of ground-truth labels y_true. If we assume that the dataset represents accurately the proportions of each label, we can then say that the probability of any given sample to have label k (for k in 1...K) is:

where n_k is the number of samples in y_true with label equal to k, and N is the total number of samples in y_true.
Based on this observation, let's consider a model that predicts y_pred by attributing to each sample, independently from the other samples, a random label according to the observed occurrence probabilities. This means that the predicted label of the i-th sample, y_pred[i] is given by
![https://latex.codecogs.com/svg.image?y_{pred}[i] \sim \text{Categorical}(\hat{p}_1, ..., \hat{p_k}) = \text{Categorical}\left(\frac{n_1}{N}, ..., \frac{n_K}{N}\right)](https://camo.githubusercontent.com/eb89f94ab432f8786fab82a5011066c3df35dd3df88d799830e3ccfffff0e578/68747470733a2f2f6c617465782e636f6465636f67732e636f6d2f7376672e696d6167653f795f7b707265647d5b695d2673706163653b5c73696d2673706163653b5c746578747b43617465676f726963616c7d285c6861747b707d5f312c2673706163653b2e2e2e2c2673706163653b5c6861747b705f6b7d292673706163653b3d2673706163653b5c746578747b43617465676f726963616c7d5c6c656674285c667261637b6e5f317d7b4e7d2c2673706163653b2e2e2e2c2673706163653b5c667261637b6e5f4b7d7b4e7d5c726967687429)
Then, the probability of the event y_true[i] == y_pred[i] is computed using the Law of Total Probability as
![https://latex.codecogs.com/svg.image?p(y_{pred}[i] = y_{true}[i])= \sum_{k=1}^K\left(\frac{n_k}{N}\right)^2](https://camo.githubusercontent.com/79f0f2ed007b54b7e062cd640f2e3ca83d6f7763d2bf45aa8f133110aca776da/68747470733a2f2f6c617465782e636f6465636f67732e636f6d2f7376672e696d6167653f7028795f7b707265647d5b695d2673706163653b3d2673706163653b795f7b747275657d5b695d293d2673706163653b5c73756d5f7b6b3d317d5e4b5c6c656674285c667261637b6e5f6b7d7b4e7d5c7269676874295e32)
Actions
Background
The following idea was inspired by the term
p_e("probability of chance agreement") in Cohen's Kappa definition.Formula for "chance agreement"
For a (multi-class) classification problem, let's consider the vector of ground-truth labels

y_true. If we assume that the dataset represents accurately the proportions of each label, we can then say that the probability of any given sample to have labelk(fork in 1...K) is:where
n_kis the number of samples iny_truewith label equal tok, andNis the total number of samples iny_true.Based on this observation, let's consider a model that predicts
![https://latex.codecogs.com/svg.image?y_{pred}[i] \sim \text{Categorical}(\hat{p}_1, ..., \hat{p_k}) = \text{Categorical}\left(\frac{n_1}{N}, ..., \frac{n_K}{N}\right)](https://camo.githubusercontent.com/eb89f94ab432f8786fab82a5011066c3df35dd3df88d799830e3ccfffff0e578/68747470733a2f2f6c617465782e636f6465636f67732e636f6d2f7376672e696d6167653f795f7b707265647d5b695d2673706163653b5c73696d2673706163653b5c746578747b43617465676f726963616c7d285c6861747b707d5f312c2673706163653b2e2e2e2c2673706163653b5c6861747b705f6b7d292673706163653b3d2673706163653b5c746578747b43617465676f726963616c7d5c6c656674285c667261637b6e5f317d7b4e7d2c2673706163653b2e2e2e2c2673706163653b5c667261637b6e5f4b7d7b4e7d5c726967687429)
y_predby attributing to each sample, independently from the other samples, a random label according to the observed occurrence probabilities. This means that the predicted label of thei-th sample,y_pred[i]is given byThen, the probability of the event
![https://latex.codecogs.com/svg.image?p(y_{pred}[i] = y_{true}[i])= \sum_{k=1}^K\left(\frac{n_k}{N}\right)^2](https://camo.githubusercontent.com/79f0f2ed007b54b7e062cd640f2e3ca83d6f7763d2bf45aa8f133110aca776da/68747470733a2f2f6c617465782e636f6465636f67732e636f6d2f7376672e696d6167653f7028795f7b707265647d5b695d2673706163653b3d2673706163653b795f7b747275657d5b695d293d2673706163653b5c73756d5f7b6b3d317d5e4b5c6c656674285c667261637b6e5f6b7d7b4e7d5c7269676874295e32)
y_true[i] == y_pred[i]is computed using the Law of Total Probability asActions
make_performance_table()