# Probability Estimation via Scoring


There are various methods in machine learning for inducing probabilistic predictors.
These are hypotheses $h$ that do not merely output point predictions $h(\vec{x}) \in \mathcal{Y}$, 
i.e., elements of the output space $\mathcal{Y}$, 
but probability estimates $p_h(\cdot \vert \vec{x}) =  p(\cdot \vert \vec{x}, h)$, 
i.e., complete probability distributions on $\mathcal{Y}$. 
In the case of classification, 
this means predicting a single (conditional) probability $p_h(y \vert \vec{x}) = p(y \vert \vec{x} , h)$ for each class $y \in \mathcal{Y}$, 
whereas in regression, $p( \cdot \vert \vec{x}, h)$ is a density function on $\mathbb{R}$. 
Such predictors can be learned in a discriminative way, 
i.e., in the form of a mapping $\vec{x} \mapsto p( \cdot \vert \vec{x})$, 
or in a generative way, which essentially means learning a joint distribution on $\mathcal{X} \times \mathcal{Y}$. 
Moreover, the approaches can be parametric (assuming specific parametric families of probability distributions) or non-parametric. 
Well-known examples include classical statistical methods such as logistic and linear regression, 
Bayesian approaches such as Bayesian networks and Gaussian processes, <!-- (cf.\ Section \ref{sec:gp}),  -->
as well as various techniques in the realm of (deep) neural networks. 
<!-- (cf.\ Section \ref{sec:m1}).  -->

## 1. Logarithmic Scoring Rule

In [3]:
import numpy as np

y_true = np.array([1, 1, 0])

p_pred = np.array([0.8, 0.3, 0.6])

log_score = -np.mean(np.log(p_pred[y_true == 1]))
print(f"Log Score: {log_score:.4f}")

Log Score: 0.7136


## 2. Brier Score

In [5]:
from sklearn.metrics import brier_score_loss

y_true = np.array([1, 1, 0])

p_pred = np.array([0.8, 0.3, 0.6])

brier_score = brier_score_loss(y_true, p_pred)
print(f"Brier Score: {brier_score:.4f}")


Brier Score: 0.2967


## 3. Continuous Ranked Probability Score

In [8]:
import numpy as np
from properscoring import crps_ensemble

y_true = np.array([3.5])

predicted_ensemble = np.array([[3.0, 3.2, 3.4, 3.6, 3.8]])

crps_score = crps_ensemble(y_true, predicted_ensemble)
print(f"Continuous Ranked Probability Score: {crps_score.mean():.4f}")


Continuous Ranked Probability Score: 0.1000
