
### Bias-Variance Trade-off and Decomposition

Assume we have some data, with true labels $y$ generated from a function $f(x) + \epsilon$, where $epsilon$ is a random variable with a mean of $0$ and a variance of $\sigma$.

We estimate the $f(x)$ as $\hat f(x)$.

The bias-variance decomposition says that for any unseen $x$,

$$E\left[(y - \hat f (x))^2\right] =
  \left(Bias[\hat f (x)] \right)^2 +
  Var[\hat f (x)] + 
  \sigma^2$$
  
where

$$Bias[\hat f(x)] = E[\hat f(x)] - f(x)$$

and

$$Var[\hat f(x)] = E[\hat f(x)^2] - E[\hat f(x)]^2 $$



Let's try the derivation.

$$
\begin{align}
E\left[(y - \hat f(x))^2\right] & = E\left[(f(x) + \epsilon -
     \hat f(x) +
     E[\hat f(x)] - E[\hat f(x)
     )^2\right] \\
     & = 
\end{align}$$

Let's try to show that.

In the simplest case, $f(x) = 0$, and $\epsilon$ is normally distributed. We'll try to estimate it as a constant function.

In [2]:
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
%matplotlib inline

In [16]:
sample_size = 10
sigma = 5
epsilon = stats.norm(0, sigma)

def f(size=1):
    return np.zeros((size,))

In [24]:
n_trials = 100000
fhats = []
ys = []
for t in range(n_trials):
    sample =  f(size=sample_size) + epsilon.rvs(sample_size)
    fhat = sample.mean()
    fhats.append(fhat)
    ys.append((f(1) + epsilon.rvs(1))[0])

    
fhats = np.array(fhats)
ys = np.array(ys)

bias2 = ((fhats).mean() - 0)**2
variance = (fhats**2).mean() - (fhats).mean()**2
error = ((ys-fhats)**2).mean()
print(f"error = {error} bias^2 = {bias2} variance = {variance} sigma^2 = {sigma**2} sum={bias2+variance+sigma**2}")

error = 27.57562107712117 bias^2 = 6.306279634043667e-07 variance = 2.5001724587591063 sigma^2 = 25 sum=27.50017308938707


Let's get a little less trivial, a step function.

In [27]:
sample_size = 10
sigma = 5
epsilon = stats.norm(0, sigma)
x_dist = stats.uniform(-1, 2)

def f(x):
    return np.where(x > 0, 1, 0)


In [25]:
n_trials = 10000
fhats = []
ys = []
for t in range(n_trials):
    sample =  f(size=sample_size) + epsilon.rvs(sample_size)
    fhat = sample.mean()
    fhats.append(fhat)
    ys.append((f(1) + epsilon.rvs(1))[0])

    
fhats = np.array(fhats)
ys = np.array(ys)

bias2 = ((fhats).mean() - 0)**2
variance = (fhats**2).mean() - (fhats).mean()**2
error = ((ys-fhats)**2).mean()
print(f"error = {error} bias^2 = {bias2} variance = {variance} sigma^2 = {sigma**2} sum={bias2+variance+sigma**2}")

array([0, 1, 0])