# Mean Square Error and the Bias-Variance Tradeoff

This code shows a proof that Mean Square Error (MSE) = Variance + Bias-squared followed by how this can be applied to model selection.


By standard definition, MSE of an estimator (some function of the data) is the expected value of the squared-difference between the expected value of the estimator and the true value. 
MSE(θ^) = E((θ^ - θ)**2)

The Bias of an estimator is the difference between the expected value of the estimator and the true value of the estimator.
Bias(θ^) = E(θ^) - θ

The Variance

Bias-Variance Decomposition: MSE(θ^) = Variance + Bias-squared

## Start with creating an estimator



In [52]:
# calculate mean
def mean(numList):
    return round((float(sum(numList)) / len(numList) if len(numList) > 0 else 0),8)

# calculate standard deviation
def sd(numList):
    meanList = mean(numList)
    return round((float(((sum([(i - meanList)**2 for i in numList])) 
                 / (len(numList) - 1))**0.5)),8)



# make up some data
X = [2,3,4,3,2,9,6,7,2,9]
y = [12,25,39,25,12,124,70,87,12,124]

# define the sample size
n = len(X) if len(X) == len(y) else print("Sample sizes are not equal.")
   

# correlation coefficient
r = ((n * sum([i*y[z] for z, i in enumerate(X)]) - (sum(X) * sum(y))) 
     / ((n * sum([i**2 for i in X]) - sum(X)**2) * (n * sum([i**2 for i in y]) - sum(y)**2))**0.5)

# slope
slope = r * (sd(y) / sd(X))

# intercept
intercept = mean(y) - slope * mean(X)

# return the function
print("θ^ = {} + {}x".format(intercept,slope))

θ^ = -21.834951414177198 + 15.922330088122807x


## Use estimator to calculate expected values

In [39]:
# for each x, what is the expected value of y?
thetaHat = list(map(lambda x: round((intercept + slope*x),8), X))

thetaHat

[10.00970874,
 25.93203883,
 41.85436893,
 25.93203883,
 10.00970874,
 121.46601942,
 73.69902913,
 89.62135922,
 10.00970874,
 121.46601942]

## The real values of y


In [3]:
# real values of y
theta = y

theta

[12, 25, 39, 25, 12, 124, 70, 87, 12, 124]

## Calculate MSE of estimator



<img src="Images/MSE1.png" alt="MSE">

In [45]:
# find the expected value of square difference between thetaHat and theta
MSE_theta_1 = mean([(thetaHat[num] - theta[num])**2 for num, est in enumerate(thetaHat)])

MSE_theta_1

5.51650485

## Subtract an estmator of theta from both thetas

<img src="Images/MSE2.png" alt="MSE2">

In [53]:
# expectated value of thetaHat
theta_exp = mean(thetaHat)

# calculate new 
MSE_theta_2 = mean([(thetaHat[num] - theta_exp + theta_exp - theta[num])**2 for num, est in enumerate(thetaHat)])

if float(MSE_theta_2) == MSE_theta_1:
    print ("MATCH: {} = {}".format(MSE_theta_1, MSE_theta_2))
else:
    print("No Match: {} , {}".format(MSE_theta_1, MSE_theta_2))

MATCH: 5.51650485 = 5.51650485


## Distribute the binomial

<img src="Images/MSE3.png" alt="MSE">

In [49]:
MSE_theta_3 = mean([(((thetaHat[num] - theta_exp)**2) 
               + (2*((thetaHat[num] - theta_exp) * (theta_exp - theta[num]))) 
               + ((theta_exp - theta[num])**2)) for num, est in enumerate(thetaHat)])


if MSE_theta_3 == MSE_theta_1:
    print ("MATCH: {} = {}".format(MSE_theta_1, MSE_theta_3))
else:
    print("No Match: {} , {}".format(MSE_theta_1, MSE_theta_3))

MATCH: 5.51650485 = 5.51650485


## Break up the 2()


In [7]:
MSE_theta_4 = ((thetaHat5 - theta_est)**2 
               + 2*((thetaHat5 * theta_est) - (thetaHat5 * theta5) 
                    - (theta_est * theta_est) + (theta_est * theta5) )
               + (theta_est - theta5)**2)

if MSE_theta_4 == MSE_theta_1:
    print ("MATCH: {} = {}".format(MSE_theta_1, MSE_theta_4))
else:
    print("No Match: {} , {}".format(MSE_theta_1, MSE_theta_4))

No Match: 0.1211928262680303 , 0.1211928262680162


## Final



In [None]:
Var_Theta = (thetaHat5 - theta_est)**2
Bias_Theta = (theta_est - theta5)**2