# Homework 4

## Problem 7

We repeat the procedures from problem 4 to 6 for different hypothesis. This gives us the bias and variance.

The expected value of $E_{out}$ with respect to some data set $D$ is then calculated by

$\mathbb{E}_D[ E_{out}(g^{(D)})]$ 

$= \mathbb{E}_D[\mathbb{E}_x [(g^{(D)}(x) - f(x))^2]]$

$= ...$

$=  bias + variance$

see slides 5-8 of lecture 8.

**Note**:  

- In the lecture we first started comparing a particular $g^{(D)}$ to $f(x)$ by calculating the square error $(g^{(D)}(x) - f(x))^2$ which can be interpreted as the out of sample error of the 'trained' function $g^{(D)}$, so $E_{out}(g^{(D)}) = \mathbb{E}_x [(g^{(D)}(x) - f(x))^2]$, see slide 12 of lecture 8.

- We then compared all possible $g^{(D)}$ to $f(x)$ and computed the mean square error over all data points D in our training set. So we have $\mathbb{E}_D[ E_{out}(g^{(D)})] = \mathbb{E}_D[\mathbb{E}_x [(g^{(D)}(x) - f(x))^2]]$, see slide 13 and 14 of lecture 8.



## Results

Bias and variance for different hypotheses:

- a) $h(x) = b$, bias = 0.50, var = 0.25  $\Rightarrow$ $E_{out}$ = bias + var = 0.75
- b) $h(x) = ax$, bias = 0.28, var = 0.23  $\Rightarrow$ $E_{out}$ = bias + var = 0.51
- c) $h(x) = ax + b$, bias = 0.21, var = 1.69 $\Rightarrow$ $E_{out}$ = bias + var = 1.9
- d) $h(x) = ax^2$, bias =  0.49, var = 17.89 $\Rightarrow$ $E_{out}$ = bias + var = 18.38
- e) $h(x) = ax^2+b$, bias =  0.59, var = 71000 $\Rightarrow$ $E_{out}$ = bias + var = 710000

The model $h(x) = ax$ has the smallest expected value of $E_{out}$, so the correct answer is **7[b]** .

Note: The variance for the last model $h(x) = ax^2 + b$ is ridiculously high and does not really seem to converge to a value.

## Hypothesis: $h(x) = b$

In [1]:
# HYPOTHESIS: h(x) = b

import numpy as np


def problem4():
    
    RUNS = 1000
    b_total = 0
    N = 2          # size of data set
    
    for _ in range(RUNS):
        # two random points
        x_rnd = np.random.uniform(-1, 1, N)
        y_rnd = np.sin(np.pi * x_rnd)

        # linear regression for model y = b*1
        X = np.array([np.ones(N)]).T
        w = np.dot(np.dot(np.linalg.inv(np.dot(X.T, X)), X.T), y_rnd)
        b = w[0]

        b_total += b
        
    b_avg = b_total / RUNS
    return b_avg

print("h(x) = b (constant)")
print("solution problem 7: b_avg = ", problem4())


#-------------------------------------------------------------------------


def problem5():
    N_test = 1000
    x_test = np.random.uniform(-1,1,N_test)

    y_f = np.sin(np.pi * x_test)
    b_avg = problem4()
    y_g_bar = b_avg

    bias = sum((y_f - y_g_bar)**2) / N_test
    return bias
    

print("\nSolution to problem 7: bias = ", problem5())

#--------------------------------------------------------------------------

def problem6():
    b_avg = problem4()
    expectation_over_X = 0
    
    RUNS_D = 100
    RUNS_X = 1000
    # variance: Compare each g to g_bar
    
    for i in range(RUNS_X):
        N = 2
        x_test = np.random.uniform(-1,1)
        expectation_over_D = 0
        
        for _ in range(RUNS_D):
            # two random points as data set D
            x_rnd = np.random.uniform(-1, 1, N)
            y_rnd = np.sin(np.pi * x_rnd)

            # linear regression for model y = ax
            # get a particular g^(D)
            X = np.array([np.ones(N)]).T
            w = np.dot(np.dot(np.linalg.inv(np.dot(X.T, X)), X.T), y_rnd)
            b  = w[0]
            
            # calculate difference
            y_g = b
            y_g_bar = b_avg
            expectation_over_D += (y_g - y_g_bar)**2 / RUNS_D

        expectation_over_X += expectation_over_D / RUNS_X
    
    variance = expectation_over_X
    return variance


print("\nSolution to problem 7, variance = ", problem6())


h(x) = b (constant)
solution problem 7: b_avg =  -0.00962505562673

Solution to problem 7: bias =  0.514264178422

Solution to problem 7, variance =  0.250087080433


## Hypothesis: $h(x) = a x^2$

In [2]:
# HYPOTHESIS: h(x) = ax^2

import numpy as np


def problem4():
    
    RUNS = 1000
    a_total = 0
    N = 2          # size of data set
    
    for _ in range(RUNS):
        # two random points
        x_rnd = np.random.uniform(-1, 1, N)
        y_rnd = np.sin(np.pi * x_rnd)

        # linear regression for model y = ax^2
        X = np.array([x_rnd * x_rnd]).T
        w = np.dot(np.dot(np.linalg.inv(np.dot(X.T, X)), X.T), y_rnd)
        a = w[0]

        a_total += a
        
    a_avg = a_total / RUNS
    return a_avg

print("h(x) = ax^2")
print("solution problem 7: a_avg = ", problem4())


#-------------------------------------------------------------------------


def problem5():
    N_test = 1000
    x_test = np.random.uniform(-1,1,N_test)

    y_f = np.sin(np.pi * x_test)
    a_avg = problem4()
    y_g_bar = a_avg * (x_test * x_test)

    bias = sum((y_f - y_g_bar)**2) / N_test
    return bias
    

print("\nSolution to problem 7: bias = ", problem5())

#--------------------------------------------------------------------------

def problem6():
    a_avg = problem4()
    expectation_over_X = 0
    
    RUNS_D = 100
    RUNS_X = 1000
    # variance: Compare each g to g_bar
    
    for i in range(RUNS_X):
        N = 2
        x_test = np.random.uniform(-1,1)
        expectation_over_D = 0
        
        for _ in range(RUNS_D):
            # two random points as data set D
            x_rnd = np.random.uniform(-1, 1, N)
            y_rnd = np.sin(np.pi * x_rnd)

            # linear regression for model y = ax^2
            # get a particular g^(D)
            X = np.array([x_rnd * x_rnd]).T
            w = np.dot(np.dot(np.linalg.inv(np.dot(X.T, X)), X.T), y_rnd)
            a  = w[0]
            
            # calculate difference on test point
            y_g = a * x_test**2
            y_g_bar = a_avg * x_test**2
            expectation_over_D += (y_g - y_g_bar)**2 / RUNS_D

        expectation_over_X += expectation_over_D / RUNS_X
    
    variance = expectation_over_X
    return variance


print("\nSolution to problem 7, variance = ", problem6())


h(x) = ax^2
solution problem 7: a_avg =  -0.10296756787

Solution to problem 7: bias =  0.511649469789

Solution to problem 7, variance =  27.578995093


## Hypothesis: $h(x)=a x^2 + b$

In [3]:
# HYPOTHESIS: h(x) = ax^2 + b

import numpy as np


def problem4():
    
    RUNS = 1000000
    a_total = 0
    b_total = 0
    N = 2          # size of data set
    
    for _ in range(RUNS):
        # two random points
        x_rnd = np.random.uniform(-1, 1, N)
        y_rnd = np.sin(np.pi * x_rnd)

        # linear regression for model y = ax^2
        X = np.array([np.ones(N), x_rnd * x_rnd]).T
        w = np.dot(np.dot(np.linalg.inv(np.dot(X.T, X)), X.T), y_rnd)
        b, a = w

        a_total += a
        b_total += b
        
    a_avg = a_total / RUNS
    b_avg = b_total / RUNS
    return b_avg, a_avg

print("h(x) = b + ax^2")
print("solution problem 7: (b_avg, a_avg) = ", problem4())


#-------------------------------------------------------------------------


def problem5():
    N_test = 1000
    x_test = np.random.uniform(-1,1,N_test)

    y_f = np.sin(np.pi * x_test)
    b_avg, a_avg = problem4()
    y_g_bar = b_avg + a_avg * (x_test * x_test)

    bias = sum((y_f - y_g_bar)**2) / N_test
    return bias
    

print("\nSolution to problem 7: bias = ", problem5())

#--------------------------------------------------------------------------

def problem6():
    b_avg, a_avg = problem4()
    expectation_over_X = 0
    
    RUNS_D = 100
    RUNS_X = 1000
    # variance: Compare each g to g_bar
    
    for i in range(RUNS_X):
        N = 2
        x_test = np.random.uniform(-1,1)
        expectation_over_D = 0
        
        for _ in range(RUNS_D):
            # two random points as data set D
            x_rnd = np.random.uniform(-1, 1, N)
            y_rnd = np.sin(np.pi * x_rnd)

            # linear regression for model y = ax^2
            # get a particular g^(D)
            X = np.array([np.ones(N), x_rnd * x_rnd]).T
            w = np.dot(np.dot(np.linalg.inv(np.dot(X.T, X)), X.T), y_rnd)
            b, a  = w
            
            # calculate difference on test point
            y_g = b + a * x_test**2
            y_g_bar = b_avg + a_avg * x_test**2
            expectation_over_D += (y_g - y_g_bar)**2 / RUNS_D

        expectation_over_X += expectation_over_D / RUNS_X
    
    variance = expectation_over_X
    return variance


print("\nSolution to problem 7, variance = ", problem6())


h(x) = b + ax^2
solution problem 7: (b_avg, a_avg) =  (-0.84064290397935792, 10.963322156901588)

Solution to problem 7: bias =  0.502058834627

Solution to problem 7, variance =  333458.121839
