# Homework 4

## Problem 4

For the model $h(x) = ax$ we have to compute $\overline{g}(x) = \hat{a} x$. 

Note that we calculate $\overline{g}(x)$ via (see slide 6 of lecture 8) 

$\overline{g}(x) \approx \sum_{i=1}^{K} g^{D_k}(x)$

Since our model is $h(x) = ax$ this simplifies to:

$\overline{g}(x) \approx \frac{1}{K} \sum_{i=1}^{K} g^{(D_i)}(x)$ 

$= \frac{1}{K} \sum_{i=1}^{K} a^{(D_i)} x$ 

$= x \frac{1}{K} \sum_{i=1}^{K} a^{(D_i)}$ 

$= x \hat{a}$

$= \hat{a} x$

which is why we can express $\overline{g}(x)$ in terms of an average coefficient $\hat{a}$.

_____

## Problem 5

We have to compute the bias, see slide 9 of lecture 8:

$bias = E_\mathbf{x}[\overline{g}(\mathbf{x}) - f(\mathbf{x})^2]$

The bias is the distance of $\overline{g}(\mathbf{x})$ to $f(\mathbf{x})$

**Method 1:**

We can do this by choosing $N_{test} = 1000$ test points and by calculating the average square difference $(\overline{g}(\mathbf{x}) - f(\mathbf{x})^2)$

$bias \approx \frac{1}{N_{test}} \sum_{i=1}^{N_{test}} [\overline{g}(\mathbf{x}_i) - f(\mathbf{x}_i)^2]$

**Method 2:**

We can use an integral to compute $E_\mathbf{x}[\overline{g}(\mathbf{x}) - f(\mathbf{x})^2]$ .

This is equivalent to calculating the mean of a function $r(x)$ in the interval $[a,b]$ via

$\bar {r}(x) = \tfrac 1 {b-a} \int_{a}^b r(x) dx$

see also [here](https://www.math.vt.edu/people/qlfang/class_home/Lesson8.pdf) and [here](https://en.wikipedia.org/wiki/Mean_of_a_function) .

With $r(x) = (\bar g(x) - f(x))^2$ we get:

$\bar {r}(x) = \tfrac 1 {b-a} \int_{a}^b (\bar g(x) - f(x))^2 dx =: bias$
____

## Problem 6

We have to compute the variance, see slide 9 of lecture 8:

$E_\mathbf{x}[E_D[g^{(D)}(x) - \bar{g}(x) ]]$

In [1]:
import matplotlib.pyplot as plt
import numpy as np


def problem4():
    
    RUNS = 1000
    a_total = 0
    N = 2          # size of data set
    
    for _ in range(RUNS):
        # two random points
        x_rnd = np.random.uniform(-1, 1, N)      # this is a vector of size 2
        y_rnd = np.sin(np.pi * x_rnd)            # this is a vector of size 2

        # linear regression for model y = ax
        X_a = np.array([x_rnd]).T
        w_a = np.dot(np.dot(np.linalg.inv(np.dot(X_a.T, X_a)), X_a.T), y_rnd)
        a = w_a[0]

        a_total += a
        
    a_avg = a_total / RUNS
    return a_avg


print("solution problem 4: a_avg = ", problem4())
print("Answer 4[e] is therefore correct.")


#-------------------------------------------------------------------------

def problem5():
    N_test = 1000
    x_test = np.random.uniform(-1,1,N_test)

    y_f = np.sin(np.pi * x_test)
    a_avg = problem4()
    y_g_bar = a_avg * x_test

    bias = sum((y_f - y_g_bar)**2) / N_test
    return bias
    
    
print("\nSolution to problem 5: bias = ", problem5())
print("Answer 5[b] is therefore correct.")

#--------------------------------------------------------------------------

def problem6():
    a_avg = problem4()
    expectation_over_X = 0
    
    RUNS_D = 100
    RUNS_X = 1000
    # variance: Compare each g to g_bar
    
    for i in range(RUNS_X):
        N = 2
        x_test = np.random.uniform(-1,1)
        expectation_over_D = 0
        
        for _ in range(RUNS_D):
            # two random points as data set D
            x_rnd = np.random.uniform(-1, 1, N)
            y_rnd = np.sin(np.pi * x_rnd)

            # linear regression for model y = ax
            # get a particular g^(D)
            X_a = np.array([x_rnd]).T
            w_a = np.dot(np.dot(np.linalg.inv(np.dot(X_a.T, X_a)), X_a.T), y_rnd)
            a  = w_a[0]
            
            # calculate difference
            y_g = a * x_test
            y_g_bar = a_avg * x_test
            expectation_over_D += (y_g - y_g_bar)**2 / RUNS_D

        expectation_over_X += expectation_over_D / RUNS_X
    
    variance = expectation_over_X
    return variance


print("\nSolution to problem 6, variance = ", problem6())
print("Answer 6[a] is therefore correct.")


#--------------------------------------------------------------------------


solution problem 4: a_avg =  1.40845852709
Answer 4[e] is therefore correct.

Solution to problem 5: bias =  0.279449588863
Answer 5[b] is therefore correct.

Solution to problem 6, variance =  0.239356400027
Answer 6[a] is therefore correct.


We can compute the **bias via the integral**:


$bias = \tfrac 1 {1-(-1)} \int_{-1}^{1} (\hat{a}x - \sin(\pi x))^2 dx
= \tfrac 1 {2} \int_{-1}^{1} (1.43 x - \sin(\pi x))^2 $

which in WolframAlpha evaluated equals 0.27

https://www.wolframalpha.com/input/?i=integral+from+-1+to+1+(1.43+x+-+sin(pi+*+x))%5E2+%2F+2

This coincides with our Monte Carlo simulation with $N_{test} = 1000$ points.

## Testing program by confirming values from lecture

Below we test our program by confirming the values from slide 15, lecture 8 with

- $bias = 0.21$
- $var = 1.69$

In [2]:
import matplotlib.pyplot as plt
import numpy as np



def get_g_bar_lecture():

    RUNS = 1000
    m_total = 0
    b_total = 0

    x = np.arange(-1, 1.1, .1)
    y = np.sin(np.pi * x)

    N = 2

    for _ in range(RUNS):

        # two random points
        x_rnd = np.random.uniform(-1, 1, N)
        y_rnd = np.sin(np.pi * x_rnd)

        # linear regression for model y = mx + b
        X = np.array([np.ones(N), x_rnd]).T
        w = np.dot(np.dot(np.linalg.inv(np.dot(X.T, X)), X.T), y_rnd)

        b, m = w
        x_line = np.array([-1, 1])
        y_line = m * x_line + b

        m_total += m
        b_total += b

    # https://matplotlib.org/examples/pylab_examples/axes_props.html

    m_avg = m_total/RUNS
    b_avg = b_total/RUNS

    '''
    plt.plot(x, y)
    plt.plot(x_rnd, y_rnd, 'go')
    plt.plot(x_line, y_line, 'r-')
    plt.plot(x_line_a, y_line_a, 'b-')
    plt.ylim(-1,1)
    plt.xlim(-1,1)
    plt.grid(True)

    plt.show()
    '''


    # Calculate bias and variance

    # bias: compare g_bar(x) to f(x)
    # generate 1000 random points
    N_test = 1000
    x_test = np.random.uniform(-1,1,N_test)

    y_f = np.sin(np.pi * x_test)
    y_g_bar_lecture = m_avg * x_test + b_avg

    bias_lecture = sum((y_f - y_g_bar_lecture)**2) / N_test
    print("\n\nbias lecture: ", bias_lecture)

    return (m_avg, b_avg)
    


#---------------------------------------------------------------------------------------

def confirm_variance_from_lecture():
    m_avg, b_avg = get_g_bar_lecture()
    expectation_over_X = 0
    
    RUNS_D = 100
    RUNS_X = 1000
    # variance: Compare each g to g_bar
    
    for i in range(RUNS_X):
        N = 2
        x_test = np.random.uniform(-1,1)
        expectation_over_D = 0
        for _ in range(RUNS_D):
            # two random points as data set D
            x_rnd = np.random.uniform(-1, 1, N)
            y_rnd = np.sin(np.pi * x_rnd)

            # linear regression for model y = mx + b
            X = np.array([np.ones(N), x_rnd]).T
            w = np.dot(np.dot(np.linalg.inv(np.dot(X.T, X)), X.T), y_rnd)
            b, m = w

            # calculate difference
            y_g = m * x_test + b
            y_g_bar = m_avg * x_test + b_avg
            expectation_over_D += (y_g - y_g_bar)**2 / RUNS_D
        #print("expectation_over_D = ", expectation_over_D)
        expectation_over_X += expectation_over_D / RUNS_X
    
    print("variance = ", expectation_over_X)

        
confirm_variance_from_lecture()   



bias lecture:  0.210732294044
variance =  1.68353665926
