#### Locally Weighted Linear Regression

Most of the code is same from last time. Only the loss and grad_loss functions are different.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.animation as animation
%matplotlib notebook

We begin by loading the data and estimating the graph using pandas.

In [2]:
data = pd.concat([pd.read_table('q3x.dat', names = ['x']),pd.read_table('q3y.dat', names = ['y'])], axis=1)
data.plot(kind='scatter', x='x', y='y')

<IPython.core.display.Javascript object>

<matplotlib.axes._subplots.AxesSubplot at 0x7f625a274978>

Hmm.

Normalize the inputs and store it in numpy array. Finally Concatenate 1 to the inputs as a simple way to introduce the bias term.

In [3]:
inputs = data.values[:,0:1]
targets = data.values[:,1:2]

inputs_mean = np.mean(inputs, axis=0)
inputs_std = np.std(inputs,axis=0)
inputs = (inputs - inputs_mean)/(inputs_std + np.finfo(float).eps)
inputs = np.concatenate((np.ones((inputs.shape[0],1)),inputs),axis=1)

We define the loss, gradLoss functions. We also define the lin. regression loop.

In [4]:
def weighted_l2loss(inputs,targets,weights,W):
    p = np.dot(inputs,weights)-targets
    return (np.dot(p.T,np.dot(W,p))/(2*inputs.shape[0]))[0][0]

def l2loss(inputs,targets,weights,W):
    if W is None:
        p = np.dot(inputs,weights)-targets
        return (np.dot(p.T,p)/(2*inputs.shape[0]))[0][0]
    else:
        return weighted_l2loss(inputs,targets,weights,W)

def weighted_l2error(inputs, targets, weights,W):
    return np.dot(np.dot(inputs.T,W), ((np.dot(inputs,weights)-targets))).mean(axis=1, keepdims=True)

def l2error(inputs, targets, weights,W):
    if W is None:
        return np.dot(inputs.T, ((np.dot(inputs,weights)-targets))).mean(axis=1, keepdims=True)
    else:
        return weighted_l2error(inputs,targets,weights,W)

def lin_reg_loop(learning_rate, stopping_df_dx,W=None):
    animation_losses = []
    weights = np.zeros((inputs.shape[1],1))
    while True:
        loss = l2loss(inputs, targets, weights,W)
        df_dx = l2error(inputs, targets, weights,W)    
        animation_losses.append([weights[0][0], weights[1][0], loss])
        weights = weights - learning_rate*df_dx
        if (np.linalg.norm(df_dx) < stopping_df_dx):
            animation_losses.append([weights[0][0], weights[1][0], loss])
            break

    print ('Line: y = ' , weights[1][0], ' x_norm + ', weights[0][0])
    print('Final loss:', loss)
    print('Learning Rate: ', learning_rate)
    print('Stopping Criteria: df/dx <', stopping_df_dx)
    print('Num iterations: ', len(animation_losses))

    return weights, np.array(animation_losses).T

First, let's try to find a linear fit using simple unweighted regression.

In [5]:
weights, data = lin_reg_loop(learning_rate=0.013, stopping_df_dx=1e-10)

fig = plt.figure()
plt.plot(inputs, inputs*weights[1] + weights[0])
plt.scatter(inputs[:,1:2],targets,c='g')
plt.show()

Line: y =  0.835188895313  x_norm +  1.03127983079
Final loss: 0.333364453435
Learning Rate:  0.013
Stopping Criteria: df/dx < 1e-10
Num iterations:  26


<IPython.core.display.Javascript object>

Of course, it doesn't fit.

Now we make a function that returns the W (Weights) diagonal matrix for given input and bandwidth. Also, we choose the test points uniformly from the same range as training data.

We plot the training data in green points, and test data in blue points.

In [6]:
def get_W(x,Tau):
    return np.diag(np.exp(-((inputs[:,1] - x)**2)/(2*Tau*Tau)))

test_inputs = np.linspace(-2,2,num=100)
test_results = np.empty_like(test_inputs)

for t,x in enumerate(test_inputs):
    weights, data = lin_reg_loop(learning_rate=0.001, stopping_df_dx=0.01,W=get_W(x,0.8))
    test_results[t] = weights[0][0] + weights[1][0]*x


fig = plt.figure()
plt.scatter(inputs[:,1:2],targets,c='g')
plt.scatter(test_inputs,test_results,c='b')

plt.show()

Line: y =  2.22308677343  x_norm +  2.23143687366
Final loss: 0.00935048869073
Learning Rate:  0.001
Stopping Criteria: df/dx < 0.01
Num iterations:  3260
Line: y =  2.22447745895  x_norm +  2.23320610577
Final loss: 0.0101775782446
Learning Rate:  0.001
Stopping Criteria: df/dx < 0.01
Num iterations:  3033
Line: y =  2.22481894875  x_norm +  2.23368164456
Final loss: 0.0110891145817
Learning Rate:  0.001
Stopping Criteria: df/dx < 0.01
Num iterations:  2825
Line: y =  2.22405489303  x_norm +  2.23283071524
Final loss: 0.012095514747
Learning Rate:  0.001
Stopping Criteria: df/dx < 0.01
Num iterations:  2634
Line: y =  2.22212594399  x_norm +  2.23062094883
Final loss: 0.0132082314698
Learning Rate:  0.001
Stopping Criteria: df/dx < 0.01
Num iterations:  2458
Line: y =  2.21898222303  x_norm +  2.22703685115
Final loss: 0.0144397421287
Learning Rate:  0.001
Stopping Criteria: df/dx < 0.01
Num iterations:  2297
Line: y =  2.214553355  x_norm +  2.22204087795
Final loss: 0.0158035161295


<IPython.core.display.Javascript object>

Now we can change the value of bandwidth to see its effect on the output.