In [1]:
import numpy as np
import matplotlib.pyplot as plt


## Dataset

Run the cell given below to generate the data-matrix $X$ and target vector $y$. $X$ is of shape $(n, d)$, where $n$ denotes the number of samples and $d$ denotes the number of features. $y$ is of shape $(n,)$. You will be using this dataset for the rest of the session. 



In [6]:
from sklearn.datasets import load_diabetes
X, y = load_diabetes(return_X_y = True)

# set the random seed value to 0
np.random.seed(0)
X.shape

(442, 10)


Write a function `shuffle_data(X, y)` that returns the shuffled $X$ and $y$. 

Note that it should shuffle the data pairs $(x_i, y_i)$. 



In [5]:
## Shuffle data
def shuffle_data(X,y):
    indices = np.arange(X.shape[0])     #will result consecutive numbers from 0 to n
    np.random.shuffle(indices)
    return X[indices], y[indices] 
X, y = shuffle_data(X, y)


Write a function `train_test_split(X, y, test_size)` that divides the data (X, y) into $X_{train}$, $X_{test}$, $y_{train}$, $y_{test}$ respectively as per test_size which should be a value between 0 and 1.

That is, if test_size = $t$, then `int(t*n)` data points should go to test and the remaining data points should go to train.

$X_{train}$, $X_{test}$, $y_{train}$, $y_{test}$ should be returned by the function.








In [37]:
## Train Test split
def train_test_split(X, y, test_size):
    np.random.seed(0)
    X, y = shuffle_data(X,y)
    indices_test = X.shape[0]*test_size
    n = X.shape[0]
    train_data = X[ :int(n-indices_test)]
    test_data = X[int(n-indices_test): ]
    y_train = y[ :int(n-indices_test)]
    y_test = y[int(n-indices_test): ]
    return train_data, test_data, y_train, y_test
train_data, test_data, y_train, y_test = train_test_split(X, y, 0.25)
y_test.sum()
test_data.shape,train_data.shape

((111, 10), (331, 10))


Add a dummy feature, i.e., a column containing all 1's (as the first column) in $X_{train}$ and $X_{test}$.

Take the transpose of both $X_{train}$ and $X_{test}$.



In [41]:
### Add dummy feature
train_data = train_data.T
test_data = test_data.T
dummy_feature = np.ones(train_data.shape[1])
X_tr = np.row_stack((dummy_feature, train_data))
X_te = np.row_stack((np.ones(test_data.shape[1]), test_data))
X_tr.shape,X_te.shape
#dummy_feature.shape

((11, 331), (11, 111))



Write a function `compute_weights(X, y)` that uses the closed form formula of linear regression and returns a weight vector.



In [46]:
## Weight vector
def compute_weights(X, y):
    return np.linalg.pinv(X@X.T) @X @y
w = compute_weights(X_tr, y_train)



Write a function `MSE(X, y, w)` that returns the mean squared error for the given `X`, `y` as per `w` values.



In [50]:
def MSE(X, y, w):
    y_pred = (X.T)@w
    return np.mean((y-y_pred)**2)
print('training error is: ',MSE(X_tr,y_train, w))

training error is:  2930.6006971803567


In [51]:
print('test error is: ',MSE(X_te,y_test, w))

test error is:  2719.657271320531



Write a function `compute_weights_ridge(X, y)` that uses the closed form formula of Ridge regression and returns a weight vector.



In [60]:
#### Ridge weight vector
def compute_weights_ridge(X, y, alpha):
    return (np.linalgX@X.T - alpha*(np.eye(X.shape[0])))@X@y
w_r = compute_weights_ridge(X_tr, y_train, 0.3)
w_r.round(1)

array([ 1.65983309e+07, -1.04177000e+04, -3.30848000e+04,  2.89814000e+04,
       -5.92660000e+03, -1.63896000e+04, -5.53370000e+03,  8.86500000e+03,
       -2.10758000e+04, -2.61383000e+04,  3.76026000e+04])



Compute the train error and test error.

In [56]:
### Test and train error
print('training error using ridge weight is: ',MSE(X_tr,y_train, w_r))
print('test error using ridge weight is: ',MSE(X_te,y_test, w_r))

training error using ridge weight is:  275509628681568.9
test error using ridge weight is:  275469498261607.03
