# Exercises

There are three exercises in this notebook:

1. Use the cross-validation method to test the linear regression with different $\alpha$ values, at least three.
2. Implement a SGD method that will train the Lasso regression for 10 epochs.
3. Extend the Fisher's classifier to work with two features. Use the class as the $y$.

## 1. Cross-validation linear regression

You need to change the variable ``alpha`` to be a list of alphas. Next do a loop and finally compare the results.

In [2]:
import numpy as np

In [None]:
x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])

I = np.identity(2)
alpha_array = [0.1,1,10] # change here

# add 1-3 line of code here
results=np.empty(len(alpha_array), dtype=object)
for i,alpha in enumerate(alpha_array):
    w = np.linalg.inv(x.T*x + alpha * I)*x.T*y
    w=w.ravel()
    results[i]=w

print(results)


[matrix([[-101.72397081,    1.16978757]])
 matrix([[-20.59044706,   0.71048616]])
 matrix([[-2.29106262,  0.60688107]])]


## 2. Implement based on the Ridge regression example, the Lasso regression.

Please implement the SGD method and compare the results with the sklearn Lasso regression results. 

In [11]:
def sgd(x, y, it_depth, r, alpha):
    x_num, d = x.shape  # Use x.shape without parentheses
    w = np.zeros(d)  # Initialize weights
    for j in range(it_depth):
        for i in range(x_num):
            w -= r * (-2 * x[i] * (y[i] - x[i].dot(w)) + alpha * np.sign(w))
    return w

In [None]:
x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.c_[np.ones((15, 1)), x] 
y = y.ravel()

I = np.identity(2)
alpha = 0.1 
print(sgd(x,y,100,0.000001,alpha))

#w = np.linalg.inv(x.T*x + alpha * I)*x.T*y # I not sure what to do?
#w=w.ravel()


[-0.00140505  0.59178569]


## 3. Extend the Fisher's classifier

Please extend the targets of the ``iris_data`` variable and use it as the $y$.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris

iris_data = load_iris()
iris_df = pd.DataFrame(iris_data.data,columns=iris_data.feature_names)
iris_df.head()

x1 = iris_df['sepal width (cm)'].values # change here
x2 = iris_df['sepal length (cm)'].values # change
y = iris_data.target

#Now there are two ways
#First: doing matrix multiplications with the weight formula
x = np.asmatrix(np.c_[np.ones((len(x1),1)),x1,x2])
w = np.linalg.inv(x.T @ x) @ x.T @ y
print(w)


#Second: calculating means
dataset_size = np.size(x1)

mean_x1, mean_x2, mean_y = np.mean(x1), np.mean(x2), np.mean(y)

SS_x1y = np.sum(y * x1) - dataset_size * mean_y * mean_x1
SS_x2y = np.sum(y * x2) - dataset_size * mean_y * mean_x2

SS_x1x1 = np.sum(x1 * x1) - dataset_size * mean_x1 * mean_x1
SS_x2x2 = np.sum(x2 * x2) - dataset_size * mean_x2 * mean_x2
SS_x1x2 = np.sum(x1 * x2) - dataset_size * mean_x1 * mean_x2

# Solve for coefficients using matrix algebra
denominator = SS_x1x1 * SS_x2x2 - SS_x1x2 ** 2
a1 = (SS_x2x2 * SS_x1y - SS_x1x2 * SS_x2y) / denominator
a2 = (SS_x1x1 * SS_x2y - SS_x1x2 * SS_x1y) / denominator
b = mean_y - a1 * mean_x1 - a2 * mean_x2

# Predicted values
y_pred = a1 * x1 + a2 * x2 + b
print(f"Coefficients: a1={a1}, a2={a2}, Intercept: b={b}")

[[-1.34333979 -0.63781099  0.73474169]]
Coefficients: a1=-0.6378109916553719, a2=0.7347416880973524, Intercept: b=-1.343339792294505
