# Exercises

There are three exercises in this notebook:

1. Use the cross-validation method to test the linear regression with different $\alpha$ values, at least three.
2. Implement a SGD method that will train the Lasso regression for 10 epochs.
3. Extend the Fisher's classifier to work with two features. Use the class as the $y$.

In [39]:
import numpy as np

## 1. Cross-validation linear regression

You need to change the variable ``alpha`` to be a list of alphas. Next do a loop and finally compare the results.

In [41]:
x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])

I = np.identity(2)
alpha_values = [0.1, 1, 10]

for alpha in alpha_values:
    w = np.linalg.inv(x.T*x + alpha * I)*x.T*y
    w=w.ravel()
    print(f"Alpha: {alpha}, Weights: {w}")

Alpha: 0.1, Weights: [[-101.72397081    1.16978757]]
Alpha: 1, Weights: [[-20.59044706   0.71048616]]
Alpha: 10, Weights: [[-2.29106262  0.60688107]]


## 2. Implement based on the Ridge regression example, the Lasso regression.

Please implement the SGD method and compare the results with the sklearn Lasso regression results. 

In [43]:
def sgd(x, y, alpha=0.1, lr = 1e-7, epochs=10):
    w = np.zeros(x.shape[1])

    for epoch in range(epochs):
        for i in range(x.shape[0]):
            # Convert xi from matrix row to flat array
            xi = np.array(x[i]).flatten()
            yi = float(y[i, 0])
            y_pred = np.dot(xi, w)
            error = y_pred - yi
            grad = 2 * error * xi + alpha * np.sign(w)
            w -= lr * grad

    return w

In [44]:
x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])

I = np.identity(2)
alpha = 0.1 


w = sgd(x, y, alpha=0.1, lr=1e-5)
w=w.ravel()

In [45]:
from sklearn.linear_model import Lasso

# Sklearn's Lasso for comparison
model = Lasso(alpha=0.1, fit_intercept=True, max_iter=10000)
model.fit(np.array(x)[:,1:], y)

print("SGD-based Lasso:")
print("  Intercept:", w[0])
print("  Coef:     ", w[1])

print("\nsklearn Lasso:")
print("  Intercept:", model.intercept_)
print("  Coef:     ", model.coef_[0])

SGD-based Lasso:
  Intercept: -0.0017081631059024404
  Coef:      0.6284784839734829

sklearn Lasso:
  Intercept: [-180.8579086]
  Coef:      1.6177649901016675


## 3. Extend the Fisher's classifier

Please extend the targets of the ``iris_data`` variable and use it as the $y$.

In [53]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris

iris_data = load_iris()
iris_df = pd.DataFrame(iris_data.data,columns=iris_data.feature_names)
iris_df.head()

x = iris_df[['sepal width (cm)', 'petal width (cm)']].values
y = iris_data.target

mean_x = np.mean(x, axis=0)
mean_y = np.mean(y)
dataset_size = x.shape[0]

SS_xy = x.T @ y - dataset_size * mean_x * mean_y
SS_xx = x.T @ x - dataset_size * np.outer(mean_x, mean_x)

a = np.linalg.inv(SS_xx) @ SS_xy
b = mean_y - a @ mean_x

y_pred = x @ a + b
