# Exercises

There are three exercises in this notebook:

1. Use the cross-validation method to test the linear regression with different $\alpha$ values, at least three.
2. Implement a SGD method that will train the Lasso regression for 10 epochs.
3. Extend the Fisher's classifier to work with two features. Use the class as the $y$.

In [2]:
import numpy as np

## 1. Cross-validation linear regression

You need to change the variable ``alpha`` to be a list of alphas. Next do a loop and finally compare the results.

In [4]:
x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])

I = np.identity(2)
alphas = [0.1, 0.2, 0.3] # change here

# add 1-3 line of code here
for alpha in alphas:
    w = np.linalg.inv(x.T*x + alpha * I)*x.T*y
    w=w.ravel()

# add 1-3 lines to compare the results
    y_pred = np.array(x * w.reshape(-1,1))
    mse = np.mean((y - y_pred) ** 2)
    print(f"Alpha: {alpha:.2f}, MSE: {mse:.2f}, Weights: {w}")


Alpha: 0.10, MSE: 426.05, Weights: [[-101.72397081    1.16978757]]
Alpha: 0.20, MSE: 476.27, Weights: [[-70.75142154   0.99445055]]
Alpha: 0.30, MSE: 509.77, Weights: [[-54.23704349   0.90096184]]


## 2. Implement based on the Ridge regression example, the Lasso regression.

Please implement the SGD method and compare the results with the sklearn Lasso regression results. 

In [31]:
import matplotlib.pyplot as plt
from sklearn.linear_model import Lasso
from sklearn.preprocessing import StandardScaler

def sgd(x, y, alpha=0.1, learning_rate=0.01, epochs=10):
    x = np.asarray(x)
    y = np.asarray(y)
    n_features = x.shape[1]
    w = np.zeros(n_features)
    n_samples = x.shape[0]
    
    for epoch in range(epochs):
        indices = np.random.permutation(n_samples)
        x_shuffled = x[indices]
        y_shuffled = y[indices]
        
        for i in range(n_samples):
            y_pred = np.dot(x_shuffled[i], w)
            mse_grad = -2 * x_shuffled[i] * (y_shuffled[i] - y_pred)
            
            l1_grad = np.zeros_like(w)
            for j in range(n_features):
                if w[j] > 0:
                    l1_grad[j] = 1
                elif w[j] < 0:
                    l1_grad[j] = -1
                else:
                    l1_grad[j] = np.random.choice([-1, 1])
            
            grad = mse_grad + alpha * l1_grad
            w = w - learning_rate * grad
            
    return w

In [34]:
x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])

I = np.identity(2)
alpha = 0.1 

x = np.asarray(x)
x_mean = np.mean(x, axis=0)
x_std = np.std(x, axis=0)
x_std[x_std == 0] = 1
x_scaled = (x - x_mean) / x_std

w = sgd(x_scaled, y, alpha=alpha)

lasso_sk = Lasso(alpha=alpha, fit_intercept=False)
lasso_sk.fit(x_scaled, y)

y_pred_sgd = np.dot(x_scaled, w)
y_pred_sk = lasso_sk.predict(x_scaled)

mse_sgd = np.mean((y - y_pred_sgd) ** 2)
mse_sk = np.mean((y - y_pred_sk) ** 2)

print("\nMSE Comparison:")
print(f"SGD Lasso MSE: {mse_sgd:.2f}")
print(f"Sklearn Lasso MSE: {mse_sk:.2f}")






MSE Comparison:
SGD Lasso MSE: 12184.22
Sklearn Lasso MSE: 12253.79


## 3. Extend the Fisher's classifier

Please extend the targets of the ``iris_data`` variable and use it as the $y$.

In [36]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris

iris_data = load_iris()
iris_df = pd.DataFrame(iris_data.data,columns=iris_data.feature_names)
iris_df.head()

x = iris_df[['sepal width (cm)', 'petal width (cm)']].values # change here
y = iris_data.target # change here

dataset_size = np.size(x, axis=0)
mean_x = np.mean(x, axis=0)
mean_y = np.mean(y)

SS_xy = np.sum((y - mean_y).reshape(-1, 1) * (x - mean_x), axis=0)
SS_xx = np.sum((x - mean_x) * (x - mean_x), axis=0)

a = SS_xy / SS_xx
b = mean_y - np.sum(a * mean_x)

y_pred = np.sum(a * x, axis=1) + b