# Exercises

There are three exercises in this notebook:

1. Use the cross-validation method to test the linear regression with different $\alpha$ values, at least three.
2. Implement a SGD method that will train the Lasso regression for 10 epochs.
3. Extend the Fisher's classifier to work with two features. Use the class as the $y$.

## 1. Cross-validation linear regression

You need to change the variable ``alpha`` to be a list of alphas. Next do a loop and finally compare the results.

In [56]:
import numpy as np
from sklearn.linear_model import Ridge

x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1)
x = np.c_[np.ones((15, 1)), x]
I = np.identity(2)
alphas = [1.0, 0.1, 0.01, 0.001]

for alpha in alphas:
    w = np.linalg.inv(x.T @ x + alpha * I) @ x.T @ y
    w = np.ravel(w)
    y_pred = x @ w.reshape(-1, 1)
    mse = ((y_pred - y) ** 2).mean()
    ridge_model = Ridge(alpha=alpha, fit_intercept=False)
    ridge_model.fit(x, y)
    ridge_w = ridge_model.coef_.ravel()
    print(f"alpha = {alpha:.3f} | Weights (analytical): {w} | Weights (sklearn): {ridge_w} | MSE: {mse:.3f}")

alpha = 1.000 | Weights (analytical): [-20.59044706   0.71048616] | Weights (sklearn): [-20.59044706   0.71048616] | MSE: 592.464
alpha = 0.100 | Weights (analytical): [-101.72397081    1.16978757] | Weights (sklearn): [-101.72397081    1.16978757] | MSE: 426.045
alpha = 0.010 | Weights (analytical): [-167.85534019    1.54416013] | Weights (sklearn): [-167.85534019    1.54416013] | MSE: 373.794
alpha = 0.001 | Weights (analytical): [-179.52628555    1.61022985] | Weights (sklearn): [-179.52628555    1.61022985] | MSE: 372.348


## 2. Implement based on the Ridge regression example, the Lasso regression.

Please implement the SGD method and compare the results with the sklearn Lasso regression results. 

In [57]:
def sgd(x, y, alpha, epochs):
    weights = np.ones(x.shape[1])
    bias = 1

    for _ in range(epochs):
        delta = y - (x * weights + bias)
        gradient_weights = -2 * sum(x * delta) / (np.linalg.norm(x) ** 2)
        gradient_bias = -(2 / y.size) * sum(delta)
        weights -= (alpha * gradient_weights)
        bias -= (alpha * gradient_bias)

    return bias, weights

In [58]:
from sklearn.linear_model import Lasso

x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15, 1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15, 1)

lasso_regression = Lasso(alpha=0.1)
lasso_regression.fit(X=x, y=y)
lasso_bias, lasso_weights= lasso_regression.intercept_[0], lasso_regression.coef_[0]
print(f"Lasso: [{lasso_bias}] [{lasso_weights}]")

sgd_bias, sgd_weights = sgd(x, y, 0.1, 10000)
print(f"SGD: {sgd_bias} {sgd_weights}")

Lasso: [-180.85790859980537] [1.6177649901016675]
SGD: [-180.89686576] [1.6179881]


## 3. Extend the Fisher's classifier

Please extend the targets of the ``iris_data`` variable and use it as the $y$.

In [59]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris

iris = load_iris()
df_iris = pd.DataFrame(iris.data, columns=iris.feature_names)
df_iris['target'] = iris.target
independent_vars = ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
dependent_var = 'target'
X_vals = df_iris[independent_vars].values
Y_vals = df_iris[dependent_var].values.reshape(-1, 1)
total_elements = X_vals.size
mean_X = np.mean(X_vals)
mean_Y = np.mean(Y_vals)
sum_xy = np.sum(X_vals * Y_vals)
sum_xx = np.sum(X_vals * X_vals)
SS_xy_alt = sum_xy - total_elements * mean_X * mean_Y
SS_xx_alt = sum_xx - total_elements * mean_X * mean_X
slope_alt = SS_xy_alt / SS_xx_alt
intercept_alt = mean_Y - slope_alt * mean_X
Y_pred_alt = slope_alt * X_vals + intercept_alt
df_predictions = pd.DataFrame(Y_pred_alt)
df_predictions

Unnamed: 0,0,1,2,3
0,1.244804,1.005314,0.690983,0.511365
1,1.214867,0.930473,0.690983,0.511365
2,1.184931,0.960409,0.676015,0.511365
3,1.169963,0.945441,0.705951,0.511365
4,1.229836,1.020282,0.690983,0.511365
...,...,...,...,...
145,1.484294,0.930473,1.259772,0.825696
146,1.424421,0.855632,1.229836,0.765824
147,1.454357,0.930473,1.259772,0.780792
148,1.409453,0.990346,1.289708,0.825696
