# Exercises

There are three exercises in this notebook:

1. Use the cross-validation method to test the linear regression with different $\alpha$ values, at least three.
2. Implement a SGD method that will train the Lasso regression for 10 epochs.
3. Extend the Fisher's classifier to work with two features. Use the class as the $y$.

## 1. Cross-validation linear regression

You need to change the variable ``alpha`` to be a list of alphas. Next do a loop and finally compare the results.

In [106]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from collections import  defaultdict
from sklearn.datasets import load_iris
from sklearn.linear_model import Lasso

In [4]:
x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])

I = np.identity(2)
alphas = [0.0001, 0.001, 0.01, 0.02, 0.025, 0.1, 0.2, 0.5]
results = defaultdict(int)

for alpha in alphas:
    w = np.linalg.inv(x.T*x + alpha * I)*x.T*y
    w=w.ravel()
    
    my_y = x @ w.T
    
    mse = np.mean(np.asarray(y - my_y) ** 2)
    mae = np.mean(np.abs(y - my_y))
    rmse = np.sqrt(mse)
    
    results[alpha] = {
        "MSE" : mse,
        "MAE" : mae,
        "RMSE" : rmse
    }

    
df_results = pd.DataFrame.from_dict(results, orient='index')
df_results.reset_index(inplace=True)
df_results.rename(columns={'index': 'Alpha'}, inplace=True)

df_results



Unnamed: 0,Alpha,MSE,MAE,RMSE
0,0.0001,372.331462,16.557719,19.295892
1,0.001,372.348022,16.562967,19.296321
2,0.01,373.7938,16.611693,19.333748
3,0.02,377.419689,16.658903,19.427292
4,0.025,379.772207,16.6802,19.487745
5,0.1,426.045077,18.227294,20.640859
6,0.2,476.271132,19.408063,21.823637
7,0.5,549.77106,21.366623,23.447197


## 2. Implement based on the Ridge regression example, the Lasso regression.

Please implement the SGD method and compare the results with the sklearn Lasso regression results. 

In [285]:
def sgd(X, y, learning_rate, alpha, epochs):
    
    weights, bias = np.random.randn(X.shape[1]), 0
    n_samples = X.shape[0]
    
    for _ in range(epochs):
        indices = np.random.permutation(n_samples)
        X = X[indices]
        y = y[indices]
        for i in range(n_samples):
            x_i = X[i]
            y_i = y[i]
        
            y_pred = x_i @ weights + bias

            dw = -2 * x_i * (y_i - y_pred) + alpha * np.sign(weights)
            db = -2 * (y_i - y_pred)

            weights -= learning_rate * dw
            bias -= learning_rate * db
    
    return weights, bias

In [291]:
x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)
 
alpha = 0.1
learning_rate = 0.000015
epochs=10

sgd_result = sgd(x, y, learning_rate, alpha, epochs)

lasso_regression = Lasso(alpha)
lasso_regression.fit(x, y)

comparison_df = pd.DataFrame({
    "Model": ["sklearn_lasso", "sgd_lasso"],
    "Coef": [lasso_regression.coef_[0], sgd_result[0][0]],  
    "Intercept": [lasso_regression.intercept_[0], sgd_result[1][0]]  
})

comparison_df

Unnamed: 0,Model,Coef,Intercept
0,sklearn_lasso,1.617765,-180.857909
1,sgd_lasso,0.371838,0.005203


## 3. Extend the Fisher's classifier

Please extend the targets of the ``iris_data`` variable and use it as the $y$.

In [321]:
iris_data = load_iris()
iris_df = pd.DataFrame(iris_data.data,columns=iris_data.feature_names)
iris_df["target"] = iris_data.target

x = iris_df[['sepal width (cm)', 'sepal length (cm)']].values 
y = iris_df["target"].values.reshape(-1, 1)

dataset_size = np.size(y)

mean_x = np.mean(x, axis=0) 
mean_y = np.mean(y)  

SS_xy = np.sum((y - mean_y) * (x - mean_x), axis=0) 
SS_xx = np.sum((x - mean_x) ** 2, axis=0)  

a = SS_xy / SS_xx 

b = mean_y - np.dot(a, mean_x)

y_pred = np.dot(x, a) + b