# Exercises

There are three exercises in this notebook:

1. Use the cross-validation method to test the linear regression with different $\alpha$ values, at least three.
2. Implement a SGD method that will train the Lasso regression for 10 epochs.
3. Extend the Fisher's classifier to work with two features. Use the class as the $y$.

## 1. Cross-validation linear regression

You need to change the variable ``alpha`` to be a list of alphas. Next do a loop and finally compare the results.

In [7]:
import numpy as np

x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])

I = np.identity(2)
alphas = [0.01, 0.1, 0.5] # change here

# add 1-3 line of code here
results = []
for alpha in alphas:
    w = np.linalg.inv(x.T*x + alpha * I)*x.T*y
    w=w.ravel()
    results.append(w)

# add 1-3 lines to compare the results
for i, alpha in enumerate(alphas):
    print("Alpha =", alpha)
    print("Weights:", results[i])
    print()


Alpha = 0.01
Weights: [[-167.85534019    1.54416013]]

Alpha = 0.1
Weights: [[-101.72397081    1.16978757]]

Alpha = 0.5
Weights: [[-36.97522016   0.80324169]]



## 2. Implement based on the Ridge regression example, the Lasso regression.

Please implement the SGD method and compare the results with the sklearn Lasso regression results. 

In [53]:
import numpy as np
from sklearn.linear_model import Lasso
from sklearn.metrics import mean_squared_error

def sgd(x, y, alpha, epochs):
    bias = 1.0
    weights = np.ones(x.shape[1])

    for i in range(epochs):
        delta = y - (x * weights + bias)

        weights_gradient = -2 * sum(x * delta) / (np.linalg.norm(x) ** 2)
        bias_gradient    = -(2 / y.size) * sum(delta)

        weights -= alpha * weights_gradient
        bias    -= alpha * bias_gradient

    return weights, bias

In [54]:
x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

alpha = 0.1 
epochs = 10

In [55]:
w_sgd = []
for epoch in range(epochs):
    w_sgd.append(sgd(x, y, alpha, epoch))

print("SGD epochs=" + str(epochs) + " | coef: [" + str(w_sgd[epochs-1][0][0]) + "] intercept: [" + str(w_sgd[epochs-1][1][0]) + "]")

lasso_regression = Lasso(alpha=alpha)
lasso_regression.fit(X=x, y=y)
w_sklearn = lasso_regression.coef_, lasso_regression.intercept_

print("Sklearn | coef: [" + str(w_sklearn[0][0]) + "] intercept: [" + str(w_sklearn[1][0]) + "]")

SGD epochs=10 | coef: [0.8003962894878118] intercept: [-36.34501185508552]
Sklearn | coef: [1.6177649901016677] intercept: [-180.85790859980537]


## 3. Extend the Fisher's classifier

Please extend the targets of the ``iris_data`` variable and use it as the $y$.

In [62]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris

iris_data = load_iris()
iris_df = pd.DataFrame(iris_data.data,columns=iris_data.feature_names)
iris_df['target'] = iris_data.target

x = iris_df[['sepal width (cm)','sepal length (cm)']].values # change here
y = iris_df['target'].values.reshape(-1, 1) # change here

dataset_size = np.size(x)

mean_x, mean_y = np.mean(x), np.mean(y)

SS_xy = np.sum(y * x) - dataset_size * mean_y * mean_x
SS_xx = np.sum(x * x) - dataset_size * mean_x * mean_x

a = SS_xy / SS_xx
b = mean_y - a * mean_x


y_pred = a * x + b

df = pd.DataFrame(y_pred, columns=["", ""])
print(df)

                       
0    0.924785  1.051418
1    0.885212  1.035589
2    0.901042  1.019760
3    0.893127  1.011845
4    0.932700  1.043504
..        ...       ...
145  0.885212  1.178051
146  0.845640  1.146393
147  0.885212  1.162222
148  0.916871  1.138479
149  0.885212  1.114735

[150 rows x 2 columns]
