# Exercises

There are three exercises in this notebook:

1. Use the cross-validation method to test the linear regression with different $\alpha$ values, at least three.
2. Implement a SGD method that will train the Lasso regression for 10 epochs.
3. Extend the Fisher's classifier to work with two features. Use the class as the $y$.

## 1. Cross-validation linear regression

You need to change the variable ``alpha`` to be a list of alphas. Next do a loop and finally compare the results.

In [226]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as ps

x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])

I = np.identity(2)
alphas = []
for i in range(0, 3):
    alphas.append(round(i * 0.1, 1))

elements = []
for alpha in alphas:
    w = np.linalg.inv(x.T*x + alpha * I)*x.T*y
    w=w.ravel()
    elements.append(w)

results = np.asarray(list(elements))
results = results.flatten().reshape(len(elements), 2)
print(results)


[[-180.92401772    1.61814247]
 [-101.72397081    1.16978757]
 [ -70.75142154    0.99445055]]


## 2. Implement based on the Ridge regression example, the Lasso regression.

Please implement the SGD method and compare the results with the sklearn Lasso regression results. 

In [227]:
def sgd(x, y, alpha, w):
    normalizeData = np.linalg.norm(x, axis=0)
    for i in range(0, 700):
        x_data = x[:, 1].reshape(-1, 1)
        y_pred = x_data * w[0] + w[1]
        if w[0] > 0:
            deltaW = (-x_data.T.dot(y - y_pred) * 2 + alpha) / (normalizeData[1] * normalizeData[1])
        else:
            deltaW = (-x_data.T.dot(y - y_pred) * 2 - alpha) / (normalizeData[1] * normalizeData[1])
        delta = (-2) * np.sum(y - y_pred) / (normalizeData[0] * normalizeData[0])
        w[0] = w[0] - alpha * deltaW
        w[1] = w[1] - alpha *  delta

    return w

In [228]:

x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])

I = np.identity(2)
alpha = 0.1 


w = np.linalg.inv(x.T*x + alpha * I)*x.T*y
w=w.ravel()
print("Lasso", w[0,1], w[0,0])

wSdg = np.ones((2, 1))
w = sgd(x, y ,alpha, wSdg)
w = w.ravel()
print("SGD", w[0], w[1] )


Lasso 1.169787574869769 -101.72397080681458
SGD 1.1651313951744873 -101.24480072786287


## 3. Extend the Fisher's classifier

Please extend the targets of the ``iris_data`` variable and use it as the $y$.

In [229]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris

iris_data = load_iris()
iris_df = pd.DataFrame(iris_data.data,columns=iris_data.feature_names)


x = iris_df[['sepal width (cm)', 'petal width (cm)']].values # change here
y = iris_df[['sepal length (cm)', 'petal length (cm)']].values # change here

dataset_size = np.size(y)

mean_x, mean_y = np.mean(x), np.mean(y)

SS_xy = np.sum(y * x) - dataset_size * mean_y * mean_x
SS_xx = np.sum(x * x) - dataset_size * mean_x * mean_x

a = SS_xy / SS_xx
b = mean_y - a * mean_x


y_pred = a * x + b
print(y_pred)

[[6.55223204 2.33825946]
 [5.91375134 2.33825946]
 [6.16914362 2.33825946]
 [6.04144748 2.33825946]
 [6.67992818 2.33825946]
 [7.06301659 2.59365173]
 [6.4245359  2.46595559]
 [6.4245359  2.33825946]
 [5.7860552  2.33825946]
 [6.04144748 2.21056332]
 [6.80762432 2.33825946]
 [6.4245359  2.33825946]
 [5.91375134 2.21056332]
 [5.91375134 2.21056332]
 [7.19071273 2.33825946]
 [7.70149729 2.59365173]
 [7.06301659 2.59365173]
 [6.55223204 2.46595559]
 [6.93532045 2.46595559]
 [6.93532045 2.46595559]
 [6.4245359  2.33825946]
 [6.80762432 2.59365173]
 [6.67992818 2.33825946]
 [6.29683976 2.72134787]
 [6.4245359  2.33825946]
 [5.91375134 2.33825946]
 [6.4245359  2.59365173]
 [6.55223204 2.33825946]
 [6.4245359  2.33825946]
 [6.16914362 2.33825946]
 [6.04144748 2.33825946]
 [6.4245359  2.59365173]
 [7.31840887 2.21056332]
 [7.44610501 2.33825946]
 [6.04144748 2.33825946]
 [6.16914362 2.33825946]
 [6.55223204 2.33825946]
 [6.67992818 2.21056332]
 [5.91375134 2.33825946]
 [6.4245359  2.33825946]
