# Exercises

There are three exercises in this notebook:

1. Use the cross-validation method to test the linear regression with different $\alpha$ values, at least three.
2. Implement a SGD method that will train the Lasso regression for 10 epochs.
3. Extend the Fisher's classifier to work with two features. Use the class as the $y$.

## 1. Cross-validation linear regression

You need to change the variable ``alpha`` to be a list of alphas. Next do a loop and finally compare the results.

In [None]:
import pandas as pd
import numpy as np

x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])

I = np.identity(2)
alpha = [0.01, 0.1, 0.2, 0.3, 0.4, 0.5, 1]

data = {}
for i in alpha:
    data[str(i)] = ((np.linalg.inv(x.T*x + i * I)*x.T*y).ravel().tolist())[0]

    
df = pd.DataFrame.from_dict(data, orient='index', columns=['a1', 'a2'])

print(df)


## 2. Implement based on the Ridge regression example, the Lasso regression.

Please implement the SGD method and compare the results with the sklearn Lasso regression results. 

In [None]:
import pandas as pd
import numpy as np

def sgd(x_train, y, learning_rate, iterations = 3000):
    normalized_data = np.linalg.norm(x_train, axis=0)
    w = 1
    b = 0
    for _ in range(0, iterations):
        x = x_train[:,1].reshape(-1, 1)
        y_predict = x * w + b

        if w > 0:
            w_gradient = (-2 * x.T.dot((y - y_predict)) + learning_rate ) / normalized_data[1]**2
        else:
            w_gradient = (-2 * x.T.dot((y - y_predict)) - learning_rate ) / normalized_data[1]**2

        b_gradient = (-2) * np.sum(y - y_predict) / normalized_data[0]**2

        w = w - learning_rate * w_gradient
        b = b - learning_rate * b_gradient



    return np.array([b, w.item(0)])

In [None]:
import numpy as np

x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])

I = np.identity(2)
alpha = 0.1
n = x.shape[1] 

initial_coefficients = np.zeros((n,1))

sgd = sgd(x, y, alpha, 710)
sgd = sgd.ravel()
print("SGD ", sgd)

w = np.linalg.inv(x.T*x + alpha * I)*x.T*y # update this line
w=w.ravel()
print("Lasso ", w.tolist()[0])


## 3. Extend the Fisher's classifier

Please extend the targets of the ``iris_data`` variable and use it as the $y$.

In [None]:
import pandas as pd
import numpy as np
from sklearn.datasets import load_iris

iris_data = load_iris()
iris_df = pd.DataFrame(iris_data.data,columns=iris_data.feature_names)
iris_df.head()

x = iris_df['sepal width (cm)', 'petal width (cm)'].values # change here
y = iris_df['sepal length (cm)', 'petal length (cm)'].values # change here

dataset_size = np.size(x)

mean_x, mean_y = np.mean(x), np.mean(y)

SS_xy = np.sum(y * x) - dataset_size * mean_y * mean_x
SS_xx = np.sum(x * x) - dataset_size * mean_x * mean_x

a = SS_xy / SS_xx
b = mean_y - a * mean_x


y_pred = a * x + b