# Exercises

There are three exercises in this notebook:

1. Use the cross-validation method to test the linear regression with different $\alpha$ values, at least three.
2. Implement a SGD method that will train the Lasso regression for 10 epochs.
3. Extend the Fisher's classifier to work with two features. Use the class as the $y$.

## 1. Cross-validation linear regression

You need to change the variable ``alpha`` to be a list of alphas. Next do a loop and finally compare the results.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn import linear_model


x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])


I = np.identity(2)
alpha = [1,0.1,0.01,0.001,0.0001] # change here

# add 1-3 line of code here
results = []
regr = linear_model.LinearRegression().fit(np.asarray(x),np.asarray(y))
print(regr.coef_,regr.intercept_)

for a in alpha:
    w = np.linalg.inv(x.T*x + a * I)*x.T*y
    w=w.ravel()
    results.append(w)
    
results = np.asarray(results).flatten().reshape(5,2)
dataframe = pd.DataFrame(data=results,index=alpha,columns=["b","a"])

dataframe

[[0.         1.61814247]] [-180.92401772]


Unnamed: 0,b,a
1.0,-20.590447,0.710486
0.1,-101.723971,1.169788
0.01,-167.85534,1.54416
0.001,-179.526286,1.61023
0.0001,-180.783266,1.617346


#### 2. Implement based on the Ridge regression example, the Lasso regression.

Please implement the SGD method and compare the results with the sklearn Lasso regression results. 

In [2]:
def sgd(x, y, n, epochs=1000):
    x_mean, x_std = np.mean(x), np.std(x)
    y_mean, y_std = np.mean(y), np.std(y)
    x = (x - x_mean) / x_std
    y = (y - y_mean) / y_std
    w, b = 0.0, 0.0
    

    for _ in range(epochs):
        indices = np.random.permutation(len(y))
        for i in indices:
            xi, yi = x[i], y[i]
            y_pred = w * xi + b
            error = y_pred - yi
            
            w -= n * error * xi
            b -= n * error
        

    w = w * (y_std / x_std)
    b = y_mean - w * x_mean
    
    return np.array([w, b])

In [5]:
import itertools

x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

I = np.identity(2)
alpha = 0.001

sgd_result = sgd(x, y, alpha)
sgd_result = sgd_result.ravel()

x = np.asmatrix(np.c_[np.ones((15,1)),x])
lasso_result = np.linalg.inv(x.T*x + alpha * I)*x.T*y
lasso_result = lasso_result.ravel()

prepared_results = np.asarray([sgd_result[0], sgd_result[1], lasso_result.item(1), lasso_result.item(0)])
prepared_results = prepared_results.flatten()
prepared_results = prepared_results.reshape(2, 2)

final_result = pd.DataFrame(data=prepared_results, index=['sgd', 'lasso'], columns=["w", "b"])
final_result

Unnamed: 0,w,b
sgd,1.617749,-180.855105
lasso,1.61023,-179.526286


## 3. Extend the Fisher's classifier

Please extend the targets of the ``iris_data`` variable and use it as the $y$.

In [4]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris

iris_data = load_iris()
iris_df = pd.DataFrame(iris_data.data,columns=iris_data.feature_names)
iris_df.head()
iris_df_target = pd.DataFrame(iris_data.target)

x = iris_df[['sepal width (cm)','sepal length (cm)']].values
y = iris_df_target.values
dataset_size = np.size(x)

mean_x, mean_y = np.mean(x), np.mean(y)

SS_xy = np.sum(y * x) - dataset_size * mean_y * mean_x
SS_xx = np.sum(x * x) - dataset_size * mean_x * mean_x

a = SS_xy / SS_xx
b = mean_y - a * mean_x


y_pred = a * x + b
y_pred

array([[0.92478522, 1.05141831],
       [0.88521238, 1.03558917],
       [0.90104152, 1.01976004],
       [0.89312695, 1.01184547],
       [0.93269979, 1.04350374],
       [0.95644349, 1.07516201],
       [0.91687065, 1.01184547],
       [0.91687065, 1.04350374],
       [0.87729781, 0.99601633],
       [0.89312695, 1.03558917],
       [0.94061436, 1.07516201],
       [0.91687065, 1.02767461],
       [0.88521238, 1.02767461],
       [0.88521238, 0.98810177],
       [0.96435806, 1.10682029],
       [0.99601633, 1.09890572],
       [0.95644349, 1.07516201],
       [0.92478522, 1.05141831],
       [0.94852893, 1.09890572],
       [0.94852893, 1.05141831],
       [0.91687065, 1.07516201],
       [0.94061436, 1.05141831],
       [0.93269979, 1.01184547],
       [0.90895609, 1.05141831],
       [0.91687065, 1.02767461],
       [0.88521238, 1.04350374],
       [0.91687065, 1.04350374],
       [0.92478522, 1.05933288],
       [0.91687065, 1.05933288],
       [0.90104152, 1.01976004],
       [0.