# Exercises

There are three exercises in this notebook:

1. Use the cross-validation method to test the linear regression with different $\alpha$ values, at least three.
2. Implement a SGD method that will train the Lasso regression for 10 epochs.
3. Extend the Fisher's classifier to work with two features. Use the class as the $y$.

In [26]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.linear_model import Ridge
from sklearn.datasets import load_iris
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import Lasso

## 1. Cross-validation linear regression

You need to change the variable ``alpha`` to be a list of alphas. Next do a loop and finally compare the results.

In [27]:
x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])

I = np.identity(2)
alphas = [0.1, 0.5, 1]
ws = []

for alpha in alphas:
    w = np.linalg.inv(x.T*x + alpha * I)*x.T*y
    w = w.ravel()
    ws.append(w)
    
for i, alpha in enumerate(alphas):
    print(f"Alpha={alpha}, Weights= {ws[i]}")

Alpha=0.1, Weights= [[-101.72397081    1.16978757]]
Alpha=0.5, Weights= [[-36.97522016   0.80324169]]
Alpha=1, Weights= [[-20.59044706   0.71048616]]


## 2. Implement based on the Ridge regression example, the Lasso regression.

Please implement the SGD method and compare the results with the sklearn Lasso regression results. 

In [40]:
def sgd(X, y, alpha, epochs=10, learning_rate=0.001):
    w = np.zeros(X.shape[1])
    m = len(y)
    
    for epoch in range(epochs):
        for i in range(m):
            random_index = np.random.randint(m)
            xi = X[random_index:random_index+1]
            yi = y[random_index:random_index+1]
            gradients = -2 * xi.T.dot(yi - xi.dot(w)) + alpha * np.sign(w)
            w = w - learning_rate * gradients
    return w

In [41]:
x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])

I = np.identity(2)
alpha = 0.1 

w_sgd = sgd(x, y.ravel(), alpha)

x = np.asarray(x)

w_sgd = sgd(x, y.ravel(), alpha)

lasso = Lasso(alpha=alpha)
lasso.fit(x, y.ravel())
w_sklearn = [lasso.intercept_, lasso.coef_[0]]

print(f"SGDLasso Weight: {w_sgd}")
print(f"Sklearn Lasso weights: {w_sklearn}")

SGDLasso Weight: [-1.36657679e+263 -2.40587176e+265]
Sklearn Lasso weights: [-180.85790859980537, 0.0]


## 3. Extend the Fisher's classifier

Please extend the targets of the ``iris_data`` variable and use it as the $y$.

In [42]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris

iris_data = load_iris()
iris_df = pd.DataFrame(iris_data.data,columns=iris_data.feature_names)
print(iris_df.head())

x = iris_df['sepal width (cm)'].values
y = iris_data.target

dataset_size = np.size(x)

mean_x, mean_y = np.mean(x), np.mean(y)

SS_xy = np.sum(y * x) - dataset_size * mean_y * mean_x
SS_xx = np.sum(x * x) - dataset_size * mean_x * mean_x

a = SS_xy / SS_xx
b = mean_y - a * mean_x

y_pred = a * x + b

   sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)
0                5.1               3.5                1.4               0.2
1                4.9               3.0                1.4               0.2
2                4.7               3.2                1.3               0.2
3                4.6               3.1                1.5               0.2
4                5.0               3.6                1.4               0.2
