# Exercises

There are three exercises in this notebook:

1. Use the cross-validation method to test the linear regression with different $\alpha$ values, at least three.
2. Implement a SGD method that will train the Lasso regression for 10 epochs.
3. Extend the Fisher's classifier to work with two features. Use the class as the $y$.

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

## 1. Cross-validation linear regression

You need to change the variable ``alpha`` to be a list of alphas. Next do a loop and finally compare the results.

In [3]:
x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])

I = np.identity(2)
alpha = [0.01, 0.1, 0.5] # change here

# add 1-3 line of code here
res = []
for a in alpha:
    w = np.linalg.inv(x.T*x + a * I)*x.T*y
    w=w.ravel()
    res.append(w)

# add 1-3 lines to compare the results
for i, a in enumerate(alpha):
    print("alpha: ", a)
    print("weights: ", res[i], "\n")


alpha:  0.01
weights:  [[-167.85534019    1.54416013]] 

alpha:  0.1
weights:  [[-101.72397081    1.16978757]] 

alpha:  0.5
weights:  [[-36.97522016   0.80324169]] 



## 2. Implement based on the Ridge regression example, the Lasso regression.

Please implement the SGD method and compare the results with the sklearn Lasso regression results. 

In [4]:
from sklearn import linear_model
from sklearn import metrics

In [5]:
def sgd(x, y, alpha, epochs):
    bias = 1.0
    weights = np.ones(x.shape[1])

    for i in range(epochs):
        delta = y - (x * weights + bias)

        weights_gradient = -2 * sum(x * delta) / (np.linalg.norm(x) ** 2)
        bias_gradient    = -(2 / y.size) * sum(delta)

        weights -= alpha * weights_gradient
        bias    -= alpha * bias_gradient

    return weights, bias

In [6]:
# data
x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

# regularization parameter
alpha = 0.1 

w_sgd = []
for epoch in range(100):
    w_sgd.append(sgd(x, y, alpha, epoch))

print("coef: " + str(w_sgd[100-1][0][0]))
print("intercept: " + str(w_sgd[100-1][1][0]) + "\n")

lasso_regression = linear_model.Lasso(alpha=alpha)
lasso_regression.fit(X=x, y=y)
w_sklearn = lasso_regression.coef_, lasso_regression.intercept_

print("Sklearn:")
print("coef: " + str(w_sklearn[0][0]) + "intercept: " + str(w_sklearn[1][0]))


coef: 0.859235803334886
intercept: -47.441417412431484

Sklearn:
coef: 1.6177649901016677intercept: -180.85790859980537


## 3. Extend the Fisher's classifier

Please extend the targets of the ``iris_data`` variable and use it as the $y$.

In [7]:
from sklearn.datasets import load_iris

In [8]:
iris_data = load_iris()
iris_df = pd.DataFrame(iris_data.data,columns=iris_data.feature_names)
iris_df.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm)
0,5.1,3.5,1.4,0.2
1,4.9,3.0,1.4,0.2
2,4.7,3.2,1.3,0.2
3,4.6,3.1,1.5,0.2
4,5.0,3.6,1.4,0.2


In [9]:
iris_df['target'] = iris_data.target

x = iris_df[['sepal width (cm)', 'sepal length (cm)']].values
y = iris_df['target'].values.reshape(-1,1)

In [10]:
dataset_size = np.size(x)

mean_x, mean_y = np.mean(x), np.mean(y)

SS_xy = np.sum(y * x) - dataset_size * mean_y * mean_x
SS_xx = np.sum(x * x) - dataset_size * mean_x * mean_x

a = SS_xy / SS_xx
b = mean_y - a * mean_x


y_pred = a * x + b

In [11]:
df = pd.DataFrame(y_pred, columns=["", ""])
df.head()

Unnamed: 0,Unnamed: 1,Unnamed: 2
0,0.924785,1.051418
1,0.885212,1.035589
2,0.901042,1.01976
3,0.893127,1.011845
4,0.9327,1.043504
