# Exercises

There are three exercises in this notebook:

1. Use the cross-validation method to test the linear regression with different $\alpha$ values, at least three.
2. Implement based on the Ridge regression example, the Lasso regression.
3. Extend the Fisher's classifier to work with two features. Use the class as the $y$.

## 1. Cross-validation linear regression

You need to change the variable ``alpha`` to be a list of alphas. Next do a loop and finally compare the results.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn import linear_model

In [3]:
x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])
w = []
I = np.identity(2)
alphas = [0.1, 1.0, 10.0]
# add 1-3 line of code here
for alpha in alphas:
    w.append(np.linalg.inv(x.T*x + alpha * I)*x.T*y)
    #w = w.ravel()

# add 1-3 lines to compare the results
for i in range(3):
    print("For alpha = {}: {}, {}".format(alphas[i], w[i][0], w[i][1]))


For alpha = 0.1: [[-101.72397081]], [[1.16978757]]
For alpha = 1.0: [[-20.59044706]], [[0.71048616]]
For alpha = 10.0: [[-2.29106262]], [[0.60688107]]


## 2. Implement based on the Ridge regression example, the Lasso regression.

You need only update the compare the results with the sklearn results. You should get
[1.61776499, -180.8579086].

In [38]:
x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])

I = np.identity(2)
alpha = 0.1 

model = linear_model.Lasso(alpha = alpha)
model.fit(x, y)
w = np.linalg.inv(x.T*x + alpha * I)*x.T*y # update this line
w=w.ravel()

w_lasso = [model.coef_[1], model.intercept_[0]]
w_lasso

[1.6177649901016677, -180.85790859980537]

## 3. Extend the Fisher's classifier

Please extend the targets of the ``iris_data`` variable and use it as the $y$.

In [21]:
iris_data = load_iris()
iris_df = pd.DataFrame(iris_data.data,columns=iris_data.feature_names)
iris_df.head()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm)
0,5.1,3.5,1.4,0.2
1,4.9,3.0,1.4,0.2
2,4.7,3.2,1.3,0.2
3,4.6,3.1,1.5,0.2
4,5.0,3.6,1.4,0.2


In [28]:
x = iris_df['sepal width (cm)'].values # change here
y = iris_data.target

dataset_size = np.size(x)
y

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [27]:
mean_x, mean_y = np.mean(x), np.mean(y)

SS_xy = np.sum(y * x) - dataset_size * mean_y * mean_x
SS_xx = np.sum(x * x) - dataset_size * mean_x * mean_x

a = SS_xy / SS_xx
b = mean_y - a * mean_x


y_pred = a * x + b
y_pred

array([ 0.64501512,  1.04597696,  0.88559222,  0.96578459,  0.56482275,
        0.32424565,  0.72520749,  0.72520749,  1.12616932,  0.96578459,
        0.48463039,  0.72520749,  1.04597696,  1.04597696,  0.24405328,
       -0.07671619,  0.32424565,  0.64501512,  0.40443802,  0.40443802,
        0.72520749,  0.48463039,  0.56482275,  0.80539985,  0.72520749,
        1.04597696,  0.72520749,  0.64501512,  0.72520749,  0.88559222,
        0.96578459,  0.72520749,  0.16386092,  0.08366855,  0.96578459,
        0.88559222,  0.64501512,  0.56482275,  1.04597696,  0.72520749,
        0.64501512,  1.60732353,  0.88559222,  0.64501512,  0.40443802,
        1.04597696,  0.40443802,  0.88559222,  0.48463039,  0.80539985,
        0.88559222,  0.88559222,  0.96578459,  1.60732353,  1.20636169,
        1.20636169,  0.80539985,  1.52713116,  1.12616932,  1.28655406,
        1.84790063,  1.04597696,  1.6875159 ,  1.12616932,  1.12616932,
        0.96578459,  1.04597696,  1.28655406,  1.6875159 ,  1.44