# Exercises

There are three exercises in this notebook:

1. Use the cross-validation method to test the linear regression with different $\alpha$ values, at least three.
2. Implement a SGD method that will train the Lasso regression for 10 epochs.
3. Extend the Fisher's classifier to work with two features. Use the class as the $y$.

## 1. Cross-validation linear regression

You need to change the variable ``alpha`` to be a list of alphas. Next do a loop and finally compare the results.

In [5]:
import numpy as np

x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])

I = np.identity(2)
alpha_list = [0.1, 0.3, 0.5, 0.7, 1] # change here

# add 1-3 line of code here
results = []
for alpha in alpha_list:
    w = np.linalg.inv(x.T*x + alpha * I)*x.T*y
    w=w.ravel()
    results.append(w)

# add 1-3 lines to compare the results
for i, alpha in enumerate(alpha_list):
    print('Alpha:', alpha, ', results:', results[i])


Alpha: 0.1 , results: [[-101.72397081    1.16978757]]
Alpha: 0.3 , results: [[-54.23704349   0.90096184]]
Alpha: 0.5 , results: [[-36.97522016   0.80324169]]
Alpha: 0.7 , results: [[-28.04797742   0.75270394]]
Alpha: 1 , results: [[-20.59044706   0.71048616]]


## 2. Implement based on the Ridge regression example, the Lasso regression.

Please implement the SGD method and compare the results with the sklearn Lasso regression results. 

In [17]:
def sgd(features, target, alpha, max_iter=1000):
    feat_mean, feat_std = np.mean(features), np.std(features)
    target_mean, target_std = np.mean(target), np.std(target)
    
    norm_features = (features - feat_mean) / feat_std
    norm_target = (target - target_mean) / target_std
    
    coefficient = np.random.normal(scale=0.1)
    intercept = np.random.normal(scale=0.1)
    
    for iteration in range(max_iter):
        random_indices = np.random.permutation(len(target))
        
        for i in random_indices:
            x = norm_features[i]
            y = norm_target[i]
            prediction = coefficient * x + intercept
            residual = prediction - y
            coefficient -= alpha * residual * x
            intercept -= alpha * residual
    
    coefficient = coefficient * (target_std / feat_std)
    intercept = target_mean - coefficient * feat_mean
    
    return np.array([intercept, coefficient])

In [25]:
import itertools

x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

I = np.identity(2)
alpha = 0.001

sgd_res = sgd(x, y, alpha)
sgd_res = sgd_result.ravel()

x = np.asmatrix(np.c_[np.ones((15,1)),x])
w = np.linalg.inv(x.T*x + alpha * I)*x.T*y
w = lasso_result.ravel()

print('sgd:', sgd_res[0], sgd_res[1])
print('lasso:', w.item(0), w.item(1))

sgd: -180.84289960253503 1.617679289698525
lasso: -179.52628555248992 1.6102298475724885


## 3. Extend the Fisher's classifier

Please extend the targets of the ``iris_data`` variable and use it as the $y$.

In [49]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris

iris_data = load_iris()
iris_df = pd.DataFrame(iris_data.data,columns=iris_data.feature_names)
iris_df.head()
iris_df_target = iris_data.target

x = iris_df[['sepal width (cm)', 'sepal length (cm)']].values # change here
y = iris_df_target.reshape(-1, 1) # change here

dataset_size = np.size(x)

mean_x, mean_y = np.mean(x), np.mean(y)

SS_xy = np.sum(y * x) - dataset_size * mean_y * mean_x
SS_xx = np.sum(x * x) - dataset_size * mean_x * mean_x

a = SS_xy / SS_xx
b = mean_y - a * mean_x

y_pred = a * x + b
print(y_pred[:10], "\n...\n", y_pred[-10:])

[[0.92478522 1.05141831]
 [0.88521238 1.03558917]
 [0.90104152 1.01976004]
 [0.89312695 1.01184547]
 [0.93269979 1.04350374]
 [0.95644349 1.07516201]
 [0.91687065 1.01184547]
 [0.91687065 1.04350374]
 [0.87729781 0.99601633]
 [0.89312695 1.03558917]]
...
 [[0.89312695 1.1780514 ]
 [0.89312695 1.19388053]
 [0.86146868 1.10682029]
 [0.90104152 1.18596596]
 [0.90895609 1.1780514 ]
 [0.88521238 1.1780514 ]
 [0.84563954 1.14639313]
 [0.88521238 1.16222226]
 [0.91687065 1.13847856]
 [0.88521238 1.11473485]]
