# Exercises

There are three exercises in this notebook:

1. Use the cross-validation method to test the linear regression with different $\alpha$ values, at least three.
2. Implement a SGD method that will train the Lasso regression for 10 epochs.
3. Extend the Fisher's classifier to work with two features. Use the class as the $y$.

## 1. Cross-validation linear regression

You need to change the variable ``alpha`` to be a list of alphas. Next do a loop and finally compare the results.

In [15]:
import numpy as np
import pandas as pd
from sklearn.linear_model import Ridge

In [16]:
x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15,1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15,1)

x = np.asmatrix(np.c_[np.ones((15,1)),x])

I = np.identity(2)
alphas = [-1, -0.1, 0, 0.1, 1, 100,]

results = []

for alpha in alphas:
    w = np.linalg.inv(x.T*x + alpha * I)*x.T*y
    w=w.ravel()
    results.append([alpha, w.item(0), w.item(1)])

df = pd.DataFrame(results, columns=["Alpha", "w0", "w1"])
df



Unnamed: 0,Alpha,w0,w1
0,-1.0,26.667097,0.442962
1,-0.1,-817.017374,5.219094
2,0.0,-180.924018,1.618142
3,0.1,-101.723971,1.169788
4,1.0,-20.590447,0.710486
5,100.0,-0.22873,0.595091


## 2. Implement based on the Ridge regression example, the Lasso regression.

Please implement the SGD method and compare the results with the sklearn Lasso regression results.

In [17]:
def sgd(x, y, alpha, iteration):
    rows, cols = x.shape
    weights = np.ones(cols)
    bias = 1

    for _ in range(iteration):
        prediction = np.dot(x, weights) + bias
        delta = y - prediction

        gradient_weights = -2 * np.dot(x.T, delta) / np.linalg.norm(x) ** 2
        gradient_bias = -2 * np.mean(delta)

        weights -= alpha * gradient_weights
        bias -= alpha * gradient_bias

    return bias, weights

In [7]:
from sklearn.linear_model import Lasso

x = np.array([188, 181, 197, 168, 167, 187, 178, 194, 140, 176, 168, 192, 173, 142, 176]).reshape(-1, 1).reshape(15, 1)
y = np.array([141, 106, 149, 59, 79, 136, 65, 136, 52, 87, 115, 140, 82, 69, 121]).reshape(-1, 1).reshape(15, 1)

x = np.c_[np.ones((15, 1)), x]
I = np.identity(2)
alpha = 0.1

w = np.linalg.inv(x.T @ x + alpha * I) @ x.T @ y
w = w.ravel()

iterations_list = [10, 1000, 3000, 10000]

print("Comparison of SGD Weights for Different Iterations:")
for iterations in iterations_list:
    intercept_, coef_ = sgd(x, y.flatten(), alpha, iterations)
    print(f"\nIterations: {iterations}")
    print(f"  Intercept: {intercept_:.6f}")
    print(f"  Coefficient: {coef_[1]:.6f}")

lasso = Lasso(alpha=alpha)
lasso.fit(x[:, 1:], y.ravel())

print("\nLasso sklearn Weights:")
print(f"  Intercept: {lasso.intercept_:.6f}")
print(f"  Coefficient: {lasso.coef_[0]:.6f}")




Comparison of SGD Weights for Different Iterations:

Iterations: 10
  Intercept: -37.119060
  Coefficient: 0.797458

Iterations: 1000
  Intercept: -120.122501
  Coefficient: 1.266796

Iterations: 3000
  Intercept: -170.819960
  Coefficient: 1.555043

Iterations: 10000
  Intercept: -181.890861
  Coefficient: 1.617988

Lasso sklearn Weights:
  Intercept: -180.857909
  Coefficient: 1.617765


## 3. Extend the Fisher's classifier

Please extend the targets of the ``iris_data`` variable and use it as the $y$.

In [18]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris

iris_data = load_iris()
iris_df = pd.DataFrame(iris_data.data,columns=iris_data.feature_names)
iris_df['target'] = iris_data.target

x = iris_df[['sepal width (cm)', 'sepal length (cm)']].values
y = iris_df['target'].values.reshape(-1, 1)

dataset_size = np.size(x)

mean_x, mean_y = np.mean(x), np.mean(y)

SS_xy = np.sum(y * x) - dataset_size * mean_y * mean_x
SS_xx = np.sum(x * x) - dataset_size * mean_x * mean_x

a = SS_xy / SS_xx
b = mean_y - a * mean_x

y_pred = a * x + b

df = pd.DataFrame({
    "Sepal Width": x[:, 0],
    "Sepal Length": x[:, 1],
    "Target": y.flatten(),
    "Predicted Value 1": y_pred[:, 0],
    "Predicted Value 2": y_pred[:, 1]  # Bez sprawdzania
})

df


Unnamed: 0,Sepal Width,Sepal Length,Target,Predicted Value 1,Predicted Value 2
0,3.5,5.1,0,0.924785,1.051418
1,3.0,4.9,0,0.885212,1.035589
2,3.2,4.7,0,0.901042,1.019760
3,3.1,4.6,0,0.893127,1.011845
4,3.6,5.0,0,0.932700,1.043504
...,...,...,...,...,...
145,3.0,6.7,2,0.885212,1.178051
146,2.5,6.3,2,0.845640,1.146393
147,3.0,6.5,2,0.885212,1.162222
148,3.4,6.2,2,0.916871,1.138479
