## 1. Solution of Ridge Regression and Lasso

### Ridge Regression Objective:
The Ridge regression problem is given by:

$$
\min_{\mathbf{w}, w_0} \left( \frac{1}{2} \|\mathbf{X}\mathbf{w} + w_0 - \mathbf{y}\|^2 + \lambda \|\mathbf{w}\|^2_2 \right)
$$

### Lasso Regression Objective:

The Lasso regression problem is given by:

$$
\min_{\mathbf{w}, w_0} \left( \frac{1}{2} \|\mathbf{X}\mathbf{w} + w_0 - \mathbf{y}\|^2 + \lambda \|\mathbf{w}\|_1 \right)
$$

In [1]:
import numpy as np
from sklearn.linear_model import Ridge, Lasso
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_svmlight_files
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt

In [7]:
# Load the E2006-tfidf dataset
X_train, y_train, X_test, y_test = load_svmlight_files(('E2006.train', 'E2006.test'))  # Adjust path as needed
num_samples = X_train.shape[0] #should be 16087
print(num_samples)

16087


In [9]:
lamda = 0.1
# Ridge Regression

ridge = Ridge(alpha=lamda)
ridge.fit(X_train, y_train)
ridge_w = ridge.coef_

# Lasso Regression
lasso = Lasso(alpha=lamda / len(y_train))  # lambda/n for Lasso in sklearn
lasso.fit(X_train, y_train)
lasso_w = lasso.coef_

# Number of nonzero coefficients
ridge_nonzero = np.sum(ridge_w != 0)
lasso_nonzero = np.sum(lasso_w != 0)

print(f'Nonzero coefficients in Ridge: {ridge_nonzero}')
print(f'Nonzero coefficients in Lasso: {lasso_nonzero}')

Nonzero coefficients in Ridge: 150348
Nonzero coefficients in Lasso: 102


### Observations:
- Ridge regression tends to shrink the coefficients but does not set them to zero.
- Lasso regression, due to the  $L_1$ -norm regularization, can set many coefficients exactly to zero, resulting in a sparse solution.
- The difference is due to the nature of the regularization:  $L_2$ -norm penalizes large coefficients but keeps most of them, while  $L_1$ -norm encourages sparsity by driving coefficients to zero.

## 2. Training and Testing Error with Different Values of  $\lambda$

### (i) RMSE Calculation:

To compute the RMSE for both training and testing data at different values of  $\lambda$ :

In [None]:
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt

# Lambda values
lambdas = [0, 1e-5, 1e-3, 1e-2, 0.1, 1, 10, 100, 1e3, 1e4, 1e5, 1e6]
ridge_errors_train = []
ridge_errors_test = []
lasso_errors_train = []
lasso_errors_test = []

for lmbda in lambdas:
    # Ridge regression
    ridge = Ridge(alpha=lmbda)
    ridge.fit(X_train, y_train)
    ridge_train_rmse = np.sqrt(mean_squared_error(y_train, ridge.predict(X_train)))
    ridge_test_rmse = np.sqrt(mean_squared_error(y_test, ridge.predict(X_test)))
    ridge_errors_train.append(ridge_train_rmse)
    ridge_errors_test.append(ridge_test_rmse)

    # Lasso regression
    lasso = Lasso(alpha=lmbda / len(y_train))
    lasso.fit(X_train, y_train)
    lasso_train_rmse = np.sqrt(mean_squared_error(y_train, lasso.predict(X_train)))
    lasso_test_rmse = np.sqrt(mean_squared_error(y_test, lasso.predict(X_test)))
    lasso_errors_train.append(lasso_train_rmse)
    lasso_errors_test.append(lasso_test_rmse)

# Plotting RMSE for Ridge and Lasso
plt.figure(figsize=(10, 6))
plt.plot(lambdas, ridge_errors_train, label='Ridge Train RMSE')
plt.plot(lambdas, ridge_errors_test, label='Ridge Test RMSE')
plt.plot(lambdas, lasso_errors_train, label='Lasso Train RMSE')
plt.plot(lambdas, lasso_errors_test, label='Lasso Test RMSE')
plt.xscale('log')
plt.xlabel('Lambda')
plt.ylabel('RMSE')
plt.legend()
plt.title('RMSE vs Lambda')
plt.show()