# Lasso Regression

Write code for learning using Lasso Regression and give your conclusions. Use the dataset LassoReg_data.npz for this question. The file contains two matrices of size 120\*1000 and 120\*1, corresponding to 120 instance points with 1000 dimensional features and its targets.

 Split the data into train-validation-test on 50-25-25 ratio. Learn the best model using Lasso Regression (use projected gradient descent, the projection oracle code is given for your convenience). Try different learning rate parameters and L1 norm ball constraint radii. Choose an appropriate learning rate that allows for convergence of the training loss.  Train the models for different L1 norm radius parameters. Choose the L1 norm constraint that works best on the validation set. 

Report the test error of the learned model thus chosen. Also report the indices and weight values corresponding to the top 10 values of the weight vector (which is 1000 dimensional). 

In [1]:
import numpy as np
import matplotlib.pyplot as plt

In [2]:
def projection_oracle_l1(w, l1_norm):
    # first remember signs and store them. Modify w so that it is all positive then.
    signs = np.sign(w)
    w = w*signs
    # project this modified w onto the simplex in first orthant.
    d=len(w)
    # if w is already in l1 norm ball return as it is.
    if np.sum(w)<=l1_norm:
        return w*signs
    
    # using 1e-7 as zero here to avoid floating point issues
    for i in range(d):
        w_next = w+0
        w_next[w>1e-7] = w[w>1e-7] - np.min(w[w>1e-7])
        if np.sum(w_next)<=l1_norm:
            w = ((l1_norm - np.sum(w_next))*w + (np.sum(w) - l1_norm)*w_next)/(np.sum(w)-np.sum(w_next))
            return w*signs
        else:
            w=w_next



In [3]:
data = np.load("Data/LassoReg_data.npz")
X, Y= data["arr_0"], data["arr_1"]

X_train = X[:60, :]
Y_train = Y[:60]

X_val = X[60:90, :]
Y_val = Y[60:90]

X_test = X[90:120, :]
Y_test = Y[90:120]

def mean_squared_error(y_true, y_pred):
    return np.mean((y_true - y_pred) ** 2)

def lasso_regression(X_train, Y_train, X_val, Y_val, learning_rate, l1_norm, num_iter=100, tolerance=1e-4):
    # Initialize weights
    w = np.zeros(X_train.shape[1])
    for _ in range(num_iter):
        # Compute gradient
        grad = -2 * X_train.T.dot(Y_train - X_train.dot(w))
        # Update weights using projected gradient descent
        w_new = w - learning_rate * grad
        w = projection_oracle_l1(w_new, l1_norm)
        # Evaluate validation error
        val_loss = mean_squared_error(Y_val, X_val.dot(w))
        # Check convergence
        if np.linalg.norm(w - w_new) < tolerance:
            break
    return w


In [4]:
# Write the code for the gradient descent routine on the training set mean square error loss function.
# Also write code for doing validation of the learned model using the validation set

# Hyperparameters
learning_rates = [0.001, 0.01, 0.01, 0.1]
l1_norms = [1, 5, 10]

# Grid search for best hyperparameters
best_loss = float('inf')
best_lr = None
best_l1_norm = None

for lr in learning_rates:
    for l1_norm in l1_norms:
        # Train model
        weights = lasso_regression(X_train, Y_train, X_val, Y_val, learning_rate=lr, l1_norm=l1_norm)
        # Evaluate validation error
        val_loss = mean_squared_error(Y_val, X_val.dot(weights))
        # Update best hyperparameters if necessary
        if val_loss < best_loss:
            best_loss = val_loss
            best_lr = lr
            best_l1_norm = l1_norm
print(f"Best learning rate: {best_lr}")
print(f"Best l1 norm: {best_l1_norm}")
best_weights = lasso_regression(X_train, Y_train, X_val, Y_val, learning_rate=best_lr, l1_norm=best_l1_norm)
test_loss = mean_squared_error(Y_test, X_test.dot(best_weights))

# Report test error
print("Test Mean Squared Error:", test_loss)

# Report top 10 indices and weight values
top_indices = np.argsort(best_weights)[-10:]
top_weights = best_weights[top_indices]

print("Top 10 indices:", top_indices)
print("Corresponding weight values:", top_weights)

Best learning rate: 0.001
Best l1 norm: 1
Test Mean Squared Error: 0.10451966512105049
Top 10 indices: [339 340 341 342 343 332  81 686 390 107]
Corresponding weight values: [ 0.00000000e+00 -0.00000000e+00  0.00000000e+00 -0.00000000e+00
  0.00000000e+00 -0.00000000e+00  3.20391903e-08  9.87846601e-08
  1.38372616e-01  8.61626690e-01]


**Observations**

Best learning rate: 0.001 (For number of iterations = 100)

Best l1 norm: 1

Test Mean Squared Error: 0.10451966512105049

Top 10 indices: [339 340 341 342 343 332  81 686 390 107]

Corresponding weight values: [ 0.00000000e+00    -0.00000000e+00     0.00000000e+00 -0.00000000e+00   0.00000000e+00     -0.00000000e+00     3.20391903e-08             9.87846601e-08    1.38372616e-01     8.61626690e-01]



