# NNTI Assignment 8 (Q8.3) **[Parameter Norm Penalties]**

Name 1: <br>
Student id 1: <br>
Email 1: <br>

Name 2: <br>
Student id 2:  <br>
Email 2:  <br>

Name 3: <br>
Student id 3:  <br>
Email 3: <br>

**Instructions:** Read each question carefully. <br/>
Make sure you appropriately comment your code wherever required. Your final submission should contain the completed Notebook and the respective  files for any additional exercises necessary. There is no need to resubmit the data files should they be provided separately. <br>


Upload the zipped folder on CMS. Please follow the naming convention of **Name1_id1_Name2_id2_Name3_id3.zip **. Only one member of the group should make the submisssion.


**NOTE**:
1. Pay close attention to the function doc-strings, they give you clues on what to include in your implementation.

2. Set the `random_seed = 42` for any part of your code that requires randomness.

3. You are free to use the `scikit-learn` library for implementing the regression models.

In this exercise, you will be working on the provided `regularization_dataset.csv` dataset. In this dataset, columns `x0` through `x47` correspond to the input and column `y` corresponds to the output.You will be working with linear regression and using different norm penalties to analyze their regularizing behaviour.

The basic inputs are provided to you. You are free to import additional components, **but add comments to explain why you added them**.

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt

## a. Data setup [0.5 points]
First things first, let's load the dataset. Complete the function `load_data()` to load data from the supplied csv file.



In [2]:
def load_data(file_path):
    """
    Loads a dataset from the specified CSV file, separates the features and target,
    normalizes the features, and splits the data into training and testing sets.

    Parameters:
    - file_path (str): The path to the CSV file containing the dataset.

    Returns:
    - X_train (ndarray): The training set features after normalization.
    - X_test (ndarray): The testing set features after normalization.
    - y_train (ndarray): The training set target values.
    - y_test (ndarray): The testing set target values.
    """
    pass

In [None]:
X_train, X_test, y_train, y_test = load_data('regularization_dataset.csv') #Change file path as per requirement

### b. Regression Model setup [0.5 points]

In this part, you will build and evaluate a Linear Regression model trained on the data without any regularization. Complete the function `train_lr()` to train the model, evaluate its performance using Mean Squared Error (MSE), and extract the model's coefficients.

**Tasks to implement**:
- Initialize the Linear Regression model.
- Train the model method on `X_train` and `y_train`.
- Make predictions on the test dataset `X_test`.
- Calculate the Mean Squared Error (MSE) between the predicted and actual values `y_test`.
- Extract the coefficients of the trained model.

In [None]:
def train_lr(X_train, X_test, y_train, y_test):
    """
    This is a wrapper function for training a linear regression model.

    Parameters:
    - X_train: Features for training
    - X_test: Features for testing
    - y_train: Target values for training
    - y_test: Target values for testing

    Returns:
    - model: Trained model
    - mse: the mean-squared error at test time
    - coeffs: List of coefficients from the trained model
    """
    pass

## c. Regularized Regression Models [0.5 points]

In this part, you will build and evaluate regularized linear regression models using Ridge, Lasso, and ElasticNet. These models add a penalty term to the loss function to regularize the coefficients, helping to prevent overfitting. You will complete the function `train_lr_reg()` to train models with different regularization types and evaluate their performance

**Tasks to Implement**:

- Based on the reg parameter, initialize the appropriate regression model:
  - Use standard linear regression when `reg` is None.
  - Use ridge when `reg` is `ridge`.
  - Use Lasso when `reg` is `lasso`.
  - Use ElasticNet when `reg` is `elasticnet`.
  - Raise a ValueError for invalid values of `reg`.
- Train the selected model on `X_train` and `y_train`.
- Make predictions on the test dataset `X_test`.
- Calculate the Mean Squared Error (MSE) between the predicted and actual values (`y_test`).
- Extract the coefficients of the trained model.

In [None]:
def train_lr_reg(X_train, X_test, y_train, y_test, reg=None, alpha=1.0):
    """
    This is a wrapper function for training a linear regression model with optional regularization.

    Parameters:
    - X_train: Features for training
    - X_test: Features for testing
    - y_train: Target values for training
    - y_test: Target values for testing
    - reg: Regularization type ('ridge', 'lasso', 'elasticnet', or None for standard Linear Regression)
    - alpha: Regularization strength (applicable for Ridge, Lasso, and ElasticNet)

    Returns:
    - model: Trained model
    - mse: the mean-squared error at test time
    - coeffs: List of coefficients from the trained model
    """
    pass


## d. Hyperparameter Search [0.5 points]

Here, you will perform a grid search over a given range of values for the regularization parameter alpha to find the most optimal value for each type of regularization (Ridge, Lasso, and ElasticNet). The goal is to minimize the Mean Squared Error (MSE) for the model.

For each regularization type (ridge, lasso, and elasticnet), evaluate the model performance over different values of alpha. For each model, you should report:

  - The best value of alpha that minimizes the MSE.
  - The corresponding MSE at the best alpha.

You will then compare the MSE for each regularized model with the MSE obtained from the non-regularized linear regression model (i.e., without any regularization). This will help you assess the impact of regularization on the model performance.

In [None]:
def alpha_search(X_train, X_test, y_train, y_test, alphas, reg_type):
    """
    Evaluates a specified regularization model (Ridge, Lasso, or ElasticNet) over a range of alpha values.
    The function returns the best alpha and MSE for the chosen regularization type.

    Parameters:
    - X_train: Feature matrix for training.
    - X_test: Feature matrix for testing.
    - y_train: Target values for training.
    - y_test: Target values for testing.
    - alphas: List of regularization strengths to evaluate.
    - reg_type: The type of regularization ('ridge', 'lasso', or 'elasticnet').

    Returns:
    - result: A dictionary containing the best alpha and MSE for the selected regularization type.
    """
    pass

In [None]:
## DONT'T CHANGE THIS CODE BLOCK
alphas = [0.0001, 0.001, 0.01, 0.1, 1.0]

linear_model, linear_mse, linear_coeffs = train_lr(X_train, X_test, y_train, y_test)
best_ridge = alpha_search(X_train, X_test, y_train, y_test, alphas, reg_type='ridge')
best_lasso = alpha_search(X_train, X_test, y_train, y_test, alphas, reg_type='lasso')
best_elasticnet = alpha_search(X_train, X_test, y_train, y_test, alphas, reg_type='elasticnet')

# Print the results in the desired format
print("Linear Regression (No regularization):")
print(f"  MSE: {linear_mse}\n")

print("Ridge Regression:")
print(f"  Best Alpha: {best_ridge['Best Alpha']}")
print(f"  MSE: {best_ridge['MSE']}\n")

print("Lasso Regression:")
print(f"  Best Alpha: {best_lasso['Best Alpha']}")
print(f"  MSE: {best_lasso['MSE']}\n")

print("ElasticNet Regression:")
print(f"  Best Alpha: {best_elasticnet['Best Alpha']}")
print(f"  MSE: {best_elasticnet['MSE']}\n")

## e. Visualize regularization [0.5 points]

Here, you will visualize how the regularization strength parameter $α$ influences both the magnitudes of the model coefficients and the model's mean squared error (MSE). The plot generated should display the following:
- **Coefficient Magnitudes**: The effect of increasing αα on the magnitude of the coefficients for each feature.
- **Mean Squared Error (MSE)**: The impact of different $α$ values on the model's performance, measured by MSE, same as `alpha_search()`.


**Tasks to Implement**:

- Train the specified regularized regression model (Ridge, Lasso, or ElasticNet) using the training data `(X_train, y_train)` for different $\alpha$ values.
- Create a plot with two y-axes:
  - Left y-axis: Display the magnitude of the coefficients for each feature across different values of $\alpha$.
  - Right y-axis: Show the MSE for each value of $\alpha$, indicating how the regularization strength affects model performance.

- For each feature, plot its coefficient magnitude as a function of $\alpha$ (**log scale**) on the left y-axis.

- Plot the MSE for each $\alpha$ on the right y-axis.

In [None]:
def plot_regularization_effect(X_train, X_test, y_train, y_test, alphas, reg_type='lasso'):
    """
    This function trains a regularized regression model over a range of alpha values,
    then plots the coefficient magnitudes and MSE for each value of alpha on a dual-axis plot.

    Parameters:
    - X_train: Feature matrix for training.
    - X_test: Feature matrix for testing.
    - y_train: Target values for training.
    - y_test: Target values for testing.
    - alphas: List of regularization strengths (alphas) to evaluate.
    - reg_type: Type of regularization ('ridge', 'lasso', or 'elasticnet').

    Returns:
    - None: This function generates and shows a plot.
    """
    pass

In [None]:
plot_regularization_effect(X_train, X_test, y_train, y_test, alphas, reg_type='ridge')

In [None]:
plot_regularization_effect(X_train, X_test, y_train, y_test, alphas, reg_type='lasso')

In [None]:
plot_regularization_effect(X_train, X_test, y_train, y_test, alphas, reg_type='elasticnet')

### Answer the following questions based on your plots: [0.25 + 0.25 points]

Q1: What differences do you observe in the coefficients between lasso ($L_{1}$) and ridge ($L_{2}$) regularization?<br>
Q2: ElasticNet is a combination of both $L_{1}$ and $L_{2}$ regularization, what is its effect on the model coefficients?