# Lasso Regression

Lasso Regression, which stands for **Least Absolute Shrinkage and Selection Operator**, is a type of linear regression that uses shrinkage. Shrinkage here means that the data values are shrunk towards a central point, like the mean. The lasso technique encourages simple, sparse models (i.e., models with fewer parameters). This particular type of regression is well-suited for models showing high levels of multicollinearity or when you want to automate certain parts of model selection, like variable selection/parameter elimination.

## Key Features of Lasso Regression:

* **Regularization Term**: The key characteristic of Lasso Regression is that it adds an L1 penalty to the regression model, which is the absolute value of the magnitude of the coefficients. The cost function for Lasso regression is:

  $$J(\theta) = \frac{1}{2m} \sum_{i=1}^{m} (h_\theta(x^{(i)}) - y^{(i)})^2 + \alpha \sum_{j=1}^{n} |\theta_j|$$

  where $\alpha$ is the regularization parameter.

* **Feature Selection**: One of the advantages of lasso regression over ridge regression is that it can result in sparse models with few coefficients; some coefficients can become exactly zero and be eliminated from the model. This property is called automatic feature selection and is a form of embedded method.

* **Parameter Tuning**: The strength of the L1 penalty is determined by a parameter, typically denoted as alpha or lambda. Selecting a good value for this parameter is crucial and is typically done using cross-validation.

* **Bias-Variance Tradeoff**: Similar to ridge regression, lasso also manages the bias-variance tradeoff in model training. Increasing the regularization strength increases bias but decreases variance, potentially leading to better generalization on unseen data.

* **Scaling**: Before applying lasso, it is recommended to scale/normalize the data as lasso is sensitive to the scale of input features.

## Implementation in Scikit-Learn:

Lasso regression can be implemented using the `Lasso` class from Scikit-Learn's `linear_model` module. Here's a basic example:

In [1]:
from sklearn.linear_model import Lasso, Ridge, LinearRegression
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Generate some regression data
X, y = make_regression(n_samples=1000, n_features=15, noise=0.1, random_state=42)

# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create a Lasso regression object
lr = LinearRegression()
lasso = Lasso(alpha=0.2)
ridge = Ridge(alpha=1.0)

# Fit the model
lr.fit(X_train, y_train)
lasso.fit(X_train, y_train)
ridge.fit(X_train, y_train)

# Make predictions
y_pred_lr = lr.predict(X_test)
y_pred_lasso = lasso.predict(X_test)
y_pred_ridge = ridge.predict(X_test)
# Evaluate the model
print("MSE of Linear Regression:", mean_squared_error(y_test, y_pred_lr))
print("MSE of Lasso:", mean_squared_error(y_test, y_pred_lasso))
print("MSE of Ridge:", mean_squared_error(y_test, y_pred_ridge))


MSE of Linear Regression: 0.0111837651150914
MSE of Lasso: 0.3847492638484584
MSE of Ridge: 0.05090866185226863


## Code Explanation - Linear, Lasso, and Ridge Regression Comparison

In this code block, I:

1. Imported necessary models (`Lasso`, `Ridge`, `LinearRegression`) and utility functions from scikit-learn
2. Generated synthetic regression data with 1000 samples and 15 features
3. Split the data into training (80%) and testing (20%) sets
4. Created three regression models:
   - Standard Linear Regression (no regularization)
   - Lasso Regression (L1 regularization with alpha=0.2)
   - Ridge Regression (L2 regularization with alpha=1.0)
5. Trained all three models on the training data
6. Made predictions on the test set with each model
7. Calculated and printed the Mean Squared Error (MSE) to compare the performance of all three models

This demonstrates how different regularization techniques affect model performance on the same dataset.

In [2]:
import numpy as np
x = np.arange(1, 10, 0.1)
len(x)

90

## Code Explanation - Creating a Range

In this code snippet, I:

1. Imported the NumPy library
2. Created an array `x` containing values from 1 to 10 with a step size of 0.1
3. Checked the length of this array using `len(x)`

This creates a sequence of values that can be used for parameter tuning in the next code blocks.

In [3]:
%%time
# Fine tune alpha value using cv
from sklearn.model_selection import GridSearchCV, train_test_split, RandomizedSearchCV
from sklearn.linear_model import Lasso, Ridge
import numpy as np

# Create a Lasso regression object
lasso = Lasso()

# Create a dictionary for the grid search key and values
param_grid = {'alpha': np.arange(1, 10, 0.01)}

# Use grid search to find the best value for alpha
lasso_cv = GridSearchCV(lasso, param_grid, cv=10, 
            # n_jobs=-2
            )

# Fit the model
lasso_cv.fit(X, y)

# Print the tuned parameters and score
print("Tuned Lasso Regression Parameters: {}".format(lasso_cv.best_params_))
print("Best score is {}".format(lasso_cv.best_score_))

# Create a Ridge regression object
ridge = Ridge()

# Create a dictionary for the grid search key and values
param_grid = {'alpha': np.arange(1, 10, 0.01)}

# Use grid search to find the best value for alpha
ridge_cv = GridSearchCV(ridge, param_grid, cv=10, 
                # n_jobs=-2
                )

# Fit the model
ridge_cv.fit(X, y)

# Print the tuned parameters and score
print("Tuned Ridge Regression Parameters: {}".format(ridge_cv.best_params_))
print("Best score is {}".format(ridge_cv.best_score_))


Tuned Lasso Regression Parameters: {'alpha': np.float64(1.0)}
Best score is 0.9995685234915115
Tuned Ridge Regression Parameters: {'alpha': np.float64(1.0)}
Best score is 0.9999981195099323
CPU times: total: 48.1 s
Wall time: 49.4 s


## Code Explanation - Grid Search for Hyperparameter Tuning

In this code block, I used Grid Search Cross-Validation to find the optimal alpha value for both Lasso and Ridge regression:

1. Imported necessary tools for cross-validation and model selection
2. Created Lasso and Ridge regression objects (without specifying alpha)
3. Defined a parameter grid with alpha values ranging from 1 to 10 with 0.01 step size
4. Set up GridSearchCV with 10-fold cross-validation for both models
5. Trained both models using the entire dataset
6. Printed the best alpha parameter and corresponding score for each model

Grid Search exhaustively searches through all parameter combinations to find the optimal value. I used the `%%time` magic command to measure execution time since this is a computationally intensive process.

In [4]:
%%time
# Fine tune alpha value using cv
from sklearn.model_selection import GridSearchCV, train_test_split, RandomizedSearchCV
from sklearn.linear_model import Lasso, Ridge
import numpy as np

# Create a Lasso regression object
lasso = Lasso()

# Create a dictionary for the grid search key and values
param_grid = {'alpha': np.arange(1, 10, 0.01)}

# Use grid search to find the best value for alpha
lasso_cv = RandomizedSearchCV(lasso, param_grid, cv=10,
            n_jobs=-2
            )

# Fit the model
lasso_cv.fit(X, y)

# Print the tuned parameters and score
print("Tuned Lasso Regression Parameters: {}".format(lasso_cv.best_params_))
print("Best score is {}".format(lasso_cv.best_score_))

# Create a Ridge regression object
ridge = Ridge()

# Create a dictionary for the grid search key and values
param_grid = {'alpha': np.arange(1, 10, 0.01)}

# Use grid search to find the best value for alpha
ridge_cv = RandomizedSearchCV(ridge, param_grid, cv=10, 
                n_jobs=-2
                )

# Fit the model
ridge_cv.fit(X, y)

# Print the tuned parameters and score
print("Tuned Ridge Regression Parameters: {}".format(ridge_cv.best_params_))
print("Best score is {}".format(ridge_cv.best_score_))

Tuned Lasso Regression Parameters: {'alpha': np.float64(1.02)}
Best score is 0.9995510922555066
Tuned Ridge Regression Parameters: {'alpha': np.float64(1.9600000000000009)}
Best score is 0.9999939557904197
CPU times: total: 422 ms
Wall time: 5.96 s


## Code Explanation - Randomized Search for Faster Hyperparameter Tuning

This code block performs the same hyperparameter tuning as the previous one, but using RandomizedSearchCV instead of GridSearchCV:

1. Set up the same parameter grid for alpha values (1 to 10 with 0.01 step)
2. Used RandomizedSearchCV instead of GridSearchCV to randomly sample from the parameter space
3. Enabled parallel processing with `n_jobs=-2` (using all but one CPU core)
4. Performed 10-fold cross-validation for both Lasso and Ridge regression
5. Printed the best parameters and scores

RandomizedSearchCV is more efficient than GridSearchCV as it doesn't test all parameter combinations, but rather samples a specified number of combinations. This is particularly useful for large parameter spaces, providing a good trade-off between computation time and optimization quality.

# About the Author

<div style="background-color: #f8f9fa; border-left: 5px solid #28a745; padding: 20px; margin-bottom: 20px; border-radius: 5px;">
  <h2 style="color: #28a745; margin-top: 0; font-family: 'Poppins', sans-serif;">Muhammad Atif Latif</h2>
  <p style="font-size: 16px; color: #495057;">Data Scientist & Machine Learning Engineer</p>
  
  <p style="font-size: 15px; color: #6c757d; margin-top: 15px;">
    Passionate about building AI solutions that solve real-world problems. Specialized in machine learning, 
    deep learning, and data analytics with experience implementing production-ready models.
  </p>
</div>

## Connect With Me

<div style="display: flex; flex-wrap: wrap; gap: 10px; margin-top: 15px;">
  <a href="https://github.com/m-Atif-Latif" target="_blank">
    <img src="https://img.shields.io/badge/GitHub-Follow-212121?style=for-the-badge&logo=github" alt="GitHub">
  </a>
  <a href="https://www.kaggle.com/matiflatif" target="_blank">
    <img src="https://img.shields.io/badge/Kaggle-Profile-20BEFF?style=for-the-badge&logo=kaggle" alt="Kaggle">
  </a>
  <a href="https://www.linkedin.com/in/muhammad-atif-latif-13a171318" target="_blank">
    <img src="https://img.shields.io/badge/LinkedIn-Connect-0077B5?style=for-the-badge&logo=linkedin" alt="LinkedIn">
  </a>
  <a href="https://x.com/mianatif5867" target="_blank">
    <img src="https://img.shields.io/badge/Twitter-Follow-1DA1F2?style=for-the-badge&logo=twitter" alt="Twitter">
  </a>
  <a href="https://www.instagram.com/its_atif_ai/" target="_blank">
    <img src="https://img.shields.io/badge/Instagram-Follow-E4405F?style=for-the-badge&logo=instagram" alt="Instagram">
  </a>
  <a href="mailto:muhammadatiflatif67@gmail.com">
    <img src="https://img.shields.io/badge/Email-Contact-D14836?style=for-the-badge&logo=gmail" alt="Email">
  </a>
</div>

---