# Lasso L1

L1 regularization, also known as Lasso (Least Absolute Shrinkage and Selection Operator) regularization, is a technique used in statistical models, such as linear regression, to prevent overfitting and to perform feature selection. It involves adding a penalty equal to the absolute value of the magnitude of coefficients to the loss function. This has the effect of not only reducing the values of the coefficients to prevent overfitting but also potentially reducing some coefficients to zero, thereby effectively removing those features from the model. This can lead to sparse models when dealing with high-dimensional data, where sparse implies that only a subset of the features contribute to the prediction with non-zero coefficients.

### Mathematical Representation

In a linear regression model, the ordinary least squares (OLS) method seeks to minimize the residual sum of squares (RSS) to find the best-fitting line. L1 regularization adds a penalty term to the RSS, leading to the following optimization problem:

\[
\text{Minimize: } RSS + \lambda \sum_{j=1}^{p} |\beta_j|
\]

where:
- \(RSS\) is the residual sum of squares,
- \(\beta_j\) are the coefficients of the model,
- \(p\) is the number of features,
- \(\lambda\) is a non-negative regularization parameter that controls the strength of the penalty. As \(\lambda\) increases, the penalty for having large coefficients increases, which can drive some coefficients to zero.

### Effects of L1 Regularization

- **Feature Selection**: By driving some coefficients to zero, L1 regularization performs automatic feature selection, identifying potentially relevant features and discarding irrelevant ones.
- **Sparsity**: The resulting model is sparse, which means it uses only a subset of all the features available, making the model simpler and potentially easier to interpret.
- **Prevention of Overfitting**: By adding a penalty on the size of the coefficients, L1 regularization helps to reduce the model's complexity, which can prevent overfitting to the training data. This can improve the model's generalization to new, unseen data.

### Choosing \(\lambda\)

The choice of \(\lambda\) is critical in L1 regularization:
- If \(\lambda = 0\), the penalty term has no effect, and the solution is equivalent to the OLS solution.
- As \(\lambda\) increases, more coefficients are set to zero, leading to a simpler model.
- If \(\lambda\) is too large, it can oversimplify the model, leading to underfitting.

The optimal value of \(\lambda\) is usually chosen via cross-validation, balancing the trade-off between bias and variance to achieve good predictive performance on unseen data.

### Applications

L1 regularization is widely used in machine learning and data science, especially in scenarios with high-dimensional data where feature selection is crucial. It's particularly useful when it's believed that only a small number of features are relevant for predicting the outcome, as it can help identify those features automatically.


In Python, you can apply L1 regularization (Lasso regularization) using libraries such as scikit-learn, which is a popular machine learning library for Python. The Lasso class in scikit-learn provides an easy-to-use implementation of L1 regularization for linear models.

Here's a basic example of how to use L1 regularization with a linear regression model in Python using scikit-learn:

# Step 1: Install scikit-learn (if necessary)
If you haven't installed scikit-learn yet, you can do so using pip:

In [1]:
from sklearn.linear_model import Lasso
from sklearn.datasets import make_regression
import numpy as np


# Step 3: Create a sample dataset
For demonstration purposes, let's generate a regression dataset using scikit-learn's make_regression function:



In [2]:
X, y = make_regression(n_samples=1000, n_features=20, noise=0.1, random_state=42)

This code generates a dataset with 1000 samples and 20 features, along with some noise to simulate real-world data.

# Step 4: Initialize and fit the Lasso model
You can specify the strength of the regularization using the alpha parameter; a larger value applies more regularization:

In [3]:
# Initialize the Lasso regression model with an alpha value
lasso = Lasso(alpha=0.1)

# Fit the model to the data
lasso.fit(X, y)


# Step 5: Inspect the coefficients
After fitting the model, you can look at the coefficients to see the effect of the L1 regularization:

In [4]:
print("Coefficients:", lasso.coef_)


Coefficients: [79.91183621 98.47656042  5.47173655  0.         86.35393606 -0.
 69.32137511 -0.         -0.         -0.         18.49633831 39.5328363
 -0.          2.9809935   0.         26.28106376 -0.         86.78388173
  0.          0.        ]


Coefficients that are zero indicate features that have been excluded by the Lasso regularization, demonstrating the feature selection capability of L1 regularization.

# Choosing alpha
The choice of alpha (the regularization strength) is crucial. It controls the level of sparsity of the coefficients estimated by the lasso. A larger alpha means more regularization, leading to sparser coefficients. The best alpha can be found using cross-validation, for which scikit-learn provides LassoCV, a Lasso model fitted with cross-validation to determine the best alpha:

In [5]:
from sklearn.linear_model import LassoCV

# Initialize the LassoCV model
lasso_cv = LassoCV(alphas=np.logspace(-4, 4, 20), cv=5)

# Fit the model to the data
lasso_cv.fit(X, y)

# The best alpha value found
print("Best alpha:", lasso_cv.alpha_)

# Using the best model
print("Coefficients with best alpha:", lasso_cv.coef_)


Best alpha: 0.0018329807108324356
Coefficients with best alpha: [ 7.99998993e+01  9.85773573e+01  5.56550116e+00  1.05886446e-03
  8.64640763e+01 -1.80801810e-03  6.94288298e+01 -0.00000000e+00
  5.03567617e-04 -1.24509783e-03  1.86046539e+01  3.96336467e+01
  0.00000000e+00  3.10114942e+00 -2.45672152e-03  2.63846239e+01
 -7.98343894e-04  8.68809418e+01  1.86024642e-03  0.00000000e+00]


This approach automatically selects the best alpha from the provided range, using cross-validation.