# LASSO REGRESSION (L1 REGULARIZATION)
```
LASSO = Least Absolute Shrinkage and Selection Operator

It is a linear regression technique that adds L1 penalty to reduce overfitting and perform feature selection.

```

Lasso modifies the linear regression loss like this:
 
Cost = ∑ (yi​−y^​i​)^ 2 /  λ ∑ |βj​|


Where:

λ (lambda) = regularization strength

L1 penalty = sum of absolute values of coefficients

#### Why use Lasso Regression?

Because Lasso can:

1. Reduce overfitting (regularization)
2. Select features automatically (important!)

Some coefficients become exactly zero

So it removes unimportant features

This makes Lasso useful when:

You have many features

You don’t know which features matter

You want a simpler, interpretable model


#### How Lasso works 

The L1 penalty creates a diamond-shaped constraint.
The regression loss is an ellipse.

When the ellipse touches the diamond corners, the model sets coefficients to exactly 0.

This is why Lasso performs feature selection and Ridge does not.

#### Mathematical Solution

There is no closed-form solution for Lasso like Ridge.

We use optimization methods:

Coordinate descent

Subgradient methods

LARS (Least Angle Regression)


#### Effect of λ (lambda)
λ Value	    Effect
0	        Same as Linear Regression
Small	    Slight shrinkage
Larger	    Many coefficients → 0
Very large	Underfitting





## Lasso vs Ridge Regression



Feature	              Lasso	            Ridge
Penalty	              L1	            L2
Coefficients          Some become 0	    Shrinks but never 0
Feature Selection	  Yes	            No
Good for	          Sparse models	    Multicollinearity
Coefficient stability Can be unstable	Stable



When to use	Few important features	Many small effects


#### Advantages

Performs automatic feature selection

Reduces overfitting

Improves model interpretability

Best for high-dimensional data (p >> n)


#### Disadvantages

When features are highly correlated, Lasso picks one and ignores others

Coefficients can be unstable

Not ideal when all features are important

#### When to use Lasso Regression?

Use Lasso when:

 Want feature selection
 Have many irrelevant features
 Need a simple model
 Working with high-dimensional dataset
 Features are not strongly correlated



``` 
Lasso uses L1 regularization

Reduces overfitting

Makes some coefficients exactly zero

Performs feature selection

Good for high-dimensional data

λ controls strength of regularization

Solved using optimization (no closed-form solution)

```


In [None]:
from sklearn.datasets import fetch_california_housing
from sklearn.linear_model import Lasso
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
import numpy as np


data = fetch_california_housing()
X = data.data
y = data.target


X_train, X_test, Y_train, Y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Lasso model
model = Lasso(alpha=0.1)
model.fit(X_train, Y_train)

y_pred = model.predict(X_test)

# Evaluation
print("Coefficients:", model.coef_)
print("Intercept:", model.intercept_)
print("MSE:", mean_squared_error(Y_test, y_pred))
print("R2 Score:", r2_score(Y_test, y_pred))


Coefficients: [ 3.92693362e-01  1.50810624e-02 -0.00000000e+00  0.00000000e+00
  1.64168387e-05 -3.14918929e-03 -1.14291203e-01 -9.93076483e-02]
Intercept: -7.698845419807455
MSE: 0.6135115198058131
R2 Score: 0.5318167610318159


In [3]:
from sklearn.datasets import load_diabetes
from sklearn.linear_model import Lasso
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score

data = load_diabetes()
X = data.data
y = data.target

X_train, X_test, Y_train, Y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

model = Lasso(alpha=0.1)
model.fit(X_train, Y_train)

y_pred = model.predict(X_test)

print("Coefficients:", model.coef_)
print("Intercept:", model.intercept_)
print("MSE:", mean_squared_error(Y_test, y_pred))
print("R2 Score:", r2_score(Y_test, y_pred))


Coefficients: [   0.         -152.66477923  552.69777529  303.36515791  -81.36500664
   -0.         -229.25577639    0.          447.91952518   29.64261704]
Intercept: 151.57485282893947
MSE: 2798.193485169719
R2 Score: 0.4718547867276227


In [4]:
from sklearn.datasets import make_regression
from sklearn.linear_model import Lasso
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score

# Synthetic regression dataset
X, y = make_regression(
    n_samples=500,
    n_features=10,
    noise=10,
    random_state=42
)

X_train, X_test, Y_train, Y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

model = Lasso(alpha=0.1)
model.fit(X_train, Y_train)

y_pred = model.predict(X_test)

print("Coefficients:", model.coef_)
print("Intercept:", model.intercept_)
print("MSE:", mean_squared_error(Y_test, y_pred))
print("R2 Score:", r2_score(Y_test, y_pred))


Coefficients: [41.82204111 64.12395241 18.73201774 46.72243239 23.9565675  16.39228576
 82.69535426 64.17889918  7.4871428  28.65860581]
Intercept: -0.2927999172299187
MSE: 97.1743302138308
R2 Score: 0.9950667951885857
