# Regression 4: Lasso Regression Concepts and Applications
This notebook covers Lasso Regression, its advantages for feature selection, coefficient interpretation, tuning parameters, handling of non-linear problems and multicollinearity, and comparison with Ridge Regression.

## Q1. What is Lasso Regression, and how does it differ from other regression techniques?

**Lasso Regression** (Least Absolute Shrinkage and Selection Operator) is a linear regression technique that adds an L1 penalty (sum of absolute values of coefficients) to the loss function. This penalty can shrink some coefficients to exactly zero, effectively performing feature selection. Unlike OLS (no penalty) and Ridge (L2 penalty), Lasso can eliminate irrelevant features.

In [None]:
# Example: Lasso Regression
from sklearn.linear_model import Lasso

lasso = Lasso(alpha=0.1)
lasso.fit(X, Y)
print('Lasso coefficients:', lasso.coef_)

## Q2. What is the main advantage of using Lasso Regression in feature selection?

The main advantage is that Lasso can shrink some coefficients to exactly zero, automatically selecting a subset of features and simplifying the model. This helps in identifying the most important predictors and reducing overfitting.

## Q3. How do you interpret the coefficients of a Lasso Regression model?

Coefficients that are non-zero indicate features that are important for predicting the target variable. Coefficients that are exactly zero mean the corresponding features are excluded from the model.

## Q4. What are the tuning parameters that can be adjusted in Lasso Regression, and how do they affect the model's performance?

The main tuning parameter is **alpha** (the regularization strength). Higher alpha increases the penalty, leading to more coefficients being set to zero (stronger feature selection), but may underfit. Lower alpha reduces the penalty, including more features but may overfit.

In [None]:
# Example: Effect of alpha in Lasso
for a in [0.01, 0.1, 1]:
    lasso = Lasso(alpha=a)
    lasso.fit(X, Y)
    print(f'Alpha: {a}, Coefficients: {lasso.coef_}')

## Q5. Can Lasso Regression be used for non-linear regression problems? If yes, how?

Yes, by transforming the input features (e.g., using polynomial or interaction terms), Lasso can be applied to non-linear problems. The model remains linear in the parameters, but the features can represent non-linear relationships.

In [None]:
# Example: Lasso with polynomial features
from sklearn.preprocessing import PolynomialFeatures

poly = PolynomialFeatures(degree=2)
X_poly = poly.fit_transform(X)
lasso_poly = Lasso(alpha=0.1)
lasso_poly.fit(X_poly, Y)
print('Lasso coefficients (polynomial):', lasso_poly.coef_)

## Q6. What is the difference between Ridge Regression and Lasso Regression?

- **Ridge Regression** uses an L2 penalty (squared coefficients), shrinking coefficients but rarely making them exactly zero. All features remain in the model.
- **Lasso Regression** uses an L1 penalty (absolute coefficients), which can shrink some coefficients to zero, performing feature selection.

## Q7. Can Lasso Regression handle multicollinearity in the input features? If yes, how?

Yes, Lasso can handle multicollinearity by selecting one variable from a group of highly correlated features and setting the others to zero. However, the choice among correlated features may be arbitrary.

## Q8. How do you choose the optimal value of the regularization parameter (lambda) in Lasso Regression?

The optimal value of alpha (lambda) is typically chosen using cross-validation (e.g., LassoCV in scikit-learn), selecting the value that minimizes validation error.

In [None]:
# Example: Choosing alpha with cross-validation
from sklearn.linear_model import LassoCV

lasso_cv = LassoCV(alphas=[0.01, 0.1, 1, 10], cv=5)
lasso_cv.fit(X, Y)
print('Best alpha:', lasso_cv.alpha_)