# Lasso Regression — Advanced Theory & Interview Q&A

## Advanced Theory

- **L1 Regularization:** Adds penalty proportional to absolute value of coefficients to loss function.
- **Feature Selection:** Lasso can set some coefficients to zero, performing automatic feature selection.
- **Bias-Variance Tradeoff:** Lasso increases bias but reduces variance, improving generalization.
- **Multicollinearity:** Lasso can be unstable with highly correlated predictors; ElasticNet is preferred.
- **Hyperparameter Tuning:** Use cross-validation to select optimal alpha.
- **Standardization:** Features should be standardized before applying Lasso.
- **Extensions:** Adaptive Lasso, ElasticNet.
- **Diagnostics:** Coefficient paths, learning curves, validation curves.

## Advanced Interview Q&A

**Q1: How does Lasso Regression perform feature selection?**  
A: By setting some coefficients to zero, removing irrelevant features.

**Q2: How do you select the regularization parameter alpha?**  
A: Use cross-validation, grid search, or validation curves.

**Q3: What is the impact of multicollinearity?**  
A: Lasso can be unstable; ElasticNet is preferred for correlated predictors.

**Q4: What is the difference between Lasso and Ridge?**  
A: Lasso can eliminate features, Ridge only shrinks coefficients.

**Q5: How do you interpret Lasso coefficients?**  
A: Non-zero coefficients are selected features; zero coefficients are removed.

**Q6: What is Adaptive Lasso?**  
A: Weights penalty differently for each coefficient, improving selection.

**Q7: How do you assess model fit?**  
A: Use R², RMSE, learning curves, validation curves.

**Q8: What are the limitations of Lasso Regression?**  
A: Can be unstable with correlated predictors, sensitive to alpha.

**Q9: How do you handle categorical variables?**  
A: Use one-hot encoding before applying Lasso.

**Q10: What is the effect of increasing alpha?**  
A: More coefficients set to zero, more feature selection.

# Lasso Regression — Theory & Interview Q&A

Lasso Regression is a regularized linear regression technique that adds an L1 penalty to the loss function, enabling both regularization and feature selection.

| Aspect                | Details                                                                 |
|-----------------------|------------------------------------------------------------------------|
| **Definition**        | Linear regression with L1 regularization (can set coefficients to zero).|
| **Equation**          | Minimize: RSS + α * Σ|βⱼ|                                               |
| **Use Cases**         | Feature selection, high-dimensional data, regularization                |
| **Assumptions**       | Same as linear regression                                               |
| **Pros**              | Performs feature selection, reduces overfitting, interpretable          |
| **Cons**              | Can be unstable with correlated features, α needs tuning                |
| **Key Parameters**    | Regularization strength (α)                                             |
| **Evaluation Metrics**| MSE, RMSE, R² Score                                                     |

## Interview Q&A

**Q1: What is Lasso Regression?**  
A: It is linear regression with L1 regularization, which can shrink some coefficients to zero.

**Q2: When should you use Lasso Regression?**  
A: When you want automatic feature selection or have many predictors.

**Q3: How does the regularization parameter α affect the model?**  
A: Higher α increases penalty, more coefficients set to zero.

**Q4: What is the difference between Lasso and Ridge Regression?**  
A: Lasso can eliminate features (set coefficients to zero), Ridge only shrinks them.

**Q5: How do you select α?**  
A: Use cross-validation to find the optimal value.

**Q6: Can Lasso Regression be used for feature selection?**  
A: Yes, it can automatically select important features.

In [1]:
# Lasso Regression with Pipeline & Hyperparameter Tuning

import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import Lasso
from sklearn.pipeline import Pipeline
from sklearn.metrics import mean_squared_error, r2_score

# 1. Load dataset
data = fetch_california_housing()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target

# 2. Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 3. Pipeline: scaling + Lasso regression
pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('lasso', Lasso(max_iter=10000))
])

# 4. Hyperparameter grid for alpha (regularization strength)
param_grid = {
    'lasso__alpha': [0.001, 0.01, 0.1, 1.0, 10.0]
}

# 5. GridSearchCV setup
grid_search = GridSearchCV(
    estimator=pipeline,
    param_grid=param_grid,
    scoring='r2',
    cv=5,
    n_jobs=-1,
    verbose=1
)

# 6. Fit GridSearchCV
grid_search.fit(X_train, y_train)

# 7. Best model & predictions
best_model = grid_search.best_estimator_
y_pred = best_model.predict(X_test)

# 8. Metrics
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Best alpha: {grid_search.best_params_['lasso__alpha']}")
print(f"Mean Squared Error: {mse:.4f}")
print(f"R² Score: {r2:.4f}")

# 9. Optional: Coefficients to inspect feature importance / selection
coef = best_model.named_steps['lasso'].coef_
print("Lasso coefficients:", coef)


Fitting 5 folds for each of 5 candidates, totalling 25 fits
Best alpha: 0.001
Mean Squared Error: 0.5545
R² Score: 0.5769
Lasso coefficients: [ 0.84914038  0.12334631 -0.28127333  0.32604963 -0.00106185 -0.03988954
 -0.88582217 -0.85809324]
