In [None]:
# Q1. What is Lasso Regression, and How Does It Differ from Other Regression Techniques?
# Lasso Regression (Least Absolute Shrinkage and Selection Operator) is a linear regression technique that adds an L1 regularization term to the loss function.
# The objective is to minimize:
# Loss function = RSS (Residual Sum of Squares) + λ * (sum of absolute values of coefficients)
# Lasso can set some of the coefficients to zero, effectively performing feature selection by eliminating irrelevant predictors.
# It differs from ordinary least squares regression (OLS) and Ridge Regression:
# - OLS minimizes the residuals (without regularization).
# - Ridge uses L2 regularization (penalizes large coefficients but does not eliminate them).
# - Lasso uses L1 regularization (can shrink coefficients to zero, performing feature selection).

from sklearn.linear_model import Lasso
import numpy as np

# Example of Lasso Regression
X = np.array([[1], [2], [3], [4], [5]])  # Independent variable
y = np.array([1, 2, 3, 4, 5])            # Dependent variable

lasso_model = Lasso(alpha=0.1)
lasso_model.fit(X, y)
print(f"Lasso Coefficients: {lasso_model.coef_}")

# Q2. What is the Main Advantage of Using Lasso Regression in Feature Selection?
# The main advantage of Lasso Regression in feature selection is its ability to shrink the coefficients of some features to zero.
# This means that Lasso automatically performs feature selection by identifying and eliminating irrelevant features.
# This is particularly useful when there are many features, as Lasso can simplify the model and reduce overfitting.

# Q3. How Do You Interpret the Coefficients of a Lasso Regression Model?
# The interpretation of the coefficients in a Lasso Regression model is similar to that in ordinary linear regression:
# - A positive coefficient means that as the corresponding independent variable increases, the dependent variable increases.
# - A negative coefficient means that as the independent variable increases, the dependent variable decreases.
# - The magnitude of the coefficient indicates the strength of the relationship between the independent and dependent variables.
# Since Lasso may shrink some coefficients to zero, variables with zero coefficients are effectively excluded from the model.

# Example interpretation
print(f"Lasso Intercept: {lasso_model.intercept_}")
print(f"Lasso Coefficients: {lasso_model.coef_}")

# Q4. What are the Tuning Parameters That Can Be Adjusted in Lasso Regression, and How Do They Affect the Model's Performance?
# The main tuning parameter in Lasso Regression is the regularization parameter, λ (alpha in sklearn).
# - A larger value of λ increases regularization, resulting in more shrinkage of coefficients and more features being eliminated.
# - A smaller value of λ reduces the effect of regularization, allowing the model to fit more closely to the data, but this can lead to overfitting.
# Cross-validation is typically used to select the optimal value of λ.

from sklearn.model_selection import GridSearchCV

# Example: Grid search for optimal alpha (regularization parameter)
parameters = {'alpha': [0.01, 0.1, 1, 10, 100]}
lasso_grid = GridSearchCV(Lasso(), parameters, cv=5)
lasso_grid.fit(X, y)
print(f"Best alpha value: {lasso_grid.best_params_['alpha']}")

# Q5. Can Lasso Regression be Used for Non-Linear Regression Problems? If Yes, How?
# Lasso Regression is inherently a linear regression technique, meaning it models linear relationships between the features and the target.
# However, it can be extended to handle non-linear relationships by using polynomial features or kernel methods:
# - Polynomial features: By transforming the input features into higher-degree polynomials, Lasso can model non-linear relationships.
# - Kernel methods: Non-linear transformations of the input data can be applied before applying Lasso to capture more complex patterns.

from sklearn.preprocessing import PolynomialFeatures

# Example: Using polynomial features with Lasso Regression
poly = PolynomialFeatures(degree=2)
X_poly = poly.fit_transform(X)

lasso_poly = Lasso(alpha=0.1)
lasso_poly.fit(X_poly, y)
print(f"Lasso Coefficients with Polynomial Features: {lasso_poly.coef_}")

# Q6. What is the Difference Between Ridge Regression and Lasso Regression?
# The main difference between Ridge and Lasso Regression lies in the type of regularization they use:
# - Ridge Regression uses L2 regularization (penalizes the sum of squared coefficients), which shrinks coefficients but does not set them to zero.
# - Lasso Regression uses L1 regularization (penalizes the sum of absolute values of coefficients), which can set some coefficients to zero, performing feature selection.
# Lasso is more suited for sparse models where many features are irrelevant, while Ridge is better when all features contribute to the model.

# Example of Ridge and Lasso comparison
from sklearn.linear_model import Ridge

ridge_model = Ridge(alpha=0.1)
ridge_model.fit(X, y)
print(f"Ridge Coefficients: {ridge_model.coef_}")
print(f"Lasso Coefficients: {lasso_model.coef_}")

# Q7. Can Lasso Regression Handle Multicollinearity in the Input Features? If Yes, How?
# Lasso Regression can handle multicollinearity to some extent. Multicollinearity occurs when independent variables are highly correlated.
# Lasso can help by selecting one feature from a group of correlated features and shrinking the others to zero.
# This makes Lasso particularly useful when there are many correlated features, as it performs feature selection, eliminating unnecessary or redundant predictors.

# Q8. How Do You Choose the Optimal Value of the Regularization Parameter (λ) in Lasso Regression?
# The optimal value of the regularization parameter λ is typically chosen using cross-validation.
# A smaller value of λ allows the model to fit the data more closely (risk of overfitting), while a larger λ shrinks coefficients more, potentially underfitting.
# Grid search or randomized search is commonly used to find the best λ that balances bias and variance, ensuring the best model performance.

# Example of cross-validation for tuning alpha
from sklearn.model_selection import cross_val_score

# Evaluate performance using cross-validation
lasso_cv = Lasso(alpha=0.1)
scores = cross_val_score(lasso_cv, X, y, cv=5)
print(f"Cross-validation scores: {scores.mean()}")
