`Ridge Regression and Lasso Regression are two popular techniques used in linear regression to address issues related to overfitting and multicollinearity, which are common challenges in predictive modeling and regression analysis.`


In [33]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.linear_model import LinearRegression
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt


In [34]:
iris = load_iris()
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=12)


In [41]:
# Used for cross-validation and the scoring criteria used is mean_squared_error
linear_regressor=LinearRegression()
linear_regressor.fit(X,y)
mse_score = cross_val_score(
    linear_regressor, X, y, scoring="neg_mean_squared_error", cv=5)
mean_mse = mse_score.mean()

print(f"Mean of MSE: {mean_mse}")

# Mean of MSE should be as close to 0


Mean of MSE: -0.0689701255462469



# Ridge and Lasso Regression

**Ridge Regression** and **Lasso Regression** are two techniques used in linear regression to address issues related to overfitting and multicollinearity, common challenges in predictive modeling. They modify traditional linear regression by introducing regularization, adding penalty terms to the cost function.


In [36]:
from sklearn.linear_model import Ridge,Lasso
from sklearn.model_selection import GridSearchCV

parameters={"alpha":(1e-15,1e-10,1e-5,1e-3,1e-2,1,5,3,10,20,25,50,75,100)}

## Ridge Regression

- **Objective:** Ridge Regression, also known as L2 regularization, aims to find model parameters that fit the data well while keeping them small.

- **Cost Function:** It adds a regularization term to the mean squared error (MSE) cost function:

   ### J(θ)=MSE(θ) + α∑(θ)^2

- **Why Use Ridge Regression:**
  
  1. **Prevents Overfitting:** Ridge Regression mitigates overfitting by penalizing large coefficients, leading to a simpler, more generalizable model.
  
  2. **Handles Multicollinearity:** It effectively addresses multicollinearity, reducing the impact of correlated features.

In [37]:
ridge = Ridge()
ridge_regressor = GridSearchCV(
    ridge, parameters, scoring="neg_mean_squared_error", cv=5)

# Fitting our model to data-set so to find the best_score
ridge_regressor.fit(X_train,y_train)


GridSearchCV(cv=5, estimator=Ridge(),
             param_grid={'alpha': (1e-15, 1e-10, 1e-05, 0.001, 0.01, 1, 5, 3,
                                   10, 20, 25, 50, 75, 100)},
             scoring='neg_mean_squared_error')

In [38]:
print(ridge_regressor.best_params_)
print(ridge_regressor.best_score_)

{'alpha': 1}
-0.04961936142257062


## Lasso Regression

- **Objective:** Lasso Regression, also known as L1 regularization, seeks to find a model that fits the data well while keeping coefficients small.

- **Cost Function:** It introduces a regularization term to the cost function:

  ### J(θ)=MSE(θ) + α∑∣θ∣

- **Why Use Lasso Regression:**

  1. **Feature Selection:** Lasso Regression automatically selects relevant features by zeroing out less important coefficients.
  
  2. **Handles Multicollinearity:** It is effective in dealing with multicollinearity.
  
  3. **Simplifies the Model:** Lasso produces a simplified model with fewer variables, aiding interpretation.

In [39]:
lasso = Lasso()
lasso_regressor = GridSearchCV(
    ridge, parameters, scoring="neg_mean_squared_error", cv=5)

# Fitting our model to data-set so to find the best_score
lasso_regressor.fit(X_train, y_train)


GridSearchCV(cv=5, estimator=Ridge(),
             param_grid={'alpha': (1e-15, 1e-10, 1e-05, 0.001, 0.01, 1, 5, 3,
                                   10, 20, 25, 50, 75, 100)},
             scoring='neg_mean_squared_error')

In [40]:
print(lasso_regressor.best_params_)
print(lasso_regressor.best_score_)


{'alpha': 1}
-0.04961936142257062


Ridge and Lasso Regression are valuable tools when working with linear regression models, especially in situations where feature selection, multicollinearity, and overfitting are concerns. The choice between them depends on your specific problem, and you may need to fine-tune the regularization parameter α to find the optimal model for your data.