<a href="https://colab.research.google.com/github/ravirajgm/Raviraj_Data_Science_Case_Studies/blob/main/Lasso_Ridge_Regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Types of Regularized Regression
Two commonly used types of regularized regression methods are ridge regression and lasso regression.

Ridge regression is a way to create a parsimonious model when the number of predictor variables in a set exceeds the number of observations, or when a data set has multicollinearity (correlations between predictor variables).

Lasso regression is a type of linear regression that uses shrinkage. Shrinkage is where data values are shrunk towards a central point, like the mean. This type is very useful when you have high levels of muticollinearity or when you want to automate certain parts of model selection, like variable selection/parameter elimination.

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score
from sklearn.linear_model import Ridge
from sklearn.linear_model import Lasso
from sklearn.linear_model import LinearRegression
import statsmodels.api as sm

  import pandas.util.testing as tm


In [2]:
# Ridge Regression

wine_quality = pd.read_csv("/content/drive/MyDrive/Colab Notebooks/Data/winequality-red.csv",sep=';')  
wine_quality.rename(columns=lambda x: x.replace(" ", "_"), inplace=True)

all_colnms = ['fixed_acidity', 'volatile_acidity', 'citric_acid', 'residual_sugar',
 'chlorides', 'free_sulfur_dioxide', 'total_sulfur_dioxide', 'density',
 'pH', 'sulphates', 'alcohol']


pdx = wine_quality[all_colnms]
pdy = wine_quality["quality"]

x_train,x_test,y_train,y_test = train_test_split(pdx,pdy,train_size = 0.7,random_state=42)

alphas = [1e-4,1e-3,1e-2,0.1,0.5,1.0,5.0,10.0]

initrsq = 0

print ("\nRidge Regression: Best Parameters\n")
for alph in alphas:
    ridge_reg = Ridge(alpha=alph) 
    ridge_reg.fit(x_train,y_train)    
    tr_rsqrd = ridge_reg.score(x_train,y_train)
    ts_rsqrd = ridge_reg.score(x_test,y_test)    

    if ts_rsqrd > initrsq:
        print ("Lambda: ",alph,"Train R-Squared value:",round(tr_rsqrd,5),"Test R-squared value:",round(ts_rsqrd,5))
        initrsq = ts_rsqrd

# Coeffients of Ridge regression of best alpha value
ridge_reg = Ridge(alpha=0.001) 
ridge_reg.fit(x_train,y_train) 
 

print ("\nRidge Regression coefficient values of Alpha = 0.001\n")
for i in range(11):
    print (all_colnms[i],": ",ridge_reg.coef_[i])



Ridge Regression: Best Parameters

Lambda:  0.0001 Train R-Squared value: 0.3612 Test R-squared value: 0.35135

Ridge Regression coefficient values of Alpha = 0.001

fixed_acidity :  0.015506587508043793
volatile_acidity :  -1.1050982354876895
citric_acid :  -0.2487986553235105
residual_sugar :  0.004018895392835028
chlorides :  -1.684383962086347
free_sulfur_dioxide :  0.004636901710963127
total_sulfur_dioxide :  -0.0032837679041055035
density :  -5.567271746802898
pH :  -0.362480017204004
sulphates :  0.8009191228025629
alcohol :  0.2999182442952101


In [3]:
# Lasso Regression
from sklearn.linear_model import Lasso

alphas = [1e-4,1e-3,1e-2,0.1,0.5,1.0,5.0,10.0]
initrsq = 0
print ("\nLasso Regression: Best Parameters\n")

for alph in alphas:
    lasso_reg = Lasso(alpha=alph) 
    lasso_reg.fit(x_train,y_train)    
    tr_rsqrd = lasso_reg.score(x_train,y_train)
    ts_rsqrd = lasso_reg.score(x_test,y_test)    

    if ts_rsqrd > initrsq:
        print ("Lambda: ",alph,"Train R-Squared value:",round(tr_rsqrd,5),"Test R-squared value:",round(ts_rsqrd,5))
        initrsq = ts_rsqrd

# Coeffients of Lasso regression of best alpha value
lasso_reg = Lasso(alpha=0.001) 
lasso_reg.fit(x_train,y_train) 

print ("\nLasso Regression coefficient values of Alpha = 0.001\n")
for i in range(11):
    print (all_colnms[i],": ",lasso_reg.coef_[i])


Lasso Regression: Best Parameters

Lambda:  0.0001 Train R-Squared value: 0.36101 Test R-squared value: 0.35057

Lasso Regression coefficient values of Alpha = 0.001

fixed_acidity :  0.014149546369062422
volatile_acidity :  -1.0906236090493848
citric_acid :  -0.18529515004737027
residual_sugar :  -0.00013661024678723296
chlorides :  -1.058775797041006
free_sulfur_dioxide :  0.0048316481751489865
total_sulfur_dioxide :  -0.0032672288559592293
density :  -0.0
pH :  -0.25690192587072963
sulphates :  0.694487540316411
alcohol :  0.3077561491242808
