# Performance Comparision of Regularized and Unregularized Regression Models

### Introduction
Regularization came out to be an essential technique for reducing the effect of overfitting, especially for regression problems. An overfitting model has a large variation in Train set Root Mean Square Error (RMSE) and test Root Mean Square Error (RMSE). Regularized Regression Model tends to show the least difference between the Train and Test Set RMSE than the Classical Regression Model.

We will focus on performance evaluation and comparison of Unregularized Classical Multilinear Regression Models with Regularized Multilinear Regression Models on a dataset. We will compare the RMSE for Train and Test set and will try to determine which Regression Model performs the best for the given dataset. We are going to use four Regression Models:

1. Linear Regression Model
2. Lasso Regression Model
3. Ridge Regression Model
4. ElasticNet Regression Model.


Regularization in Linear regression is a technique that prevents overfitting in the model by penalizing the coefficients involved in the linear regression equation. Coefficients in an overfitted model are inflated or weigh highly. Thus adding penalties on these parameters prevent them from inflating. Overfitted Models perform well on the training data while fail to perform on the test or new data passed. Thus, the built model has no use. We add coefficients to the cost function which is the Mean Squared Error (Sum of Squared Residuals Divided by Degrees of Freedom) of the regression model, which as a result increases the cost. The optimizer would try to minimize the coefficient to decrease the cost function. In regularization, penalizes all the parameters except the intercept.

Two Regularization techniques can be used to present overfitting. The L1 Regularization or LASSO adds the absolute value of coefficients as penalties to the cost function. The L2 Regularization or Ridge adds the summation of squared values of coefficients as penalities to the cost function. The alpha value represents how much we want to penalize the coefficients.

The unregularized Regression Model is our Classical Linear (or Multilinear) Regression Model. While Regularized Regression Models are based on these Regularization techniques. The Ridge Regression Model is based on the L2 Regularization Technique. While Lasso Regression Model is based on the L1 Regularization technique. The ElasticNet Regression Model is based on both L1 and L2 Regularization techniques. Let’s compare the performances of the Unregularized Regression Models with Regularized Regression Models.

### Classic Multilinear Regression Model
Fistly, we build a Classic Multilinear Regression Model for the specified dataset. We will calculate the Train and Test RMSE and later will compare with Regularized Regression Models.

In [45]:
#importing all libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn import metrics
import warnings
warnings.filterwarnings('ignore')

We picked the dataset from [Kaggle](https://www.kaggle.com/aungpyaeap/fish-market) that contains data about common fish species.

In [46]:
#file location
location = 'P:/Projects/Fish.csv'

In [47]:
#reading csv data
df = pd.read_csv(location)

In [48]:
#fetching rows
df.head()

Unnamed: 0,Species,Weight,Length1,Length2,Length3,Height,Width
0,Bream,242.0,23.2,25.4,30.0,11.52,4.02
1,Bream,290.0,24.0,26.3,31.2,12.48,4.3056
2,Bream,340.0,23.9,26.5,31.1,12.3778,4.6961
3,Bream,363.0,26.3,29.0,33.5,12.73,4.4555
4,Bream,430.0,26.5,29.0,34.0,12.444,5.134


In [49]:
#fish species
df.Species.value_counts()

Perch        56
Bream        35
Roach        20
Pike         17
Smelt        14
Parkki       11
Whitefish     6
Name: Species, dtype: int64

For this dataset, our task is to predict the Weight of the Fishes based on several features. Thus, ‘Weight’ is our target feature. For the x variable, we are taking every feature except the target variable ‘Weight’ and for the y variable, we are taking just the target ‘Weight’.

In [50]:
#selecting predictor variables and target variable
x = df.drop('Weight', axis = 1)
y = df['Weight']
y = y.values.reshape(-1,1)

Since our dataset contains a categorical feature, we will use the .get_dummies() method of pandas to create dummies for the unique classes of the categorical feature. We set the drop_first parameter value to True to save ourselves from the dummy variable trap.

In [51]:
# Creating Dummies for Categorical Features
x = pd.get_dummies(x, drop_first = True)

In [52]:
# train-test splitting data
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.3, random_state = 55)

### Unregularized Regression Model Performace

In [53]:
#instantiating model class
lr = LinearRegression()

In [54]:
#fitting training data
lr.fit(x_train, y_train)

LinearRegression()

In [55]:
#printing RMSE scores on train and test set
print("Train RMSE: ", np.round(np.sqrt(metrics.mean_squared_error(y_train, lr.predict(x_train))), 5))
print("Test RMSE: ", np.round(np.sqrt(metrics.mean_squared_error(y_test, lr.predict(x_test))), 5))

Train RMSE:  75.33595
Test RMSE:  182.23395


**RMSE Difference = ~106.8**

As discussed earlier, a perfect model often shows almost equal Train and Test Errors. Now, We will compare the difference between these Train and Test Error-values with Regularized Regression Models to provide our judgment if this Classical Regression Model is perfect or not.

### Unregularized Regression Model Performace

For the Regularized Regression Models, will look at LASSO Regression, Ridge Regression, and ElasticNet Regression and compare their performances (by checking the difference in Train-Test error values) with the performance of the Unregularized Classical Regression Model (Train-Test error values) evaluated above.

### 1.  LASSO Regression
LASSO (or Least Absolute Shrinkage and Selection Operator) or L1 is a regularization technique in which the summation of absolute coefficient values is added to the cost function (or MSE) as a penalty. Lasso Regression is based on this L1 Regularization technique. 

In [56]:
#importing Lasso Regression
from sklearn.linear_model import Lasso

In [57]:
#instantiating model class and fitting training data
lasso = Lasso()
lasso.fit(x_train, y_train)

Lasso()

In [58]:
#printing RMSE scores for train and test set
print("Lasso Train RMSE: ", np.round(np.sqrt(metrics.mean_squared_error(y_train, lasso.predict(x_train))), 5))
print("Lasso Test RMSE: ", np.round(np.sqrt(metrics.mean_squared_error(y_test, lasso.predict(x_test))), 5))

Lasso Train RMSE:  77.52916
Lasso Test RMSE:  176.92357


**RMSE Difference = ~99.3**

### 2. Ridge Regression
Ridge or L2 is a Regularization Technique in which the summation of squared values of the coefficients of the regression equation is added as penalty into cost function (or MSE). Ridge Regression is based on the L2 Regularization technique.

In [59]:
#importing Ridge Regression
from sklearn.linear_model import Ridge

In [60]:
#instantiating model class and fitting training data
ridge = Ridge()
ridge.fit(x_train, y_train)

Ridge()

In [61]:
#printing RMSE scores for train and test set
print("Ridge Train RMSE: ", np.round(np.sqrt(metrics.mean_squared_error(y_train, ridge.predict(x_train))), 5))
print("Ridge Test RMSE: ", np.round(np.sqrt(metrics.mean_squared_error(y_test, ridge.predict(x_test))), 5))

Ridge Train RMSE:  77.53143
Ridge Test RMSE:  181.04635


**RMSE Difference = ~103.5**

### 3. ElasticNet Regression
ElasticNet Regression is based on both L1 and L2 Regularization techniques. In ElasticNet regression, both L1 and L2 penalties are added to the cost function. 

In [62]:
#importing ElasticNet
from sklearn.linear_model import ElasticNet

In [63]:
#instantiating model class and training data
enet = ElasticNet()
enet.fit(x_train, y_train)

ElasticNet()

In [64]:
#printing RMSE scores for train and test test
print("ElasticNet Train RMSE:", np.round(np.sqrt(metrics.mean_squared_error(y_train, enet.predict(x_train))), 5))
print("ElasticNet Test RMSE:", np.round(np.sqrt(metrics.mean_squared_error(y_test, enet.predict(x_test))), 5))

ElasticNet Train RMSE: 98.2781
ElasticNet Test RMSE: 186.52126


**RMSE Difference = ~88.2**

As discussed earlier, a perfect model often shows almost equal Train and Test Errors.
Thus, based on the RMSE scores of Train and Test Sets of Regression Models built above, **Ridge Regression Model** performed the best as its Train and test RMSE Score does have the **least difference** comparing with other built models.