Overfitting is a common issue in the field of machine learning. L1 and L2 regularization are some of the techniques to address the overfitting issue. In my previous notebook (https://www.kaggle.com/saharpourahmad/filoger-week-3-real-estate-price-prediction) I made a multivariate and polynomial model on this dataset and calculated the 'MAE', 'MSE' and 'RMSE' for them. In this notebook I am going to make the polynomial model again but this time with regularization. Let's see if the errors are reducing by this techniques!

Let's see what we had in the previous notebook:

Polynomial regression model

MAE: 4.304236 / MSE: 29.018488 / RMSE: 5.386881 

Multivariate regression model

MAE: 5.392294 / MSE: 46.211798 / RMSE: 6.797926

## Step 1: Importing the dataset and neccessary libraries:¶

In [None]:
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
%matplotlib inline

In [None]:
df=pd.read_csv('/kaggle/input/real-estate-price-prediction/Real estate.csv')

## Step 2: Getting to know the data better¶

In [None]:
df.head()

In [None]:
df.shape

In [None]:
df.info()

In [None]:
df.describe()

## Step 3: Exploratory data analysis¶

In [None]:
plt.figure(figsize=(8,3))
sns.displot(x=df['Y house price of unit area'], kde=True, aspect=2, color='purple')
plt.xlabel('house price of unit area')

In [None]:
ax = sns.heatmap(df.corr(),annot=True,linewidths=.5)

In [None]:
sns.pairplot(df)

## Step 4: Preprocessing

Splitting the features and lables:

In [None]:
X = df.drop('Y house price of unit area',axis=1)
y = df['Y house price of unit area']

Creating polynomial features: (in the previous notebook I analyzed the degree and it showed that degree=3 is the best one for this data)

In [None]:
from sklearn.preprocessing import PolynomialFeatures
polynomial_converter= PolynomialFeatures(degree=3, include_bias=False)
poly_features= polynomial_converter.fit_transform(X)
poly_features.shape

Splitting the data to test and train:

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(poly_features, y, test_size=0.3, random_state=101)

Scaling the data:

In [None]:
from sklearn.preprocessing import StandardScaler
scaler= StandardScaler()
scaler.fit(X_train)
X_train= scaler.transform(X_train)
X_test= scaler.transform(X_test)

## Step 5: Building a polynomial model with regularization

### A. Ridge

In [None]:
from sklearn.linear_model import Ridge
ridge_model = Ridge(alpha=40)

In [None]:
ridge_model.fit(X_train, y_train)

In [None]:
y_pred= ridge_model.predict(X_test)

evaluating the model:

In [None]:
from sklearn.metrics import mean_absolute_error, mean_squared_error

MAE= mean_absolute_error(y_test, y_pred)
MSE= mean_squared_error(y_test, y_pred)
RMSE= np.sqrt(MSE)

pd.DataFrame([MAE, MSE, RMSE], index=['MAE', 'MSE', 'RMSE'], columns=['metrics'])

How to choose the optimum value of alpha?

We choose an alpha value with Cross-Validation. We import from the sklearn.linear_model the RidgeCV class which is Ridge regression with built-in cross-validation.

You can read more from sklearn documantation: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.RidgeCV.html

In [None]:
from sklearn.linear_model import RidgeCV
ridge_cv_model=RidgeCV(alphas=(0.1, 1.0, 10.0), scoring='neg_mean_absolute_error')

In [None]:
ridge_cv_model.fit(X_train, y_train)

In [None]:
ridge_cv_model.alpha_

In [None]:
y_pred_ridge= ridge_cv_model.predict(X_test)

evaluating the model once again:

In [None]:
MAE_ridge= mean_absolute_error(y_test, y_pred_ridge)
MSE_ridge= mean_squared_error(y_test, y_pred_ridge)
RMSE_ridge= np.sqrt(MSE_ridge)
pd.DataFrame([MAE_ridge, MSE_ridge, RMSE_ridge], index=['MAE', 'MSE', 'RMSE'], columns=['Ridge Metrics'])

We can see that the errors has reduced.

In [None]:
ridge_cv_model.coef_

### B. Lasso

Also here we use Lasso linear model with iterative fitting along a regularization path. The best model is selected by cross-validation.

You can read more from sklearn documantation: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LassoCV.html

In [None]:
from sklearn.linear_model import LassoCV
lasso_cv_model= LassoCV(eps=0.01, n_alphas=100, cv=5)

In [None]:
lasso_cv_model.fit(X_train, y_train)

In [None]:
lasso_cv_model.alpha_

In [None]:
y_pred_lasso= lasso_cv_model.predict(X_test)

In [None]:
MAE_Lasso= mean_absolute_error(y_test, y_pred_lasso)
MSE_Lasso= mean_squared_error(y_test, y_pred_lasso)
RMSE_Lasso= np.sqrt(MSE_Lasso)

pd.DataFrame([MAE_Lasso, MSE_Lasso, RMSE_Lasso], index=['MAE', 'MSE', 'RMSE'], columns=['Lasso Metrics'])

In [None]:
lasso_cv_model.coef_

We can see that many of the parameter coefficients are zero in this model. 

### C. Elastic Net

In [None]:
from sklearn.linear_model import ElasticNetCV
elastic_model= ElasticNetCV(l1_ratio=[0.1, 0.5, 0.7, 0.9, 0.95, 0.99, 1],cv=5, max_iter=100000)
elastic_model.fit(X_train, y_train)

In [None]:
elastic_model.l1_ratio_

In [None]:
y_pred_elastic=elastic_model.predict(X_test)

In [None]:
MAE_Elastic= mean_absolute_error(y_test, y_pred_elastic)
MSE_Elastic= mean_squared_error(y_test, y_pred_elastic)
RMSE_Elastic= np.sqrt(MSE_Elastic)

pd.DataFrame([MAE_Elastic, MSE_Elastic, RMSE_Elastic], index=['MAE', 'MSE', 'RMSE'], columns=['Elastic Metrics'])

In [None]:
elastic_model.coef_

We can see that many of the parameter coefficients are zero in this model. 

## Step 6: Comparing the different errors in the models

In [None]:
data = {'Polynomial regression': [4.304236, 29.018488, 5.386881], 'Multivariate regression': [5.392294, 46.211798, 6.797926], 'Ridge Metrics': [MAE_ridge, MSE_ridge, RMSE_ridge], 'Lasso Metrics': [MAE_Lasso, MSE_Lasso, RMSE_Lasso] , 'Elastic Metrics': [MAE_Elastic, MSE_Elastic, RMSE_Elastic]}

pd.DataFrame( data, index=['MAE', 'MSE', 'RMSE'])

We can see that Ridge regularization gave us the best metrics!