### Regularization
Have you ever encountered a situation where your machine learning model models the training data exceptionally well but fails to perform well on the testing data i.e. was not able to predict test data? This situation can be dealt with regularization in Machine learning.

Overfitting happens when a model learns the very specific pattern and noise from the training data to such an extent that it negatively impacts our model’s ability to generalize from our training data to new (“unseen”) data. By noise, we mean the irrelevant information or randomness in a dataset.

Preventing overfitting is very necessary to improve the performance of our machine learning model.

**What is Regularization?**

In general, regularization means to make things regular or acceptable. This is exactly why we use it for applied machine learning. In the context of machine learning, regularization is the process which regularizes or shrinks the coefficients towards zero. In simple words, regularization discourages learning a more complex or flexible model, to prevent overfitting.


**How Does Regularization Work?**

The basic idea is to penalize the complex models i.e. adding a complexity term that would give a bigger loss for complex models. To understand it, let’s consider a simple relation for linear regression. 

####  residual sum of squares (RSS)
![](https://miro.medium.com/max/908/1*DY3-IaGcHjjLg7oYXx1O3A.png)



#### Ridge Regression
![](https://miro.medium.com/max/1106/1*CiqZ8lhwxi5c4d1nV24w4g.png)

#### Lasso Regression
![](https://miro.medium.com/max/1094/1*tHJ4sSPYV0bDr8xxEdiwXA.png)

### Import libraries

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

In [None]:
from sklearn.preprocessing import PolynomialFeatures
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

In [None]:
#Train the Model
from sklearn.linear_model import Ridge
#Evaluating the Model
from sklearn.metrics import mean_absolute_error, mean_squared_error
#Train the Model
from sklearn.linear_model import RidgeCV

from sklearn.linear_model import LassoCV
from sklearn.linear_model import ElasticNetCV

### Load Dataset

In [None]:
df= pd.read_csv('../input/real-estate-price-prediction/Real estate.csv')

In [None]:
df.head()

### Dataset Information

In [None]:
df.shape


In [None]:
df.info()


In [None]:
df.describe()


### Exploratory Data Analysis (EDA)

In [None]:
sns.pairplot(df)


In [None]:
sns.heatmap(df.corr(), annot=True,cmap='RdYlBu_r')


### Data Preprocessing

In [None]:
X= df.drop(['Y house price of unit area','No'], axis=1)
y=df['Y house price of unit area']

### Generate polynomial and interaction features

In [None]:
polynomial_converter= PolynomialFeatures(degree=3, include_bias=False)

In [None]:
poly_features= polynomial_converter.fit_transform(X)

In [None]:
poly_features.shape

### Train - Test Split

In [None]:
X_train, X_test, y_train, y_test = train_test_split(poly_features, y, test_size=0.3, random_state=101)

### Scaling the data

In [None]:
scaler= StandardScaler()

In [None]:
scaler.fit(X_train)

In [None]:
X_train= scaler.transform(X_train)
X_test= scaler.transform(X_test)

### Regularization

#### 1- Ridge Regression

In [None]:
ridge_model= Ridge(alpha=10)

In [None]:
ridge_model.fit(X_train, y_train)

In [None]:
#predict Test Data
y_pred= ridge_model.predict(X_test)

In [None]:
MAE= mean_absolute_error(y_test, y_pred)
MSE= mean_squared_error(y_test, y_pred)
RMSE= np.sqrt(MSE)

In [None]:
pd.DataFrame([MAE, MSE, RMSE], index=['MAE', 'MSE', 'RMSE'], columns=['metrics'])

### Ridge Regression (Coosing an alpha value with Cross-Validation)

In [None]:
ridge_cv_model=RidgeCV(alphas=(0.1, 1.0, 10.0), scoring='neg_mean_absolute_error')

In [None]:
ridge_cv_model.fit(X_train, y_train)

In [None]:
ridge_cv_model.alpha_

In [None]:
#Predicting Test Data
y_pred_ridge= ridge_cv_model.predict(X_test)

In [None]:
MAE_ridge= mean_absolute_error(y_test, y_pred_ridge)
MSE_ridge= mean_squared_error(y_test, y_pred_ridge)
RMSE_ridge= np.sqrt(MSE_ridge)

In [None]:
pd.DataFrame([MAE_ridge, MSE_ridge, RMSE_ridge], index=['MAE', 'MSE', 'RMSE'], columns=['Ridge Metrics'])

In [None]:
ridge_cv_model.coef_

### 2: Lasso Regression

In [None]:
lasso_cv_model= LassoCV(eps=0.01, n_alphas=100, cv=5)

In [None]:
lasso_cv_model.fit(X_train, y_train)

In [None]:
lasso_cv_model.alpha_

In [None]:
y_pred_lasso= lasso_cv_model.predict(X_test)

In [None]:
MAE_Lasso= mean_absolute_error(y_test, y_pred_lasso)
MSE_Lasso= mean_squared_error(y_test, y_pred_lasso)
RMSE_Lasso= np.sqrt(MSE_Lasso)

In [None]:
pd.DataFrame([MAE_Lasso, MSE_Lasso, RMSE_Lasso], index=['MAE', 'MSE', 'RMSE'], columns=['Lasso Metrics'])

In [None]:
lasso_cv_model.coef_

### 3: Elastic Net

In [None]:
elastic_model= ElasticNetCV(l1_ratio=[0.1, 0.5, 0.7, 0.9, 0.95, 0.99, 1],cv=5, max_iter=100000)

In [None]:
elastic_model.fit(X_train, y_train)

In [None]:
elastic_model.l1_ratio_

In [None]:
y_pred_elastic=elastic_model.predict(X_test)

In [None]:
MAE_Elastic= mean_absolute_error(y_test, y_pred_elastic)
MSE_Elastic= mean_squared_error(y_test, y_pred_elastic)
RMSE_Elastic= np.sqrt(MSE_Elastic)

In [None]:
pd.DataFrame([MAE_Elastic, MSE_Elastic, RMSE_Elastic], index=['MAE', 'MSE', 'RMSE'], columns=['Elastic Metrics'])

In [None]:
elastic_model.coef_

In [None]:
list_elastic_model_coef=[]
list_elastic_model_coef = elastic_model.coef_


In [None]:
count = 0
for i in list_elastic_model_coef:
    if i!=0:
        count+=1
print("After Elastic Net, we have only ", count , " none-zero coefficients.")