<h1 style="font-family:verdana;">Introduction</h1>

<p>One of the most common problems data science professionals face is to avoid overfitting. Have you come across a situation where your model performed exceptionally well on train data but was not able to predict test data. Or you were on the top of the competition on the public leaderboard, only to fall hundreds of places in the final rankings? Well – this notebook write for solve this problem using regularization</p>

<b>what is Regularization:</b><br>
This is a form of regression, that constrains/ regularizes or shrinks the coefficient estimates towards zero. In other words, this technique discourages learning a more complex or flexible model, so as to avoid the risk of overfitting.



Some of the regularization techniques are  given below:
<ul>
<li>L1 Regularization  -><a href="#Lasso">Lasso Regression</a></li>


<li>L2 Regularization  -><a href="#Ridge">Ridge Regression</a></li>
<li>Combining L1 & L2  -><a href="#Elastic">Elastic Net</a></li></ul>



<h1 style="font-family:verdana;">Dataset</h1>
The data set is Real estate price prediction that is used for regression analysis, mutiple regression,linear regression, prediction. Since house price is a continues variable, this is a regression problem. The data contains 8columns that include sixFeatures(X) and one Label(y): house price of unit area.



<h3>1-Import all Necessary Libraries<h3>   


In [None]:
import numpy as np #Import all necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

<h3>2-Import the dataset:</h3>

In [None]:
df=pd.read_csv('../input/real-estate-price-prediction/Real estate.csv')


<h3>3- Data Overview</h3>

In [None]:
df.head()

In [None]:
df.info()

In [None]:
df.columns

In [None]:
df.shape

In [None]:
df.describe

<h3>4-EDA</h3>

In [None]:
g= sns.pairplot(df)
g.map_upper(plt.scatter)

In [None]:
plt.figure(figsize=(10,4))
sns.displot(df['Y house price of unit area'],kde=True,bins=20, aspect=2)
plt.xlabel('house price of unit area')

In [None]:
plt.figure(figsize=(5, 5), dpi=100)

sns.scatterplot(data=df, y=df['Y house price of unit area'], x=df['X1 transaction date'] , hue= 'X2 house age', palette="rocket")


<h3>5-Determine the Features & Target Variable (Lable)</h3>

In [None]:
X= df.drop('Y house price of unit area', axis=1)
y=df['Y house price of unit area']

<h3>6-Preprocessing (Polynomial Conversion)</h3>

In [None]:
def missing_percent(train_set):
    nan_percent = 100*(train_set.isnull().sum()/len(train_set))
    nan_percent = nan_percent[nan_percent>0].sort_values(ascending=False).round(1)
    DataFrame = pd.DataFrame(nan_percent)
    # Rename the columns
    mis_percent_table = DataFrame.rename(columns = {0 : '% of Misiing Values'}) 
    # Sort the table by percentage of missing descending
    mis_percent = mis_percent_table
    return mis_percent

In [None]:
from sklearn.preprocessing import PolynomialFeatures


In [None]:
polynomial_converter= PolynomialFeatures(degree=3, include_bias=False)

In [None]:
poly_features= polynomial_converter.fit_transform(X)


In [None]:
poly_features.shape

<h3>7-Split the Data to Train & Test</h3>

In [None]:
from sklearn.model_selection import train_test_split


In [None]:
X_train, X_test, y_train, y_test = train_test_split(poly_features, y, test_size=0.3, random_state=101)


<h3>8-Scaling the Data</h3>

In [None]:
from sklearn.preprocessing import StandardScaler


In [None]:
scaler= StandardScaler()

In [None]:
scaler.fit(X_train)

In [None]:
X_train= scaler.transform(X_train)
X_test= scaler.transform(X_test)

<h3>9-Regularization</h3>

<h5><p id="Ridge">A-Ridge Regression</p></h5>

In [None]:
#Train the Model
from sklearn.linear_model import Ridge

In [None]:
ridge_model= Ridge(alpha=10)

In [None]:
ridge_model.fit(X_train, y_train)

In [None]:
#predict Test Data
y_pred= ridge_model.predict(X_test)

In [None]:
#Evaluating the Model
from sklearn.metrics import mean_absolute_error, mean_squared_error

MAE= mean_absolute_error(y_test, y_pred)
MSE= mean_squared_error(y_test, y_pred)
RMSE= np.sqrt(MSE)

In [None]:
pd.DataFrame([MAE, MSE, RMSE], index=['MAE', 'MSE', 'RMSE'], columns=['metrics'])

<h3> Ridge Regression (Coosing an alpha value with Cross-Validation)</h3>

In [None]:
#Train the Model
from sklearn.linear_model import RidgeCV

In [None]:
ridge_cv_model=RidgeCV(alphas=(0.1, 1.0, 10.0), scoring='neg_mean_absolute_error')

In [None]:
ridge_cv_model=RidgeCV(alphas=(0.1, 1.0, 10.0), scoring='neg_mean_absolute_error')

In [None]:
ridge_cv_model.fit(X_train, y_train)

In [None]:
ridge_cv_model.alpha_

In [None]:
#Predicting Test Data
y_pred_ridge= ridge_cv_model.predict(X_test)

In [None]:
MAE_ridge= mean_absolute_error(y_test, y_pred_ridge)
MSE_ridge= mean_squared_error(y_test, y_pred_ridge)
RMSE_ridge= np.sqrt(MSE_ridge)

In [None]:
pd.DataFrame([MAE_ridge, MSE_ridge, RMSE_ridge], index=['MAE', 'MSE', 'RMSE'], columns=['Ridge Metrics'])

In [None]:
ridge_cv_model.coef_

<h3>
<p id="Lasso">B-Lasso Regression</p>
    </h3>

Lasso, or Least Absolute Shrinkage and Selection Operator, is quite similar conceptually to ridge regression. It also adds a penalty for non-zero coefficients, but unlike ridge regression which penalizes sum of squared coefficients (the so-called L2 penalty), lasso penalizes the sum of their absolute values (L1 penalty). As a result, for high values of λ, many coefficients are exactly zeroed under lasso, which is never the case in ridge regression.



In [None]:
from sklearn.linear_model import LassoCV

In [None]:
lasso_cv_model= LassoCV(eps=0.1, n_alphas=100, cv=5)

In [None]:
lasso_cv_model.fit(X_train, y_train)

In [None]:
lasso_cv_model.alpha_

In [None]:
y_pred_lasso= lasso_cv_model.predict(X_test)

In [None]:
MAE_Lasso= mean_absolute_error(y_test, y_pred_lasso)
MSE_Lasso= mean_squared_error(y_test, y_pred_lasso)
RMSE_Lasso= np.sqrt(MSE_Lasso)

In [None]:
pd.DataFrame([MAE_Lasso, MSE_Lasso, RMSE_Lasso], index=['MAE', 'MSE', 'RMSE'], columns=['Lasso Metrics'])

In [None]:
lasso_cv_model.coef_

<h3><p id ="Elastic">C-Elastic Net</p></h3>

In [None]:
from sklearn.linear_model import ElasticNetCV

In [None]:
elastic_model= ElasticNetCV(l1_ratio=[0.1, 0.5, 0.7, 0.9, 0.95, 0.99, 1],cv=5, max_iter=100000)

In [None]:
elastic_model.fit(X_train, y_train)

In [None]:
elastic_model.l1_ratio_

In [None]:
y_pred_elastic=elastic_model.predict(X_test)

In [None]:
MAE_Elastic= mean_absolute_error(y_test, y_pred_elastic)
MSE_Elastic= mean_squared_error(y_test, y_pred_elastic)
RMSE_Elastic= np.sqrt(MSE_Elastic)


In [None]:
pd.DataFrame([MAE_Elastic, MSE_Elastic, RMSE_Elastic], index=['MAE', 'MSE', 'RMSE'], columns=['Elastic Metrics'])

In [None]:
elastic_model.coef_