# Regularisation

Regularisation simply means constraining a model to make it simpler and reduce the risk of overfitting. We use regularization because we want to add some bias into our model to prevent it overfitting to our training data. After adding a regularization, we end up with a machine learning model that performs well on the training data, and has a good ability to generalize to new examples that it has not seen during training. Regularisation also helps to stabilize the estimates especially when there's collinearity in the data.  Regularization term should only be added to the cost function during training. Once the model is trained, you want to use the unregularized performance measure to evaluate the model’s performance.

The fewer degrees of freedom it has, the harder it will be for it to overfit the data. 

For a linear model, regularization is typically achieved by constraining the weights of the model. We will now look at Ridge Regression, Lasso Regression, and Elastic Net. A simple way to regularize a polynomial model is to reduce the number of polynomial degrees.

### Ridge Regression

This is also known as L2. This forces the learning algorithm to not only fit the data but also keep the model weights as small as possible. This means less signifcant features will be there but without much influence.

### Lasso Regression

Least Absolute Shrinkage and Selection Operator (LASSO). It is also known as L1. An important characteristic of Lasso Regression is that it tends to eliminate the weights of the least important features. Lasso Regression automatically performs feature selection and outputs a sparse model (i.e., with few nonzero feature weights).

### Elastic Net

Elastic Net is a middle ground between Ridge Regression and Lasso Regression. The regularization term is a simple mix of both Ridge and Lasso’s regularization terms, and you can control the mix ratio r. When r = 0, Elastic Net is equivalent to Ridge
Regression, and when r = 1, it is equivalent to Lasso Regression.

In [29]:
#data analysis and viz
import pandas as pd
import numpy as np 
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

#ML_libraries
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Lasso, LogisticRegression, Ridge, ElasticNet
from sklearn.feature_selection import SelectFromModel
from sklearn.preprocessing import StandardScaler

In [10]:
df = pd.read_csv('C:\\Users\\danielogbu\\train (1).csv')

In [39]:
# Will focus on the continuous variables

numerics_dtypes = ['int16', 'int32', 'int64', 'float16', 'float32', 'float64']
numerical_vars = list(df.select_dtypes(include=numerics_dtypes).columns)

df = df[numerical_vars]

df.shape

(1460, 38)

In [40]:
#train_test_split

X_train, X_test, y_train, y_test = train_test_split(df.drop(labels=['SalePrice'], axis = 1),df["SalePrice"], test_size=0.35, random_state=0)

X_train.shape, X_test.shape

((949, 37), (511, 37))

### LASSO

In [41]:
#Scaling a linear model is very beneficial

scaler = StandardScaler()
scaler.fit(X_train.fillna(0))

L1Fea=SelectFromModel(Lasso(alpha=100))
L1Fea.fit(scaler.transform(X_train.fillna(0)), y_train)

SelectFromModel(estimator=Lasso(alpha=100))

In [42]:
print('Total features-->',X_train.shape[1])
print('Selected features-->',sum(L1Fea.get_support()))
print('Removed features-->',np.sum(L1Fea.estimator_.coef_==0))

Total features--> 37
Selected features--> 33
Removed features--> 4


### Ridge

In [43]:
L2Fea=SelectFromModel(Ridge(alpha=100))
L2Fea.fit(scaler.transform(X_train.fillna(0)), y_train)

SelectFromModel(estimator=Ridge(alpha=100))

In [44]:
print('Total features-->',X_train.shape[1])
print('Selected features-->',sum(L2Fea.get_support()))
print('Removed features-->',np.sum(L2Fea.estimator_.coef_==0))

Total features--> 37
Selected features--> 17
Removed features--> 0


### Elastic Net

In [45]:
ElaNet=SelectFromModel(ElasticNet(alpha=100, l1_ratio=0.9))
ElaNet.fit(scaler.transform(X_train.fillna(0)), y_train)

SelectFromModel(estimator=ElasticNet(alpha=100, l1_ratio=0.9))

In [46]:
print('Total features-->',X_train.shape[1])
print('Selected features-->',sum(ElaNet.get_support()))
print('Removed features-->',np.sum(ElaNet.estimator_.coef_==0))

Total features--> 37
Selected features--> 16
Removed features--> 0
