#ML Regularization Techniques for Polynomial Regression  

The main aim of this exercise is to compare the effect of different types of regularization on the polynomial (cubic) regression models produced for the Advertising dataset.
> - For L1 regularization, use LassoCV(eps=0.1, n_alphas=100, cv=5).
- For L2 regularization, use RidgeCV(alphas=(0.1, 1.0, 10.0), scoring=’neg_mean_absolute_error’).
- For L1 & L2 combined regularization, use ElasticNetCV(l1_ratio=[.1, .5, .7, .9, .95, .99, 1], tol=0.01).
- We will output the RMSE, coefficients and plot the output for each.
- We will also perform feature scaling on the data before giving it to the models.


## Imports

In [3]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

## Data and Setup

In [4]:
df = pd.read_csv("Advertising.csv")
X = df.drop('sales',axis=1)
y = df['sales']

### Polynomial Conversion

In [5]:
from sklearn.preprocessing import PolynomialFeatures
polynomial_converter = PolynomialFeatures(degree=3,include_bias=False)
poly_features = polynomial_converter.fit_transform(X)

### Train | Test Split

In [6]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(poly_features, y, test_size=0.3, random_state=101)



## Scaling the Data

In [7]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

-----

## RidgeCV Regression


In [8]:
from sklearn.linear_model import RidgeCV
ridge_cv_model = RidgeCV(alphas=(0.1, 1.0, 10.0),scoring='neg_mean_absolute_error')
ridge_cv_model.fit(X_train,y_train)
ridge_cv_model.coef_
test_predictions = ridge_cv_model.predict(X_test)

In [9]:
from sklearn.metrics import mean_absolute_error,mean_squared_error
MSE = mean_squared_error(y_test,test_predictions)
RMSE = np.sqrt(MSE)
RMSE

0.6180719926938822


-----

## LassoCV Regression

In [10]:
from sklearn.linear_model import LassoCV
lasso_cv_model = LassoCV(eps=0.1,n_alphas=100,cv=5)
lasso_cv_model.fit(X_train,y_train)
lasso_cv_model.coef_

array([1.002651  , 0.        , 0.        , 0.        , 3.79745279,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        ])

In [11]:
test_predictions = lasso_cv_model.predict(X_test)
MSE = mean_squared_error(y_test,test_predictions)
RMSE = np.sqrt(MSE)
RMSE

1.1308001022762548

-----

## Elastic Net

Elastic Net combines the penalties of ridge regression and lasso in an attempt to get the best of both worlds!

In [12]:
from sklearn.linear_model import ElasticNetCV
elastic_model = ElasticNetCV(l1_ratio=[.1, .5, .7,.9, .95, .99, 1],tol=0.01)
elastic_model.fit(X_train,y_train)
elastic_model.coef_

array([ 3.78993643,  0.89232919,  0.28765395, -1.01843566,  2.15516144,
       -0.3567547 , -0.271502  ,  0.09741081,  0.        , -1.05563151,
        0.2362506 ,  0.07980911,  1.26170778,  0.01464706,  0.00462336,
       -0.39986069,  0.        ,  0.        , -0.05343757])

In [13]:
test_predictions = elastic_model.predict(X_test)
MSE = mean_squared_error(y_test,test_predictions)
RMSE = np.sqrt(MSE)
RMSE

0.7485546215633726