# Regularization with SciKit-Learn

Regularization attempts to minimize the RSS (residual sum of squares) *and* a penalty factor. This penalty factor will penalize models that have coefficients that are too large. Some methods of regularization will actually cause non useful features to have a coefficient of zero, in which case the model does not consider the feature.

Let's explore two methods of regularization, Ridge Regression and Lasso. We'll combine these with the polynomial feature set (it wouldn't be as effective to perform regularization of a model on such a small original feature set of the original X).

## Imports

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [2]:
from google.colab import drive 
drive.mount('/content/drive')

Mounted at /content/drive


## Data and Setup

In [3]:
df = pd.read_csv("/content/drive/MyDrive/Datasets/Advertising.csv")
X = df.drop('sales',axis=1)
y = df['sales']

### Train | Test Split

In [4]:
from sklearn.model_selection import train_test_split

In [5]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=101)

----
----

## Scaling the Data

While our particular data set has all the values in the same order of magnitude ($1000s of dollars spent), typically that won't be the case on a dataset, and since the mathematics behind regularized models will sum coefficients together, its important to standardize the features. Review the theory videos for more info, as well as a discussion on why we only **fit** to the training data, and **transform** on both sets separately.

In [6]:
from sklearn.preprocessing import StandardScaler

In [7]:
scaler = StandardScaler()

In [8]:
scaler.fit(X_train)

StandardScaler()

In [9]:
X_train = scaler.transform(X_train)

In [10]:
X_test = scaler.transform(X_test)

## ElasticNet Regression

In [11]:
from sklearn.linear_model import ElasticNet

In [12]:
elastic_model = ElasticNet(alpha=10,l1_ratio=0.5)

In [13]:
elastic_model.fit(X_train,y_train)

ElasticNet(alpha=10)

In [14]:
test_predictions = elastic_model.predict(X_test)

In [15]:
from sklearn.metrics import mean_absolute_error,mean_squared_error
MAE = mean_absolute_error(y_test,test_predictions)
MSE = mean_squared_error(y_test,test_predictions)
RMSE = np.sqrt(MSE)



In [16]:
MAE

4.618428571428571

In [17]:
MSE

29.159716326530617

In [18]:
RMSE

5.399973733874139

### Choosing an alpha value with Cross-Validation

In [19]:
from sklearn.linear_model import ElasticNetCV

In [21]:
elastic_cv_model = ElasticNetCV(alphas=(0.1, 1.0, 10.0),l1_ratio=(0.25,0.5,0.75))

In [22]:
elastic_cv_model.fit(X_train,y_train)

ElasticNetCV(alphas=(0.1, 1.0, 10.0), l1_ratio=(0.25, 0.5, 0.75))

In [23]:
elastic_cv_model.alpha_

0.1

In [24]:
test_predictions = elastic_cv_model.predict(X_test)

In [25]:
MAE = mean_absolute_error(y_test,test_predictions)
MSE = mean_squared_error(y_test,test_predictions)
RMSE = np.sqrt(MSE)

In [26]:
MAE

1.2329000355831425

In [27]:
MSE

2.4572892113717093

In [28]:
RMSE

1.56757430808613

In [29]:
elastic_cv_model.coef_

array([3.60938565, 2.63372507, 0.        ])