# Regularization

Regularization is a technique used in machine learning and statistics to prevent overfitting of models by adding a penalty term to the loss function. Overfitting occurs when a model fits the training data too closely and captures noise, resulting in poor generalization to new, unseen data. Regularization methods introduce a trade-off between fitting the training data well and keeping the model's complexity in check.

There are several common types of regularization, including Ridge, Lasso, and Elastic Net regularization.

## Ridge Regularization (L2 Regularization):

Ridge regularization adds a penalty term proportional to the square of the magnitude of the model's coefficients. The added term encourages smaller coefficient values, effectively shrinking them towards zero. This helps prevent the model from relying too heavily on any one feature, reducing the risk of overfitting.

**Importing Libraries**

In [20]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import PolynomialFeatures, StandardScaler
from sklearn.linear_model import Ridge, RidgeCV
from sklearn.metrics import mean_squared_error, r2_score

**Importing Dataset**

In [2]:
df = pd.read_csv('/kaggle/input/advertising/Advertising.csv')

df.head()

Unnamed: 0,TV,radio,newspaper,sales
0,230.1,37.8,69.2,22.1
1,44.5,39.3,45.1,10.4
2,17.2,45.9,69.3,9.3
3,151.5,41.3,58.5,18.5
4,180.8,10.8,58.4,12.9


**Setting Up Features and Labels**

In [3]:
# Features
X = df.drop('sales', axis=1)

# Labels
y = df['sales']

**Making Train-Test Split**

In [5]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

**Making Polynomial Features**

In [7]:
polynomial_converter = PolynomialFeatures(degree=3, include_bias=False)

poly_features = polynomial_converter.fit_transform(X)

**Scaling The Values**

In [9]:
# Creating a scaler object
scaler = StandardScaler()

scaler.fit(X_train)

In [10]:
# Transforming the data
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

**Applying Ridge Regularization**

In [18]:
# Creating a Ridge i.e. L2 Regularization Model

# The alpha parameter controls the regularization. Larger the value, larger the impact
# of regularization. For now, we've set it to 10, later on we will see how to choose it
# through cross validation

ridge_model = Ridge(alpha=10)

ridge_model.fit(X_train, y_train)

test_predictions = ridge_model.predict(X_test)

test_predictions

array([16.50249112, 20.81554936, 21.13973058, 11.15778468, 21.59441934,
       13.49388297, 20.73958974,  7.77642646, 13.3126007 , 15.08894404,
        9.26736097,  7.04807098, 14.51785597,  9.24574187,  9.78494176,
       12.19412986,  9.30509884, 15.99510771, 10.57402776, 18.3849271 ,
       19.57270247, 14.09359132, 12.54580315, 21.21640484,  7.96072489,
        6.16803043, 20.50139222, 11.9173088 ,  9.42176627,  8.79630754,
       12.31170051, 10.21970273, 21.34545432, 13.07005854, 17.90201342,
       19.62424852, 14.18104606, 20.62171767, 10.99563434,  5.01448519,
        9.82087283, 12.46943071, 10.45972618,  8.46842026, 13.21756003,
        5.77773408,  9.66359733, 14.08010366,  9.09065376, 11.78949041,
       15.63323844, 11.76930757, 13.55543071, 11.42142492,  6.82476481,
       10.08301484,  9.68995129, 23.68572974,  8.08684949, 12.23735929])

In [19]:
# Calculate metrics to evaluate the model
mse = mean_squared_error(y_test, test_predictions)
r2 = r2_score(y_test, test_predictions)

print(f"Mean Squared Error: {mse}")
print(f"R-squared: {r2}")

# The R-squuared score is great, the model is working way too fine.

Mean Squared Error: 3.9176600098141585
R-squared: 0.856520190218898


**Getting The Alpha Value Using Cross Validation**

In [22]:
# Setting up the model

# It will return the alpha that performed the best out of these alphas

ridge_cv_model = RidgeCV(alphas=(0.1, 1.0, 10.0))

ridge_cv_model.fit(X_train, y_train)

print(ridge_cv_model.alpha_)

0.1
