# Regularization

## Library

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import Lasso, Ridge, LinearRegression
from sklearn.metrics import mean_squared_error

## 1. Regularization in Machine Learning

**Regularization** is a technique used in machine learning to prevent **overfitting**. Overfitting happens when a model learns not just the underlying pattern of the data but also its noise. As a result, the model excels with the training data but underperforms on unseen data (e.g., validation or test data).

To counter this, regularization introduces a penalty to the loss function that the algorithm is optimizing. This penalty prevents the model from becoming overly complex, a common cause of overfitting. In essence, while the algorithm still aims to fit the data accurately, regularization ensures it doesn't fit "too perfectly," maintaining a level of generality.

There are various regularization techniques, with the most prevalent being:

## L1 Regularization (Lasso Regression)
This method introduces a penalty equal to the absolute value of the magnitude of coefficients. As a result, some coefficients can become zero, which effectively selects features.

$$ L = \text{Loss}(Data|Model) + \alpha \sum |w_i| $$

## L2 Regularization (Ridge Regression)
This technique adds a penalty proportional to the square of the magnitude of coefficients. While it restricts coefficients from becoming overly large and dominating the output, it doesn't necessarily nullify them.

$$ L = \text{Loss}(Data|Model) + \alpha \sum w_i^2 $$

## Elastic Net
Elastic Net is a hybrid approach, incorporating elements of both L1 and L2 regularization.

The intensity of the regularization is managed by the parameter \( \alpha \). An \( \alpha \) value near 0 denotes minimal regularization. As \( \alpha \) grows, so does the strength of the regularization, pushing coefficients closer to zero.


## 2. Example of Regularization in Python

In [None]:
# Generate some data
np.random.seed(0)
X = np.sort(np.random.rand(40, 1) * 10, axis=0)
y = 2 * X.squeeze() + 1 + np.sin(X.squeeze()) * 10 + np.random.randn(40) * 2

X_train, X_test = X[:30], X[30:]
y_train, y_test = y[:30], y[30:]

# Linear regression without regularization
linear_model = LinearRegression().fit(X_train, y_train)
y_pred_linear = linear_model.predict(X_test)

# Linear regression with L1 regularization (Lasso)
lasso_model = Lasso(alpha=1.0).fit(X_train, y_train)
y_pred_lasso = lasso_model.predict(X_test)

# Linear regression with L2 regularization (Ridge)
ridge_model = Ridge(alpha=1.0).fit(X_train, y_train)
y_pred_ridge = ridge_model.predict(X_test)

# Plotting the results
plt.figure(figsize=(10, 6))
plt.scatter(X_train, y_train, color='blue', s=10, label='Train data')
plt.scatter(X_test, y_test, color='red', s=10, label='Test data')
plt.plot(X_test, y_pred_linear, color='cyan', label='Linear Regression')
plt.plot(X_test, y_pred_lasso, color='magenta', label='Lasso Regression')
plt.plot(X_test, y_pred_ridge, color='green', label='Ridge Regression')
plt.legend()
plt.show()

print("Linear Regression MSE:", mean_squared_error(y_test, y_pred_linear))
print("Lasso Regression MSE:", mean_squared_error(y_test, y_pred_lasso))
print("Ridge Regression MSE:", mean_squared_error(y_test, y_pred_ridge))