# Ridge Regression

Ridge Regression is a regularized version of Linear Regression where a regularization term equal to λ∑θj² is added to the cost function. This forces the learning algorithm to not only fit the data but also keep the model weights as small as possible. Note that the regularization term should only be added to the cost function during training. Once the model is trained, you want to evaluate the model's performance using the unregularized performance measure.

## What is Regularization?

Regularization is a technique used in machine learning and statistics to prevent overfitting of models on training data. Overfitting occurs when a model learns the training data too well, including its noise and outliers, leading to poor generalization to new, unseen data. Regularization helps to solve this problem by adding a penalty to the model's complexity.

## Ridge Regression Explained

Ridge regression, also known as Tikhonov regularization, is a type of linear regression that includes a regularization term. The key idea behind ridge regression is to find a new line that doesn't fit the training data as well as ordinary least squares regression, in order to achieve better generalization to new data. This is particularly useful when:

- Dealing with multicollinearity (independent variables are highly correlated)
- The number of predictors (features) exceeds the number of observations

## Key Concept

**Regularization**: Ridge regression adds a penalty equal to the square of the magnitude of coefficients. This penalty term (squared L2 norm) shrinks the coefficients towards zero, but it doesn't make them exactly zero.

## Mathematical Representation

The ridge regression modifies the least squares objective function by adding a penalty term:

$$\sum_{i=1}^{n}(y_i - \sum_{j=1}^{p} x_{ij}\beta_j)^2 + \lambda \sum_{j=1}^{p} \beta_j^2$$

where:

- $y_i$ is the response value for the $i$th observation.
- $x_{ij}$ is the value of the $j$th predictor for the $i$th observation.
- $\beta_j$ is the regression coefficient for the $j$th predictor.
- $\lambda$ is the tuning parameter that controls the strength of the penalty; $\lambda \geq 0$.

In code implementations, `alpha` is the regularization strength ($\lambda$). Adjusting alpha changes the strength of the regularization penalty. A larger alpha enforces stronger regularization (leading to smaller coefficients), and a smaller alpha tends towards a model similar to linear regression.

## Key Points

- **Choosing Alpha**: Selecting the right value of alpha is crucial. It can be done using cross-validation techniques like RidgeCV.
- **Standardization**: It's often recommended to standardize the predictors before applying ridge regression.
- **Bias-Variance Tradeoff**: Ridge regression balances the bias-variance tradeoff in model training.

In [1]:
from sklearn.linear_model import LinearRegression, Ridge
import numpy as np

# Example data
X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
# Target values
y = np.dot(X, np.array([1, 2])) + 3

# Linear regression
lr = LinearRegression()
lr.fit(X, y)

# Coefficients
print("Coefficients:", lr.coef_)
# Intercept
print("Intercept:", lr.intercept_)

Coefficients: [1. 2.]
Intercept: 3.0000000000000018


In [2]:
from sklearn.linear_model import Ridge
import numpy as np

# Example data
X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
# Target values
y = np.dot(X, np.array([1, 2])) + 3

# Ridge Regression Model
ridge_reg = Ridge(alpha=0.5)  # alpha is the equivalent of lambda in the formula
ridge_reg.fit(X, y)

# Coefficients
print("Coefficients:", ridge_reg.coef_)
# Intercept
print("Intercept:", ridge_reg.intercept_)

Coefficients: [0.90909091 1.63636364]
Intercept: 3.8636363636363633


## Comparing Simple Linear Regression vs. Ridge Regression
### Import Libraries and load the data

In [3]:
import seaborn as sns
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression, Ridge
from sklearn.metrics import mean_squared_error, r2_score, mean_absolute_error, mean_absolute_percentage_error
from sklearn.preprocessing import OneHotEncoder, StandardScaler, MinMaxScaler
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
import numpy as np

# Load the data set
df = sns.load_dataset('diamonds')

### Pre Process the data

In [4]:
# separate the features X and the target/labels y
X = df.drop('price', axis=1)
y = df['price']

# numeric features
numeric_features = ['carat', 'depth', 'table', 'x', 'y', 'z']
# categorical features
categorical_features = ['cut', 'color', 'clarity']

# preprocess the data
preprocessor = ColumnTransformer(
    transformers=[
        ('num', StandardScaler(), numeric_features),
        ('cat', OneHotEncoder(), categorical_features)
    ]
)

# train test split
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.8, random_state=42)

### Creating a pipeline

In [5]:
# Linear Regression Pipeline
lr_pipeline = Pipeline(steps=[('preprocessor', preprocessor),
                              ('regressor', LinearRegression())])

# Ridge Regression Pipeline
ridge_pipeline = Pipeline(steps=[('preprocessor', preprocessor),
                                 ('regressor', Ridge(alpha=0.5))])

### train and Evaluate the Models

In [6]:
# Train and evaluate Linear Regression
lr_pipeline.fit(X_train, y_train)
lr_pred = lr_pipeline.predict(X_test)
lr_mse = mean_squared_error(y_test, lr_pred)
lr_r2 = r2_score(y_test, lr_pred)
lr_mae = mean_absolute_error(y_test, lr_pred)
lr_mape = mean_absolute_percentage_error(y_test, lr_pred)
lr_rmse = np.sqrt(lr_mse)

# Train and evaluate Ridge Regression
ridge_pipeline.fit(X_train, y_train)
ridge_pred = ridge_pipeline.predict(X_test)
ridge_mse = mean_squared_error(y_test, ridge_pred)
ridhe_r2 = r2_score(y_test, ridge_pred)
ridge_mae = mean_absolute_error(y_test, ridge_pred)
ridge_mape = mean_absolute_percentage_error(y_test, ridge_pred)
ridge_rmse = np.sqrt(ridge_mse)

print("Linear Regression MSE:", lr_mse)
print("Ridge Regression MSE:", ridge_mse)
print(f"------------------------")

print("Linear Regression R2:", lr_r2)
print("Ridge Regression R2:", ridhe_r2)
print(f"------------------------")
print("Linear Regression MAE:", lr_mae)
print("Ridge Regression MAE:", ridge_mae)
print(f"------------------------")
print("Linear Regression MAPE:", lr_mape)
print("Ridge Regression MAPE:", ridge_mape)
print(f"------------------------")
print("Linear Regression RMSE:", lr_rmse)
print("Ridge Regression RMSE:", ridge_rmse)


Linear Regression MSE: 1288705.4778516763
Ridge Regression MSE: 1288691.2489788432
------------------------
Linear Regression R2: 0.9189331350419386
Ridge Regression R2: 0.9189340301185347
------------------------
Linear Regression MAE: 737.1513665933285
Ridge Regression MAE: 737.1455056792028
------------------------
Linear Regression MAPE: 0.3952933516494362
Ridge Regression MAPE: 0.3952510372855173
------------------------
Linear Regression RMSE: 1135.2116445190634
Ridge Regression RMSE: 1135.2053774444707


## Summary of the notebook:
### 1. Introduced Ridge Regression and regularization concepts.
### 2. Demonstrated basic Linear and Ridge Regression with small numpy arrays.
### 3. Loaded the diamonds dataset and split it into features (X) and target (y).
### 4. Identified numeric and categorical features for preprocessing.
### 5. Built a preprocessing pipeline using StandardScaler for numeric and OneHotEncoder for categorical features.
### 6. Split the data into training and test sets.
### 7. Created pipelines for both Linear Regression and Ridge Regression models.
### 8. Trained and evaluated both models, comparing metrics such as MSE, R2, MAE, MAPE, and RMSE on the test set.

# About the Author

<div style="background-color: #f8f9fa; border-left: 5px solid #28a745; padding: 20px; margin-bottom: 20px; border-radius: 5px;">
  <h2 style="color: #28a745; margin-top: 0; font-family: 'Poppins', sans-serif;">Muhammad Atif Latif</h2>
  <p style="font-size: 16px; color: #495057;">Data Scientist & Machine Learning Engineer</p>
  
  <p style="font-size: 15px; color: #6c757d; margin-top: 15px;">
    Passionate about building AI solutions that solve real-world problems. Specialized in machine learning, 
    deep learning, and data analytics with experience implementing production-ready models.
  </p>
</div>

## Connect With Me

<div style="display: flex; flex-wrap: wrap; gap: 10px; margin-top: 15px;">
  <a href="https://github.com/m-Atif-Latif" target="_blank">
    <img src="https://img.shields.io/badge/GitHub-Follow-212121?style=for-the-badge&logo=github" alt="GitHub">
  </a>
  <a href="https://www.kaggle.com/matiflatif" target="_blank">
    <img src="https://img.shields.io/badge/Kaggle-Profile-20BEFF?style=for-the-badge&logo=kaggle" alt="Kaggle">
  </a>
  <a href="https://www.linkedin.com/in/muhammad-atif-latif-13a171318" target="_blank">
    <img src="https://img.shields.io/badge/LinkedIn-Connect-0077B5?style=for-the-badge&logo=linkedin" alt="LinkedIn">
  </a>
  <a href="https://x.com/mianatif5867" target="_blank">
    <img src="https://img.shields.io/badge/Twitter-Follow-1DA1F2?style=for-the-badge&logo=twitter" alt="Twitter">
  </a>
  <a href="https://www.instagram.com/its_atif_ai/" target="_blank">
    <img src="https://img.shields.io/badge/Instagram-Follow-E4405F?style=for-the-badge&logo=instagram" alt="Instagram">
  </a>
  <a href="mailto:muhammadatiflatif67@gmail.com">
    <img src="https://img.shields.io/badge/Email-Contact-D14836?style=for-the-badge&logo=gmail" alt="Email">
  </a>
</div>

---