<h1 style="background-color: #f8f0fa;
            border-left: 5px solid #1b4332;
            font-family: 'Trebuchet MS', sans-serif;
            border-right: 5px solid #1b4332;
            padding: 12px;
            border-radius: 50px 50px;
            color: #1b4332;
            text-align:center;
            font-size:45px;"><strong>😊Gradient Boosting Algorithm🌟</strong></h1>
<hr style="border-top: 5px solid #264653;">

This notebook demonstrates the implementation of the **Gradient Boosting algorithm** from scratch.
Gradient Boosting is an ensemble method that improves prediction accuracy by sequentially adding weak models (like decision stumps).
In each iteration, a weak model attempts to predict the residuals (errors) of the previous models, thereby minimizing them over time.

In this example, we use a simple regression dataset with **10 samples** and **3 features** to illustrate Gradient Boosting.



## Gradient Boosting Algorithm Steps

1. **Initialize with a Base Model**: Start by predicting the mean of the target variable.

2. **Calculate Residuals**: Compute residuals, the differences between actual and predicted values.

3. **Train a Weak Model on Residuals**: Train a weak model (e.g., a decision stump) on residuals to predict errors.

4. **Update Predictions with Learning Rate**: Add the weak model’s predictions (scaled by a learning rate) to current predictions.

5. **Repeat**: Iteratively repeat steps 2-4 to reduce residuals further.



## Example Dataset

This dataset includes **10 samples** and **3 features**, used for demonstrating Gradient Boosting.

| Sample | Feature 1 | Feature 2 | Feature 3 | Target |
|--------|-----------|-----------|-----------|--------|
| 1      | 1.0       | 2.0       | 3.0       | 10.0   |
| 2      | 2.0       | 1.0       | 2.0       | 12.0   |
| 3      | 3.0       | 2.0       | 1.0       | 15.0   |
| 4      | 1.0       | 3.0       | 2.0       | 18.0   |
| 5      | 2.0       | 1.0       | 1.0       | 10.0   |
| 6      | 3.0       | 3.0       | 2.0       | 20.0   |
| 7      | 1.0       | 1.0       | 1.0       | 8.0    |
| 8      | 2.0       | 2.0       | 3.0       | 14.0   |
| 9      | 3.0       | 1.0       | 2.0       | 16.0   |
| 10     | 1.0       | 3.0       | 3.0       | 11.0   |


In [5]:

import pandas as pd
import numpy as np
# Define the dataset
data = {
    'Sample': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    'Feature_1': [1.0, 2.0, 3.0, 1.0, 2.0, 3.0, 1.0, 2.0, 3.0, 1.0],
    'Feature_2': [2.0, 1.0, 2.0, 3.0, 1.0, 3.0, 1.0, 2.0, 1.0, 3.0],
    'Feature_3': [3.0, 2.0, 1.0, 2.0, 1.0, 2.0, 1.0, 3.0, 2.0, 3.0],
    'Target': [10.0, 12.0, 15.0, 18.0, 10.0, 20.0, 8.0, 14.0, 16.0, 11.0]
}

df = pd.DataFrame(data)
df.set_index('Sample', inplace=True)
df


Unnamed: 0_level_0,Feature_1,Feature_2,Feature_3,Target
Sample,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,1.0,2.0,3.0,10.0
2,2.0,1.0,2.0,12.0
3,3.0,2.0,1.0,15.0
4,1.0,3.0,2.0,18.0
5,2.0,1.0,1.0,10.0
6,3.0,3.0,2.0,20.0
7,1.0,1.0,1.0,8.0
8,2.0,2.0,3.0,14.0
9,3.0,1.0,2.0,16.0
10,1.0,3.0,3.0,11.0



## Step 1: Initialize with a Base Model

The initial model predicts the mean of the target values as a starting point.


In [6]:

# Calculate the initial prediction as the mean of the target
initial_prediction = df['Target'].mean()
predictions = np.full(len(df), initial_prediction)
predictions


array([13.4, 13.4, 13.4, 13.4, 13.4, 13.4, 13.4, 13.4, 13.4, 13.4])


## Step 2: Calculate Residuals

Residuals are calculated as the difference between the target values and the current predictions. 
These residuals represent the errors the model will try to minimize.


In [7]:

# Calculate residuals
residuals = df['Target'] - predictions
residuals


Sample
1    -3.4
2    -1.4
3     1.6
4     4.6
5    -3.4
6     6.6
7    -5.4
8     0.6
9     2.6
10   -2.4
Name: Target, dtype: float64


## Step 3: Train a Weak Model on Residuals

To improve, a weak learner (e.g., decision stump) is trained on the residuals. Here, we use Feature 1 to predict residuals as a simple example.


In [8]:

# Train a decision stump using Feature 1
stump_predictions = df.groupby('Feature_1')['Target'].transform('mean')
stump_predictions


Sample
1     11.75
2     12.00
3     17.00
4     11.75
5     12.00
6     17.00
7     11.75
8     12.00
9     17.00
10    11.75
Name: Target, dtype: float64


## Step 4: Update Predictions with Learning Rate

Scale the weak learner's predictions by a learning rate (e.g., 0.1) and add to the current predictions.


In [9]:

learning_rate = 0.1
predictions += learning_rate * stump_predictions
predictions


Sample
1     14.575
2     14.600
3     15.100
4     14.575
5     14.600
6     15.100
7     14.575
8     14.600
9     15.100
10    14.575
Name: Target, dtype: float64


## Step 5: Repeat the Process

The steps are repeated for several iterations to further minimize residuals, improving the model incrementally.
