### 🔹 Gradient Boosting Regression  

Gradient Boosting Regression is a powerful **boosting ensemble technique** that builds a strong predictive model by combining multiple weak learners (typically Decision Trees) sequentially.  
Each new tree is trained to minimize the **residual errors** (gradients) made by the previous ensemble.  

The main idea:  
- The model starts with an initial prediction (often the mean of the target values).  
- At each iteration, a new tree is trained to predict the **residuals** (errors) of the current model.  
- The new tree’s predictions are scaled by a learning rate and **added** to the existing model.  
- This process continues until the error no longer improves significantly.  

Mathematically:  

$$
\hat{Y}_m = \hat{Y}_{m-1} + \eta \, f_m(X)
$$  

where:  
- $(\hat{Y}_{m-1})$ is the prediction from the previous iteration,  
- $(f_m(X))$ is the new regression tree trained on residuals,  
- $(\eta)$ is the **learning rate**, controlling how much each tree contributes,  
- $(\hat{Y}_m)$ is the updated prediction after iteration \(m\).  

Gradient Boosting Regression helps us to:  
- Achieve **high predictive accuracy** on complex datasets.  
- Handle **non-linear relationships** effectively.  
- Provide **flexible control** via hyperparameters like learning rate, number of trees, and tree depth.  

In this notebook, we will implement **Gradient Boosting Regression** and compare its performance with AdaBoost, Random Forest, and LightGBM models 🚀.  


# --------------------------------------------------------------------------

# import dataset

In [1]:
# from google.colab import files, drive

# up = files.upload()
# drive.mount('/content/drive')

In [2]:
import pandas as pd

df = pd.read_csv('dataset.csv')
df.head(3)

Unnamed: 0,A,B,C,T
0,2.0,4,8.5,196
1,2.4,4,9.6,221
2,1.5,4,5.9,136


# cleaning

In [3]:
# clean the data

# encoding

In [4]:
# encode the data

# define x , y

In [5]:
import numpy as np

x = df[['A']].values   # 2D
y = df['T'].values     # 1D

# spliting

In [6]:
# # finding best random state 

# from sklearn.model_selection import train_test_split
# from sklearn.ensemble import GradientBoostingRegressor
# gbr = GradientBoostingRegressor()
# from sklearn.metrics import r2_score

# import time
# t1 = time.time()
# lst = []
# for i in range(1,10):
#     x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=i) 
#     gbr.fit(x_train, y_train)
#     yhat_test = gbr.predict(x_test)
#     r2 = r2_score(y_test, yhat_test)
#     lst.append(r2)
# t2 = time.time()

# print(f"run time: {round((t2 - t1) / 60 , 0)} min")
# print(f"R2_score = {round(max(lst),2)}")
# print(f"random_state = {np.argmax(lst) + 1}")

In [7]:
from sklearn.model_selection import train_test_split

x_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.25, random_state=4)

# scaling

In [8]:
# No need for scaling in XGBoost Regression

# fit the model

In [9]:
# # K-fold cross validation

# from sklearn.ensemble import GradientBoostingRegressor
# from sklearn.model_selection import GridSearchCV

# parameters = {
#     '': [],
#     '': []
# }

# gb = GradientBoostingRegressor(random_state=1)
# gs = GridSearchCV(estimator=gb, param_grid=parameters, cv=5)

# gs.fit(x_train, y_train)

# best_params = gs.best_params_
# print(best_params)

In [10]:
# def param
# loss='squared_error', learning_rate=0.1, n_estimators=100, subsample=1.0, criterion='friedman_mse', 
# min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_depth=3, 
# min_impurity_decrease=0.0, init=None, random_state=None, max_features=None, alpha=0.9, verbose=0, 
# max_leaf_nodes=None, warm_start=False, validation_fraction=0.1, n_iter_no_change=None, tol=0.0001, ccp_alpha=0.0

In [11]:
from sklearn.ensemble import GradientBoostingRegressor

gbr = GradientBoostingRegressor(n_estimators=100, learning_rate=0.1, max_depth=3, random_state=42)
gbr.fit(x_train, y_train)

# predict test data

In [12]:
yhat_test = gbr.predict(X_test)

# Evaluate the model

In [13]:
from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_error
print("r2-score (train data): %0.4f" % r2_score(y_train, gbr.predict(x_train)))
print("r2-score (test data): %0.4f" % r2_score(y_test, yhat_test))

r2-score (train data): 0.8105
r2-score (test data): 0.8355


In [14]:
print(f"MSE (train data): {mean_squared_error(y_train, gbr.predict(x_train))}")
print(f"MAE (train data): {mean_absolute_error(y_train, gbr.predict(x_train))}")
print(f"MSE (test data): {mean_squared_error(y_test, yhat_test)}")
print(f"MAE (test data): {mean_absolute_error(y_test, yhat_test)}")

MSE (train data): 779.8427528030512
MAE (train data): 21.409916260555065
MSE (test data): 608.6308478161832
MAE (test data): 19.28962521272492


# save the model

In [15]:
# import joblib
# joblib.dump(gbr, 'gbr_model.pkl')

# load the model

In [16]:
# import joblib
# gbr = joblib.load('gbr_model.pkl')