# Gradient Boosting
1. Gradient Boosting is a boosting technique that builds models sequentially, just like AdaBoost, but it uses gradient descent to minimize a loss function. It’s one of the most powerful techniques for both regression and classification tasks.

2. Core Idea:
Train new models to predict the errors (residuals) of previous models. These predictions are combined to make a stronger final model.

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns


In [4]:
from sklearn.datasets import fetch_california_housing
# Step 1: Load the dataset
data = fetch_california_housing()

df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target

In [6]:
# Split Independent Feature X and Dependent Feature y

x = df.drop('target', axis = 'columns')
y = df.target

In [8]:
# Split data into Train and Test
from sklearn.model_selection import train_test_split, GridSearchCV
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)

In [10]:
# Step 4: Initialize and train model
from sklearn.ensemble import GradientBoostingRegressor

model = GradientBoostingRegressor(n_estimators=100, learning_rate=0.1, max_depth=3, random_state=42)
model.fit(X_train, y_train)


In [12]:
# Step 5: Predictions
y_pred = model.predict(X_test)

In [14]:
# Step 6: Evaluate
from sklearn.metrics import mean_squared_error, r2_score

print("MSE:", mean_squared_error(y_test, y_pred))
print("R² Score:", r2_score(y_test, y_pred))

MSE: 0.2939973248643864
R² Score: 0.7756446042829697


## hyperparameters tuning from GridSearchCV

In [17]:
# Step 4: Set up hyperparameter grid
param_grid = {
    'n_estimators': [100, 150],
    'learning_rate': [0.1, 0.05],
    'max_depth': [3, 4],
    'subsample': [1.0, 0.8]
}


In [19]:
# Step 5: Initialize model and GridSearchCV
model = GradientBoostingRegressor(random_state=42)
grid_search = GridSearchCV(model, param_grid, cv=5, scoring='neg_mean_squared_error', n_jobs=-1)


In [21]:
# Step 6: Fit model
grid_search.fit(X_train, y_train)


In [23]:
# Step 7: Best model and evaluation
best_model = grid_search.best_estimator_
y_pred = best_model.predict(X_test)

In [25]:
print("Best Parameters:", grid_search.best_params_)
print("MSE:", mean_squared_error(y_test, y_pred))
print("R² Score:", r2_score(y_test, y_pred))

Best Parameters: {'learning_rate': 0.1, 'max_depth': 4, 'n_estimators': 150, 'subsample': 0.8}
MSE: 0.2466396698034418
R² Score: 0.8117842033297633
