# Gradient Boosting Regressor

Gradient Boosting Regressor is an ensemble method that builds many weak decision tree models sequentially, where each new tree tries to fix the errors made by the previous trees.

XGBoost

LightGBM

CatBoost

### Common Pitfalls and Solutions

Each new tree learns from the "residuals" (errors) of the previous tree.

Instead of training all trees independently (like Random Forest), Gradient Boosting trains them one after another.


Common Pitfalls and Solutions
1. Overfitting
Symptoms:

Training error keeps decreasing while validation error increases

Model becomes too complex

Solutions:

Use smaller learning_rate with more n_estimators

Apply stronger regularization (max_depth, min_samples_split)

Use subsample < 1.0 (Stochastic GB)

Implement early stopping

2. Underfitting
Symptoms:

Both training and validation errors are high

Model is too simple

Solutions:

Increase max_depth

Increase n_estimators

Increase learning_rate (with caution)

Reduce regularization

3. Computational Efficiency
Optimizations:

Use histogram-based boosting (like LightGBM) for large datasets

Reduce n_estimators with higher learning_rate

Use max_features to limit feature consideration

 Best Practices
Start Simple: Begin with default parameters and iterate

Feature Scaling: GBR is generally robust to feature scales, but normalization can help

Handle Missing Values: GBR can handle NaNs naturally in some implementations

Monitor Learning: Use validation curves to detect overfitting

Ensemble Diversity: Consider blending with other models for maximum performance

ami ar kico  jani  na  ata  nia 



In [None]:
from sklearn.datasets import fetch_california_housing
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score

# Load dataset
data = fetch_california_housing()
X = data.data
y = data.target

# Split
X_train, X_test, Y_train, Y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Model
model = GradientBoostingRegressor(
    n_estimators=300,
    learning_rate=0.05,
    max_depth=3,
    random_state=42
)

# Train
model.fit(X_train, Y_train)

# Predict
y_pred = model.predict(X_test)

# Evaluation
print("MSE:", mean_squared_error(Y_test, y_pred))
print("R2 Score:", r2_score(Y_test, y_pred))
print("Feature Importances:", model.feature_importances_)
