# M72 Boosting2 Assignment

# 1. What is Gradient Boosting Regression?

Gradient Boosting Regression is a machine learning technique for regression tasks that builds an ensemble of weak learners (usually decision trees) in a sequential manner. Each subsequent tree is trained to correct the errors made by the previous trees, using gradients of the loss function.## 

# Q2. Implement a simple gradient boosting algorithm from scratch using Python and NumPy

In [3]:
import numpy as np
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error, r2_score

class SimpleGradientBoostingRegressor:
    def __init__(self, n_estimators=100, learning_rate=0.1, max_depth=3):
        self.n_estimators = n_estimators
        self.learning_rate = learning_rate
        self.max_depth = max_depth
        self.models = []

    def fit(self, X, y):
        n_samples, n_features = X.shape
        self.initial_prediction = np.mean(y)
        residuals = y - self.initial_prediction
        
        for _ in range(self.n_estimators):
            tree = DecisionTreeRegressor(max_depth=self.max_depth)
            tree.fit(X, residuals)
            predictions = tree.predict(X)
            residuals -= self.learning_rate * predictions
            self.models.append(tree)
    
    def predict(self, X):
        y_pred = np.full(X.shape[0], self.initial_prediction)
        for model in self.models:
            y_pred += self.learning_rate * model.predict(X)
        return y_pred

# Generate a simple regression dataset
X = np.random.rand(100, 1)
y = 4 * X.squeeze() + np.random.randn(100) * 0.5

# Train the simple gradient boosting regressor
gb = SimpleGradientBoostingRegressor(n_estimators=50, learning_rate=0.1, max_depth=3)
gb.fit(X, y)
y_pred = gb.predict(X)

# Evaluate the model
mse = mean_squared_error(y, y_pred)
r2 = r2_score(y, y_pred)
print(f"Mean Squared Error: {mse}")
print(f"R-squared: {r2}")


Mean Squared Error: 0.07829529694531265
R-squared: 0.9511116376712166


## Q3. Experiment with different hyperparameters 

In [2]:
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import GradientBoostingRegressor

# Generate a simple regression dataset
X = np.random.rand(100, 1)
y = 4 * X.squeeze() + np.random.randn(100) * 0.5

# Define the parameter grid
param_grid = {
    'n_estimators': [50, 100, 200],
    'learning_rate': [0.01, 0.1, 0.2],
    'max_depth': [3, 4, 5]
}

# Perform grid search
gb = GradientBoostingRegressor()
grid_search = GridSearchCV(estimator=gb, param_grid=param_grid, cv=5, scoring='neg_mean_squared_error')
grid_search.fit(X, y)

# Output the best parameters
print(f"Best parameters: {grid_search.best_params_}")

# Evaluate the best model
best_model = grid_search.best_estimator_
y_pred = best_model.predict(X)
mse = mean_squared_error(y, y_pred)
r2 = r2_score(y, y_pred)
print(f"Mean Squared Error: {mse}")
print(f"R-squared: {r2}")


Best parameters: {'learning_rate': 0.01, 'max_depth': 3, 'n_estimators': 200}
Mean Squared Error: 0.18601737824086217
R-squared: 0.8791502836640005


# Q4. What is a weak learner in Gradient Boosting?
A weak learner in Gradient Boosting is a model that performs slightly better than random guessing. Typically, decision trees with limited depth (stumps) are used as weak learners.

# Q5. What is the intuition behind the Gradient Boosting algorithm?
The intuition behind Gradient Boosting is to sequentially add models that correct the errors of the combined ensemble of previous models. Each model is trained on the residual errors of the combined ensemble.

# Q6. How does Gradient Boosting algorithm build an ensemble of weak learners?
Gradient Boosting builds an ensemble of weak learners by sequentially adding new models trained on the residuals (errors) of the current ensemble's predictions. Each model attempts to reduce these residuals by fitting to the gradients of the loss function.

# Q7. What are the steps involved in constructing the mathematical intuition of Gradient Boosting algorithm?
Initialize the model with a constant value (usually the mean of the target variable for regression).
Compute residuals between the true values and the current model predictions.
Fit a weak learner to the residuals.
Update the model by adding the predictions of the weak learner, scaled by the learning rate.
Repeat steps 2-4 for a specified number of iterations or until convergence.
Combine the weak learners to make the final prediction.