Q1. What is Gradient Boosting Regression?


Gradient Boosting Regression is a machine learning technique used for regression tasks. It builds an ensemble of weak learners, typically decision trees, in a sequential manner. Each subsequent tree is trained to correct the errors made by the previous trees, thereby gradually improving the model's performance. The idea is to minimize the loss function by combining the predictions of these weak learners in a gradient descent fashion.

In [None]:
Q2. Implement a simple gradient boosting algorithm from scratch using Python and NumPy.

Here is a simple implementation of Gradient Boosting Regression from scratch using Python and NumPy:


import numpy as np
from sklearn.metrics import mean_squared_error, r2_score

class GradientBoostingRegressor:
    def __init__(self, n_estimators=100, learning_rate=0.1, max_depth=3):
        self.n_estimators = n_estimators
        self.learning_rate = learning_rate
        self.max_depth = max_depth
        self.trees = []
        self.initial_prediction = None

    def fit(self, X, y):
        self.initial_prediction = np.mean(y)
        predictions = np.full(y.shape, self.initial_prediction)
        
        for _ in range(self.n_estimators):
            residuals = y - predictions
            tree = self._build_tree(X, residuals)
            self.trees.append(tree)
            predictions += self.learning_rate * tree.predict(X)

    def predict(self, X):
        predictions = np.full((X.shape[0],), self.initial_prediction)
        for tree in self.trees:
            predictions += self.learning_rate * tree.predict(X)
        return predictions

    def _build_tree(self, X, residuals):
        from sklearn.tree import DecisionTreeRegressor
        tree = DecisionTreeRegressor(max_depth=self.max_depth)
        tree.fit(X, residuals)
        return tree

# Create a small dataset
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([1.5, 2.0, 2.5, 3.5, 3.0])

# Train the model
gbr = GradientBoostingRegressor(n_estimators=50, learning_rate=0.1, max_depth=3)
gbr.fit(X, y)

# Predict
y_pred = gbr.predict(X)

# Evaluate
mse = mean_squared_error(y, y_pred)
r2 = r2_score(y, y_pred)
print("Mean Squared Error:", mse)
print("R-squared:", r2)

In [None]:
3. Experiment with different hyperparameters such as learning rate, number of trees, and tree depth to optimise the performance of the model.
To find the best hyperparameters, you can use Grid Search. Here's an example using Grid Search with scikit-learn's GridSearchCV:

python
Copy code
from sklearn.model_selection import GridSearchCV

param_grid = {
    'n_estimators': [50, 100, 200],
    'learning_rate': [0.01, 0.1, 0.2],
    'max_depth': [1, 3, 5]
}

# Since we implemented the GradientBoostingRegressor from scratch, we will use a wrapper to make it compatible with GridSearchCV
from sklearn.base import BaseEstimator, RegressorMixin

class GradientBoostingRegressorWrapper(BaseEstimator, RegressorMixin, GradientBoostingRegressor):
    def __init__(self, n_estimators=100, learning_rate=0.1, max_depth=3):
        super().__init__(n_estimators, learning_rate, max_depth)

gbr_wrapper = GradientBoostingRegressorWrapper()

grid_search = GridSearchCV(estimator=gbr_wrapper, param_grid=param_grid, cv=3, scoring='neg_mean_squared_error')
grid_search.fit(X, y)

print("Best parameters found: ", grid_search.best_params_)
print("Best MSE: ", -grid_search.best_score_)


Q4. What is a weak learner in Gradient Boosting?
A weak learner in Gradient Boosting is typically a model that performs slightly better than random guessing.
Decision trees with shallow depth (i.e., stumps) are commonly used as weak learners. The idea is that by combining many weak learners, 
the boosting algorithm can create a strong learner that has better performance.

Q5. What is the intuition behind the Gradient Boosting algorithm?

The intuition behind Gradient Boosting is to build a model sequentially by adding new models that correct the errors made by the previous models.
Each new model is trained to predict the residuals (errors) of the combined ensemble of previous models. 
By minimizing these residuals, the algorithm improves the overall prediction accuracy step by step.



Q6. How does Gradient Boosting algorithm build an ensemble of weak learners?


Gradient Boosting builds an ensemble of weak learners as follows:

Initialize the model with a constant value, typically the mean of the target variable.
Iterate over a specified number of boosting rounds (n_estimators):
Compute the residuals between the actual values and the current predictions.
Train a weak learner (e.g., a decision tree) to predict these residuals.
Update the model by adding the weak learner's predictions multiplied by the learning rate.
Combine the predictions of all weak learners to form the final strong learner.


