## Q1. What is Gradient Boosting Regression?

Gradient Boosting Regression (GBR) is a machine learning algorithm that belongs to the family of boosting algorithms. It is a supervised learning algorithm used for both regression and classification problems.

In Gradient Boosting Regression (GBR), a decision tree is fitted to the data in a step-wise manner, with each new tree attempting to correct the errors made by the previous tree. The prediction of the ensemble model is obtained by summing the predictions of all the individual trees.

The "gradient" in gradient boosting refers to the use of the gradient descent optimization algorithm to minimize the loss function (e.g., mean squared error) of the model. At each step, the gradient of the loss function with respect to the prediction is computed, and the tree is fitted to the negative gradient, which corresponds to the direction of steepest descent.

GBR is a powerful algorithm that can handle a large number of features and nonlinear relationships between the features and the target variable. However, it can be prone to overfitting if the model is too complex or if the data is noisy. Therefore, it is important to tune the hyperparameters of the model (e.g., the number of trees, the learning rate, the maximum depth of the trees) to achieve the best performance on the validation set.

## Q2. Implement a simple gradient boosting algorithm from scratch using Python and NumPy. Use a simple regression problem as an example and train the model on a small dataset. Evaluate the model's performance using metrics such as mean squared error and R-squared.

In [1]:
from sklearn.datasets import make_regression
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.tree import DecisionTreeRegressor
import numpy as np

class GradientBoostingRegressor:
    
    def __init__(self, n_estimators=100, learning_rate=0.1, max_depth=3):
        self.n_estimators = n_estimators
        self.learning_rate = learning_rate
        self.max_depth = max_depth
        self.trees = []
        self.intercept = None
        
    def fit(self, X, y):
        self.intercept = np.mean(y)
        residual = y - self.intercept
        
        for i in range(self.n_estimators):
            tree = DecisionTreeRegressor(max_depth=self.max_depth)
            tree.fit(X, residual)
            self.trees.append(tree)
            pred = tree.predict(X)
            residual -= self.learning_rate * pred
            
    def predict(self, X):
        preds = np.array([tree.predict(X) for tree in self.trees])
        return self.intercept + self.learning_rate * np.sum(preds, axis=0)

# Generate a random regression problem
X, y = make_regression(n_samples=100, n_features=5, noise=0.5)

# Split data into training and testing sets
n_train = int(0.8 * len(X))
X_train, y_train = X[:n_train], y[:n_train]
X_test, y_test = X[n_train:], y[n_train:]

# Train a gradient boosting regressor on the training set
gb = GradientBoostingRegressor(n_estimators=100, learning_rate=0.1, max_depth=3)
gb.fit(X_train, y_train)

# Evaluate the model on the testing set
y_pred = gb.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Mean squared error: {mse:.2f}")
print(f"R-squared: {r2:.2f}")

Mean squared error: 2998.32
R-squared: 0.87


## Q3. Experiment with different hyperparameters such as learning rate, number of trees, and tree depth to optimise the performance of the model. Use grid search or random search to find the best hyperparameters.

Here is an implementation of grid search to find the best hyperparameters for the GradientBoostingRegressor model:

In [None]:
from sklearn.datasets import make_regression
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import GridSearchCV
from numpy import arange

# Generate a random regression problem
X, y = make_regression(n_samples=100, n_features=5, noise=0.5)

# Split data into training and testing sets
n_train = int(0.8 * len(X))
X_train, y_train = X[:n_train], y[:n_train]
X_test, y_test = X[n_train:], y[n_train:]

# Define hyperparameters for grid search
param_grid = {
    'n_estimators': [50, 100, 150],
    'learning_rate': [0.01, 0.1, 0.5],
    'max_depth': [2, 3, 4],
}

# Create a GradientBoostingRegressor object
gb = GradientBoostingRegressor()

# Use GridSearchCV to search for the best hyperparameters
grid_search = GridSearchCV(gb, param_grid=param_grid, cv=5)
grid_search.fit(X_train, y_train)

# Print the best hyperparameters
print(f"Best hyperparameters: {grid_search.best_params_}")

# Evaluate the best model on the testing set
y_pred = grid_search.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Mean squared error: {mse:.2f}")
print(f"R-squared: {r2:.2f}")

In [None]:
In this implementation, we use scikit-learn's GridSearchCV to search for the best hyperparameters. We can define a dictionary of hyperparameters to search over, and then pass this dictionary and the GradientBoostingRegressor object to GridSearchCV. We also specify a 5-fold cross-validation to evaluate the models.

You can also use RandomizedSearchCV to perform a random search over the hyperparameter space. Here's an example of how to use RandomizedSearchCV: