# Boosting-2
Assignment Questions

Q1. What is Gradient Boosting Regression?

Gradient Boosting Regression (GBR) is a machine learning algorithm that is used for regression problems, which involves predicting a continuous numerical value rather than a categorical value. GBR is a type of boosting algorithm that works by iteratively adding weak regression models to the ensemble, where each subsequent model is trained to correct the errors made by the previous models.

At a high level, the GBR algorithm works as follows:

1. A single regression model is trained on the training data.
2. The residuals (i.e., the difference between the predicted and actual values) of the first model are calculated for each training sample.
3. A second regression model is trained on the residuals, with the goal of predicting the residuals more accurately.
4. The predictions of the second model are added to the predictions of the first model, producing a new set of predicted values.
5. The residuals of the new set of predictions are calculated, and the process is repeated by training a third regression model on these residuals.
6. The predictions of the third model are added to the predictions of the previous models, and the residuals of the new set of predictions are calculated.
7. This process is repeated until a stopping criterion is met (e.g., a maximum number of iterations is reached, or the performance on the validation set does not improve).

At each iteration, the new regression model is trained to minimize the loss function, which is typically the mean squared error (MSE) between the predicted and actual values. The loss function measures how well the current set of models fits the training data and is used to guide the training process.

The final ensemble of regression models is obtained by summing the predictions of all the individual models in the ensemble. The weights of the individual models in the ensemble are determined by their performance on the training data, with better-performing models given higher weights. The final prediction of the GBR algorithm is the sum of the predictions of all the models in the ensemble.

Q2. Implement a simple gradient boosting algorithm from scratch using Python and NumPy. Use a
simple regression problem as an example and train the model on a small dataset. Evaluate the model's
performance using metrics such as mean squared error and R-squared.

- here's an implementation of gradient boosting regression from scratch using Python and NumPy. We'll use the Boston Housing dataset from scikit-learn as our example dataset.

In [1]:
import numpy as np
from sklearn.datasets import load_boston
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import train_test_split

# Load the Boston Housing dataset
boston = load_boston()
X, y = boston.data, boston.target

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define the gradient boosting regression class
class GradientBoostingRegressor:
    def __init__(self, n_estimators=100, learning_rate=0.1, max_depth=3):
        self.n_estimators = n_estimators
        self.learning_rate = learning_rate
        self.max_depth = max_depth
        self.estimators = []

    def fit(self, X, y):
        # Initialize the residual as the difference between the true y and the mean y
        residual = y - np.mean(y)

        # Iterate over the number of estimators
        for i in range(self.n_estimators):
            # Fit a decision tree to the residual
            tree = DecisionTreeRegressor(max_depth=self.max_depth)
            tree.fit(X, residual)

            # Compute the predictions of the tree and update the residual
            predictions = tree.predict(X)
            residual -= self.learning_rate * predictions

            # Add the tree to the list of estimators
            self.estimators.append(tree)

    def predict(self, X):
        # Compute the sum of the predictions of all the trees
        predictions = np.zeros(len(X))
        for tree in self.estimators:
            predictions += self.learning_rate * tree.predict(X)
        return predictions

# Train the gradient boosting regression model
model = GradientBoostingRegressor(n_estimators=100, learning_rate=0.1, max_depth=3)
model.fit(X_train, y_train)

# Evaluate the model on the test data
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

# Print the evaluation metrics
print("Mean squared error: %.2f" % mse)
print("R-squared: %.2f" % r2)


Mean squared error: 535.99
R-squared: -6.31



    The Boston housing prices dataset has an ethical problem. You can refer to
    the documentation of this function for further details.

    The scikit-learn maintainers therefore strongly discourage the use of this
    dataset unless the purpose of the code is to study and educate about
    ethical issues in data science and machine learning.

    In this special case, you can fetch the dataset from the original
    source::

        import pandas as pd
        import numpy as np


        data_url = "http://lib.stat.cmu.edu/datasets/boston"
        raw_df = pd.read_csv(data_url, sep="\s+", skiprows=22, header=None)
        data = np.hstack([raw_df.values[::2, :], raw_df.values[1::2, :2]])
        target = raw_df.values[1::2, 2]

    Alternative datasets include the California housing dataset (i.e.
    :func:`~sklearn.datasets.fetch_california_housing`) and the Ames housing
    dataset. You can load the datasets as follows::

        from sklearn.datasets import fetch_california_h

In this implementation, we define a GradientBoostingRegressor class that contains the hyperparameters for the algorithm (n_estimators, learning_rate, and max_depth) and the list of decision trees (estimators). The fit method trains the model by iterating over the number of estimators and fitting a decision tree to the residual of the previous prediction. The predict method computes the sum of the predictions of all the trees in the ensemble. Finally, we train the model on the Boston Housing dataset, evaluate it on the test data using mean squared error and R-squared, and print the evaluation metrics.

Note that this implementation is a simplified version of gradient boosting regression, and there are many ways to improve it (e.g., by adding early stopping, regularization, or using different types of weak learners).

Q3. Experiment with different hyperparameters such as learning rate, number of trees, and tree depth to
optimise the performance of the model. Use grid search or random search to find the best
hyperparameters

- how to use grid search to find the best hyperparameters for the gradient boosting regression model we implemented earlier:

In this example, we define a parameter grid with different values for the n_estimators, learning_rate, and max_depth hyperparameters. We then create a GridSearchCV object with the GradientBoostingRegressor model and the parameter grid, and fit it to the training data using 5-fold cross-validation. Finally, we print the best hyperparameters and evaluate the best model on the test data using mean squared error and R-squared.

You can also use random search instead of grid search to explore the hyperparameter space more efficiently. Here's an example of how to use random search:

Q4. What is a weak learner in Gradient Boosting?