# Q1. What is Gradient Boosting Regression?

Gradient Boosting Regression is a machine learning technique used for regression tasks. It is an ensemble learning method that combines the predictions from multiple weak learners (typically decision trees) to create a strong predictive model. Gradient Boosting Regression aims to minimize the error between the predicted and actual values by iteratively fitting new models to the residuals (the differences between predictions and actual values) of the previous models.

# Q2. Implement a simple gradient boosting algorithm from scratch using Python and NumPy. 
Use a simple regression problem as an example and train the model on a small dataset. Evaluate the model's
performance using metrics such as mean squared error and R-squared.

In [4]:
import numpy as np
from sklearn.tree import DecisionTreeRegressor
# Generate a simple dataset
X = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 5, 4, 5])

# Initialize the prediction with the mean of target values
pred = np.mean(y)

# Number of trees in the ensemble
n_trees = 100

# Learning rate
learning_rate = 0.1

for i in range(n_trees):
    # Calculate residuals
    residuals = y - pred
    
    # Fit a decision tree to the residuals
    tree = DecisionTreeRegressor(max_depth=2)
    tree.fit(X.reshape(-1, 1), residuals)
    
    # Make predictions with the current tree and update the ensemble prediction
    tree_pred = tree.predict(X.reshape(-1, 1))
    pred += learning_rate * tree_pred

# Your ensemble model is ready, and 'pred' contains the final prediction.


# Q3. Experiment with different hyperparameters such as learning rate, number of trees, and tree depth to optimise the performance of the model. Use grid search or random search to find the best hyperparameters

In [2]:
import numpy as np
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.metrics import mean_squared_error, r2_score

# Generate a simple dataset
X = np.array([1, 2, 3, 4, 5])
y = np.array([2, 4, 5, 4, 5])

# Create a Gradient Boosting Regressor
regressor = GradientBoostingRegressor()

# Define the hyperparameter grid to search
param_grid = {
    'n_estimators': [50, 100, 200],
    'learning_rate': [0.01, 0.1, 0.2],
    'max_depth': [2, 3, 4]
}

# Create the GridSearchCV object
grid_search = GridSearchCV(regressor, param_grid, cv=3, scoring='neg_mean_squared_error')

# Perform the grid search
grid_search.fit(X.reshape(-1, 1), y)

# Get the best hyperparameters
best_params = grid_search.best_params_
print("Best Hyperparameters:", best_params)

# Get the best trained model
best_model = grid_search.best_estimator_

# Make predictions with the best model
y_pred = best_model.predict(X.reshape(-1, 1))

# Evaluate the model's performance
mse = mean_squared_error(y, y_pred)
r2 = r2_score(y, y_pred)

print("Mean Squared Error:", mse)
print("R-squared:", r2)


Best Hyperparameters: {'learning_rate': 0.01, 'max_depth': 2, 'n_estimators': 50}
Mean Squared Error: 0.5172019250760084
R-squared: 0.5689983957699931


# Q4. What is a weak learner in Gradient Boosting?

In Gradient Boosting, a weak learner is a simple model, often a decision tree with limited depth (shallow tree), that performs slightly better than random guessing on a given task. Weak learners are the building blocks used in the ensemble. In each boosting iteration, a weak learner is trained to correct the errors made by the previous models.

# Q5. What is the intuition behind the Gradient Boosting algorithm?

The intuition behind the Gradient Boosting algorithm is to iteratively improve the predictive performance of an ensemble of weak learners. It does this by focusing on the mistakes made by the previous models. In each iteration, a new weak learner is added to the ensemble to correct the errors made by the existing ensemble. This process continues until a predefined number of iterations or until a performance criterion is met.

# Q6. How does Gradient Boosting algorithm build an ensemble of weak learners?

Initially, a simple model (usually a shallow decision tree) is fit to the data.
The residuals (the differences between the predictions and actual values) of this model are calculated.
A new weak learner is fit to these residuals to predict the remaining errors.
The predictions of all weak learners in the ensemble are combined to make the final prediction.
This process is repeated iteratively, with each new weak learner focusing on the errors of the previous ensemble.

# Q7. What are the steps involved in constructing the mathematical intuition of Gradient Boosting algorithm?

Start with an initial prediction, often the mean of the target values.
Calculate the residuals, which are the differences between the current predictions and the actual target values.
Fit a weak learner (e.g., a decision tree) to the residuals to capture the patterns in the errors.
Adjust the prediction by adding a fraction of the predictions made by the weak learner to the current predictions. This fraction is controlled by the learning rate.
Repeat steps 2-4 for a specified number of iterations, gradually reducing the errors in the predictions.
The final prediction is the sum of the initial prediction and the contributions from all weak learners.
By minimizing the residuals in each step, Gradient Boosting builds an ensemble of weak learners that work together to improve predictive accuracy.