In [None]:
Q1


Gradient boosting regression is a machine learning technique that uses an ensemble of weak learners to create a strong learner. The weak learners are typically decision trees, and they are added sequentially to the model, each one trying to correct the errors of the previous models.

The gradient boosting algorithm works by iteratively fitting a new model to the residual errors of the previous model. The residual errors are the differences between the predicted values and the actual values. The new model is fit using a technique called gradient descent, which minimizes the loss function.

The loss function is a measure of how well the model fits the data. The most common loss function for regression is the mean squared error (MSE). The MSE is the sum of the squared differences between the predicted values and the actual values.

Gradient boosting regression is a powerful machine learning technique that can be used to solve a variety of regression problems. It is particularly well-suited for problems where the relationship between the independent and dependent variables is nonlinear.

In [None]:
Q2

In [None]:
import numpy as np
class GradientBoostingRegressor:
    def __init__(self, n_estimators=100, learning_rate=0.1, max_depth=3):
        self.n_estimators = n_estimators
        self.learning_rate = learning_rate
        self.max_depth = max_depth
        self.estimators = []
    
    def fit(self, X, y):
        y_pred = np.full(y.shape, np.mean(y))
        
        for _ in range(self.n_estimators):
            residuals = y - y_pred
                        tree = DecisionTreeRegressor(max_depth=self.max_depth)
            tree.fit(X, residuals)
                        y_pred += self.learning_rate * tree.predict(X)
                        self.estimators.append(tree)
    
    def predict(self, X):
        y_pred = np.full(X.shape[0], np.mean(y))
                for tree in self.estimators:
            y_pred += self.learning_rate * tree.predict(X)
        
        return y_pred

def mean_squared_error(y_true, y_pred):
    return np.mean((y_true - y_pred) ** 2)

def r_squared(y_true, y_pred):
    ss_total = np.sum((y_true - np.mean(y_true)) ** 2)
    ss_residual = np.sum((y_true - y_pred) ** 2)
    return 1 - (ss_residual / ss_total)
X_train = np.array([[1], [2], [3], [4], [5]])
y_train = np.array([2, 4, 5, 4, 5])
gb_regressor = GradientBoostingRegressor(n_estimators=100, learning_rate=0.1, max_depth=3)
gb_regressor.fit(X_train, y_train)
X_test = np.array([[6], [7], [8]])
y_pred = gb_regressor.predict(X_test)
mse = mean_squared_error(y_train, gb_regressor.predict(X_train))
r2 = r_squared(y_train, gb_regressor.predict(X_train))
print(f"Mean Squared Error: {mse}")
print(f"R-squared: {r2}")


In [None]:
Q3

In [None]:
from sklearn.model_selection import GridSearchCV
param_grid = {
    'n_estimators': [50, 100, 150],
    'learning_rate': [0.01, 0.1, 0.2],
    'max_depth': [3, 4, 5]
}
gb_regressor = GradientBoostingRegressor()

grid_search = GridSearchCV(gb_regressor, param_grid, cv=5, scoring='neg_mean_squared_error', n_jobs=-1)
grid_search.fit(X_train, y_train)
best_params = grid_search.best_params_
best_estimator = grid_search.best_estimator_
best_mse = mean_squared_error(y_train, best_estimator.predict(X_train))
best_r2 = r_squared(y_train, best_estimator.predict(X_train))
print("Best Hyperparameters:", best_params)
print("Best Mean Squared Error:", best_mse)
print("Best R-squared:", best_r2)


In [None]:
Q4

A weak learner is a machine learning model that is only slightly better than random guessing. In gradient boosting, weak learners are typically decision trees. The idea is to combine many weak learners to create a strong learner.

The gradient boosting algorithm works by iteratively fitting a new model to the residual errors of the previous model. The residual errors are the differences between the predicted values and the actual values. The new model is fit using a technique called gradient descent, which minimizes the loss function.

The loss function is a measure of how well the model fits the data. The most common loss function for regression is the mean squared error (MSE). The MSE is the sum of the squared differences between the predicted values and the actual values.

The gradient boosting algorithm starts by fitting a weak learner to the data. The residual errors are then calculated. A new weak learner is then fit to the residual errors. This process is repeated until the desired accuracy is achieved.

The weak learners in gradient boosting are called weak because they are only slightly better than random guessing. However, by combining many weak learners, the gradient boosting algorithm can create a strong learner that is much more accurate than any individual weak learner.

In [None]:
Q5

The intuition behind the gradient boosting algorithm is to build an ensemble of weak learners that collectively learn to minimize a loss function. The weak learners are typically decision trees, and they are added sequentially to the model, each one trying to correct the errors of the previous models.The gradient boosting algorithm works by iteratively fitting a new model to the residual errors of the previous model. The residual errors are the differences between the predicted values and the actual values. The new model is fit using a technique called gradient descent, which minimizes the loss function.
The loss function is a measure of how well the model fits the data. The most common loss function for regression is the mean squared error (MSE). The MSE is the sum of the squared differences between the predicted values and the actual values.
The gradient boosting algorithm starts by fitting a weak learner to the data. The residual errors are then calculated. A new weak learner is then fit to the residual errors. This process is repeated until the desired accuracy is achieved.


In [None]:
Q6

The gradient boosting algorithm builds an ensemble of weak learners by iteratively fitting new models to the residual errors of the previous models. The residual errors are the differences between the predicted values and the actual values.

The gradient boosting algorithm starts by fitting a weak learner to the data. The residual errors are then calculated. A new weak learner is then fit to the residual errors. This process is repeated until the desired accuracy is achieved.

The gradient boosting algorithm can be summarized in the following steps:

Initialize the predictions to the mean of the target values.
For each iteration:
Calculate the residual errors.
Fit a new weak learner to the residual errors.
Update the predictions by adding the predictions of the new weak learner.
The predictions of the gradient boosting algorithm are the sum of the predictions of the individual weak learners. The weak learners are typically decision trees, but other models can also be used.

In [None]:
Q7

The mathematical intuition of the gradient boosting algorithm can be constructed in the following steps:
Define a loss function that measures the error between the predicted values and the actual values. The most common loss function for regression is the mean squared error (MSE).
Find the gradient of the loss function with respect to the predicted values. The gradient is a vector that points in the direction of the steepest descent.
Update the predicted values by subtracting a small step in the direction of the gradient.
Repeat steps 2 and 3 until the desired accuracy is achieved.
The gradient boosting algorithm can be thought of as a way to minimize the loss function iteratively. By iteratively updating the predicted values in the direction of the gradient, the algorithm can gradually improve its fit to the data.