In [None]:
1:
  Gradient Boosting Regression is a machine learning algorithm used for regression problems. It works
by building multiple decision trees iteratively, with each new tree attempting to correct the errors 
made by the previous ones. At each iteration, the algorithm calculates the gradient of the loss function 
with respect to the current model's predictions and trains a new tree to predict the residual errors. 
The final predictions are obtained by combining the predictions of all the trees. Gradient Boosting
Regression is known for producing highly accurate and robust models, especially when dealing with large
and complex datasets  

In [None]:
2:
  

In [12]:
import numpy as np
from sklearn.tree import DecisionTreeRegressor

class GradientBoostingRegressor:
    
    def __init__(self, n_estimators=100, learning_rate=0.1, max_depth=3):
        self.n_estimators = n_estimators
        self.learning_rate = learning_rate
        self.max_depth = max_depth
        self.trees = []
    
    def fit(self, X, y):
        #initialize the prediction to the mean of target variable
        self.prediction = np.mean(y)*np.ones(X.shape[0])
        
        for i in range(self.n_estimators):
            #Calculate the negative gradient
            residual = y - self.prediction
            
            #Fit a decision tree to the negative gradient
            tree = DecisionTreeRegressor(max_depth=self.max_depth)
            tree.fit(X, residual)
            
            #Make a prediction using the newly fitted decision tree
            prediction_i = tree.predict(X)
            
            #Update the prediction for the next iteration by adding a fraction of the new prediction
            self.prediction += self.learning_rate * prediction_i
            
            #Save the tree for later use
            self.trees.append(tree)
            
    def predict(self, X):
        #Make a prediction by summing the predictions of all decision trees
        prediction = np.zeros(X.shape[0])
        for tree in self.trees:
            prediction += self.learning_rate * tree.predict(X)
        return prediction


In [13]:
import numpy as np

#generate a random dataset
np.random.seed(0)
n_samples = 100
X = np.sort(np.random.rand(n_samples))*5
y = np.sin(X) + np.random.randn(n_samples) * 0.1


In [14]:
from sklearn.model_selection import train_test_split

#split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)


In [15]:
#train the gradient boosting model
gb = GradientBoostingRegressor(n_estimators=100, learning_rate=0.1, max_depth=3)
gb.fit(X_train.reshape(-1,1), y_train)

#make predictions on the testing data
y_pred = gb.predict(X_test.reshape(-1,1))


In [16]:
from sklearn.metrics import mean_squared_error, r2_score

print("Mean squared error: %.2f"
      % mean_squared_error(y_test, y_pred))

print('R-squared score: %.2f'
      % r2_score(y_test, y_pred))


Mean squared error: 0.06
R-squared score: 0.88


In [None]:
3:
    

In [None]:
import numpy as np
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.model_selection import GridSearchCV

# generate a random dataset
np.random.seed(0)
X = np.random.rand(100, 10)
y = np.random.rand(100)

# split the dataset into training and testing sets
train_X, test_X = X[:80], X[80:]
train_y, test_y = y[:80], y[80:]

# define the parameter grid for grid search
param_grid = {
    'n_estimators': [50, 100, 200],
    'learning_rate': [0.01, 0.1, 0.5],
    'max_depth': [3, 5, 7]
}

# perform grid search with cross-validation
model = GradientBoostingRegressor()
grid_search = GridSearchCV(model, param_grid, cv=5)
grid_search.fit(train_X, train_y)

# evaluate the best model on the test set
best_model = grid_search.best_estimator_
mse = mean_squared_error(test_y, best_model.predict(test_X))
r2 = r2_score(test_y, best_model.predict(test_X))
print(f"Best hyperparameters: {grid_search.best_params_}")
print(f"Test MSE: {mse:.4f}, R^2: {r2:.4f}")


In [None]:
4:
 In Gradient Boosting, a weak learner is a simple and relatively weak model that can only make slightly
better predictions than random guessing. Examples of weak learners include decision stumps, which are 
decision trees with only one split, or linear models with a small number of parameters.
The idea behind Gradient Boosting is to combine multiple weak learners to create a strong and accurate 
model. At each iteration, the algorithm fits a new weak learner to the residual errors of the previous
model, and then adds the two models together. By repeating this process many times, the algorithm can
gradually improve the overall accuracy of the model.
The strength of Gradient Boosting lies in its ability to combine many weak learners into a powerful 
ensemble model that can capture complex relationships in the data.   

In [None]:
5:
    The intuition behind Gradient Boosting algorithm is to iteratively improve the predictions of 
a model by adding new weak learners, which learn from the errors made by the previous models. 
The idea is to build an ensemble of models, where each model is trained to improve the predictions
of the previous model. In this way, the algorithm is able to learn complex relationships between the
input variables and the target variable, and produce highly accurate predictions. 

In [None]:
6:
   Gradient Boosting algorithm builds an ensemble of weak learners by training each learner to predict
the residual errors of the previous model. In other words, the algorithm starts with a simple model, 
such as a decision tree, and then trains a new model to predict the difference between the target variable
and the predictions of the previous model. This process is repeated iteratively until the desired level of
accuracy is achieved.  

In [None]:
7:
   The steps involved in constructing the mathematical intuition of Gradient Boosting algorithm are as follows:
1. Initialize the model with a constant value, such as the mean of the target variable.
2. Calculate the gradient of the loss function with respect to the current model's predictions.
3. Train a new model to predict the gradient, or error, of the previous model's predictions.
4. Combine the predictions of the previous model and the new model to obtain the updated predictions.
5. Repeat steps 2-4 until the desired level of accuracy is achieved.

The idea behind this process is to iteratively improve the model by learning from the errors made by
the previous models. The final model is an ensemble of weak learners, which work together to produce
highly accurate predictions.
