# Q1.
###  What is Gradient Boosting Regression?

- Gradient Boosting Regression is a machine learning technique used for regression tasks that builds an ensemble of weak prediction models, typically decision trees, to create a strong predictive model. The technique involves sequentially adding trees to the model, with each new tree correcting the errors of the previous ones. The "gradient" in Gradient Boosting refers to the use of gradient descent to minimize the loss function, guiding the optimization process as new trees are added.

# Q2.
### Implement a simple gradient boosting algorithm from scratch using Python and NumPy. Use a simple regression problem as an example and train the model on a small dataset. Evaluate the model's performance using metrics such as mean squared error and R-squared.

In [1]:
import numpy as np
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error, r2_score

class SimpleGradientBoostingRegressor:
    def __init__(self, n_estimators=100, learning_rate=0.1, max_depth=3):
        self.n_estimators = n_estimators
        self.learning_rate = learning_rate
        self.max_depth = max_depth
        self.models = []

    def fit(self, X, y):
        self.models = []
        residuals = y.copy()
        
        for _ in range(self.n_estimators):
            model = DecisionTreeRegressor(max_depth=self.max_depth)
            model.fit(X, residuals)
            predictions = model.predict(X)
            residuals -= self.learning_rate * predictions
            self.models.append(model)

    def predict(self, X):
        y_pred = np.zeros(X.shape[0])
        for model in self.models:
            y_pred += self.learning_rate * model.predict(X)
        return y_pred

X = np.array([[1], [2], [3], [4], [5]])
y = np.array([1.5, 3.2, 2.8, 5.0, 7.2])

# Train model
model = SimpleGradientBoostingRegressor(n_estimators=10, learning_rate=0.1, max_depth=2)
model.fit(X, y)

# Predictions
predictions = model.predict(X)

# Evaluate model
print(mean_squared_error(y, predictions))
print(r2_score(y, predictions))

2.3767754788396735
0.39219121347185104


# Q3.
###  Experiment with different hyperparameters such as learning rate, number of trees, and tree depth to optimise the performance of the model. Use grid search or random search to find the best hyperparameters

In [2]:
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import GradientBoostingRegressor

In [3]:
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([1.5, 3.2, 2.8, 5.0, 7.2])

In [4]:
param_grid = {
    'n_estimators': [10, 50, 100],
    'learning_rate': [0.01, 0.1, 0.2],
    'max_depth': [2, 3, 4]
}


In [5]:
model = GradientBoostingRegressor()
grid_search = GridSearchCV(estimator=model, param_grid=param_grid, scoring='neg_mean_squared_error', cv=3)
grid_search.fit(X, y)

In [8]:
best_params = grid_search.best_params_
best_params

{'learning_rate': 0.2, 'max_depth': 3, 'n_estimators': 100}

In [9]:
best_model = grid_search.best_estimator_
best_model

In [10]:
predictions = best_model.predict(X)
print(mean_squared_error(y, predictions))
print(r2_score(y, predictions))

2.1169863720527904e-16
1.0


# Q4.
### What is a weak learner in Gradient Boosting?

A weak learner in Gradient Boosting is a simple model that performs slightly better than random guessing. It has limited predictive power individually but can be significantly improved when combined with other weak learners. In Gradient Boosting, decision trees with shallow depth are commonly used as weak learners.

# Q5.
### What is the intuition behind the Gradient Boosting algorithm?

- The intuition behind Gradient Boosting is to sequentially add models to an ensemble where each new model aims to correct the errors made by the previous models. By minimizing the residuals of the predictions iteratively, the ensemble's performance improves over time. This is done by fitting each new model to the negative gradient of the loss function with respect to the ensemble's current predictions.

# Q6.
### How does the Gradient Boosting algorithm build an ensemble of weak learners?

- Initialize the model
- Compute Residuals
- Fit a Weak Learner
- Update the model
- Repeat 
- Final Model : sum of the initial model and the scaled predictions of all the weak learners.

# Q7.
### What are the steps involved in constructing the mathematical intuition of the Gradient Boosting algorithm?

- Initialize with a simple model, such as the mean of the target values.
- Compute residuals for the current model.
- Fit a weak learner to the residuals.
- Update the model by adding the weak learner's predictions, scaled by a - learning rate.
- Repeat the process for a specified number of iterations.
- Combine all weak learners to form the final model.