<div class="alert alert-block alert-info" align="center" style="padding: 10px;">
<h1><b><u>Boosting-2</u></b></h1>
</div>

## Q1. What is Gradient Boosting Regression?

Gradient Boosting Regression is a powerful machine learning algorithm used for both regression and classification tasks. In the context of regression, it builds a predictive model by combining the predictions of multiple weak learners (typically decision trees) sequentially. The key idea is to minimize the error of the model at each step by fitting a weak learner to the residuals of the previous model. This process continues until a predefined number of iterations are reached or until the model converges to a certain level of performance.

**Here are some examples of problems that can be solved with Gradient Boosting Regression:**

* Predicting house prices
* Predicting customer churn
* Predicting fraud

### Q2. Implement a simple gradient boosting algorithm from scratch using Python and NumPy. Use a simple regression problem as an example and train the model on a small dataset. Evaluate the model's performance using metrics such as mean squared error and R-squared.

In [1]:
import numpy as np
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error, r2_score

# Generate synthetic data
np.random.seed(0)
X = np.random.rand(100, 1)
y = 4 * X.squeeze() + np.random.randn(100)

# Initialize the target variable as residuals
F = np.zeros_like(y)

# Set the number of boosting iterations
n_iterations = 100
learning_rate = 0.1

# Create weak learners
weak_learners = []

for _ in range(n_iterations):
    # Calculate residuals
    residuals = y - F
    weak_learner = DecisionTreeRegressor(max_depth=1)
    weak_learner.fit(X, residuals)
    F += learning_rate * weak_learner.predict(X)
    weak_learners.append(weak_learner)

# Predict using the ensemble of weak learners
y_pred = sum(learning_rate * weak_learner.predict(X) for weak_learner in weak_learners)

# Calculate mean squared error
mse = mean_squared_error(y, y_pred)
print(f"Mean Squared Error: {mse:.2f}")

# Calculate R-squared
r_squared = r2_score(y, y_pred)
print(f"R-squared: {r_squared:.2f}")

Mean Squared Error: 0.82
R-squared: 0.64


### Q3. Experiment with different hyperparameters such as learning rate, number of trees, and tree depth to optimise the performance of the model. Use grid search or random search to find the best hyperparameters.

In [2]:
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import GridSearchCV

# Create a Gradient Boosting Regressor
gbm = GradientBoostingRegressor()

# Define hyperparameter grid
param_grid = {
    'n_estimators': [50, 100],
    'learning_rate': [0.1, 0.2],
    'max_depth': [1, 2]
}

# Create a grid search object
grid_search = GridSearchCV(gbm, param_grid, cv=5, scoring='neg_mean_squared_error', n_jobs=-1)
grid_search.fit(X, y)

# Get the best hyperparameters
best_params = grid_search.best_params_
print("Best Hyperparameters:", best_params)

# Get the best model
best_model = grid_search.best_estimator_

# Evaluate the best model
y_pred = best_model.predict(X)
mse = ((y - y_pred) ** 2).mean()
r_squared = best_model.score(X, y)
print(f"Mean Squared Error: {mse:.2f}")
print(f"R-squared: {r_squared:.2f}")

Best Hyperparameters: {'learning_rate': 0.1, 'max_depth': 1, 'n_estimators': 50}
Mean Squared Error: 0.85
R-squared: 0.63


In [None]:
Q4. What is a weak learner in Gradient Boosting?


    A weak learner in the context of Gradient Boosting is a base model that performs slightly better than random guessing but
    is not necessarily a strong or highly accurate model on its own. 
    Weak learners are often simple models, such as decision stumps or linear models. 
    The key characteristic of a weak learner is that its error rate is only slightly better than chance, which allows Gradient 
    Boosting to focus on the mistakes made by these models and iteratively improve predictions.

## Q4. What is a weak learner in Gradient Boosting?

A weak learner in Gradient Boosting is a base model that performs slightly better than random guessing but is not necessarily a strong or highly accurate model on its own. Weak learners are often simple models, such as decision stumps or linear models. The key characteristic of a weak learner is that its error rate is only slightly better than chance, which allows Gradient Boosting to focus on the mistakes made by these models and iteratively improve predictions.

## Q5. What is the intuition behind the Gradient Boosting algorithm?

The intuition behind the Gradient Boosting algorithm is to build a strong predictive model by sequentially combining the predictions of multiple weak learners.

Here's a high-level intuition:

* Start with an initial prediction.
* Fit a weak learner to the residuals.
* Update the predictions by adding a fraction (learning rate) of the weak learners predictions to the current predictions.
* Repeat steps 2 and 3 iteratively, each time focusing on the mistakes made by the previous models.
* The final prediction is a combination of all weak learners' predictions.

The algorithm effectively reduces the errors in predictions at each step and converges to a strong predictive model.

## Q6. How does Gradient Boosting algorithm build an ensemble of weak learners?

Gradient Boosting builds an ensemble of weak learners by iteratively training and combining them. The process can be summarized as follows:

* Initialize the ensemble with a constant prediction.
* For each iteration:
    * Calculate the negative gradient of the loss with respect to the current predictions. This represents the direction and magnitude of the error to be corrected.
    * Fit a weak learner to the negative gradient to predict how the current models predictions should be adjusted.
    * Update the ensembles predictions by adding a fraction of the predictions from the weak learner.
    * Repeat steps a-c for a predefined number of iterations.

The final prediction is the sum of predictions from all weak learners, which collectively forms a strong predictive model.

## Q7. What are the steps involved in constructing the mathematical intuition of Gradient Boosting algorithm?

Here are the key steps involved in constructing the mathematical intuition of the Gradient Boosting algorithm:

1. Define a loss function to measure prediction errors (e.g., mean squared error).
2. Initialize predictions with a constant value (e.g., mean of target values).
3. Calculate the negative gradient of the loss function with respect to current predictions.
4. Train a weak learner (e.g., decision tree) to predict the negative gradient (correction).
5. Update predictions by adding a fraction (learning rate) of the weak learner's predictions.
6. Repeat steps 3-5 iteratively for a predefined number of boosting rounds.

The final prediction is the sum of predictions from all weak learners, forming a strong predictive model.
