In [1]:
#Q1. What is Gradient Boosting Regression?
'''Gradient Boosting Regression is a machine learning technique used for regression tasks, where the goal is to predict a continuous output variable. It builds a predictive model in a sequential manner by combining multiple weak learners (usually decision trees) to form a strong learner.'''

'Gradient Boosting Regression is a machine learning technique used for regression tasks, where the goal is to predict a continuous output variable. It builds a predictive model in a sequential manner by combining multiple weak learners (usually decision trees) to form a strong learner.'

In [2]:
#Q2. Implement a simple gradient boosting algorithm from scratch using Python and NumPy. Use a simple regression problem as an example and 
#train the model on a small dataset. Evaluate the model's performance using metrics such as mean squared error and R-squared.
import numpy as np
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.tree import DecisionTreeRegressor
# Generate a simple dataset
np.random.seed(42)
X = np.linspace(0, 10, 100).reshape(-1, 1)
y = 2 * X.squeeze() + np.random.normal(0, 1, X.shape[0])
class GradientBoostingRegressor:
    def __init__(self, n_estimators=100, learning_rate=0.1):
        self.n_estimators = n_estimators
        self.learning_rate = learning_rate
        self.models = []
        self.initial_prediction = None
    
    def fit(self, X, y):
        # Initialize model with mean of y
        self.initial_prediction = np.mean(y)
        current_prediction = np.full_like(y, self.initial_prediction, dtype=np.float64)
        
        for _ in range(self.n_estimators):
            residuals = y - current_prediction
            model = DecisionTreeRegressor(max_depth=3)
            model.fit(X, residuals)
            prediction = model.predict(X)
            current_prediction += self.learning_rate * prediction
            self.models.append(model)
    
    def predict(self, X):
        prediction = np.full(X.shape[0], self.initial_prediction, dtype=np.float64)
        for model in self.models:
            prediction += self.learning_rate * model.predict(X)
        return prediction
# Create and train the model
gb_model = GradientBoostingRegressor(n_estimators=50, learning_rate=0.1)
gb_model.fit(X, y)
# Make predictions
y_pred = gb_model.predict(X)

# Compute metrics
mse = mean_squared_error(y, y_pred)
r2 = r2_score(y, y_pred)

print(f"Mean Squared Error: {mse:.4f}")
print(f"R-squared: {r2:.4f}")


Mean Squared Error: 0.2797
R-squared: 0.9921


In [3]:
#Q3. Experiment with different hyperparameters such as learning rate, number of trees, and tree depth to optimise the performance of the 
#model. Use grid search or random search to find the best hyperparameters
'''Considerations
Computational Resources: Grid search can be computationally expensive. Random search can be more efficient, especially with large hyperparameter spaces.
Cross-Validation: Use cross-validation to ensure that your hyperparameter tuning is robust and not overfitting to the training data.
Search Space: The range of hyperparameters should be chosen based on domain knowledge or preliminary experiments.'''

'Considerations\nComputational Resources: Grid search can be computationally expensive. Random search can be more efficient, especially with large hyperparameter spaces.\nCross-Validation: Use cross-validation to ensure that your hyperparameter tuning is robust and not overfitting to the training data.\nSearch Space: The range of hyperparameters should be chosen based on domain knowledge or preliminary experiments.'

In [4]:
#Q4. What is a weak learner in Gradient Boosting?
'''n Gradient Boosting, a weak learner is a model that performs slightly better than random guessing on a given task. Typically, weak learners are simple models that contribute to the overall performance of the boosting algorithm through iterative improvement.'''

'n Gradient Boosting, a weak learner is a model that performs slightly better than random guessing on a given task. Typically, weak learners are simple models that contribute to the overall performance of the boosting algorithm through iterative improvement.'

In [5]:
#Q5. What is the intuition behind the Gradient Boosting algorithm?
'''The intuition behind the Gradient Boosting algorithm is to build a strong predictive model by combining multiple weak learners in an iterative manner, where each new learner corrects the errors of the previous ones.'''

'The intuition behind the Gradient Boosting algorithm is to build a strong predictive model by combining multiple weak learners in an iterative manner, where each new learner corrects the errors of the previous ones.'

In [6]:
#Q6. How does Gradient Boosting algorithm build an ensemble of weak learners?
'''Gradient Boosting builds an ensemble by iteratively adding weak learners, each focused on correcting the errors of the previous model. This process improves the model's performance incrementally, with each weak learner contributing to the final prediction by addressing residual errors. The combination of all weak learners forms a powerful ensemble that captures complex patterns in the data.'''

"Gradient Boosting builds an ensemble by iteratively adding weak learners, each focused on correcting the errors of the previous model. This process improves the model's performance incrementally, with each weak learner contributing to the final prediction by addressing residual errors. The combination of all weak learners forms a powerful ensemble that captures complex patterns in the data."

In [7]:
#Q7. What are the steps involved in constructing the mathematical intuition of Gradient Boosting algorithm?
'''Initialize with a simple model.
Compute Residuals to identify the errors of the current model.
Fit a Weak Learner to these residuals.
Update the Model by adding the weak learner's predictions, scaled by a learning rate.
Iterate the process to improve the model incrementally.
Combine Predictions from all weak learners to form the final ensemble model.'''

"Initialize with a simple model.\nCompute Residuals to identify the errors of the current model.\nFit a Weak Learner to these residuals.\nUpdate the Model by adding the weak learner's predictions, scaled by a learning rate.\nIterate the process to improve the model incrementally.\nCombine Predictions from all weak learners to form the final ensemble model."