## **Q1:-**
### **What is Gradient Boosting Regression?**

### **Ans:-**

### **Gradient Boosting is a popular boosting algorithm in machine learning used for classification and regression tasks. Boosting is one kind of ensemble Learning method which trains the model sequentially and each new model tries to correct the previous model. It combines several weak learners into strong learners.**

## **Q2:-** 
### **Implement a simple gradient boosting algorithm from scratch using Python and NumPy. Use a simple regression problem as an example and train the model on a small dataset. Evaluate the model's performance using metrics such as mean squared error and R-squared.**

In [5]:
import numpy as np

# Generate a small synthetic dataset for regression
np.random.seed(0)
X = np.random.rand(100, 1)
y = 2 * X.squeeze() + np.random.randn(100)

# Define the number of boosting iterations (number of trees)
n_iterations = 100

# Initialize predictions with the mean of the target values
predictions = np.full_like(y, np.mean(y))

# Gradient boosting algorithm
learning_rate = 0.1
for i in range(n_iterations):
    # Calculate the residuals (negative gradient) for the current predictions
    residuals = y - predictions

    # Train a decision tree on the residuals
    from sklearn.tree import DecisionTreeRegressor
    tree = DecisionTreeRegressor(max_depth=3)
    tree.fit(X, residuals)

    # Make predictions with the current tree and update the ensemble
    tree_predictions = tree.predict(X)
    predictions += learning_rate * tree_predictions

# Calculate MSE and R-squared
def mean_squared_error(y_true, y_pred):
    return np.mean((y_true - y_pred) ** 2)

def r_squared(y_true, y_pred):
    ss_total = np.sum((y_true - np.mean(y_true))**2)
    ss_residual = np.sum((y_true - y_pred)**2)
    return 1 - (ss_residual / ss_total)

mse = mean_squared_error(y, predictions)
r2 = r_squared(y, predictions)

print(f"Mean Squared Error (MSE): {mse:.4f}")
print(f"R-squared (R^2): {r2:.4f}")

Mean Squared Error (MSE): 0.1752
R-squared (R^2): 0.8657


## **Q3:-** 
### **Experiment with different hyperparameters such as learning rate, number of trees, and tree depth to optimise the performance of the model. Use grid search or random search to find the best hyperparameters**

In [6]:
import numpy as np
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.metrics import mean_squared_error, r2_score

# Generate a small synthetic dataset for regression
np.random.seed(0)
X = np.random.rand(100, 1)
y = 2 * X.squeeze() + np.random.randn(100)

# Split the dataset into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define a parameter grid for grid search
param_grid = {
    'n_estimators': [50, 100, 200],
    'learning_rate': [0.01, 0.1, 0.2],
    'max_depth': [2, 3, 4]
}

# Create a GradientBoostingRegressor model
gbm = GradientBoostingRegressor()

# Perform grid search with cross-validation
grid_search = GridSearchCV(estimator=gbm, param_grid=param_grid, cv=3, n_jobs=-1, verbose=2)
grid_search.fit(X_train, y_train)

# Get the best hyperparameters
best_params = grid_search.best_params_
print("Best Hyperparameters:")
print(best_params)

# Evaluate the model with the best hyperparameters on the test set
best_model = grid_search.best_estimator_
y_pred = best_model.predict(X_test)

mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Mean Squared Error (MSE): {mse:.4f}")
print(f"R-squared (R^2): {r2:.4f}")

Fitting 3 folds for each of 27 candidates, totalling 81 fits
Best Hyperparameters:
{'learning_rate': 0.01, 'max_depth': 2, 'n_estimators': 100}
Mean Squared Error (MSE): 1.0387
R-squared (R^2): 0.0216


## **Q4:-**
### **What is a weak learner in Gradient Boosting?**

### **Ans:-**

### **Weak learners are models that perform slightly better than random guessing. Strong learners are models that have arbitrarily good accuracy. Weak and strong learners are tools from computational learning theory and provide the basis for the development of the boosting class of ensemble methods.**

## **Q5:-**
### **What is the intuition behind the Gradient Boosting algorithm?**

### **Ans:-**

### **In gradient boosting, we predict and adjust our predictions in the opposite (negative gradient) direction. This achieves the opposite (minimize the loss). Since, the loss of a model inversely relates to its performance and accuracy, doing so improves its performance**

## **Q6:-**
### **How does Gradient Boosting algorithm build an ensemble of weak learners?**

### **Ans:-**

### **It is a boosting technique where the outputs from individual weak learners associate sequentially during the training phase. The performance of the model is boosted by assigning higher weights to the samples that are incorrectly classified.**

## **Q7:-**
### **What are the steps involved in constructing the mathematical intuition of Gradient Boosting algorithm?**

### **Ans:-**

### **Gradient boosting algorithm Intuition:-**
#### Input requirement for Gradient Boosting:
#### Regression Loss functions:
#### Binary Classification Loss Functions:
#### Step 1: Calculate the average/mean of the target variable.
#### Step 2: Calculate the residuals for each sample.
#### Step 3: Construct a decision tree.**