## Q1.
Gradient Boosting Regression is an ensemble learning method that builds multiple weak learners (usually decision trees) sequentially. Each new tree corrects the residual errors of the previous trees using gradient descent optimization.


## Q2.

In [1]:
import numpy as np
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error, r2_score

# Sample dataset (X: input, y: target values)
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([1.5, 3.6, 5.1, 7.3, 9.0])

# Hyperparameters
n_estimators = 50  # Number of trees
learning_rate = 0.1
max_depth = 2

# Initialize predictions with the mean of y
y_pred = np.full_like(y, np.mean(y), dtype=np.float64)

# Train weak learners sequentially
trees = []
for _ in range(n_estimators):
    residuals = y - y_pred  # Compute residual errors
    tree = DecisionTreeRegressor(max_depth=max_depth)  # Weak learner
    tree.fit(X, residuals)  # Fit to residuals
    y_pred += learning_rate * tree.predict(X)  # Update predictions
    trees.append(tree)

# Evaluate the model
mse = mean_squared_error(y, y_pred)
r2 = r2_score(y, y_pred)

print(f"Mean Squared Error: {mse:.4f}")
print(f"R-squared Score: {r2:.4f}")


Mean Squared Error: 0.0003
R-squared Score: 1.0000


## Q3.

In [5]:
import numpy as np
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import GridSearchCV, KFold

# Sample dataset 
X = np.array([[1], [2], [3], [4], [5], [6], [7], [8], [9], [10]])
y = np.array([1.5, 3.6, 5.1, 7.3, 9.0, 10.5, 13.1, 14.8, 17.0, 19.3])

# Define hyperparameter grid
param_grid = {
    'n_estimators': [50, 100, 200],
    'learning_rate': [0.01, 0.1, 0.2],
    'max_depth': [2, 3, 5]
}

# Initialize Gradient Boosting Regressor
gb_regressor = GradientBoostingRegressor()
cv_strategy = KFold(n_splits=2, shuffle=True, random_state=42)

# Perform Grid Search with Cross-Validation
grid_search = GridSearchCV(gb_regressor, param_grid, cv=cv_strategy, scoring='r2', error_score='raise')
grid_search.fit(X, y)  # Fit the model

# Print best hyperparameters
print("Best Parameters:", grid_search.best_params_)
print("Best R² Score:", grid_search.best_score_)


Best Parameters: {'learning_rate': 0.1, 'max_depth': 3, 'n_estimators': 50}
Best R² Score: 0.8100951272649303


### Q4.  
A weak learner is a simple model (typically a shallow decision tree) that performs slightly better than random guessing. Gradient Boosting combines multiple weak learners to build a strong model.



### **Q5.**  
Gradient Boosting iteratively improves predictions by fitting new models to the residual errors of the previous models. It minimizes the loss function using gradient descent, refining predictions at each step.


### **Q6.**  
1. Start with an initial prediction (often the mean of the target values).  
2. Calculate residuals (errors) between actual values and predictions.  
3. Train a weak learner (decision tree) to predict the residuals.  
4. Update the model’s predictions by adding the new learner’s output multiplied by a learning rate.  
5. Repeat this process for multiple iterations, gradually reducing the errors.



### **Q7.**  
1. **Define the loss function** (e.g., Mean Squared Error for regression).  
2. **Compute residuals** as negative gradients of the loss function.  
3. **Fit a weak learner** to predict residuals.  
4. **Update predictions** using the learning rate.  
5. **Repeat** the process for multiple iterations.  

