In [1]:
#Q1. What is Gradient Boosting Regression?

Gradient Boosting Regression is a machine learning technique used in regression tasks. It gives a prediction model in the form of an ensemble of weak prediction models, which are typically decision trees1. Gradient Boosting Regression is an analytical technique that is designed to explore the relationship between two or more variables (X, and Y). 

In each stage of Gradient Boosting Regression, a regression tree is fit on the negative gradient of the given loss function. This estimator builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions

In [1]:
#Q2. Implement a simple gradient boosting algorithm from scratch using Python and NumPy. Use a
#simple regression problem as an example and train the model on a small dataset. Evaluate the model's
#performance using metrics such as mean squared error and R-squared.

In [8]:
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import RandomizedSearchCV

import numpy as np
import pandas as pd
df=pd.read_csv('winequality-red.csv')
df.head()

Unnamed: 0,fixed acidity,volatile acidity,citric acid,residual sugar,chlorides,free sulfur dioxide,total sulfur dioxide,density,pH,sulphates,alcohol,quality
0,7.4,0.7,0.0,1.9,0.076,11.0,34.0,0.9978,3.51,0.56,9.4,5
1,7.8,0.88,0.0,2.6,0.098,25.0,67.0,0.9968,3.2,0.68,9.8,5
2,7.8,0.76,0.04,2.3,0.092,15.0,54.0,0.997,3.26,0.65,9.8,5
3,11.2,0.28,0.56,1.9,0.075,17.0,60.0,0.998,3.16,0.58,9.8,6
4,7.4,0.7,0.0,1.9,0.076,11.0,34.0,0.9978,3.51,0.56,9.4,5


In [9]:
X=df.drop('quality',axis=1)
y=df['quality']

X_train, X_test,y_train,y_test=train_test_split(X,y,test_size=0.3,random_state=40)

In [10]:
model = GradientBoostingRegressor(n_estimators=100, learning_rate=0.1)
model.fit(X_train, y_train)

In [11]:
y_pred = model.predict(X_test)

mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Mean Squared Error: {mse}")
print(f"R-squared: {r2}")

Mean Squared Error: 0.3637082684317155
R-squared: 0.480851314644443


The mean squared error (MSE) is a measure of how well a regression model fits the data. It is calculated as the average of the squared differences between the predicted and actual values. In this case, the MSE is 0.3646.

The R-squared value is a measure of how well the regression model explains the variation in the data. It ranges from 0 to 1, with higher values indicating a better fit. In this case, the R-squared value is 0.4795.

In [12]:
#Q3. Experiment with different hyperparameters such as learning rate, number of trees, and tree depth to
#optimise the performance of the model. Use grid search or random search to find the best
#hyperparameters


model = GradientBoostingRegressor(n_estimators=100, criterion='squared_error', max_depth=8, min_samples_split=5, min_samples_leaf=5, max_features=3)

model.fit(X_train, y_train)

y_pred = model.predict(X_test)

print('RMSE:', np.sqrt(mean_squared_error(y_test, y_pred)))

RMSE: 0.5860771210634872


In [13]:


param_grid = {'n_estimators': [10, 100,200, 300,1000], 'criterion': ['squared_error'], 'max_depth': [5,8,15,20], 'min_samples_split':[5,10,15,20], 'min_samples_leaf':[5,10,15,20,25], 'max_features':[3,5,10,15]}

# Fit the grid search object to the training data and obtain the best hyperparameters

rad_search = RandomizedSearchCV(model, param_distributions=param_grid)
rad_search.fit(X_train, y_train)

print(f"Best hyperparameters: {rad_search.best_params_}")
print(f"Best Score: {rad_search.best_score_}")

Best hyperparameters: {'n_estimators': 100, 'min_samples_split': 5, 'min_samples_leaf': 25, 'max_features': 3, 'max_depth': 20, 'criterion': 'squared_error'}
Best Score: 0.3933764529190281


In [1]:
#Q4. What is a weak learner in Gradient Boosting?

A weak learner in Gradient Boosting refers to a simple model that does only slightly better than random chance.Boosting works as weak learners are added to the ensemble iteratively and combined with many other weak learners, they can form a robust ensemble model to make accurate predictions

In [2]:
#Q5. What is the intuition behind the Gradient Boosting algorithm?

The intuition behind Gradient Boosting algorithm is to repetitively leverage the patterns in residuals and strengthen a model with weak predictions and make it better. Once we reach a stage where residuals do not have any pattern that could be modeled, we can stop modeling residuals

In [3]:
#Q6. How does Gradient Boosting algorithm build an ensemble of weak learners?

Gradient Boosting algorithm builds an ensemble of weak learners by combining several smaller, simpler models in order to obtain a more accurate prediction than what an individual model would produce. 

In [4]:
#Q7. What are the steps involved in constructing the mathematical intuition of Gradient Boosting
#algorithm?

Build a base model to predict the observations in the training dataset.

Calculate the residuals of the base model and fit a new model to these residuals.

Repeat step 2 until a stopping criterion is met.