Gradient boosting is an ensembling technique where several weak learners (regression trees) are combined to yield a powerful single model, in an iterative fashion.

Early stopping support in Gradient Boosting enables us to find the least number of iterations which is sufficient to build a model that generalizes well to unseen data.

The concept of early stopping is simple. We specify a validation_fraction which denotes the fraction of the whole dataset that will be kept aside from training to assess the validation loss of the model. The gradient boosting model is trained using the training set and evaluated using the validation set. When each additional stage of regression tree is added, the validation set is used to score the model. This is continued until the scores of the model in the last n_iter_no_change stages do not improve by atleast tol. After that the model is considered to have converged and further addition of stages is "stopped early".

The number of stages of the final model is available at the attribute n_estimators_.

This example illustrates how the early stopping can used in the :class:~sklearn.ensemble.GradientBoostingClassifier model to achieve almost the same accuracy as compared to a model built without early stopping using many fewer estimators. This can significantly reduce training time, memory usage and prediction latency.

In [33]:
import time

import numpy as np
import matplotlib.pyplot as plt

from sklearn.ensemble import GradientBoostingClassifier
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split

In [34]:
df = load_digits()
X, y = df.data, df.target

In [35]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [36]:
start_time = time.time()

In [37]:
gbc_no_early = GradientBoostingClassifier(n_estimators=100, random_state=42)
gbc_no_early.fit(X_train, y_train)

In [38]:
time_no_early = time.time() - start_time
score_no_early = gbc_no_early.score(X_test, y_test)

In [39]:
print(f"Time without early stopping: {time_no_early:.2f} seconds")
print(f"Test score without early stopping: {score_no_early:.4f}")

Time without early stopping: 4.24 seconds
Test score without early stopping: 0.9694


In [40]:
start_time = time.time()
\
gbc_early = GradientBoostingClassifier(
    n_estimators=100,
    random_state=42,
    validation_fraction=0.1,  # Use 10% of training data for validation
    n_iter_no_change=10,      # Stop if no improvement for 10 consecutive iterations
    tol=1e-4,                 # Tolerance for improvement
    subsample=0.8             # Use 80% of samples for each tree
)
gbc_early.fit(X_train, y_train)

In [41]:
time_early = time.time() - start_time
score_early = gbc_early.score(X_test, y_test)

In [42]:
print(f"Time with early stopping: {time_early:.2f} seconds")
print(f"Test score with early stopping: {score_early:.4f}")

Time with early stopping: 3.17 seconds
Test score with early stopping: 0.9639
