# Week 9 Module 5 Assignment 4
## Francis Yang 12/15/2022
### Early stopping of Gradient Boosting

Gradient boosting is an ensembling technique where several weak learners (regression trees) are combined to yield a powerful single model, in an iterative fashion.

Early stopping support in Gradient Boosting enables us to find the least number of iterations which is sufficient to build a model that generalizes well to unseen data.

The concept of early stopping is simple. We specify a `validation_fraction` which denotes the fraction of the whole dataset that will be kept aside from training to assess the validation loss of the model. The gradient boosting model is trained using the training set and evaluated using the validation set. When each additional stage of regression tree is added, the validation set is used to score the model. This is continued until the scores of the model in the last `n_iter_no_change` stages do not improve by atleast `tol`. After that the model is considered to have converged and further addition of stages is "stopped early".

The number of stages of the final model is available at the attribute `n_estimators_`.

This example illustrates how the early stopping can used in the `sklearn.ensemble.GradientBoostingClassifier` model to achieve almost the same accuracy as compared to a model built without early stopping using many fewer estimators. This can significantly reduce training time, memory usage and prediction latency.

   * Load digits data set using `load_digits()`
   * Train `GradientBoostingCLassifier` with and without early stopping
   * Keep a timer for both cases and report the time it takes to train both models
   * Report the scores for both models

In [2]:
import time
import numpy as np
import matplotlib.pyplot as plt
from sklearn.ensemble import GradientBoostingClassifier 
from sklearn.datasets import load_digits 
from sklearn.model_selection import train_test_split

data = load_digits()
data.keys()

dict_keys(['data', 'target', 'frame', 'feature_names', 'target_names', 'images', 'DESCR'])

In [4]:
X = data.data
y = data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=5)
n_estimators = 100

early = GradientBoostingClassifier(n_estimators=n_estimators, validation_fraction=0.2, 
                                   n_iter_no_change=5, tol=0.01 ,random_state=0)
plain = GradientBoostingClassifier(n_estimators=n_estimators, random_state=0)

start = time.time()
plain.fit(X_train, y_train)
time_gb = time.time() - start

start = time.time()
early.fit(X_train, y_train)
time_gbes = time.time() - start

plain_score = plain.score(X_test, y_test)
early_score = early.score(X_test, y_test)

n_est_plain = plain.n_estimators_
n_est_early = early.n_estimators_

print(f"Without Early Stop \n Train time: {time_gb} Score: {plain_score} n_estimators: {n_est_plain}")
print(f"With Early Stop \n Train time: {time_gbes} Score: {early_score} n_estimators: {n_est_early}")

Without Early Stop 
 Train time: 5.8581109046936035 Score: 0.9511111111111111 n_estimators: 100
With Early Stop 
 Train time: 2.8082473278045654 Score: 0.94 n_estimators: 58
