Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: is it possible to use early stopping of Lightbm, Catboost? #251

Open
benitocm opened this issue Oct 10, 2022 · 5 comments
Open
Labels
question Further information is requested

Comments

@benitocm
Copy link

Hi,

i am using gbt algoritms as base regressor for the forecaster. I am interested in using the early stopping feature of those kind algos. Is it possible?

in the case of HistogramGradientboosting i think is easier becuse the early stopping is configures differently.

Thank you in advance

@JoaquinAmatRodrigo
Copy link
Owner

Hi @benitocm,
Early stopping should not be activated when using a GBT (or any other regressor) inside a Forecaster unless it uses a validation strategy that keeps the time order of the samples. In the case of HistGradientBoostingClassifier and LightGBM they seem to use a random sample, therefore it is not valid for tiem series.

@benitocm
Copy link
Author

benitocm commented Oct 10, 2022

Hi,

When I have done from scratch, i have used timesSeriesSplit from sckilearn to enforce time constraints. In the case of HistogramGradientboosting, in the cross_val_score with a CV of TimeSeriesSplit expecting that the time constraints hold.

Do you think those approaches are not correct? If so, please i would appreciate your inputs.

In the case of darts library, a validation series can be provided to the fit method. Would something like this the only way to use early stopping?

Thanks very much for your time

Something related to this unit8co/darts#1154

@JoaquinAmatRodrigo
Copy link
Owner

If the time constraints holds, I think there is no problem using it. Could you add a small example so we can double check if the approach is correct?

@benitocm
Copy link
Author

benitocm commented Oct 19, 2022

Hi,

In the case of HistogramGradientBoosting, I am using the cross_val_score method and I am assuming that scikitlearn is not doing any shuffling:

from sklearn.model_selection import TimeSeriesSplit

my_kfold = TimeSeriesSplit(n_splits=5, test_size=.2)

hgb_params= {
   'loss': 'squared_error',
   'scoring' : 'loss',
   'learning_rate' : 0.02 
   'verbose' : 0,
   'random_state': 42, 
   'max_iter' : 1000,
   'early_stopping' : True,     
   'validation_fraction' : .1,
   'n_iter_no_change' : 10
}

hgb_model = HistGradientBoostingRegressor(**hgb_params) 
neg_x_val_score = cross_val_score(hgb_model, x_base_df, y_base_df, scoring='neg_mean_squared_error', cv=my_kfold, n_jobs=5)

In the case of CatBoost, I am using the parameter has_time and doing the cross validation loop by myself:

 cv_results=defaultdict(list)
 for fold, (train_index, valid_index) in enumerate(my_kfold.split(x_df)):  
     x_train_df, x_valid_df = x_df.iloc[train_index,:], x_df.iloc[valid_index,:] 
     y_train_df, y_valid_df = y_df.iloc[train_index,:], y_df.iloc[valid_index,:]          
     cb_model = CatBoostRegressor(**cb_params)                  
     t1 = perf_counter()        
     if early_stopping_rounds != 0:
         _ = cb_model.fit( 
                 x_train_df, 
                 y_train_df, 
                 eval_set=(x_valid_df, y_valid_df), 
                 use_best_model=True, 
                 early_stopping_rounds=early_stopping_rounds,
                 callbacks=opt_callbacks     
         )            
     else:
         _ = cb_model.fit(x_train_df, y_train_df, callbacks=opt_callbacks )
         
     scores = calc_scores(cb_model, scoring, x_valid_df, y_valid_df)
     for k,v in scores.items():
         cv_results[k].append(v)
     
     cv_results['lr'].append(cb_model.learning_rate_)
     cv_results['n_trees'].append(cb_model.tree_count_)           

@JavierEscobarOrtiz JavierEscobarOrtiz added the question Further information is requested label Oct 20, 2022
@benitocm
Copy link
Author

Hi again,

Maybe I made not myself clear. My goal of being able to use early stopping is taking advantage of it to guess a reasonable number of trees. In the case of Catboost (according to this), when you set up the number of trees, the algorithm itself selects a learning rate that would be very close to the optimal one. That makes easier that tuning (that is not the case with the other gbt algos)

Thanks very much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants