XGBoost 2.0 doesn't support Optuna Pruning Callback #9608

hanhanwu · 2023-09-25T03:38:50Z

In older version XGBoost, we can use Optuna prunning callback (see example code here). Like code below:

Now in XGBoost2.0 train() function, Callback has to be TainingCallback (https://xgboost.readthedocs.io/en/stable/python/python_api.html#callback-api), so we can't use Optuna pruning on XGboost any more..

The text was updated successfully, but these errors were encountered:

trivialfis · 2023-09-25T03:43:24Z

You need the latest optuna: https://github.com/optuna/optuna/blob/a920a91e4e5a32ade707f3e9819fff0f96cc7946/optuna/integration/xgboost.py#L16

hanhanwu · 2023-09-25T22:46:24Z

I'm using the latest Optuna (v3.3.0). It doesn't have TrainingCallback XGBoost required. See https://optuna.readthedocs.io/en/stable/reference/generated/optuna.integration.XGBoostPruningCallback.html

And optuna latest code for XGBoost integration: https://github.com/optuna/optuna/blob/master/optuna/integration/xgboost.py

hcho3 · 2023-09-25T23:30:10Z

optuna.integration.XGBoostPruningCallback is a subclass of xgb.callback.TrainingCallback. See this line here: https://github.com/optuna/optuna/blob/a920a91e4e5a32ade707f3e9819fff0f96cc7946/optuna/integration/xgboost.py#L40

So you can use XGBoostPruningCallback in XGBoost 2.0 as well.

hanhanwu · 2023-09-27T03:29:13Z

@hcho3 Thank you, but, I'm sure the problem is on XGboost 2.0. Here's my reason:

I just tried older version of xgboost optuna, same code, pruning worked in optuna. Then I upgraded optuna to latest version v3.3.0, till XGboost v1.7.6, pruning still worked in optuna. Problem started appear in xgboost 2.0.

More details, here's my code to use optuna for XGboost. If you guys try this with latest xgboost and latest optuna, you should be able to replicate the error.

def get_hpo_params(opt_X_train, opt_X_val, opt_y_train, opt_y_val, n_trials=180):
    """
        To make below optuna output reproducible, need to set hashseed in your OS: 
        https://gerrychain.readthedocs.io/en/latest/topics/reproducibility.html#set-pythonhashseed-0
        This is ot make pruning behaviour replicable.
    """
    def objective(trial):
        dtrain = xgb.DMatrix(opt_X_train, label=opt_y_train, enable_categorical=True)
        dval = xgb.DMatrix(opt_X_val, label=opt_y_val, enable_categorical=True)

        params = {
            'objective': 'binary:hinge',
            'verbosity': 0,
            'eval_metric': 'auc',
            'seed': 10,
            'nthread': 4,
            'saved_feature_importance_type': 1,  # 1 means "gain", 0 means "split"

            'max_depth': trial.suggest_int('max_depth', 3, 10),
            'learning_rate': trial.suggest_float('learning_rate', 0.001, 0.2),
            'min_child_weight': trial.suggest_int('min_child_weight', 1, 5),
            'subsample': trial.suggest_float('subsample', 0.8, 1),
            'booster': trial.suggest_categorical('booster', ['gbtree', 'dart']),
            'lambda': trial.suggest_float('lambda', 0.001, 0.5),
            'gamma': trial.suggest_float('gamma', 0.1, 1),
            'grow_policy': trial.suggest_categorical('grow_policy', ['depthwise', 'lossguide'])
        }
        
        if params['booster'] == 'dart':
            params['sample_type'] = trial.suggest_categorical('sample_type', ['uniform', 'weighted'])
            params['normalize_type'] = trial.suggest_categorical('normalize_type', ['tree', 'forest'])
            params['rate_drop'] = trial.suggest_float('rate_drop', 0, 0.2)
            params['skip_drop'] = trial.suggest_float('skip_drop', 0, 0.2)
            
        pruning_callback = optuna.integration.XGBoostPruningCallback(trial, 'validation-auc')
            
        clf = xgb.train(
            params, dtrain, evals=[(dval, "validation")], callbacks=[pruning_callback]
        )

        y_val_pred = clf.predict(dval)
        avp = average_precision_score(opt_y_val, y_val_pred)
        return avp


    start = timeit.default_timer()
    study = optuna.create_study(direction="maximize", pruner=optuna.pruners.HyperbandPruner(),
                               sampler=optuna.samplers.TPESampler(seed=10), study_name='xgb_hpo')
    study.optimize(objective, n_trials=n_trials)

    print("Number of finished trials: {}".format(len(study.trials)))
    best_trial = study.best_trial
    print(f"Best trial performance: {best_trial.value}")
    stop = timeit.default_timer()
    print('Time (min): ', (stop - start)/60)
    
    return best_trial.params

The error I got is "TypeError: callback must be an instance of TrainingCallback."

I'm sure the problem is on latest xgboost. I want to use xgboost 2.0 is because it has enable_categorical to make my life easier. If not because my work needs to compare LGBM and XGBoost, I won't want to use XGboost, based on such experience. LGBM has much better user experience.

Please took into this issue and fix this. Otherwise, who wants to use XGboost?

hcho3 · 2023-09-27T04:35:55Z

Otherwise, who wants to use XGboost?

This veers into unnecessary personal attack. Please do not do this again. Your post was fine without this sentence. If you keep posting uncivil comments like this, we will have to ban you.

I understand you are frustrated. (Trust me, bugs frustrate us maintainers too.) Still, give us maintainers proper respect.

hcho3 · 2023-09-27T04:45:32Z

Reopening, so that we can investigate

hcho3 · 2023-09-27T06:19:07Z

@hanhanwu Thank you for providing the Python snippet. I was able to reproduce the error.

Diagnosis
The current version of Optuna has a bug that made XGBoostPruningCallback incompatible with XGBoost 2.0. The bug was fixed in this pull request: optuna/optuna#4921. Unfortunately, the fix is not yet part of the latest release of Optuna (3.3.0).

Solution
For now, you can install the development version of Optuna in your Python environment:

pip install git+https://github.com/optuna/optuna.git

This should fix the error.

hanhanwu · 2023-09-28T01:58:12Z

I see, thank you for the investigation.

hanhanwu · 2023-09-28T02:39:21Z

Otherwise, who wants to use XGboost?

This veers into unnecessary personal attack. Please do not do this again. Your post was fine without this sentence. If you keep posting uncivil comments like this, we will have to ban you.

I understand you are frustrated. (Trust me, bugs frustrate us maintainers too.) Still, give us maintainers proper respect.

Sorry if the words made you guys feel disrespectful. In fact I felt being treated without respect here in the first place.

I need to provide feedback here:

I think if you guys ask users to provide more details in the question is even better than throwing out an ineffective answer and closed the issue immediately. Like what I experienced from 1st response. What irritated me is not any bug, but seeing the issue got closed when the problem is not solved at all, and I believed people didn't look into this problem. This made me feel disrespect.
If you guys are spending time on an issue, why not try to do a good job, to help find out the reason? The 2nd response I got was better, theoretically it's correct, but if I can really use it, why would I post a question here. It still put me in a helpless situation.
Finally, you found the reason. That's great, I appreciate. Since finally you still spent the effort to dig out the reason, why it didn't happen earlier? Is there anything you guys should do more?

I don't care to be banned here. If asking a question can't get helpful answer and only felt more helpless, it's no better than being banned. I can go to Stackoverflow. Also, as a user, if the user experience is unpleasant, I won't use the tool, no matter how much efforts you guys spent on this product, especially when there're alternatives and future new tools. User interaction here is also part of user experience.

trivialfis · 2023-09-28T03:10:26Z

Sure, apologies for the inconvenience. Sometimes I'm too used to these types of issues and assume that people will get the answer by looking into some of the code, hence the referenced link to optuna. There's definitely room for improvement.

It's mostly open source project's issues, you see, it's not a product, and we are not customer service. I started contribution as a student in school. We expect the community to cooperate and help given that xgboost is a library and users are developers themselves, at least try to give some quick debugging to the issues at hand, it's not unusual for me to ask others do they want to open a PR.

Feel free to reopen the issue if the answer is unsatisfying, we can provide further assistance at best effort.

hcho3 · 2023-09-28T03:32:49Z

Apologies for closing the issue prematurely. Next time, we won't close issues until they are resolved. Also I am sorry that I rushed my 2nd response (#9608 (comment)). It was technically correct but not very helpful. I will try to do better next time.

Since finally you still spent the effort to dig out the reason, why it didn't happen earlier?

Mostly because we are limited in our time. Like other open source projects, we try to provide support at the best effort basis. (Of course, this does not excuse closing this issue prematurely. I am just providing a context here.)

hanhanwu · 2023-09-28T04:12:48Z

Thank you guys!

You guys are still very young, having lots of future potentials. I honestly think what you're doing here has more meaning than you think now. Don't plan to change you opinions, as a gift, want to share 2 stories:

This is one of the examples when I was interacting with FLAML (another open source tool). I asked them many questions, sometimes, even got their answers during the weekends and holidays.... they even ask me feedbacks (you don't need to work this hard). I just feel, they always made me believed they want to make this open source tool better and better.
This is how ZenML cofounder helped me solve a problem. Who can be busier than a cofounder... Finally he helped me found a great solution.

These people all inspired me a lot in work. Sometimes, it's just this type of small things, Q&A made a difference. I wonder 5 years later, how are you guys going to think :)
Anyway, thank you all!

joshdunnlime · 2024-03-01T11:30:42Z

@hanhanwu Thank you for providing the Python snippet. I was able to reproduce the error.

Diagnosis The current version of Optuna has a bug that made XGBoostPruningCallback incompatible with XGBoost 2.0. The bug was fixed in this pull request: optuna/optuna#4921. Unfortunately, the fix is not yet part of the latest release of Optuna (3.3.0).

Solution For now, you can install the development version of Optuna in your Python environment:
pip install git+https://github.com/optuna/optuna.git
This should fix the error.

pip install git+https://github.com/optuna/optuna.git

PenidoRafael · 2024-04-28T04:26:13Z

I still can't use it. I tried to install the development version of Optuna and the latest version of xgboost and it did not work.
Does anyone have a solution working right now?

trivialfis · 2024-04-28T04:47:22Z

Could you please open a new issue with a reproducible example?

trivialfis closed this as completed Sep 25, 2023

hcho3 reopened this Sep 27, 2023

hcho3 closed this as completed Sep 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XGBoost 2.0 doesn't support Optuna Pruning Callback #9608

XGBoost 2.0 doesn't support Optuna Pruning Callback #9608

hanhanwu commented Sep 25, 2023 •

edited

trivialfis commented Sep 25, 2023

hanhanwu commented Sep 25, 2023

hcho3 commented Sep 25, 2023 •

edited

hanhanwu commented Sep 27, 2023

hcho3 commented Sep 27, 2023 •

edited

hcho3 commented Sep 27, 2023

hcho3 commented Sep 27, 2023 •

edited

hanhanwu commented Sep 28, 2023

hanhanwu commented Sep 28, 2023

trivialfis commented Sep 28, 2023 •

edited

hcho3 commented Sep 28, 2023 •

edited

hanhanwu commented Sep 28, 2023

joshdunnlime commented Mar 1, 2024

PenidoRafael commented Apr 28, 2024

trivialfis commented Apr 28, 2024

XGBoost 2.0 doesn't support Optuna Pruning Callback #9608

XGBoost 2.0 doesn't support Optuna Pruning Callback #9608

Comments

hanhanwu commented Sep 25, 2023 • edited

trivialfis commented Sep 25, 2023

hanhanwu commented Sep 25, 2023

hcho3 commented Sep 25, 2023 • edited

hanhanwu commented Sep 27, 2023

hcho3 commented Sep 27, 2023 • edited

hcho3 commented Sep 27, 2023

hcho3 commented Sep 27, 2023 • edited

hanhanwu commented Sep 28, 2023

hanhanwu commented Sep 28, 2023

trivialfis commented Sep 28, 2023 • edited

hcho3 commented Sep 28, 2023 • edited

hanhanwu commented Sep 28, 2023

joshdunnlime commented Mar 1, 2024

PenidoRafael commented Apr 28, 2024

trivialfis commented Apr 28, 2024

hanhanwu commented Sep 25, 2023 •

edited

hcho3 commented Sep 25, 2023 •

edited

hcho3 commented Sep 27, 2023 •

edited

hcho3 commented Sep 27, 2023 •

edited

trivialfis commented Sep 28, 2023 •

edited

hcho3 commented Sep 28, 2023 •

edited