Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XGBoost 2.0 doesn't support Optuna Pruning Callback #9608

Closed
hanhanwu opened this issue Sep 25, 2023 · 15 comments
Closed

XGBoost 2.0 doesn't support Optuna Pruning Callback #9608

hanhanwu opened this issue Sep 25, 2023 · 15 comments

Comments

@hanhanwu
Copy link

hanhanwu commented Sep 25, 2023

In older version XGBoost, we can use Optuna prunning callback (see example code here). Like code below:
image

Now in XGBoost2.0 train() function, Callback has to be TainingCallback (https://xgboost.readthedocs.io/en/stable/python/python_api.html#callback-api), so we can't use Optuna pruning on XGboost any more..

@trivialfis
Copy link
Member

You need the latest optuna: https://github.com/optuna/optuna/blob/a920a91e4e5a32ade707f3e9819fff0f96cc7946/optuna/integration/xgboost.py#L16

@hanhanwu
Copy link
Author

I'm using the latest Optuna (v3.3.0). It doesn't have TrainingCallback XGBoost required. See https://optuna.readthedocs.io/en/stable/reference/generated/optuna.integration.XGBoostPruningCallback.html

And optuna latest code for XGBoost integration: https://github.com/optuna/optuna/blob/master/optuna/integration/xgboost.py

@hcho3
Copy link
Collaborator

hcho3 commented Sep 25, 2023

optuna.integration.XGBoostPruningCallback is a subclass of xgb.callback.TrainingCallback. See this line here: https://github.com/optuna/optuna/blob/a920a91e4e5a32ade707f3e9819fff0f96cc7946/optuna/integration/xgboost.py#L40

So you can use XGBoostPruningCallback in XGBoost 2.0 as well.

@hanhanwu
Copy link
Author

@hcho3 Thank you, but, I'm sure the problem is on XGboost 2.0. Here's my reason:

I just tried older version of xgboost optuna, same code, pruning worked in optuna. Then I upgraded optuna to latest version v3.3.0, till XGboost v1.7.6, pruning still worked in optuna. Problem started appear in xgboost 2.0.

More details, here's my code to use optuna for XGboost. If you guys try this with latest xgboost and latest optuna, you should be able to replicate the error.

def get_hpo_params(opt_X_train, opt_X_val, opt_y_train, opt_y_val, n_trials=180):
    """
        To make below optuna output reproducible, need to set hashseed in your OS: 
        https://gerrychain.readthedocs.io/en/latest/topics/reproducibility.html#set-pythonhashseed-0
        This is ot make pruning behaviour replicable.
    """
    def objective(trial):
        dtrain = xgb.DMatrix(opt_X_train, label=opt_y_train, enable_categorical=True)
        dval = xgb.DMatrix(opt_X_val, label=opt_y_val, enable_categorical=True)

        params = {
            'objective': 'binary:hinge',
            'verbosity': 0,
            'eval_metric': 'auc',
            'seed': 10,
            'nthread': 4,
            'saved_feature_importance_type': 1,  # 1 means "gain", 0 means "split"

            'max_depth': trial.suggest_int('max_depth', 3, 10),
            'learning_rate': trial.suggest_float('learning_rate', 0.001, 0.2),
            'min_child_weight': trial.suggest_int('min_child_weight', 1, 5),
            'subsample': trial.suggest_float('subsample', 0.8, 1),
            'booster': trial.suggest_categorical('booster', ['gbtree', 'dart']),
            'lambda': trial.suggest_float('lambda', 0.001, 0.5),
            'gamma': trial.suggest_float('gamma', 0.1, 1),
            'grow_policy': trial.suggest_categorical('grow_policy', ['depthwise', 'lossguide'])
        }
        
        if params['booster'] == 'dart':
            params['sample_type'] = trial.suggest_categorical('sample_type', ['uniform', 'weighted'])
            params['normalize_type'] = trial.suggest_categorical('normalize_type', ['tree', 'forest'])
            params['rate_drop'] = trial.suggest_float('rate_drop', 0, 0.2)
            params['skip_drop'] = trial.suggest_float('skip_drop', 0, 0.2)
            
        pruning_callback = optuna.integration.XGBoostPruningCallback(trial, 'validation-auc')
            
        clf = xgb.train(
            params, dtrain, evals=[(dval, "validation")], callbacks=[pruning_callback]
        )

        y_val_pred = clf.predict(dval)
        avp = average_precision_score(opt_y_val, y_val_pred)
        return avp


    start = timeit.default_timer()
    study = optuna.create_study(direction="maximize", pruner=optuna.pruners.HyperbandPruner(),
                               sampler=optuna.samplers.TPESampler(seed=10), study_name='xgb_hpo')
    study.optimize(objective, n_trials=n_trials)

    print("Number of finished trials: {}".format(len(study.trials)))
    best_trial = study.best_trial
    print(f"Best trial performance: {best_trial.value}")
    stop = timeit.default_timer()
    print('Time (min): ', (stop - start)/60)
    
    return best_trial.params

The error I got is "TypeError: callback must be an instance of TrainingCallback."

I'm sure the problem is on latest xgboost. I want to use xgboost 2.0 is because it has enable_categorical to make my life easier. If not because my work needs to compare LGBM and XGBoost, I won't want to use XGboost, based on such experience. LGBM has much better user experience.

Please took into this issue and fix this. Otherwise, who wants to use XGboost?

@hcho3
Copy link
Collaborator

hcho3 commented Sep 27, 2023

Otherwise, who wants to use XGboost?

This veers into unnecessary personal attack. Please do not do this again. Your post was fine without this sentence. If you keep posting uncivil comments like this, we will have to ban you.

I understand you are frustrated. (Trust me, bugs frustrate us maintainers too.) Still, give us maintainers proper respect.

@hcho3 hcho3 reopened this Sep 27, 2023
@hcho3
Copy link
Collaborator

hcho3 commented Sep 27, 2023

Reopening, so that we can investigate

@hcho3
Copy link
Collaborator

hcho3 commented Sep 27, 2023

@hanhanwu Thank you for providing the Python snippet. I was able to reproduce the error.

Diagnosis
The current version of Optuna has a bug that made XGBoostPruningCallback incompatible with XGBoost 2.0. The bug was fixed in this pull request: optuna/optuna#4921. Unfortunately, the fix is not yet part of the latest release of Optuna (3.3.0).

Solution
For now, you can install the development version of Optuna in your Python environment:

pip install git+https://github.com/optuna/optuna.git

This should fix the error.

@hcho3 hcho3 closed this as completed Sep 27, 2023
@hanhanwu
Copy link
Author

I see, thank you for the investigation.

@hanhanwu
Copy link
Author

Otherwise, who wants to use XGboost?

This veers into unnecessary personal attack. Please do not do this again. Your post was fine without this sentence. If you keep posting uncivil comments like this, we will have to ban you.

I understand you are frustrated. (Trust me, bugs frustrate us maintainers too.) Still, give us maintainers proper respect.

Sorry if the words made you guys feel disrespectful. In fact I felt being treated without respect here in the first place.

I need to provide feedback here:

  1. I think if you guys ask users to provide more details in the question is even better than throwing out an ineffective answer and closed the issue immediately. Like what I experienced from 1st response. What irritated me is not any bug, but seeing the issue got closed when the problem is not solved at all, and I believed people didn't look into this problem. This made me feel disrespect.
  2. If you guys are spending time on an issue, why not try to do a good job, to help find out the reason? The 2nd response I got was better, theoretically it's correct, but if I can really use it, why would I post a question here. It still put me in a helpless situation.
    Finally, you found the reason. That's great, I appreciate. Since finally you still spent the effort to dig out the reason, why it didn't happen earlier? Is there anything you guys should do more?

I don't care to be banned here. If asking a question can't get helpful answer and only felt more helpless, it's no better than being banned. I can go to Stackoverflow. Also, as a user, if the user experience is unpleasant, I won't use the tool, no matter how much efforts you guys spent on this product, especially when there're alternatives and future new tools. User interaction here is also part of user experience.

@trivialfis
Copy link
Member

trivialfis commented Sep 28, 2023

Sure, apologies for the inconvenience. Sometimes I'm too used to these types of issues and assume that people will get the answer by looking into some of the code, hence the referenced link to optuna. There's definitely room for improvement.

It's mostly open source project's issues, you see, it's not a product, and we are not customer service. I started contribution as a student in school. We expect the community to cooperate and help given that xgboost is a library and users are developers themselves, at least try to give some quick debugging to the issues at hand, it's not unusual for me to ask others do they want to open a PR.

Feel free to reopen the issue if the answer is unsatisfying, we can provide further assistance at best effort.

@hcho3
Copy link
Collaborator

hcho3 commented Sep 28, 2023

Apologies for closing the issue prematurely. Next time, we won't close issues until they are resolved. Also I am sorry that I rushed my 2nd response (#9608 (comment)). It was technically correct but not very helpful. I will try to do better next time.

Since finally you still spent the effort to dig out the reason, why it didn't happen earlier?

Mostly because we are limited in our time. Like other open source projects, we try to provide support at the best effort basis. (Of course, this does not excuse closing this issue prematurely. I am just providing a context here.)

@hanhanwu
Copy link
Author

Thank you guys!

You guys are still very young, having lots of future potentials. I honestly think what you're doing here has more meaning than you think now. Don't plan to change you opinions, as a gift, want to share 2 stories:

  • This is one of the examples when I was interacting with FLAML (another open source tool). I asked them many questions, sometimes, even got their answers during the weekends and holidays.... they even ask me feedbacks (you don't need to work this hard). I just feel, they always made me believed they want to make this open source tool better and better.
  • This is how ZenML cofounder helped me solve a problem. Who can be busier than a cofounder... Finally he helped me found a great solution.

These people all inspired me a lot in work. Sometimes, it's just this type of small things, Q&A made a difference. I wonder 5 years later, how are you guys going to think :)
Anyway, thank you all!

@joshdunnlime
Copy link

@hanhanwu Thank you for providing the Python snippet. I was able to reproduce the error.

Diagnosis The current version of Optuna has a bug that made XGBoostPruningCallback incompatible with XGBoost 2.0. The bug was fixed in this pull request: optuna/optuna#4921. Unfortunately, the fix is not yet part of the latest release of Optuna (3.3.0).

Solution For now, you can install the development version of Optuna in your Python environment:

pip install git+https://github.com/optuna/optuna.git

This should fix the error.

pip install git+https://github.com/optuna/optuna.git

@PenidoRafael
Copy link

I still can't use it. I tried to install the development version of Optuna and the latest version of xgboost and it did not work.
Does anyone have a solution working right now?

@trivialfis
Copy link
Member

Could you please open a new issue with a reproducible example?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants