Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use bst.eval_set() and bst.update() with xgboost_ray #248

Open
Jeffwan opened this issue Dec 6, 2022 · 3 comments
Open

How to use bst.eval_set() and bst.update() with xgboost_ray #248

Jeffwan opened this issue Dec 6, 2022 · 3 comments

Comments

@Jeffwan
Copy link

Jeffwan commented Dec 6, 2022

I am trying to adopt xgboost_ray for a xgboost project. Currently I meet a problem. The original code is doing some fine grain control on the training process. for every iteration

       eval_results = self.bst.eval_set(
            evals=[(self.dmat_train, "train"), (self.dmat_valid, "valid")], iteration=self.bst.num_boosted_rounds() - 1
        )
        self.log_info(fl_ctx, eval_results)
        auc = float(eval_results.split("\t")[2].split(":")[1])
        for i in range(self.trees_per_round):
            self.bst.update(self.dmat_train, self.bst.num_boosted_rounds())

        # extract newly added self.trees_per_round using xgboost slicing api
        bst = self.bst[self.bst.num_boosted_rounds() - self.trees_per_round : self.bst.num_boosted_rounds()]

code source: https://github.com/NVIDIA/NVFlare/blob/dev/nvflare/app_opt/xgboost/tree_based/executor.py#L153-L174

Note: I already get bst object from xgboost_ray.train()

There're two blockers, they are bst.eval_set() and bst.update() since bst is from xgboost library, it won't accept RDMatrix which throws an error here.

  File "/usr/local/lib/python3.8/site-packages/xgboost/core.py", line 1980, in eval_set
    raise TypeError(f"expected DMatrix, got {type(d[0]).__name__}")
TypeError: expected DMatrix, got RayDMatrix

I look at the documentation and can not find the replacement like predict. How can I make it?

/cc @Yard1

@Yard1
Copy link
Member

Yard1 commented Dec 6, 2022

It looks like you are implementing your own training loop. This goes beyond what xgboost-ray provides out of the box.

You'd most likely need to subclass the internal RayXGBoostActor (xgboost_ray/main.py) and replace the logic inside the predict method, which is ran on every worker using normal xgboost (which is configured to communicate with other workers through the rabit tracker). We do not provide an API to pass your own Actor class, so you'll have to most likely monkey-patch it.

I would be happy to look into making this process smoother by providing developer APIs.

@Jeffwan
Copy link
Author

Jeffwan commented Dec 7, 2022

This goes beyond what xgboost-ray provides out of the box.

Thanks. I know this is beyong the scope right now. Does xgboost_ray have a plan to support it later?

We do not provide an API to pass your own Actor class, so you'll have to most likely monkey-patch it.

Seems I need to replicate some functions similar like train() or predict() but using custom RayXGBoostActor? This requires me fully understand the codes in xgboost_ray and do you think there's a easier way to support my use case?

@Yard1
Copy link
Member

Yard1 commented Dec 7, 2022

I think the train() and predict() methods of RayXGBoostActor are relatively straightforward and do not require knowledge of the entire xgboost-ray codebase. I do not believe there's an easier way.

We can add some extra developer APIs to make modifying the training/prediction behavior easier.

I'd be happy to schedule a chat to talk about this, if you think that'll be helpful! Please email me at antoni [at] anyscale.com

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants