Add feature weights support #265

mozjay0619 · 2023-02-04T22:46:29Z

This PR adds support for feature_weights parameter, which is xgb.DMatrix parameter since Xgboost version 1.3.0. This PR does not add this support for RayDeviceQuantileDMatrix class.

Yard1

Thanks for the PR! Can you run format.sh in the root folder to make sure the code is properly formatted?

Yard1 · 2023-02-05T04:56:18Z

setup.py

@@ -3,7 +3,7 @@
 setup(
    name="xgboost_ray",
    packages=find_packages(where=".", include="xgboost_ray*"),
-    version="0.1.14",
+    version="0.1.15",


There is no need to change the version in this PR.

Yeah, thanks for the comment. Will do!

mozjay0619 · 2023-02-06T18:02:58Z

@Yard1 I'm not sure how to proceed with the latest pytest failure. May be this is a temporary issue on Azure repo side? The error message says Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?, but I am not sure how to do this. Could you please advise? Thanks!

Yard1 · 2023-02-06T18:39:50Z

@mozjay0619 I just reran the job and it passed that point. It looks like a random failure not related to your changes.

Yard1 · 2023-02-06T20:30:49Z

hmm @mozjay0619 we seem to have some actual errors, can you check?

mozjay0619 · 2023-02-06T21:20:36Z

@Yard1 Sorry, let me explain the nature of the error since we've had this multiple times. The feature_weights leverages the colsample feature, so it is inherently stochastic. The weights are configured to be monotonically increasing with the last (9th) feature having the biggest contribution. But due to the stochasticity, the second to last feature sometimes overpowers the last feature. I increased the boosting rounds for convergence and lowered the colsample rates as well. And this time, I tested it against 100 random seed values. I can take more drastic measures but I don't want to alter the code too much from the original demo code provided by xgboost official doc... But it is working properly since the feature with 0 weight is not being selected at all. If it fails again this time, I will make the weight distribution extremely skewed.

Yard1 · 2023-02-06T22:22:04Z

@mozjay0619 can we also set the seed as a parameter for xgboost? The numpy seed will not be carried over to the xgboost-ray Actors carrying out the training.

mozjay0619 · 2023-02-06T23:09:30Z

@Yard1 Yeah, I had entertained that exact idea. But I couldn't find an easy way to do it since the xgboost.train() method does not have seed parameter, and it relies on numpy random state. I suspect this is why the demo code sets numpy random state. One hacky way might be to create a seed parameter to train method in xgboost_ray and use that number to set the local seed for each Actor? But I'm not sure what this would actually do in the context of distributed xgboost training...

The current tests all seem to have passed the feature weight part. I would suggest adding the random state support as a separate PR. But please let me know if you feel very strongly about this, I can look into it more :)

Yard1 · 2023-02-07T03:35:06Z

I believe it has a seed learning parameter but if the test passes here then I am fine with that. Thanks for iterating on this!

mozjay0619 added 2 commits February 4, 2023 14:17

added support for feature_weights parameter and associated tests

a586857

added tests for the feature_weights parameter

b090be1

mozjay0619 mentioned this pull request Feb 4, 2023

Adding to support to feature_weights #264

Closed

Yard1 reviewed Feb 5, 2023

View reviewed changes

Yard1 self-assigned this Feb 5, 2023

mozjay0619 added 3 commits February 4, 2023 21:39

ran code formatter

320ad4d

revert version

1e3d721

address lint_test failures

a149954

mozjay0619 requested a review from Yard1 February 5, 2023 06:53

mozjay0619 added 3 commits February 5, 2023 00:01

addressed further test_lint failures

28cdd0b

addressing pytest failure reduce resource request

1b22df5

addressing pytest failure increase boosting rounds for convergence

cfd03a1

addressing pytest failure increase boosting rounds for convergence

2a9b3c0

Yard1 approved these changes Feb 7, 2023

View reviewed changes

Yard1 merged commit dcdc4b7 into ray-project:master Feb 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add feature weights support #265

Add feature weights support #265

mozjay0619 commented Feb 4, 2023 •

edited

Yard1 left a comment

Yard1 Feb 5, 2023

mozjay0619 Feb 5, 2023

mozjay0619 commented Feb 6, 2023

Yard1 commented Feb 6, 2023

Yard1 commented Feb 6, 2023

mozjay0619 commented Feb 6, 2023 •

edited

Yard1 commented Feb 6, 2023

mozjay0619 commented Feb 6, 2023 •

edited

Yard1 commented Feb 7, 2023

Add feature weights support #265

Add feature weights support #265

Conversation

mozjay0619 commented Feb 4, 2023 • edited

Yard1 left a comment

Choose a reason for hiding this comment

Yard1 Feb 5, 2023

Choose a reason for hiding this comment

mozjay0619 Feb 5, 2023

Choose a reason for hiding this comment

mozjay0619 commented Feb 6, 2023

Yard1 commented Feb 6, 2023

Yard1 commented Feb 6, 2023

mozjay0619 commented Feb 6, 2023 • edited

Yard1 commented Feb 6, 2023

mozjay0619 commented Feb 6, 2023 • edited

Yard1 commented Feb 7, 2023

mozjay0619 commented Feb 4, 2023 •

edited

mozjay0619 commented Feb 6, 2023 •

edited

mozjay0619 commented Feb 6, 2023 •

edited