New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dask] Add DaskXGBRanker #6576
[dask] Add DaskXGBRanker #6576
Conversation
Question to myself: Right now the qid is required to be sorted for input. Maybe we need to perform the sorting ourselves? |
52348d0
to
ed76b8f
Compare
Codecov Report
@@ Coverage Diff @@
## master #6576 +/- ##
==========================================
+ Coverage 80.23% 80.49% +0.25%
==========================================
Files 13 13
Lines 3613 3665 +52
==========================================
+ Hits 2899 2950 +51
- Misses 714 715 +1
Continue to review full report at Codecov.
|
I need to disable the support for group weight for now. The use of qid is to avoid having too many data manipulation code in Python. But per-group weight is kind of unavoidable. |
54d0795
to
0474666
Compare
* Support `qid` in libxgboost. * Refactor `predict` and `n_features_in_`, `best_[score/iteration/ntree_limit]` to avoid duplicated code. * Define `DaskXGBRanker`. The dask ranker doesn't support group structure, instead it uses query id and convert to group ptr internally.
f406a48
to
fb50247
Compare
Initial support for distributed LTR using dask.
qid
in libxgboost.predict
andn_features_in_
,best_[score/iteration/ntree_limit]
to avoid duplicated code.DaskXGBRanker
.The dask ranker doesn't support group structure, instead it uses query id and convert to group ptr internally.