Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add base_margin for evaluation dataset. #6591

Merged
merged 6 commits into from
Jan 25, 2021

Conversation

trivialfis
Copy link
Member

@trivialfis trivialfis commented Jan 11, 2021

Close #6583 .
Close #6300 .

  • Add base_margin to evaluation dataset in skl.
  • Add all available meta info DMatrix constructor, and deprecate the pos arguments.
  • Apply similar changes to dask.
  • This PR also unifies the evaluation metric configuration between dask and non-dask code paths. This adds additional validation and saves computation when training data is used for evaluation.

I will split up this mono PR after getting all the pieces together. With growing size of meta info, I need to do some refactoring.

Small parts are extracted to:

@trivialfis
Copy link
Member Author

Note to myself: Compile a list of entry points of meta info.

@trivialfis
Copy link
Member Author

@hcho3 CI seems to be running into trouble. ;-(

@hcho3
Copy link
Collaborator

hcho3 commented Jan 12, 2021

@trivialfis Can you fix this error?

TypeError: Optional[t] requires a single type. Got (<class 'xgboost.core.Booster'>, 'XGBModel')

@trivialfis
Copy link
Member Author

Yup, I will be working on this today.

Copy link
Collaborator

@hcho3 hcho3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for consolidating common logic. It looks much nicer.

python-package/xgboost/core.py Outdated Show resolved Hide resolved
python-package/xgboost/core.py Outdated Show resolved Hide resolved
python-package/xgboost/dask.py Outdated Show resolved Hide resolved
python-package/xgboost/dask.py Outdated Show resolved Hide resolved
python-package/xgboost/sklearn.py Show resolved Hide resolved
tests/python/test_with_dask.py Outdated Show resolved Hide resolved
@trivialfis
Copy link
Member Author

Since I have spited this PR into smaller part in #6601 and will continue to do so for better review, I will add links to related PRs.

@trivialfis
Copy link
Member Author

Added a test for avoiding duplicated DMatrix after rebasing.

@trivialfis
Copy link
Member Author

@hcho3 Ready for another round of review.

@codecov-io
Copy link

codecov-io commented Jan 25, 2021

Codecov Report

Merging #6591 (96c4dc7) into master (8942c98) will increase coverage by 0.03%.
The diff coverage is 88.75%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6591      +/-   ##
==========================================
+ Coverage   80.95%   80.98%   +0.03%     
==========================================
  Files          13       13              
  Lines        3696     3676      -20     
==========================================
- Hits         2992     2977      -15     
+ Misses        704      699       -5     
Impacted Files Coverage Δ
python-package/xgboost/sklearn.py 91.09% <86.66%> (+1.48%) ⬆️
python-package/xgboost/dask.py 82.05% <91.42%> (-0.05%) ⬇️
python-package/xgboost/tracker.py 93.98% <0.00%> (-1.13%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8942c98...96c4dc7. Read the comment docs.

@trivialfis trivialfis merged commit 740d042 into dmlc:master Jan 25, 2021
@trivialfis trivialfis deleted the eval_set_base_margin branch January 25, 2021 18:11
@trivialfis trivialfis mentioned this pull request Jan 25, 2021
23 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants