Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DMatrix Refactor for Approximate Static Histogram Based Trees #1673

Closed
tqchen opened this issue Oct 17, 2016 · 1 comment
Closed

DMatrix Refactor for Approximate Static Histogram Based Trees #1673

tqchen opened this issue Oct 17, 2016 · 1 comment

Comments

@tqchen
Copy link
Member

tqchen commented Oct 17, 2016

There has been interesting improvements in histogram based trees. Specifically from LightGBM and FastBDT

While XGBoost support dynamic histograms and exact greedy as option for deeper depth. We would like to think what changes are needed to make DMatrix suitable for histogram aggregations so a faster approximation algorithm can be achieved.

  • Being able get subset of rows efficiently
  • Stored the quantized binning id of feature values

These seems to be achievable in in memory format. For external memory format, getting subset of rows seems to be a bit more complicated.

Let us see what data structure refactor could be done in DMatrix to support the in-memory format easily, and thus enables implementation of the recent improvements in xgboost.

@tqchen
Copy link
Member Author

tqchen commented Jan 9, 2017

#1950

@tqchen tqchen reopened this Jan 9, 2017
@tqchen tqchen closed this as completed Jan 9, 2017
@lock lock bot locked as resolved and limited conversation to collaborators Oct 26, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
No open projects
Histogram Optimized Refactor
Histogram based Refactoring
Development

No branches or pull requests

1 participant