Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small refactor to categoricals #7858

Merged
merged 5 commits into from
May 5, 2022
Merged

Conversation

RAMitchell
Copy link
Member

Current implementation blocks any changes to the order in which nodes are opened in the tree updater. When a one hot encoded categorical split occurs, this information was not stored on the device. In order for the update position function to work correctly this information had been copied to the device beforehand inside the ApplySplit member function. Therefore the update position call had a dependence on the ApplySplit function for the same node being called directly before.

This PR changes the one hot encoding path to store categorical split information on the device in the same way as sort based splits, making it available on the device.

I changed the vectors storing this categorical split information for all nodes to expand when necessary instead of being initialised according to the maximum tree size. This prevents OOM in cases where the user specifies very deep trees which end up being sparse.

src/tree/gpu_hist/evaluator.cu Outdated Show resolved Hide resolved
src/tree/updater_gpu_hist.cu Show resolved Hide resolved
@RAMitchell RAMitchell merged commit 7ef54e3 into dmlc:master May 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants