Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Reduce peak memory usage and improve estimation for boosted tree training #781

Merged
merged 3 commits into from
Oct 25, 2019

Conversation

tveasey
Copy link
Contributor

@tveasey tveasey commented Oct 25, 2019

This makes two changes:

  1. It calculates the best split in the node statistics constructor, which was calculated as soon as the nodes were pushed to the priority queue (so effectively immediately). The upshot is we don't need to copy the feature bag.
  2. Fixes an over counting bug in memory estimation. There are at most number of leaves leaf statistics objects not number of nodes, as we were using.

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tveasey tveasey merged commit a570218 into elastic:master Oct 25, 2019
@tveasey tveasey deleted the improve-memory-estimation branch October 25, 2019 10:35
tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request Oct 25, 2019
tveasey added a commit that referenced this pull request Oct 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants