Skip to content

[ML] Performance optimisation for classification and regression model training #2024

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Sep 14, 2021

Conversation

tveasey
Copy link
Contributor

@tveasey tveasey commented Sep 13, 2021

Caching example splits proves to be a good runtime optimisation. Testing using our experiment driver on around 250 small data sets the mean runtime dropped from 380s to 320s. However, the runtime improvements were larger on the larger data sets. I'll attach some benchmarks for large datasets as well. I also removed CImmutableRadixSet which was no longer used since looking up splits is not on the hot path.

Note that this does have potential to change results slightly. The source of the difference is that we now store the candidate splits in float precision rather than double precision.

Copy link
Contributor

@valeriy42 valeriy42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work and great results! I have just a few comments regarding readability. Other than those, LGTM! 👍

@tveasey
Copy link
Contributor Author

tveasey commented Sep 14, 2021

The failure was timing related CStringStoreTest/testStringStore so I'm going to go ahead and merge this.

@tveasey tveasey merged commit a5b812b into elastic:main Sep 14, 2021
@tveasey tveasey deleted the cache-splits-main branch September 14, 2021 14:35
tveasey added a commit that referenced this pull request Sep 15, 2021
… training (#2027)

Caching example splits proves to be a good runtime optimisation. Testing using our experiment driver on around 250
small data sets the mean runtime dropped from 380s to 320s. However, the runtime improvements were larger on the
larger data sets. I'll attach some benchmarks for large datasets as well. I also removed CImmutableRadixSet which was
no longer used since looking up splits is not on the hot path.

Note that this does have potential to change results slightly. The source of the difference is that we now store the
candidate splits in float precision rather than double precision.

This ports #2024 to the incremental training feature branch. This collides with a certain amount of code change so I've
essentially reapplied the change to this branch.
@tveasey
Copy link
Contributor Author

tveasey commented Sep 16, 2021

Training on 80% Higgs 1M runtime before 9057s after 7641s.

tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request Oct 11, 2021
… training (elastic#2024)

Caching example splits proves to be a good runtime optimisation. Testing using our experiment driver on around 250
small data sets the mean runtime dropped from 380s to 320s. However, the runtime improvements were larger on the
larger data sets. I'll attach some benchmarks for large datasets as well. I also removed CImmutableRadixSet which was
no longer used since looking up splits is not on the hot path.

Note that this does have potential to change results slightly. The source of the difference is that we now store the
candidate splits in float precision rather than double precision.
@tveasey tveasey added the v8.0.0 label Oct 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants