-
-
Notifications
You must be signed in to change notification settings - Fork 25.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EXA speed up ensemble/tests/test_gradient_boosting #21903
EXA speed up ensemble/tests/test_gradient_boosting #21903
Conversation
…g.py::test_gradient_boosting_early_stopping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, however the proposed changes break the intended meaning of the test that is to check that the actual number of trees learned without the early stopping option is significantly larger than the actual number of trees learned with the early stopping option enabled for the same learning rate value.
Please consider the proposed change below that will however increase back the test duration a bit.
It might be possible to further speed-up this test by decreasing the training set size, for instance with:
X, y = make_classification(n_samples=300, random_state=0)
however this might require adjusting the expected number of trees when early stopping is enabled to preserve the intent of the test.
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
@ogrisel Thank you so much for the detailed suggestions you have left. They have definitely clarified my understanding of this test. I have applied the changes you suggested. I attempted to decrease the training set size and adjust the number of trees when early stopping but I did not have any luck. With your suggestions the run time for this test on my machine is now 0.86 seconds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I attempted to decrease the training set size and adjust the number of trees when early stopping but I did not have any luck.
Thanks for trying! A negative experiment result can be a positive contribution in itself :)
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@ogrisel I couldn't agree more! |
…g.py::test_gradient_boosting_early_stopping.
Reference Issues/PRs
Meta-issue: accelerate the slowest running tests #21407
What does this implement/fix? Explain your changes.
These changes halve the magnitude of n_estimators used in various functions called within the test.
As a result, the run time of the test has sped up to under 1 second on my machine.
Before changes it took 2.05 seconds and decreased to 0.6 seconds.
Any other comments?