Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EXA speed up ensemble/tests/test_gradient_boosting #21903

Conversation

glennfrutiz
Copy link
Contributor

…g.py::test_gradient_boosting_early_stopping.

Reference Issues/PRs

Meta-issue: accelerate the slowest running tests #21407

What does this implement/fix? Explain your changes.

These changes halve the magnitude of n_estimators used in various functions called within the test.
As a result, the run time of the test has sped up to under 1 second on my machine.
Before changes it took 2.05 seconds and decreased to 0.6 seconds.

Any other comments?

…g.py::test_gradient_boosting_early_stopping.
@glennfrutiz glennfrutiz changed the title Attempt to speed up the call for ensemble/tests/test_gradient_boostin… [MRG]Attempt to speed up the call for ensemble/tests/test_gradient_boostin… Dec 6, 2021
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, however the proposed changes break the intended meaning of the test that is to check that the actual number of trees learned without the early stopping option is significantly larger than the actual number of trees learned with the early stopping option enabled for the same learning rate value.

Please consider the proposed change below that will however increase back the test duration a bit.

It might be possible to further speed-up this test by decreasing the training set size, for instance with:

X, y = make_classification(n_samples=300, random_state=0)

however this might require adjusting the expected number of trees when early stopping is enabled to preserve the intent of the test.

sklearn/ensemble/tests/test_gradient_boosting.py Outdated Show resolved Hide resolved
sklearn/ensemble/tests/test_gradient_boosting.py Outdated Show resolved Hide resolved
sklearn/ensemble/tests/test_gradient_boosting.py Outdated Show resolved Hide resolved
sklearn/ensemble/tests/test_gradient_boosting.py Outdated Show resolved Hide resolved
sklearn/ensemble/tests/test_gradient_boosting.py Outdated Show resolved Hide resolved
glennfrutiz and others added 5 commits December 6, 2021 16:22
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
@glennfrutiz
Copy link
Contributor Author

glennfrutiz commented Dec 7, 2021

@ogrisel Thank you so much for the detailed suggestions you have left. They have definitely clarified my understanding of this test. I have applied the changes you suggested. I attempted to decrease the training set size and adjust the number of trees when early stopping but I did not have any luck. With your suggestions the run time for this test on my machine is now 0.86 seconds.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I attempted to decrease the training set size and adjust the number of trees when early stopping but I did not have any luck.

Thanks for trying! A negative experiment result can be a positive contribution in itself :)

LGTM!

Copy link
Member

@lorentzenchr lorentzenchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lorentzenchr
Copy link
Member

A negative experiment result can be a positive contribution in itself :)

@ogrisel I couldn't agree more!

@lorentzenchr lorentzenchr changed the title [MRG]Attempt to speed up the call for ensemble/tests/test_gradient_boostin… EXA speed up the call for ensemble/tests/test_gradient_boosting_early_stopping Dec 7, 2021
@lorentzenchr lorentzenchr changed the title EXA speed up the call for ensemble/tests/test_gradient_boosting_early_stopping EXA speed up ensemble/tests/test_gradient_boosting Dec 7, 2021
@lorentzenchr lorentzenchr merged commit b05a568 into scikit-learn:main Dec 7, 2021
@glennfrutiz glennfrutiz deleted the speed_up_test_gradient_boosting_early_stopping branch December 7, 2021 17:27
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Dec 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants