MAINT reduce scope of test_linear_models_cv_fit_for_all_backends to reduce CI usage #21918

ogrisel · 2021-12-08T11:10:12Z

Alternative to #21907.
Towards: #21407.

Do not test for the threading backend which was not involved in the original problem.
Do not test for the multitask variants that are much more computationally intensive. LassoCV and ElasticNetCV should be enough to cover as a non-regression test for the original problem.
Use the minimal dataset for the fewest number of features to trigger memmaping in joblib.

…educe CI usage

thomasjpfan

I left comment about future API for joblib.

Overall this PR LGTM.

thomasjpfan · 2021-12-09T16:57:44Z

sklearn/linear_model/tests/test_coordinate_descent.py

+    # Unfortunately the scikit-learn and joblib APIs do not make it possible to
+    # change the max_nbyte of the inner Parallel call.


Do you think there should be an API for changing max_nbytes in the inner Parallel call? Something like:

with parallel_backend("loky", max_nbytes=1000): results = Parallel(n_jobs=4)(delayed(func)(x, y) for x, y in data)

Yes that would be nice but unfortunately this conflicts with the current joblib backend API design and dealing with backward compat is...

If it is hard for joblib API wise, what do you think about adding a parallel_kwargs parameter to estimators that creates a Parallel object?

(I know this is a little counter to how we have been removing pre_dispatch from the estimator's __init__.)

We should probably find a way to change the parallel_backend to detect Parallel kwargs and treat time specifically instead of passing them as constructor arguments for the backend.

ogrisel · 2021-12-09T17:07:36Z

@jeremiedbb you might want to give this PR a second review.

jeremiedbb

This test is a non regression test for a fix regarding using the loky backend. By default these estimators use the threading backend which has always been working. We need to keep testing with the "loky" backend

sklearn/linear_model/tests/test_coordinate_descent.py

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

ogrisel · 2021-12-09T17:46:51Z

This test is a non regression test for a fix regarding using the loky backend. By default these estimators use the threading backend which has always been working. We need to keep testing with the "loky" backend

Indeed, good catch, I had not realized. Let's see if the CI stays green.

sklearn/linear_model/tests/test_coordinate_descent.py

jeremiedbb

LGTM. Do you know how much time it saves ?

ogrisel · 2021-12-10T01:43:04Z

When running sklearn/linear_model/tests/test_coordinate_descent.py in with pytest-xdist in oversubscription context (-n 8 on a 8 core machine):

on main:

================================================================== slowest 10 durations ===================================================================
7.37s call     sklearn/linear_model/tests/test_coordinate_descent.py::test_linear_models_cv_fit_for_all_backends[MultiTaskElasticNetCV-threading]
6.97s call     sklearn/linear_model/tests/test_coordinate_descent.py::test_linear_models_cv_fit_for_all_backends[MultiTaskLassoCV-threading]
5.18s call     sklearn/linear_model/tests/test_coordinate_descent.py::test_linear_models_cv_fit_for_all_backends[MultiTaskElasticNetCV-loky]
4.70s call     sklearn/linear_model/tests/test_coordinate_descent.py::test_linear_models_cv_fit_for_all_backends[MultiTaskLassoCV-loky]
1.10s call     sklearn/linear_model/tests/test_coordinate_descent.py::test_linear_models_cv_fit_for_all_backends[ElasticNetCV-loky]
... other unrelated tests

on this branch:

================================================================== slowest 10 durations ===================================================================
0.89s call     sklearn/linear_model/tests/test_coordinate_descent.py::test_linear_models_cv_fit_with_loky[ElasticNetCV]
0.88s call     sklearn/linear_model/tests/test_coordinate_descent.py::test_linear_models_cv_fit_with_loky[LassoCV]
... other unrelated tests

…educe CI usage (scikit-learn#21918) Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

…educe CI usage (#21918) Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

MAINT reduce scope of test_linear_models_cv_fit_for_all_backends to r…

c40d173

…educe CI usage

github-actions bot added module:linear_model Build / CI labels Dec 8, 2021

ogrisel added the No Changelog Needed label Dec 8, 2021

ogrisel mentioned this pull request Dec 8, 2021

Meta-issue: accelerate the slowest running tests #21407

Closed

24 tasks

thomasjpfan approved these changes Dec 9, 2021

View reviewed changes

jeremiedbb requested changes Dec 9, 2021

View reviewed changes

thomasjpfan reviewed Dec 9, 2021

View reviewed changes

sklearn/linear_model/tests/test_coordinate_descent.py Outdated Show resolved Hide resolved