New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TST Speed up slow test linear_model/tests/test_quantile.py::test_asymmetric_error #21546
TST Speed up slow test linear_model/tests/test_quantile.py::test_asymmetric_error #21546
Conversation
I have the feeling that this test cannot really be easily and significantly be sped-up without altering it too much. I think it's an important test so I think we can leave it as is. The fact that it is quite sensitive to the seed of the dataset random generation procedure is bad but I don't see any easy what to improve this. |
Unless @lorentzenchr has a better idea. |
I consider |
I agree.
Note sure. @simonandras would you be interested in timing each part of the test to know their relative durations?
You mean skip if scipy < 1.15, right? I think that's an interesting idea. |
Using the
So the
It is much faster now but unfortunately it fails one assertion in line 169: (All of the 3 quantile cases behaves similarily with the runtime and with the failing one assertion. The above timing was made on the 0.2 quantile case.) |
I think we can relax the rtol. |
Maybe, we can first improve the fit precision via solver_options (have a look at the scipy docs for highs) and then relax rtol the missing amount. 1e-3 is not really tight. |
A cheap solution but if we set the random seed to 2 then it works without modification. What do you think? |
In case of |
I correct myself here: actually there is no other seed that is good with the current error tolerance in both the 3 quantile case. I think we have to change the tolerance of that one failing assertion if we want to use the We have already an assertion like that in the |
d97d11f
to
bd7a8ab
Compare
I think as a general rule, we should use tolerance levels that works with many arbitrary seeds. If the test relies on seed cherry-picking, it will be too brittle and can randomly fail in the future on different platforms or on upgrades of dependencies such as numpy/scipy/openblas. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the approach to use the highs
solver is the right way to tackle the performance issue of this test. Here is a few more feedback to make this test more stable and less likely to fail arbitrarily in the future.
0386307
to
a279b89
Compare
So we should use |
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
Let me push the suggestions of the code review to see if everything is good now. |
I pushed a commit with the switch to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for the contrib @simonandras!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one nitpick.
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
Reference Issues/PRs
Towards #21407
What does this implement/fix? Explain your changes.
Currently the test with the 3 different settings runs in 6-7 seconds on my computer. The goal is to achieve less then 2-3 sec runtime (less then 1 sec / one quantile).
About the test so far:
The
test_asymmetric_error
function tests the expected behavior of theQuantileRegressor
estimator on a generated data where the quantile is linear and known. The data is assymetric, which means that it consists of 2 parts which are different (see the plot).Ideas so far:
The basic idea is to reduce the sample size of the test data (currently it is 1000). However if i do that the assertions will fail with the current error tolerance. Noticed that if i increase the data size to 1100 the tests will fail also. For first it looks wierd, and the error bounds seems for me arbitrary so far.
Any other comments?
The work is in progress.