BUG: fix convergence tolerance in `_lsq.check_termination` #13806

FFroehlich · 2021-04-05T00:31:57Z

Reference issue

What does this implement/fix?

The implemented convergence check is quite unusual and I believe this was not intended. This changes the tolerance check such that xtol is a canonical absolute + relative tolerance

Additional information

rgommers · 2021-04-05T12:32:33Z

Thanks @FFroehlich. Did you find this problem when using one of the optimization functions? And can you construct a regression test (i.e. a unit test that fails before this fix, and works after it)?

Also note that tests are failing right now.

FFroehlich · 2021-04-05T17:35:17Z

Thanks @FFroehlich. Did you find this problem when using one of the optimization functions? And can you construct a regression test (i.e. a unit test that fails before this fix, and works after it)?

Also note that tests are failing right now.

I found this in the context of an optimizer comparison where I looked at convergence criteria of multiple optimizers. So it's not a practical issue I encountered, but more of a conceptual one.

As this effectively results in less stringent (but theoretically more sound) convergence criteria, I find it difficult to provide a good regression test for this, especially since the optimizers do not store the optimization history. The best I could come up with checking the iteration number. With the previos implementation, optimization for scipy.optimize.tests.test_least_squares required up to 42 iterations, which is now down to 38.

rgommers · 2021-04-05T18:28:21Z

It looks to me like xtol * (xtol + x_norm) was on purpose: the relative tolerance is xtol * x_norm and xtol**2 is a much smaller term to ensure the result is > 0. Having to loosen test tolerances shows that this isn't necessarily a good change.

@nmayorov you wrote this code, could you have a look?

FFroehlich · 2021-04-06T03:28:47Z

It looks to me like xtol * (xtol + x_norm) was on purpose: the relative tolerance is xtol * x_norm and xtol**2 is a much smaller term to ensure the result is > 0.

But is it theoretically justified? For me this is strikes me as a rather unconventional tolerance check.

Having to loosen test tolerances shows that this isn't necessarily a good change.

I am not sure whether I would agree with this line of argument. I could simply multiply tolerances by 1e-2 or some other constant to ensure that test pass, but that wouldn't improve quality of the change. You could also argue that the optimization now terminates with 4 fewer steps (almost 10%) and just barely doesn't meet the already pretty stringent tolerance of 1e-14.

nmayorov · 2021-04-07T20:54:54Z

But is it theoretically justified? For me this is strikes me as a rather unconventional tolerance check.

It's tricky. I believe I definitely saw it implemented like this somewhere, but I agree it looks unconventional.

Here are my thoughts on it. It is common wisdom that for large enough floating point numbers the correct measure of accuracy is relative tolerance, because computational algorithms will have errors proportional to the target value they estimate (not the best explanation, but I think it's more or less true and correct).

The situation changes as the target value goes down to zero. And I wish there was some universal guidance and explanation on what to do in this situation, but as far as I know there is none. The most correct advice seems to be to specify "problem specific" absolute tolerance in this case. But having xtol_rel and xtol_abs would be too much for the already complicated list of arguments. So we have the single xtol and I come up with this "hybrid tolerance" formula, honestly don't remember why and how exactly.

At first glance this formula has the following properties (assuming the final x is very close to the true value):

When |x| >> xtol, then xtol works as standard relative tolerance.
When x is actually zero, then xtol**2 will drive the solution closer to zero compared to xtol in the conventional formula, i.e. closer to the true value (at the expense of more iterations).
If |x| is small enough (~<< xtol), then xtol * (xtol + |x|) will also find more accurate solution (with more iterations).

My conclusion is that the implemented formula will basically try to find more accurate solution if x is near zero. I don't have strong arguments why it's good, maybe it looked like it somehow "balances" quality of the solution better for small and high |x| (note that initially the true value is unknown). I don't think it's bad either, because there is no universal measure of accuracy if the solution is near zero. Actually I think I saw it somewhere and thought it was a good idea.

All in all, maybe it was better to go with the more standard formula, but now I guess it's probably better to keep the algorithm intact, because it was around for some time. Also you are not complaining about its performance, only about this conceptual thing (important, but in practice lest_squres seems to work well).

I have the old least_squres benchmarks here -- https://github.com/nmayorov/lsq-benchmarks It should work out of the box most likely. If you have time, you can experiment with the current and the proposed formulas, maybe we'll see significant differences and develop intuition. If you are not very enthusiastic about it, I can do it myself, maybe on the weekend.

Update common.py

fdcd702

rgommers added maintenance Items related to regular maintenance tasks scipy.optimize labels Apr 5, 2021

rgommers changed the title ~~BUG: fix x convergence tolerance~~ BUG: fix convergence tolerance in _lsq.check_termination Apr 5, 2021

FFroehlich added 2 commits April 5, 2021 10:40

relax tolerance

66626d2

add additional iteration check

5aea223

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: fix convergence tolerance in `_lsq.check_termination` #13806

BUG: fix convergence tolerance in `_lsq.check_termination` #13806

FFroehlich commented Apr 5, 2021

rgommers commented Apr 5, 2021

FFroehlich commented Apr 5, 2021

rgommers commented Apr 5, 2021

FFroehlich commented Apr 6, 2021

nmayorov commented Apr 7, 2021

BUG: fix convergence tolerance in _lsq.check_termination #13806

BUG: fix convergence tolerance in _lsq.check_termination #13806

Conversation

FFroehlich commented Apr 5, 2021

Reference issue

What does this implement/fix?

Additional information

rgommers commented Apr 5, 2021

FFroehlich commented Apr 5, 2021

rgommers commented Apr 5, 2021

FFroehlich commented Apr 6, 2021

nmayorov commented Apr 7, 2021

BUG: fix convergence tolerance in `_lsq.check_termination` #13806

BUG: fix convergence tolerance in `_lsq.check_termination` #13806