Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: fix convergence tolerance in _lsq.check_termination #13806

Closed
wants to merge 3 commits into from
Closed

BUG: fix convergence tolerance in _lsq.check_termination #13806

wants to merge 3 commits into from

Conversation

FFroehlich
Copy link

Reference issue

What does this implement/fix?

The implemented convergence check is quite unusual and I believe this was not intended. This changes the tolerance check such that xtol is a canonical absolute + relative tolerance

Additional information

@rgommers rgommers added maintenance Items related to regular maintenance tasks scipy.optimize labels Apr 5, 2021
@rgommers rgommers changed the title BUG: fix x convergence tolerance BUG: fix convergence tolerance in _lsq.check_termination Apr 5, 2021
@rgommers
Copy link
Member

rgommers commented Apr 5, 2021

Thanks @FFroehlich. Did you find this problem when using one of the optimization functions? And can you construct a regression test (i.e. a unit test that fails before this fix, and works after it)?

Also note that tests are failing right now.

@FFroehlich
Copy link
Author

Thanks @FFroehlich. Did you find this problem when using one of the optimization functions? And can you construct a regression test (i.e. a unit test that fails before this fix, and works after it)?

Also note that tests are failing right now.

I found this in the context of an optimizer comparison where I looked at convergence criteria of multiple optimizers. So it's not a practical issue I encountered, but more of a conceptual one.

As this effectively results in less stringent (but theoretically more sound) convergence criteria, I find it difficult to provide a good regression test for this, especially since the optimizers do not store the optimization history. The best I could come up with checking the iteration number. With the previos implementation, optimization for scipy.optimize.tests.test_least_squares required up to 42 iterations, which is now down to 38.

@rgommers
Copy link
Member

rgommers commented Apr 5, 2021

It looks to me like xtol * (xtol + x_norm) was on purpose: the relative tolerance is xtol * x_norm and xtol**2 is a much smaller term to ensure the result is > 0. Having to loosen test tolerances shows that this isn't necessarily a good change.

@nmayorov you wrote this code, could you have a look?

@FFroehlich
Copy link
Author

It looks to me like xtol * (xtol + x_norm) was on purpose: the relative tolerance is xtol * x_norm and xtol**2 is a much smaller term to ensure the result is > 0.

But is it theoretically justified? For me this is strikes me as a rather unconventional tolerance check.

Having to loosen test tolerances shows that this isn't necessarily a good change.

I am not sure whether I would agree with this line of argument. I could simply multiply tolerances by 1e-2 or some other constant to ensure that test pass, but that wouldn't improve quality of the change. You could also argue that the optimization now terminates with 4 fewer steps (almost 10%) and just barely doesn't meet the already pretty stringent tolerance of 1e-14.

@nmayorov
Copy link
Contributor

nmayorov commented Apr 7, 2021

But is it theoretically justified? For me this is strikes me as a rather unconventional tolerance check.

It's tricky. I believe I definitely saw it implemented like this somewhere, but I agree it looks unconventional.

Here are my thoughts on it. It is common wisdom that for large enough floating point numbers the correct measure of accuracy is relative tolerance, because computational algorithms will have errors proportional to the target value they estimate (not the best explanation, but I think it's more or less true and correct).

The situation changes as the target value goes down to zero. And I wish there was some universal guidance and explanation on what to do in this situation, but as far as I know there is none. The most correct advice seems to be to specify "problem specific" absolute tolerance in this case. But having xtol_rel and xtol_abs would be too much for the already complicated list of arguments. So we have the single xtol and I come up with this "hybrid tolerance" formula, honestly don't remember why and how exactly.

At first glance this formula has the following properties (assuming the final x is very close to the true value):

  1. When |x| >> xtol, then xtol works as standard relative tolerance.
  2. When x is actually zero, then xtol**2 will drive the solution closer to zero compared to xtol in the conventional formula, i.e. closer to the true value (at the expense of more iterations).
  3. If |x| is small enough (~<< xtol), then xtol * (xtol + |x|) will also find more accurate solution (with more iterations).

My conclusion is that the implemented formula will basically try to find more accurate solution if x is near zero. I don't have strong arguments why it's good, maybe it looked like it somehow "balances" quality of the solution better for small and high |x| (note that initially the true value is unknown). I don't think it's bad either, because there is no universal measure of accuracy if the solution is near zero. Actually I think I saw it somewhere and thought it was a good idea.

All in all, maybe it was better to go with the more standard formula, but now I guess it's probably better to keep the algorithm intact, because it was around for some time. Also you are not complaining about its performance, only about this conceptual thing (important, but in practice lest_squres seems to work well).

I have the old least_squres benchmarks here -- https://github.com/nmayorov/lsq-benchmarks It should work out of the box most likely. If you have time, you can experiment with the current and the proposed formulas, maybe we'll see significant differences and develop intuition. If you are not very enthusiastic about it, I can do it myself, maybe on the weekend.

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Items related to regular maintenance tasks scipy.optimize
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants