Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: ENH: use common halley's method instead of parabolic variant #8489

Merged
merged 11 commits into from Mar 10, 2018

Conversation

mikofski
Copy link
Contributor

@mikofski mikofski commented Feb 27, 2018

I would like to propose that we change the form of the Halley's method in scipy.optimize.zeros from the "parabolic" variant currently in use (see issue #5922 ) with the more common form found in several other popular libraries and references (see list at end of email).

I think that there are at least four advantages to using the more common form:

  1. the parabolic variant has an unnecessary branching condition, which IMHO makes the file a bit more difficult to read, understand and maintain,

  2. the proposed changes actually make the zeros.py file smaller by removing about 5 lines,

  3. the common form seems to have more documentation and widespread usage, so it is easier to find, and should increase confidence in it's usage, and

  4. without branching conditions, it is easier to vectorize these methods, so that they can process numerous scalar, univariate cases simulataneously using numpy arrays or in a tight cythonized loop, with exit conditions that are similar to the multivariate case:

    • if derivatives are badly conditioned, then exit or
    • if absoluate max of all of the Newton steps is less than a given tolerance.

Thanks for your consideration!

List of libraries and references using the more common form of Halley's method:

  1. Boost
  2. the English language Wikipedia
  3. Wolfram's
  4. JuliaLang Roots.jl
  5. the R pracma package in newton.R

@mikofski
Copy link
Contributor Author

mikofski commented Feb 27, 2018

As you can see in #5922, the current parabolic variant of Halley's method doesn't have cubic converge , but the OP tried the more common version form, proposed here, as well and did observe cubic conference! Therefore I believe this PR closes #5922

@mikofski mikofski changed the title use common halley's method instead of parabolic variant BUG: ENH: use common halley's method instead of parabolic variant Feb 27, 2018
@person142
Copy link
Member

tl;dr: yeah we should switch to the common Halley's method.

Wanted to get to the bottom of this. It appears that "parabolic Halley's method" is more commonly known as "Halley's irrational formula"; see e.g.

http://mathworld.wolfram.com/HalleysIrrationalFormula.html

(The method commonly known as "Halley's method" today is "Halley's rational formula".)

A good historical summary of Halley's two formulas is in Bateman's paper "Haley's Methods for Solving Equations". AFAICT Halley's irrational formula has cubic convergence but can require going into the complex plane. I can't find any sources indicating that the current implementation's strategy of dropping the square root when it becomes complex is correct.

p = p0 - 2*fval / (fder + sign(fder) * sqrt(discr))
fder2 = fprime2(*myargs)
# Halley's method
p = p0 - 2 * fval * fder / (2 * fder ** 2 - fval * fder2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better to use the update

p = p0 - (fval / fder) / (1 - (fvar / fder)*(0.5*fder2 / fder)

to avoid potential overflow in the square.

Copy link
Contributor Author

@mikofski mikofski Mar 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have agreed, but I tried that first in this commit:

# this version has overflow issues
# p = p0 - 1 / (fder / fval - fder2 / fder / 2)

and this was the form that had overflow divide by zero issues. I'll try digging into to it more to figure out why. Thanks for looking this over and getting the Wolfram reference on the irrational Halley's form. Very interesting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@person142 , sorry, I was incorrect in my previous comment. It was a "divide by zero" runtime warning, not an overflow error. And the form is different than yours. I agree, your form is an improvement, it looks cleaner, ie: it's easier to see that fder2 is zero, then you get Newton's method, and it avoids fder**2 and a potential overflow. Thanks!

I made the change in 55017a7 , I had an inline comment, but then I remove it.

* thanks @person142 for the improvement!
* store newton_step = fval / fder to avoid redundant calculations
@mikofski
Copy link
Contributor Author

mikofski commented Mar 1, 2018

Do you think the grouping is very important? I kept the Newton step group, fval / fder , but I regrouped 0.5*fder2/fder - this still keeps fder**2 from being evaluated

@mikofski
Copy link
Contributor Author

mikofski commented Mar 1, 2018

FYI: this job 13883 didn't fail, it timed out. The message from Travis CI was:

The job exceeded the maximum time limit for jobs, and has been terminated.

But all tests did pass:

============================= test session starts ==============================
platform linux -- Python 3.6.3, pytest-3.4.1, py-1.5.2, pluggy-0.6.0
rootdir: /home/travis/build/scipy/scipy, inifile: pytest.ini
plugins: xdist-1.22.2, forked-0.2, faulthandler-1.4.1, cov-2.5.1
gw0 I / gw1 I / gw2 I
Coverage.py warning: --include is ignored because --source is set (include-ignored)
Coverage.py warning: --include is ignored because --source is set (include-ignored)
Coverage.py warning: --include is ignored because --source is set (include-ignored)
gw0 [14790] / gw1 [14790] / gw2 [14790]
scheduling tests via LoadScheduling
..........s.......s..................................................... [  0%]
........................................................................ [  0%]
.........................................s.............................. [  1%]
........................................................................ [  1%]
........................................................................ [  2%]
........................................................................ [  2%]
...................................................................x.... [  3%]
............................s........................................... [  3%]
................................................x........................ [  4%]
........................................................................ [  4%]
.................x......................................X............... [  5%]
..................................................x..................... [  5%]
........................................................................ [  6%]
........................................................................ [  6%]
.............................x.......................................... [  7%]
........................................................................ [  7%]
........................................................................ [  8%]
........................................................................ [  8%]
........................................................................ [  9%]
........................................................................ [  9%]
........................................................................ [ 10%]
........................................................................ [ 10%]
........................................................................ [ 11%]
........................................................................ [ 11%]
........................................................................ [ 12%]
........................................................................ [ 12%]
.................................................................s...... [ 13%]
........................................................................ [ 13%]
...................X.................................................... [ 14%]
........................................................................ [ 14%]
........................................................................ [ 15%]
........................................................................ [ 15%]
........................................................................ [ 16%]
........................................................................ [ 16%]
........................................................................ [ 17%]
........................................................................ [ 17%]
........................................................................ [ 18%]
........................................................................ [ 18%]
........................................................................ [ 18%]
.......................................s................................ [ 19%]
........................................................................ [ 19%]
........................................................................ [ 20%]
..........x......X....x...............................................X. [ 20%]
........................................................................ [ 21%]
........................................................................ [ 21%]
........................................................................ [ 22%]
........................................................................ [ 22%]
x....................................................................... [ 23%]
.........................................................x.............. [ 23%]
......................................................................... [ 24%]
........................................................................ [ 24%]
........................................................................ [ 25%]
........................................................................ [ 25%]
..................................................................X..... [ 26%]
........................................................................ [ 26%]
........................................................................ [ 27%]
........................................................................ [ 27%]
..........................................x............................. [ 28%]
........................................................................ [ 28%]
........................................................................ [ 29%]
.....ss......................................x.......ssssss............. [ 29%]
.......................................................s......sssss.ssss [ 30%]
sssssssssssssssssssssssssssss.....sss.s.x.s............................. [ 30%]
........................................................................ [ 31%]
........................................................................ [ 31%]
....................................................x................... [ 32%]
.................s.....sssss.sssssssssssssssssssssssss.ssssssss......... [ 32%]
.....................s.................................................. [ 33%]
.......................................................s................ [ 33%]
...x.......................x.......x...........................sssss.sss [ 34%]
ssssssssssssssssssssssssssssss........................................... [ 34%]
.........x......sss..x.................................................. [ 35%]
................ss.........................s............................ [ 35%]
..s.......s..................ss.s.sss.s.s.s.s.ssssss.......sss.ss....... [ 36%]
..............sss..x..s..s.x..xx...ss.s................................. [ 36%]
...............s.......ss...........................s.ssss...........x.. [ 37%]
......x.s.....ss.sss.s...s.........................................s.... [ 37%]
.................s.................................ss................s... [ 37%]
................................s.....sss.ssssssss.sssxs.s.ss.s.s........ [ 38%]
ss.ss..s...s...s...s....xss..........s........xx........................ [ 38%]
s....s.s..s.s.................ssss.s....sssss.s...s..................... [ 39%]
.........s.............................................................. [ 39%]
.................................................s...................... [ 40%]
................sssss.ssssssssssssssss.sssssssssssssssss................ [ 40%]
..s..............................x.x.....x.sssx.x......x.ssssss......... [ 41%]
............................s........................................... [ 41%]
.................................s...s.ss.....................ss..s.s... [ 42%]
.................ssss..s..ssss.s..s..s......s.......s..............s.... [ 42%]
......................s......s......s............s..........ss........... [ 43%]
..........s..............s.s.........s..ss..............................s [ 43%]
s.s..............ss......................s...........................s... [ 44%]
.................s......................s.ssssss......................s. [ 44%]
......................ss.............s.....................s.......s.... [ 45%]
...s..s...s...............sss....................ss..................... [ 45%]
..............................s........................s.............s.. [ 46%]
.s............................s....................s.............ss..... [ 46%]
........s....s.............s..................ssssssssssssssss.......... [ 47%]
..sss.ss...s.........s..s.x................s..sss...ssss.....s.......... [ 47%]
..sssssss.........ss........................sssssx..s.xs..sssss.s.sss.ss [ 48%]
.ss.s.ssssssssss..........ssss.ss.s...s..s.........s..xss................ [ 48%]
....ss...........s.....s..sssssssssssssssss....ssss.s..s....s...s....xss [ 49%]
........................ss.ss............................s.sss..s....ss. [ 49%]
sss.s..s.............................s.....s.........................s.. [ 50%]
......................................s.......s...............s......... [ 50%]
...........................s...........................s...............s [ 51%]
.......s....................s.....s..ss.........s...........s..sssssssss [ 51%]
ssssssss....ssss.ss.s...s..s...s.xs.s.............sss.sss.s...........ss [ 52%]
ssss.......s..s...s.s.........................ssss...s...ssss.ss..s..s... [ 52%]
................s..................................s.........s.......... [ 53%]
..............................ss.......................s............s... [ 53%]
....s.s..........s..ss.....................s........s.ssss.sssssssssssss [ 54%]
sss.ssss.ss.s.sssssssssssss...ss........................................ [ 54%]
s....................................s.......s..............s.......s... [ 55%]
.................ss........................s........s........s..s...s... [ 55%]
...................sss..s......ss.........s.............................. [ 56%]
...........s...........s...................s......s..........ss......... [ 56%]
....s........s..ss...................s......s......s.....ssss.ssssssssss [ 57%]
sssssssssssssssss.s........sss.ss.....s..s..sss.ssx........s......s.s..s [ 57%]
s.........s.x...........s.s..sssssssss..sssss..s.s..s.ss............sss. [ 57%]
s.ss..ssss.s.s..s..xs.xs..sssss.s.s.x.s.s..................s........sss. [ 58%]
...........ss...........s.....s..ssssssssssssssss.s.....sssss.ss...ss.s.. [ 58%]
.....s....s........xss.....ssss..s...sss.ss..ss......s........s.ss.sss.. [ 59%]
..........................s.sss..s...ssss.s.ssss.sss..........x.s....... [ 59%]
.........s..x.s...sssss.s.s.....................................s....... [ 60%]
........s..................s......s....................s......ss........ [ 60%]
.......s...ss........................s.....s..ss..ss.ss.ssssssssss.s..... [ 61%]
..ss.ss...s..s...s..s..s.s......s..ss....sxs.s.......................... [ 61%]
.s......................sss..s........ss.ss........................s.sss [ 62%]
.s..s.....sssss..s...sss................................................ [ 62%]
..............s...............s......................s.................. [ 63%]
....s.........................s.................................s....... [ 63%]
.................................s......s.......................s....... [ 64%]
.......ss.......................s.............s..s........s..s...s...... [ 64%]
...ss..............s......sss....s.............sssss..sssssssssss......s [ 65%]
s.......s..s....................s...........s.ss...ss............s...... [ 65%]
.........s...............s..s.x..s...............s...............s...... [ 66%]
..........................s.......ss............s..........s..s.ssss.s.s [ 66%]
s.s.ss.ssss.ssss.....ssss..s.s..s.....s.....s..........ss..........xs.s.. [ 67%]
.......sssssx.xs..sssss.s..s.....s..ss...............s.....s.s.ss........ [ 67%]
..............s....ssss..s..sssss.s.s............ss..............s...s.. [ 68%]
..........s...sssssssssssssssss....sss.s.s..s...s..s.xss................ [ 68%]
.....s.........ss.s..s.s.......................ssss.s..sssss.s.s........ [ 69%]
.......ss...........................s.....s.......s.s......s..ss........ [ 69%]
........................s.sss.............s.....s.s..................... [ 70%]
...................s..................................s................. [ 70%]
......s..s................s............s......s......................... [ 71%]
......ss...............ss...............s..................s.......s..ss [ 71%]
..........s...s.s..s...s.....s..............................ss.s.....sss [ 72%]
.........ss.......................ss.........s.............sss.......... [ 72%]
.............s..........................s..................s............ [ 73%]
........ss....................s.............s.....s.......s..s...s...... [ 73%]
...sss...ss..................................s.......................... [ 74%]
s..................s.......................sssssssssssssssss.......s.... [ 74%]
........................................................................ [ 75%]
........................................................................ [ 75%]
.........s.............................................................. [ 76%]
..............................................s......................... [ 76%]
.......................................................................s [ 77%]
........................................................................ [ 77%]
.....................x.s..ss........s.....s..sssssssssssssssss....ssss.s [ 77%]
s.s...s..s.xss.........sss....ss.ss...............ssss..s..sssss.s.s.... [ 78%]
x........s.......s......s......................................x........ [ 78%]
...xx...........................x....................................... [ 79%]
...........xxxxx..xx..xx................................................ [ 79%]
..............................................................xxxx...... [ 80%]
.....xxxx............................................................... [ 80%]
........................................................................ [ 81%]
..................x.x...................................x..x...x........ [ 81%]
..................................xx.xxxx....x.xx....................... [ 82%]
........................................................................ [ 82%]
...x......x............................................................. [ 83%]
........................................................................ [ 83%]
........................................................................ [ 84%]
....................x...X............................................... [ 84%]
..............................................s.ss............x..xs..x.. [ 85%]
xxxxxsx...x...s...s..ssss............................................... [ 85%]
......................................................................... [ 86%]
.xs......xx..x...x......................................x............... [ 86%]
........................................................................ [ 87%]
........................................................................ [ 87%]
........................................................................ [ 88%]
........ss.............................................................. [ 88%]
........................................................................ [ 89%]
........................................................................ [ 89%]
.............................................x.......................... [ 90%]
........................X............................................... [ 90%]
......................s................................................. [ 91%]
..................................................s..................... [ 91%]
........................................................................ [ 92%]
........................................................................ [ 92%]
............................x..................x........................ [ 93%]
........................................................................ [ 93%]
........................................................................ [ 94%]
........................................................................ [ 94%]
.........xx.xx........................................................... [ 95%]
................................s................s.....xxxx..x.......... [ 95%]
.x.........s...............................x...s..s....x...xx.x.....x.x. [ 96%]
................................................x.x..x.x................ [ 96%]
...................................................x.................... [ 96%]
........................................................................ [ 97%]
........................................................................ [ 97%]
....................................x................................... [ 98%]
........................................................................ [ 98%]
........................................................................ [ 99%]
........................................x..............x................ [ 99%]
.........x...                                                            [100%]
----------- coverage: platform linux, python 3.6.3-final-0 -----------
Coverage HTML written to dir /home/travis/build/scipy/scipy/build/coverage
=========================== short test summary info ============================
XPASS scipy/interpolate/tests/test_gil.py::TestGIL::()::test_rectbivariatespline race conditions, may depend on system load
XPASS scipy/linalg/tests/test_matfuncs.py::TestFractionalMatrixPower::()::test_singular Too unstable across LAPACKs.
XPASS scipy/optimize/tests/test_linprog.py::TestLinprogIPSparsePresolve::()::test_bug_6690 Fails with ATLAS, see gh-7877
XPASS scipy/ndimage/tests/test_datatypes.py::test_uint64_max runs only on darwin
XPASS scipy/optimize/tests/test_linprog.py::TestLinprogIPSparse::()::test_bug_6690 Fails with ATLAS, see gh-7877
XPASS scipy/special/tests/test_basic.py::TestCephes::()::test_fdtri_mysterious_failure Returns nan on i686.
XPASS scipy/stats/tests/test_distributions.py::TestF::()::test_stats_broadcast f stats does not properly broadcast
==== 13383 passed, 1260 skipped, 140 xfailed, 7 xpassed in 2026.27 seconds =====

@person142
Copy link
Member

it timed out

Yeah that happens sometimes. We can restart the build, but the docstring needs to be updated anyway (still mentions parabolic Halley's method).

It would be nice to find a simple test case that clearly demonstrated the previous version not having cubic convergence. I made a script

https://gist.github.com/person142/9789b09c599bda1c2377d496787d4ea0

to test the order of convergence, but haven't had time to hunt for an example with slow convergence yet. I don't think it's a necessity for merging though.

* remove the word "parabolic" from Halley's method
* update example, has always been the case that if fprime (internally
fder) is None, then it will always use secant, whether or not the 2nd
derivative fprime2 (interally fder2) is given or not.
@mikofski
Copy link
Contributor Author

mikofski commented Mar 1, 2018

  • Thanks for the Gist, I will add it as a test. I think it's worth it to properly document that we've addressed Suboptimal convergence of Halley's method? #5922.
  • I've updated the documentation in b796442 to remove the word "parabolic".
  • I also made minor changes to the examples, which seemed to imply that if fprime2 is given, even if fprime is None, then it would use Halley's which has never been the case. In version 1.0.0 it shows that if fprime is None the secant method is always used, regardless of fprime2

* add test by Josh Wilson (@person142) to check for at least cubic
convergence rate from Halley's finding roots of y = sin(x) near x = 3.5
* the test was originally posted in this gist:
https://gist.github.com/person142/9789b09c599bda1c2377d496787d4ea0
* try to import mpmath, and only run halley's convergence test if it is
available
* import the assert_array_less assertion and the skipif decorator from
numpy.testing
* numpy.testing.decorators.skipif requires nose, which isn't in
requirements
* looking through scipy codebase, it looks like pytest is used including
pytest.mark.skipif(condition)
@mikofski
Copy link
Contributor Author

mikofski commented Mar 4, 2018

If there's no more feedback then believe this PR is ready to merge. Thanks!

@mikofski
Copy link
Contributor Author

mikofski commented Mar 8, 2018

@person142 sorry to bother you. Do you have any more recommendations for merging this? I know you're very busy, and I appreciate your time. Thanks!

@person142
Copy link
Member

Do you have any more recommendations for merging this?

Long story short I think the test for order of convergence should be left out. It would be good to add a test for the new method in the complex domain: pick your favorite quadratic polynomial with negative discriminant. The current implementation won't be able to handle it, but the new version will.

Details on why no order of convergence test: I don't think there's going to be a non-pathological example where the current implementation doesn't do well. It will go wrong when [f'(x)]^2 < f(x)*f''(x), which makes it use the branch where you drop the square root term (i.e. you don't really use parabolic Haley's method). But around a "typical" root we will have that f'(x) = O(1), f''(x) = O(1), and f(x) = O(x), so this inequality won't hold. You can choose an atypical root (i.e. a double/triple root) and get poor convergence, e.g. the function x^2 + x^3 will quite nicely show second order convergence for the current implementation. But that's not a good example since the new implementation doesn't even converge (which is reasonable since it's a pathological).

@mikofski
Copy link
Contributor Author

OK, I've removed the cubic convergence test and added a complex test for Halley's.

@person142 person142 merged commit 1da95f6 into scipy:master Mar 10, 2018
@person142
Copy link
Member

Merged, thanks for your patience @mikofski.

@mikofski mikofski deleted the common_halleys branch March 10, 2018 03:16
adbugger pushed a commit to adbugger/scipy that referenced this pull request Mar 11, 2018
@pv pv added this to the 1.1.0 milestone Mar 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants