Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more accurate and faster lgamma, and use a more standard branch cut #18330

Merged
merged 10 commits into from
Sep 7, 2016

Conversation

stevengj
Copy link
Member

@stevengj stevengj commented Sep 2, 2016

This fixes #18292, replacing @JeffBezanson's 2011 lgamma function from 4675b10 with a rewritten one that should fix several accuracy problems, and is also about 3x faster (and about 40% faster than the one in SciPy).

I am marking this "breaking" because I changed it to use a more standard branch cut for real(z) < 0, the same as SciPy's loggamma function and Mathematica's LogGamma. This is a different branch cut than you would get from log(gamma(z))! (It differs by some multiple of 2π in the imaginary part, so exp(lgamma(z)) is still the same Γ function.) The new branch cut has the advantage of being analytic everywhere in the complex plane except along the negative real axis.

To do:

  • Needs more tests.
  • The recurrence relation can be sped up using a trick by Hare (1997), which is used in SciPy, of taking the log of the product of the shifts and keeping track of the sign of the real part so that the correct branch cut is chosen, rather than computing the sum of the logs of the shifts.
  • Fix accuracy close to zero at z=2

@stevengj stevengj added kind:breaking This change will break code domain:maths Mathematical functions labels Sep 2, 2016
@stevengj
Copy link
Member Author

stevengj commented Sep 2, 2016

Okay, I implemented the log(product) trick with the correct branch cut, which seems like it gives a 30% or more speed improvement, and added some more tests.

Should be good to merge once tests are green.

@stevengj
Copy link
Member Author

stevengj commented Sep 2, 2016

Hmm, I need to improve accuracy near the other zero (at z=2) as well, hold on.

@stevengj
Copy link
Member Author

stevengj commented Sep 2, 2016

Okay, fixed.

@stevengj
Copy link
Member Author

stevengj commented Sep 3, 2016

@StefanKarpinski or @ViralBShah or @jiahao, okay to merge?

t = zinv*zinv
# coefficients are bernoulli[2:n+1] .// (2*(1:n).*(2*(1:n) - 1))
return (z-0.5)*log(z) - z + 9.1893853320467274178032927e-01 + # <-- log(2pi)/2
zinv*@evalpoly(t, 8.3333333333333333333333368e-02,-2.7777777777777777777777776e-03,7.9365079365079365079365075e-04,-5.9523809523809523809523806e-04,8.4175084175084175084175104e-04,-1.9175269175269175269175262e-03,6.4102564102564102564102561e-03,-2.9550653594771241830065352e-02)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrap this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I felt like it was actually easier to read the code when these lines were left unwrapped, especially on github or in an editor like Atom that does not line-wrap for you, because then you don't have several lines of indecipherable numbers interrupting the flow of the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is several lines of numbers. The flow is line continuation / indent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, fine

@stevengj
Copy link
Member Author

stevengj commented Sep 5, 2016

Bump. Okay to merge?

@StefanKarpinski
Copy link
Sponsor Member

Sure, although I'm not remotely qualified to judge this :)

@tkelman
Copy link
Contributor

tkelman commented Sep 6, 2016

@dlfivefifty is this one of those things where ApproxFun can help us rigorously verify our results against the series expansion?

@dlfivefifty
Copy link
Contributor

@tkelman What exactly are you hoping to verify? Are you after something mathematically rigorous, or rather just a second point of comparison?

It's not currently possible to do anything mathematically rigorous in ApproxFun, though I had a chat with David Sanders of https://github.com/dpsanders/ValidatedNumerics.jl and the following should be implementable: the Fourier/Taylor/Chebyshev/Laurent coefficients can be calculated rigorously as intervals, provided that the input function

  1. can be rigorously evaluated pointwise on intervals
  2. has the first few derivatives implemented via automatic differentiation (also evaluatable on intervals)

A side effect should mean evaluation of the resulting Fun also gives a rigorous interval.

@dlfivefifty
Copy link
Contributor

@dpsanders

@stevengj
Copy link
Member Author

stevengj commented Sep 6, 2016

In this case, you need to check it in the complex plane, so not just with intervals. And the Taylor series is a deathly slow way to compute lgamma far from z=1, if it can be used at all.

Here is a quick check of the maximum relative error, with 10^7 points against SciPy:

using PyCall
lg = pyimport("scipy.special")["loggamma"]
z = (randn(10^7) + randn(10^7)*im) * 10;
err(a,b) = abs(a - b) / abs(b)
maximum(err.(lgamma.(z), lg(z)))

gives 1.175661212573326e-13.

@dlfivefifty
Copy link
Contributor

If your algorithm is a sequence of analytic operations, then you wouldn't need to check everywhere in the complex plane: accuracy along a contour would give accuracy within the enclosed region (provided you avoid the singularities of lgamma).

@stevengj
Copy link
Member Author

stevengj commented Sep 6, 2016

@dlfivefifty, that's only if you assume that there are no bugs and your code is really computing an analytic function. This is not a good thing to assume, particularly for code that involves switching between multiple approximations.

@stevengj
Copy link
Member Author

stevengj commented Sep 6, 2016

Note also that at the point of maximum difference compared to SciPy, i.e. z[indmax(err.(lgamma.(z), lg(z)))] == 1.5000165664503786 - 0.014135858766095162im, my code is actually more accurate. I checked against WolframAlpha at that point, and my actual relative error is 2.6e-15, whereas SciPy's error is 1.1e-13.

@dlfivefifty
Copy link
Contributor

dlfivefifty commented Sep 6, 2016

But you can't "rigorously" verify you don't have any bugs...

What you could do is for each approximation scheme, instead of running it on a z::Complex128, run the code on

z=Fun(identity,Taylor(Circle(z_0,r)))

Then check the returned Fun is accurate compared to

Fun(z->old_lgamma(z),Taylor(Circle(z_0,r)))

This will give a pretty good guarantee that it is accurate everywhere inside the circle.

@stevengj
Copy link
Member Author

stevengj commented Sep 6, 2016

In my experience, the most effective ways to test special-function implementations are (1) exhaustive randomized tests against other implementations combined with (2) tests near known problem spots:

  • Crossover points between two different approximations
  • Near zeros (including zeros in just the real or imaginary parts), singularities, and branch cuts.
  • Extreme arguments (very large, very small, Inf, NaN).

Again, the whole point is to check the accuracy of the different approximations you are using, and especially to check how you combine them. Checking on the boundary of a circle and assuming that things are smooth inside is pretty useless for that; you have to check the crossover points and the extremes.

And no, you can't rigorously prove that you don't have bugs. But you're much more likely to find them if you look for them than if you assume they don't exist, and you have to look for them based on what kinds of bugs are most common in practice.

@dlfivefifty
Copy link
Contributor

Running the same algorithm on a special data type that checks for a region is not "assuming" it is smooth on the inside: as long as the functions you call are smooth on the inside, then the result is also smooth on the inside.

@stevengj
Copy link
Member Author

stevengj commented Sep 6, 2016

@dlfivefifty, the problem is that the functions you call are not smooth. You are discontinuously switching between different approximation techniques in different regions of the complex plane, with the intention that the approximations are discontinuous only at (or near) the machine-precision level. You need to check that (a) each approximation branch is bug-free and (b) that the approximations are each accurate to the desired precision in the whole region where they are used.

@stevengj
Copy link
Member Author

stevengj commented Sep 6, 2016

Also, you need to check the relative accuracy everywhere in each region, and I think the technique you are discussing only bounds the absolute error.

@dlfivefifty
Copy link
Contributor

In that case you are probably better off with the random check.

@dlfivefifty
Copy link
Contributor

PS to check relative error all you would have to do is check

norm((u_approx - u_ex)/u_ex - 1)

on the boundary provided that u_ex has no zeros.

@stevengj
Copy link
Member Author

stevengj commented Sep 7, 2016

@dlfivefifty, we actually want to check the relative error in both the real and imaginary parts separately. In any case, as I said, checking on circular contours does not help with checking the discontinuities in the approximation techniques, nor does it necessarily exercise all the code paths, nor does it check extreme arguments, nor does it deal with the fact that lgamma has zeros, poles, and branch cuts.

@dlfivefifty
Copy link
Contributor

Yes, for lgamma checking at random points suffices. But that's not possible for functions with many parameters since the number of random points needed grows exponentially: I don't think you can verify even Hypergeometric 2F1 this way.

However, if you split an algorithm into separate components built out of analytic functions (essentially all functions that are used are piecewise analytic), then each component can be checked individually only on the boundary of the domain where it is used.

@stevengj
Copy link
Member Author

stevengj commented Sep 7, 2016

As I said above, I don't think checking at random points suffices even for lgamma. Just that it is a good starting point and sanity check, complemented by checking hand-selected "hard cases" more carefully. (Even then, it is not practical to prove that you don't have a bug.)

If you have a function of multiple arguments, even Bessel or polygamma functions (much less hypergeometric functions), I don't follow you — your technique only applies to contours in the complex plane, not to multidimensional manifolds, no? And even for single-argument functions it doesn't apply for checking the real and imaginary parts separately since these are not generally analytic anyway. Not to mention, as I said, that most realistic special functions have zeros and singularities.

@dlfivefifty
Copy link
Contributor

Bessel/hypergeometric are analytic in all arguments, so you need only check at the boundary of a multidimensional manifold in the multivariate complex plane.

The real and imaginary parts of an analytic function are harmonic, and so their error is also maximized at the boundary.

@stevengj
Copy link
Member Author

stevengj commented Sep 7, 2016

Oh right. But note that the real and imaginary parts typically have many more zeros than the function as a whole, you probably don't bound the relative error this way.

Ignoring the whole issue of zeros etc., implementing your scheme in practice for a high-dimensional function is going to be challenging because of the large number of odd-shaped high-dimensional manifolds that are probably required to exhaustively check all of the different approximation regions, especially once you factor in things like recurrences (each iteration of which introduces a new manifold). And in the end, you still have a curse of dimensionality, just with one dimension less.

@stevengj
Copy link
Member Author

stevengj commented Sep 7, 2016

Anyway, this argument seems academic. As discussed above, it is not particularly applicable to lgamma, and if you want to go and implement your approach for checking, say, the Bessel function implementations, good luck to you.

Since no one seems to have any objection to the change of branch cuts, and the new implementation is a clear improvement on the old one in every other way, I'm going to go ahead and merge in the next day or two.

@dlfivefifty
Copy link
Contributor

I was only answering @tkelman's question: " is this one of those things where ApproxFun can help us rigorously verify our results against the series expansion?" , not suggesting you actually do it.

@stevengj
Copy link
Member Author

stevengj commented Sep 7, 2016

I think the correct answer to @tkelman's question is simply, "no."

@dlfivefifty
Copy link
Contributor

A more precise correct answer is:

  1. Yes for real values, if you break of the real line at points where the algorithm changes/at roots, and assume other functions called are correct. You don't need ApproxFun, just interval arithmetic.
  2. No for complex values, it's theoretically possible but too complicated.

@stevengj stevengj merged commit c67ab28 into JuliaLang:master Sep 7, 2016
@stevengj stevengj deleted the lgamma branch September 7, 2016 16:35
pkofod added a commit to pkofod/julia that referenced this pull request Feb 21, 2018
pkofod added a commit to pkofod/julia that referenced this pull request Feb 21, 2018
StefanKarpinski added a commit that referenced this pull request Feb 21, 2018
Remove unused angle_restrict_symm function left over from #18330.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:maths Mathematical functions kind:breaking This change will break code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

lgamma() fails for large negative imaginary arguments
4 participants