New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented log(BesselK) for fractional orders #1121
Implemented log(BesselK) for fractional orders #1121
Conversation
Not all tests pass due to numerical issues
…gs/RELEASE_500/final)
…gs/RELEASE_500/final)
CRAN won't allow those |
I think you should complex-step the derivative with respect to the order when it is a |
The derivative with respect to the primary argument has many formulas. |
Also, these tests against Mathematica are not very strong in my opinion. It is much better to test with identities, such as at http://functions.wolfram.com/Bessel-TypeFunctions/BesselK/17/ , that hold for all input values. Usually you can just divide the left-hand side of an identity by the right-hand side and check that the ratio is near one and |
Also, it may be worth thinking about whether we need a specialized integration routine for |
I think you can ultimately avoid using |
…gs/RELEASE_500/final)
…gs/RELEASE_500/final)
Thanks for the feedback - I assume that there is no major known reason to not include this function in the library and it is therefore worthwhile for me to spend time working on it. I have tried to further improve the code. In particular, I have added analytical derivative wrt. z, but the derivative wrt. v is still from autodiff. There is a trick I use to combine those in the
Where
I updated to snprintf - is that still an issue? Would you suggest using std::ostringstream instead? Or do you believe the improved messages are not worth the hassle?
That's what I have done. I have also tried getting analytical derivative of the inner integral wrt. However, I use std::vector to be able to use precomputed_gradients which means the linter complains that the function should not be in
Currently looking into this |
On Feb 19, 2019, at 10:58 AM, Martin Modrák ***@***.***> wrote:
Thanks for the feedback - I assume that there is no major known reason to not include this function in the library and it is therefore worthwhile for me to spend time working on it. I have tried to further improve the code.
I have no idea about utility.
But all the analytic derivatives are great for anything we do implement.
...
I updated to snprintf - is that still an issue? Would you suggest using std::ostringstream instead? Or do you believe the improved messages are not worth the hassle?
You can't write to the standard streams. Writing to a std::stringstream is fine. But the only way our functions should create text output is through exceptions.
...
However, I use std::vector to be able to use precomputed_gradients which means the linter complains that the function should not be in scal :-(
There may be a bug in the linter. std::vector is OK to use in the scal directory inside functions. What we don't want is to have functions that apply to std vectors defined in scal. Those should go in arr.
|
…sselk' into feature/issue-1112-continuous-besselk
…gs/RELEASE_500/final)
I'll cosign. If you think it'll help you, then feel free to continue to implement. If it's written well, we can include it in the Math library for your use. ("well" is a very loosely defined term, but essentially, if there's very little maintenance burden for the future, then it's fine.)
@bob-carpenter, that's not correct. The difference in @martinmodrak, will you put a comment here when it's ready for review? @bgoodri, would you be able to review once it's ready? |
I recall the decision going the other way, but it's no big deal as long as @syclik can lay out an acceptable workaround until the @syclik: The question is where the definition of |
Thanks very much @SteveBronder for going through the code. Most of the suggestions are very sensible and once I resume working on this, I will implement them and will explain where I differ in opinion. Unfortunately however, the feature is far from complete as the function still has excessive errors for some parameters and does not pass tests (see the previous post: #1121 (comment) for some details). I am unsure about the timeline of resolving this as it seems to require some additional math insights (which I do not have and don't have more ideas where to look) and also because the project that drove this feature is now on the back burner as we encountered some other problems. Should have probably made the status of PR more prominent, sorry if the feature does not make it in the end and your effort ends up not improving Stan codebase. |
No worries man! if this is still a WIP then I'm going to close the PR for now |
@martinmodrak Did you try equation (24) of Rothwell in the remaining problematic region? Rothwell says equation (24) is "difficult to compute when |z| is small", but I think it might be accurate enough when Here is my Rcpp version for the log of equation (24) in Rothwell:
|
@bgoodri Thanks for the hint. I actually tried the formula given at the question https://math.stackexchange.com/questions/1960778/approximating-the-log-of-the-modified-bessel-function-of-the-second-kind (the answer is mine), which helps in a bit of the parameter space but does not work toward the larger values. The formula 24 in Rothwell is actually a bit less stable than that: I haven't found parameter values for which the Rothwell 24 integral does not blow up while the "gamma integral" from the SE question does blow up. Also using One thing that might help a lot would be a reimplementation of the exph_sinh integrator using the log_sum_exp logic, which Bob hinted is a possibility, allowing me to stay on the log scale. (for the problematic parameters the maximum log of the integrand can crawl over several hundred) |
Here is a version of equation (24) from Rothwell with a log-sum-exp thing, which I would guess works better when
|
EDIT: Sorry, I misread your suggestion. I didn't try moving the maximum out. That sounds reasonable I will try it! |
Thanks @bgoodri , moving the maximum out helped a lot. This is the current error plot (and I think I can still tweak it a bit, so maybe I am almost there) |
Great, although I am slightly confused as to how it would help a lot if it
was similar to what you already had. But I'll take it. Are you still using
doubles? Maybe switching to long double would get us home?
…On Thu, Nov 28, 2019 at 1:32 PM Martin Modrák ***@***.***> wrote:
Thanks @bgoodri <https://github.com/bgoodri> , moving the maximum out
helped a lot. This is the current error plot (and I think I can still tweak
it a bit, so maybe I am almost there)
[image: image]
<https://user-images.githubusercontent.com/9483603/69826973-bea02180-1215-11ea-9888-c070ddc8ffae.png>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1121?email_source=notifications&email_token=AAZ2XKVBH6VLICINJRCKJPDQWAFDVA5CNFSM4GYFA4TKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFNIILY#issuecomment-559580207>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZ2XKX3KCTUHTCUO2RFHATQWAFDVANCNFSM4GYFA4TA>
.
|
Also, I have not tried this yet but
https://www.boost.org/doc/libs/1_71_0/libs/math/doc/html/math_toolkit/trapezoidal.html
which cites Table 17.1 of
https://epubs.siam.org/doi/pdf/10.1137/130932132
claims that the trapezoid rule is good for evaluating complex Bessel
functions.
…On Thu, Nov 28, 2019 at 1:56 PM Ben Goodrich ***@***.***> wrote:
Great, although I am slightly confused as to how it would help a lot if it
was similar to what you already had. But I'll take it. Are you still using
doubles? Maybe switching to long double would get us home?
On Thu, Nov 28, 2019 at 1:32 PM Martin Modrák ***@***.***>
wrote:
> Thanks @bgoodri <https://github.com/bgoodri> , moving the maximum out
> helped a lot. This is the current error plot (and I think I can still tweak
> it a bit, so maybe I am almost there)
>
> [image: image]
> <https://user-images.githubusercontent.com/9483603/69826973-bea02180-1215-11ea-9888-c070ddc8ffae.png>
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#1121?email_source=notifications&email_token=AAZ2XKVBH6VLICINJRCKJPDQWAFDVA5CNFSM4GYFA4TKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFNIILY#issuecomment-559580207>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAZ2XKX3KCTUHTCUO2RFHATQWAFDVANCNFSM4GYFA4TA>
> .
>
|
Thanks for the help, I am finally moving forward. Just to report current status: It turns out the trapezoid rule on the cosh formula from the Arxiv paper (on the log scale, via (there are more test points as I added testing for the boundaries where I choose different formulas) But d/dv was computed OK with the previous algorithms, so my next plan is to combine the value from the trapezoid rule with derivatives from the asymptotic expressions. After that, the derivatives for very small v and z are still bad (error around 1e-3), which I still don't know how to tackle (the other integrals, including the trapezoid give even worse results than Rothwell there). |
I think the trapezoid rule with a log_sum_exp of equation 1.1 in
https://arxiv.org/pdf/1209.1547.pdf would work really well, except that the
maximum of the integrand apparently cannot be found analytically. So, I
guess we would have to use Boost's root-finding methods to find the maximum
and then subtract it.
…On Fri, Nov 29, 2019 at 10:57 AM Martin Modrák ***@***.***> wrote:
Thanks for the help, I am finally moving forward. Just to report current
status: It turns out the trapezoid rule (on the log scale, via log_sum_exp)
works great for v ~= z, but only for the actual value. Autodiffing
through the trapezoid rule yields bad values for d/dv.
[image: image]
<https://user-images.githubusercontent.com/9483603/69879969-12b80e00-12c9-11ea-9509-9a8d823a76a5.png>
But d/dv was computed OK with the previous algorithms, so my next plan is
to combine the value from the trapezoid rule with derivatives from the
asymptotic expressions. After that, the derivatives for very small v and z
are still bad (error around 1e-3), which I still don't know how to tackle
(the other integrals, including the trapezoid give even worse results than
Rothwell there).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1121?email_source=notifications&email_token=AAZ2XKTFVMXDTX3QN32RJMTQWE3X7A5CNFSM4GYFA4TKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFPFR2Y#issuecomment-559831275>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZ2XKTEKYCDYQAGIR3AWQTQWE3X7ANCNFSM4GYFA4TA>
.
|
Here is a rough version of equation 1.1 with root finding to get the mode and then use it with log-sum-exp logic. Unfortunately, Boost only added Newton root-finding of complex functions in version 1.70 so we can't (easily) use the complex step method to get the sensitivity to
|
Sorry. I was doing all this Newton stuff to get the value of
|
Don't have much to add, just wanted to express my gratitude for you guys working on this problem, as I've been having quite a bit of trouble tackling the asymptotic behavior of the modified Bessel function with nowhere near the knowledge you guys have. Thanks and keep up the great work @martinmodrak @bgoodri ! |
I am not sure finding the maximum is important for offsetting the For large Here's the code I used to create the latest plot:
Also, it seems like we are doubling some work here - I keep almost up-to-date code at https://github.com/martinmodrak/math/blob/feature/issue-1112-continuous-besselk/stan/math/rev/scal/fun/log_modified_bessel_second_kind_frac.hpp for you to check out. Maybe I should open a new PR / reopen this one, so that my commits are shown in context? Also - if you have the time - it might be sensible for you to try out your implementations with the tests I've developed... Finally, I've had mostly unsatisfying experiences with complex step - on the (two) occasions I used it, it produced almost exactly the same results as Stan's autodiff and was slower to compute. Do you have some specific reason to believe it would help here? @jacob-hjortlund I don't think I have knowledge. Before I started working on this, I didn't even know what Bessel functions are. I am learning it all as I go, and it hurts :-) Also I wouldn't get anywhere near where I am without the help of @bgoodri . Just wanted to let you know this, before you assume my opinions have some large weight :-) |
Just tried to include your implementation using
Leaving the Rothwell for small |
The trapezoid rule function in Boost https://www.boost.org/doc/libs/1_71_0/libs/math/doc/html/math_toolkit/trapezoidal.html has the adaptive stopping criterion but requires a finite upper bound to integrate to. I suppose we could use the largest floating point number, but maybe there is a way to use something smaller. |
Thanks, I'll look into it. Note that I can't use the boost implementation directly, because I need to keep the |
No worries; it is a holiday here. You can utilize the log-sum-exp logic without a fixed number of max steps, as I did in which in this case would be something like
But it is not super-accurate yet. Now, we need a better procedure to choose |
OK, I have some ideas, but need to eat before I attempt to implement the rest of them.
|
I am still having trouble when the order is https://www.wolframalpha.com/input/?i=Log%5BBesselK%5B37634%2C+4236.0%5D%5D but arb says it is https://dlmf.nist.gov/10.41.E2 which evaluates to |
I think we can get a pretty good version of log(besselK) with the new Boost. I just need to pull some things together. |
@martinmodrak, @bgoodri just scrolling through old pulls and found this. Ping! |
@martinmodrak I recently read at https://functions.wolfram.com/Bessel-TypeFunctions/BesselK/17/02/01/ that
so we can avoid the numerically problematic case where
although that might incur too much cancellation if Do you want to re-open this PR or open a similar one? I think numerically stable logic would be like
There might be another case that we have to deal with when |
Upon further review, I think that an approach based on the logarithm of https://functions.wolfram.com/Bessel-TypeFunctions/BesselK/07/01/01/0006/MainEq1.L.gif may work well. All the overflow / underflow stuff is attributable to the x^v in the denominator, which is well-handled by logarithms. The integral is oscillatory, but Boost has a customized function for integrals like that https://www.boost.org/doc/libs/1_76_0/libs/math/doc/html/math_toolkit/fourier_integrals.html that is said to have rapid convergence when the poles of the In other words, the main logic of a Stan Math implementation could look like this simple Rcpp one:
It does not seem to be very accurate for large |
Leaving this here https://arxiv.org/abs/2201.00090 which sounds like a good approach too |
There's this recent implementation https://github.com/tk2lab/logbesselk |
@bgoodri here's my attempt at coding up the algorithm in R following the paper description https://gist.github.com/spinkney/56d8b716248dae8545e5049f4dec33b1 |
Summary
Implemented logarithm of the modified Bessel function of the second kind (Bessel K) for fractional orders, supporting differentiation with respect to both variables.
Log(BesselK) is useful for computing the lpdf of some distributions, such as Generalized inverse Gaussian or SICHEL
The computation is based on https://github.com/stan-dev/stan/wiki/Stan-Development-Meeting-Agenda/0ca4e1be9f7fc800658bfbd97331e800a4f50011 (but modified to allow for stan::math::var arguments). Thanks to @bgoodri for providing it, I would not have been able to find this formula on my own.
The code snippet linked above is in turn based on Equation 26 of Rothwell: Computation of the
logarithm of Bessel functions of complex argument and fractional order
https://scholar.google.com/scholar?cluster=2908870453394922596&hl=en&as_sdt=5,33&sciodt=0,33
Both derivatives are computed by auto-diffing over the 1d integrator. It should be possible to get an explicit formula for the derivative with respect to
v
similarly to the way it is computed inmodified_bessel_second_kind
.The name of the main function (
log_modified_bessel_second_kind_frac
) is provisional. The function is now inrev/arr/fun
. This is IMHO weird, but I includerev/arr/functor/integrate_1d
so the linter complained when the function was inrev/scal/
In addition, I tried to improve the error messages from
integrate_1d
and its documentation, as following the current documentation led me astray. This could be moved to a separate pull request if desired.Tests
A grid of values of the function and its gradients was computed in Mathematica (code is part of the comments in the test file). The relative error between Mathematica and this implementation is <1e-7 for all values and gradients tested.
Side Effects
The error messages of
integrate_1d
have been modified to include the error threshold.Checklist
Math issue BesselK continuous function #1112
Copyright holder: Institute of Microbiology of the Czech Academy of Sciences
The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
./runTests.py test/unit
)make test-headers
)make doxygen
)make cpplint
)the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested