Implemented log(BesselK) for fractional orders #1121

martinmodrak · 2019-02-18T16:46:26Z

Summary

Implemented logarithm of the modified Bessel function of the second kind (Bessel K) for fractional orders, supporting differentiation with respect to both variables.

Log(BesselK) is useful for computing the lpdf of some distributions, such as Generalized inverse Gaussian or SICHEL

The computation is based on https://github.com/stan-dev/stan/wiki/Stan-Development-Meeting-Agenda/0ca4e1be9f7fc800658bfbd97331e800a4f50011 (but modified to allow for stan::math::var arguments). Thanks to @bgoodri for providing it, I would not have been able to find this formula on my own.

The code snippet linked above is in turn based on Equation 26 of Rothwell: Computation of the
logarithm of Bessel functions of complex argument and fractional order
https://scholar.google.com/scholar?cluster=2908870453394922596&hl=en&as_sdt=5,33&sciodt=0,33

Both derivatives are computed by auto-diffing over the 1d integrator. It should be possible to get an explicit formula for the derivative with respect to v similarly to the way it is computed in modified_bessel_second_kind.

The name of the main function (log_modified_bessel_second_kind_frac) is provisional. The function is now in rev/arr/fun. This is IMHO weird, but I include rev/arr/functor/integrate_1d so the linter complained when the function was in rev/scal/

In addition, I tried to improve the error messages from integrate_1d and its documentation, as following the current documentation led me astray. This could be moved to a separate pull request if desired.

Tests

A grid of values of the function and its gradients was computed in Mathematica (code is part of the comments in the test file). The relative error between Mathematica and this implementation is <1e-7 for all values and gradients tested.

Side Effects

The error messages of integrate_1d have been modified to include the error threshold.

Checklist

Math issue BesselK continuous function #1112
Copyright holder: Institute of Microbiology of the Czech Academy of Sciences

The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
- unit tests pass (to run, use: ./runTests.py test/unit)
- header checks pass, (make test-headers)
- docs build, (make doxygen)
- code passes the built in C++ standards checks (make cpplint)
the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested

Not all tests pass due to numerical issues

…gs/RELEASE_500/final)

bgoodri · 2019-02-18T17:09:41Z

CRAN won't allow those sprintf things

bgoodri · 2019-02-18T17:10:38Z

I think you should complex-step the derivative with respect to the order when it is a var but that is going to require the Boost 1.7.0 PR to get merged so that you can use complex integrands.

bgoodri · 2019-02-18T17:11:53Z

The derivative with respect to the primary argument has many formulas.

bgoodri · 2019-02-18T17:17:05Z

Also, these tests against Mathematica are not very strong in my opinion. It is much better to test with identities, such as at http://functions.wolfram.com/Bessel-TypeFunctions/BesselK/17/ , that hold for all input values. Usually you can just divide the left-hand side of an identity by the right-hand side and check that the ratio is near one and 1 + the derivative is near one.

bgoodri · 2019-02-18T17:19:40Z

Also, it may be worth thinking about whether we need a specialized integration routine for log_foo functions that utilizes log_sum_exp logic.

bgoodri · 2019-02-18T17:21:34Z

I think you can ultimately avoid using stan::math::integrate_1d and just use boost::math::quadrature::tanh_sinh, in which case you could move the functions back to the scal directories if they do not all get flattened first.

…gs/RELEASE_500/final)

martinmodrak · 2019-02-19T15:58:10Z

Thanks for the feedback - I assume that there is no major known reason to not include this function in the library and it is therefore worthwhile for me to spend time working on it. I have tried to further improve the code.

In particular, I have added analytical derivative wrt. z, but the derivative wrt. v is still from autodiff. There is a trick I use to combine those in the <var,var> case which I am not sure is great, but it works:

  if(is_var<T_v>::value) {
    operands.push_back(value);
    gradients.push_back(1);
  }
  operands.push_back(z);
  gradients.push_back(gradient_dz);

  return precomputed_gradients(value_of(value), operands, gradients);

Where value contains the result of autodiffed computation when v is var (and is double when v is double).

CRAN won't allow those sprintf things

I updated to snprintf - is that still an issue? Would you suggest using std::ostringstream instead? Or do you believe the improved messages are not worth the hassle?

I think you can ultimately avoid using stan::math::integrate_1d and just use boost::math::quadrature::tanh_sinh, in which case you could move the functions back to the scal directories if they do not all get flattened first.

That's what I have done. I have also tried getting analytical derivative of the inner integral wrt. v, to avoid using nested autodiff, but I could not get it to compute stably. I now have a custom nested autodiff code for this which is just simplified gradient_of_f.

However, I use std::vector to be able to use precomputed_gradients which means the linter complains that the function should not be in scal :-(

It is much better to test with identities

Currently looking into this

bob-carpenter · 2019-02-20T03:27:29Z

On Feb 19, 2019, at 10:58 AM, Martin Modrák ***@***.***> wrote: Thanks for the feedback - I assume that there is no major known reason to not include this function in the library and it is therefore worthwhile for me to spend time working on it. I have tried to further improve the code.

I have no idea about utility. But all the analytic derivatives are great for anything we do implement. ...

I updated to snprintf - is that still an issue? Would you suggest using std::ostringstream instead? Or do you believe the improved messages are not worth the hassle?

You can't write to the standard streams. Writing to a std::stringstream is fine. But the only way our functions should create text output is through exceptions. ...

However, I use std::vector to be able to use precomputed_gradients which means the linter complains that the function should not be in scal :-(

There may be a bug in the linter. std::vector is OK to use in the scal directory inside functions. What we don't want is to have functions that apply to std vectors defined in scal. Those should go in arr.

…sselk' into feature/issue-1112-continuous-besselk

…gs/RELEASE_500/final)

syclik · 2019-02-21T04:09:10Z

I have no idea about utility.

But all the analytic derivatives are great for anything we do implement.

I'll cosign. If you think it'll help you, then feel free to continue to implement. If it's written well, we can include it in the Math library for your use. ("well" is a very loosely defined term, but essentially, if there's very little maintenance burden for the future, then it's fine.)

There may be a bug in the linter. std::vector is OK to use in the scal directory inside functions. What we don't want is to have functions that apply to std vectors defined in scal. Those should go in arr.

@bob-carpenter, that's not correct. The difference in scal vs arr was that scal does not include any std::vector. In retrospect, that's probably not really a useful distinction and we're trying to flatten the structure, but for the time being, it would have to be in arr since it depends on std::vector.

@martinmodrak, will you put a comment here when it's ready for review?

@bgoodri, would you be able to review once it's ready?

bob-carpenter · 2019-02-21T16:41:19Z

I recall the decision going the other way, but it's no big deal as long as @syclik can lay out an acceptable workaround until the scal / arr / mat distinction is collapsed.

@syclik: The question is where the definition of real foo(real) should go if there is astd::vector in the implementation? Is it OK to put it in arr?

martinmodrak · 2019-07-08T15:06:53Z

Thanks very much @SteveBronder for going through the code. Most of the suggestions are very sensible and once I resume working on this, I will implement them and will explain where I differ in opinion.

Unfortunately however, the feature is far from complete as the function still has excessive errors for some parameters and does not pass tests (see the previous post: #1121 (comment) for some details). I am unsure about the timeline of resolving this as it seems to require some additional math insights (which I do not have and don't have more ideas where to look) and also because the project that drove this feature is now on the back burner as we encountered some other problems.

Should have probably made the status of PR more prominent, sorry if the feature does not make it in the end and your effort ends up not improving Stan codebase.

SteveBronder · 2019-07-08T16:25:22Z

No worries man! if this is still a WIP then I'm going to close the PR for now

bgoodri · 2019-11-24T07:16:04Z

@martinmodrak Did you try equation (24) of Rothwell in the remaining problematic region? Rothwell says equation (24) is "difficult to compute when |z| is small", but I think it might be accurate enough when z is not small but close to v. For z = 11 and v = 50, I get function values of
58.0217473497870557 from the log of equation(24) with long doubles
58.02174734978707484 from WolframAlpha
and complex-step derivatives with respect to v of
2.20960269175438162 from log of equation(24) with long doubles
2.209602691754382164 from WolframAlpha

Here is my Rcpp version for the log of equation (24) in Rothwell:

// [[Rcpp::plugins(cpp14)]]
// [[Rcpp::depends(BH)]]
#include <Rcpp.h>
#include <boost/math/quadrature/exp_sinh.hpp>
#include <boost/math/special_functions/digamma.hpp>

// [[Rcpp::export]]
std::complex<double> log_besselK(double x, double nu, double h = 1e-9) {
  using std::exp;
  using std::log;
  using std::log1p;
  
  const std::complex<long double> vmhalf(nu - 0.5, h);

  boost::math::quadrature::exp_sinh<long double> integrator;
  auto log_Q = log(integrator.integrate([&](const long double t) {
    return exp(-t +  vmhalf * (log(t) + log1p(0.5 * t / x)));
  }, std::sqrt(std::numeric_limits<long double>::epsilon()));
  std::complex<long double> log_K(0.5 * log(0.5 * M_PI / x) 
                                  - x - std::lgamma(nu + 0.5) + log_Q.real(),
                                  - h * boost::math::digamma(nu + 0.5) 
                                  + log_Q.imag());
  return static_cast<std::complex<double> >(log_K);
}

martinmodrak · 2019-11-27T20:02:51Z

@bgoodri Thanks for the hint. I actually tried the formula given at the question https://math.stackexchange.com/questions/1960778/approximating-the-log-of-the-modified-bessel-function-of-the-second-kind (the answer is mine), which helps in a bit of the parameter space but does not work toward the larger values. The formula 24 in Rothwell is actually a bit less stable than that: I haven't found parameter values for which the Rothwell 24 integral does not blow up while the "gamma integral" from the SE question does blow up.

Also using long double is not of much help as it increases the range where the integrands don't blow up only very slightly (unsurprisingly - due to the exponential behavior).

One thing that might help a lot would be a reimplementation of the exph_sinh integrator using the log_sum_exp logic, which Bob hinted is a possibility, allowing me to stay on the log scale. (for the problematic parameters the maximum log of the integrand can crawl over several hundred)

Here is the current state of affairs.

bgoodri · 2019-11-27T20:21:15Z

Here is a version of equation (24) from Rothwell with a log-sum-exp thing, which I would guess works better when v is large.

// [[Rcpp::plugins(cpp14)]]
// [[Rcpp::depends(BH)]]
#include <Rcpp.h>
#include <boost/math/quadrature/tanh_sinh.hpp>
#include <boost/math/quadrature/exp_sinh.hpp>
#include <boost/math/special_functions/digamma.hpp>

std::complex<double> log_besselK(double x, double nu, double h = 1e-9) {
  using std::exp;
  using std::log;
  using std::log1p;
  
  const std::complex<long double> v(nu, h);
  const std::complex<long double> half(0.5);
  const std::complex<long double> x_(x);
  auto maximum = std::sqrt(v * v - v + x_ * x_ + half * half) + v - x_ - half;

  auto vmhalf = v - half;
  boost::math::quadrature::exp_sinh<long double> integrator;
  auto log_Q = maximum.real() > 0 
             ? maximum + log(integrator.integrate([&](const long double t) {
                 auto t_max = t + maximum;
                 return exp(-t_max + vmhalf * (log(t) + log1p(0.5 * t / x)));
               }, std::sqrt(std::numeric_limits<long double>::epsilon())))
             : log(integrator.integrate([&](const long double t) {
                 return exp(-t + vmhalf * (log(t) + log1p(0.5 * t / x)));
             }, std::sqrt(std::numeric_limits<long double>::epsilon())));
  std::complex<long double> log_K(0.5 * log(0.5 * M_PI / x)
                                  - x - std::lgamma(nu + 0.5) + log_Q.real(),
                                  - h * boost::math::digamma(nu + 0.5)
                                  + log_Q.imag());
  return static_cast<std::complex<double> >(log_K);
}

martinmodrak · 2019-11-27T21:40:49Z

EDIT: Sorry, I misread your suggestion. I didn't try moving the maximum out. That sounds reasonable I will try it!

martinmodrak · 2019-11-28T18:32:25Z

Thanks @bgoodri , moving the maximum out helped a lot. This is the current error plot (and I think I can still tweak it a bit, so maybe I am almost there)

bgoodri · 2019-11-28T18:57:18Z

Great, although I am slightly confused as to how it would help a lot if it was similar to what you already had. But I'll take it. Are you still using doubles? Maybe switching to long double would get us home?

…

On Thu, Nov 28, 2019 at 1:32 PM Martin Modrák ***@***.***> wrote: Thanks @bgoodri <https://github.com/bgoodri> , moving the maximum out helped a lot. This is the current error plot (and I think I can still tweak it a bit, so maybe I am almost there) [image: image] <https://user-images.githubusercontent.com/9483603/69826973-bea02180-1215-11ea-9888-c070ddc8ffae.png> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1121?email_source=notifications&email_token=AAZ2XKVBH6VLICINJRCKJPDQWAFDVA5CNFSM4GYFA4TKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFNIILY#issuecomment-559580207>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZ2XKX3KCTUHTCUO2RFHATQWAFDVANCNFSM4GYFA4TA> .

bgoodri · 2019-11-28T19:42:11Z

Also, I have not tried this yet but https://www.boost.org/doc/libs/1_71_0/libs/math/doc/html/math_toolkit/trapezoidal.html which cites Table 17.1 of https://epubs.siam.org/doi/pdf/10.1137/130932132 claims that the trapezoid rule is good for evaluating complex Bessel functions.

…

On Thu, Nov 28, 2019 at 1:56 PM Ben Goodrich ***@***.***> wrote: Great, although I am slightly confused as to how it would help a lot if it was similar to what you already had. But I'll take it. Are you still using doubles? Maybe switching to long double would get us home? On Thu, Nov 28, 2019 at 1:32 PM Martin Modrák ***@***.***> wrote: > Thanks @bgoodri <https://github.com/bgoodri> , moving the maximum out > helped a lot. This is the current error plot (and I think I can still tweak > it a bit, so maybe I am almost there) > > [image: image] > <https://user-images.githubusercontent.com/9483603/69826973-bea02180-1215-11ea-9888-c070ddc8ffae.png> > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#1121?email_source=notifications&email_token=AAZ2XKVBH6VLICINJRCKJPDQWAFDVA5CNFSM4GYFA4TKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFNIILY#issuecomment-559580207>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAZ2XKX3KCTUHTCUO2RFHATQWAFDVANCNFSM4GYFA4TA> > . >

bgoodri · 2019-11-28T19:51:23Z

and https://arxiv.org/pdf/1209.1547.pdf

martinmodrak · 2019-11-29T15:57:51Z

Thanks for the help, I am finally moving forward. Just to report current status: It turns out the trapezoid rule on the cosh formula from the Arxiv paper (on the log scale, via log_sum_exp) works great for v ~= z, but only for the actual value and d/dz. Autodiffing through the trapezoid rule yields bad values for d/dv.

(there are more test points as I added testing for the boundaries where I choose different formulas)

But d/dv was computed OK with the previous algorithms, so my next plan is to combine the value from the trapezoid rule with derivatives from the asymptotic expressions. After that, the derivatives for very small v and z are still bad (error around 1e-3), which I still don't know how to tackle (the other integrals, including the trapezoid give even worse results than Rothwell there).

bgoodri · 2019-11-29T16:12:03Z

I think the trapezoid rule with a log_sum_exp of equation 1.1 in https://arxiv.org/pdf/1209.1547.pdf would work really well, except that the maximum of the integrand apparently cannot be found analytically. So, I guess we would have to use Boost's root-finding methods to find the maximum and then subtract it.

…

On Fri, Nov 29, 2019 at 10:57 AM Martin Modrák ***@***.***> wrote: Thanks for the help, I am finally moving forward. Just to report current status: It turns out the trapezoid rule (on the log scale, via log_sum_exp) works great for v ~= z, but only for the actual value. Autodiffing through the trapezoid rule yields bad values for d/dv. [image: image] <https://user-images.githubusercontent.com/9483603/69879969-12b80e00-12c9-11ea-9509-9a8d823a76a5.png> But d/dv was computed OK with the previous algorithms, so my next plan is to combine the value from the trapezoid rule with derivatives from the asymptotic expressions. After that, the derivatives for very small v and z are still bad (error around 1e-3), which I still don't know how to tackle (the other integrals, including the trapezoid give even worse results than Rothwell there). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1121?email_source=notifications&email_token=AAZ2XKTFVMXDTX3QN32RJMTQWE3X7A5CNFSM4GYFA4TKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEFPFR2Y#issuecomment-559831275>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAZ2XKTEKYCDYQAGIR3AWQTQWE3X7ANCNFSM4GYFA4TA> .

bgoodri · 2019-11-29T21:02:23Z

Here is a rough version of equation 1.1 with root finding to get the mode and then use it with log-sum-exp logic. Unfortunately, Boost only added Newton root-finding of complex functions in version 1.70 so we can't (easily) use the complex step method to get the sensitivity to v.

// [[Rcpp::plugins(cpp14)]]
// [[Rcpp::depends(BH)]]
#include <Rcpp.h>
#include <boost/math/quadrature/exp_sinh.hpp>
#include <boost/math/tools/roots.hpp>
#include <tuple> // for std::tuple and std::make_tuple.


// [[Rcpp::export]]
double log_besselK(double x, double nu) {
  using std::exp;
  using std::log;
  using std::log1p;
  using std::cosh;
  using std::sinh;
  using std::tanh;
  
  double ub = 1.0;
  
  while ( -x * sinh(ub) + nu * tanh(ub * nu) > 0 ) {
    ub *= 2.0;
  }
  double guess = 0.75 * ub;
  
  boost::uintmax_t maxit = 20;
  using boost::math::tools::newton_raphson_iterate;
  
  auto maximum = newton_raphson_iterate(
    [&](double t){
      auto sinh_t = sinh(t);
      auto v_t = nu * t;
      auto tanh_vt = tanh(v_t);
      auto cosh_vt = cosh(v_t);
      auto x_sinh_t = x * sinh_t;
      auto value = nu * tanh_vt - x_sinh_t;
      auto ratio = nu / cosh_vt;
      auto ratio2 = ratio * ratio;
      auto D1 = ratio2 - x * cosh(t);
      return std::make_tuple(value, D1);
    }, guess, ub > 1.0 ? 0.5 * ub : 0, ub, 8, maxit);
  
  double error_estimate;
  double L1;

  boost::math::quadrature::exp_sinh<double> integrator;
  double termination = sqrt(std::numeric_limits<double>::epsilon());
  
  auto log_K = maximum + log(integrator.integrate(
    [&](double t) {
      auto nut = nu * t;
      // log(cosh(x)) == x + log1p(exp(-2 * x)) - log(2)
      return exp(-maximum + nut + log1p(exp(-2.0 * nut)) 
                 - log(2.0) - x * cosh(t));
    }, termination, &error_estimate, &L1));

  std::cout << "error_estimate = " << error_estimate << std::endl 
            << "L1 = " << L1 << std::endl
            << "maximum = " << maximum << std::endl;
  return log_K;
}

bgoodri · 2019-11-29T23:19:04Z

Sorry. I was doing all this Newton stuff to get the value of t that maximizes the integrand but then shifting by that argmax instead of shifting by the maximal log-value. This version should actually be good (for the value; we still have to deal with the autodiff) because when shifted properly, there is relatively little area under the curve:

// [[Rcpp::plugins(cpp14)]]
// [[Rcpp::depends(BH)]]
#include <Rcpp.h>
#include <boost/math/quadrature/exp_sinh.hpp>
#include <boost/math/tools/roots.hpp>
#include <tuple> // for std::tuple and std::make_tuple.

// [[Rcpp::export]]
double log_besselK(double x, double nu) {
  using std::exp;
  using std::log;
  using std::log1p;
  using std::cosh;
  using std::sinh;
  using std::tanh;
  
  double ub = 1.0;
  
  while ( -x * sinh(ub) + nu * tanh(ub * nu) > 0 ) {
    ub *= 2.0;
  }
  double guess = 0.75 * ub;
  
  boost::uintmax_t maxit = 20;
  using boost::math::tools::newton_raphson_iterate;
  
  auto arg_max = newton_raphson_iterate(
    [&](double t){
      auto sinh_t = sinh(t);
      auto v_t = nu * t;
      auto tanh_vt = tanh(v_t);
      auto cosh_vt = cosh(v_t);
      auto x_sinh_t = x * sinh_t;
      auto value = nu * tanh_vt - x_sinh_t;
      auto ratio = nu / cosh_vt;
      auto ratio2 = ratio * ratio;
      auto D1 = ratio2 - x * cosh(t);
      return std::make_tuple(value, D1);
    }, guess, ub > 1.0 ? 0.5 * ub : 0, ub, 8, maxit);
  
  const double log2 = log(2.0);
  // log(cosh(x)) == x + log1p(exp(-2 * x)) - log(2)
  auto pivot = nu * arg_max + log1p(exp(-2.0 * nu * arg_max)) - log2
             - x * cosh(arg_max);
  
  double error_estimate;
  double L1;

  boost::math::quadrature::exp_sinh<double> integrator;
  double termination = sqrt(std::numeric_limits<double>::epsilon());
  
  auto log_K = pivot + log(integrator.integrate(
    [&](double t) {
      auto nut = nu * t;
      return exp(-pivot + nut + log1p(exp(-2.0 * nut)) - log2 
                 - x * cosh(t));
    }, termination, &error_estimate, &L1));

  std::cout << "error_estimate = " << error_estimate << std::endl 
            << "L1 = " << L1 << std::endl
            << "arg_max = " << arg_max << std::endl
            << "pivot = " << pivot << std::endl;
  return log_K;
}

jacob-hjortlund · 2019-11-30T08:31:21Z

Don't have much to add, just wanted to express my gratitude for you guys working on this problem, as I've been having quite a bit of trouble tackling the asymptotic behavior of the modified Bessel function with nowhere near the knowledge you guys have. Thanks and keep up the great work @martinmodrak @bgoodri !

martinmodrak · 2019-11-30T11:26:46Z

I am not sure finding the maximum is important for offsetting the log_sum_exp - everywhere I look, the maximum is at most 1, usually much lower. I however use the maximum to find the step size to integrate with. Luckily, for large v*x > 17, you get log(cosh(v*x)) ~= abs(v*x) - log(2). With this substitution you can analytically find the arg max of the integrand as asinh(v/z).

For large v, the integrand decays very quickly past the maximum.
The behaviour at low v and z seems to be more tricky, and I have yet to figure out if the method would be useful there.

Here's the code I used to create the latest plot:

template <typename T>
T logcosh(const T& x) {
  //Taken from the limma R package, maybe I should check it works as good as I
  //expect it to?
  //Credited in limma as 	Gordon Smyth  8 April 2007
  T x_ = abs(x);
  if(x_ < 1e-4) {
    return 0.5 * x * x;
  } else if(x_ < 17) {
    return log(cosh(x));
  } else {
    return x_ - std::log(2);
  }
}

template <typename T_v>
T_v trapezoid_cosh(const T_v &v, const double &z) {
  const double& approximate_maximum = std::asinh(value_of(v) / z);
  const int max_steps = 500;
  const double& h = approximate_maximum / (0.5 * max_steps);
  std::vector<T_v> terms;
  terms.reserve(max_steps);
  for(int n = 0; n < max_steps; n++) {
    const double x = n * h;
    const T_v last_term = logcosh(v * x) - z * cosh(x);
    terms.push_back(last_term);
    //TODO create some sensible stopping criterion
  }
  return log_sum_exp(terms) + std::log(h);
}

Also, it seems like we are doubling some work here - I keep almost up-to-date code at https://github.com/martinmodrak/math/blob/feature/issue-1112-continuous-besselk/stan/math/rev/scal/fun/log_modified_bessel_second_kind_frac.hpp for you to check out. Maybe I should open a new PR / reopen this one, so that my commits are shown in context? Also - if you have the time - it might be sensible for you to try out your implementations with the tests I've developed...

Finally, I've had mostly unsatisfying experiences with complex step - on the (two) occasions I used it, it produced almost exactly the same results as Stan's autodiff and was slower to compute. Do you have some specific reason to believe it would help here?

@jacob-hjortlund I don't think I have knowledge. Before I started working on this, I didn't even know what Bessel functions are. I am learning it all as I go, and it hurts :-) Also I wouldn't get anywhere near where I am without the help of @bgoodri . Just wanted to let you know this, before you assume my opinions have some large weight :-)

martinmodrak · 2019-11-30T14:48:47Z

Just tried to include your implementation using exp_sinh to integrate and it runs into problems for values like v = nu = 37634, z = x = 4236 where the logarithm of the value swings between -4236 and +70535 so no amount of pivoting can help. However, using your implementation of logcosh(x) = x + log1p_exp(-2 * x) - log(2) makes the stupid trapezoid rule also resolve the derivatives wrt. v. And some of the problems can be ameliorated by increasing resoluion. With:

template <typename T>
T logcosh(const T& x) {
  //By bgoodri
  return x + log1p_exp(-2 * x) - log(2);
}

template <typename T_v>
T_v trapezoid_cosh(const T_v &v, const double &z) {
  const double& approximate_maximum = std::asinh(value_of(v) / z);
  const int max_steps = 5000;
  const double& h = approximate_maximum / (0.1 * max_steps);
  std::vector<T_v> terms;
  terms.reserve(max_steps);
  for(int n = 0; n < max_steps; n++) {
    const double x = n * h;
    const T_v last_term = logcosh(v * x) - z * cosh(x);
    terms.push_back(last_term);
    //TODO create some sensible stopping criterion
  }
  return log_sum_exp(terms) + std::log(h);
}

I get:

Leaving the Rothwell for small v and z the biggest problem. However, with max_steps = 5000 this is heavily inefficient and I need to find good ways to determine h and a stopping criterion.

bgoodri · 2019-11-30T14:58:05Z

The trapezoid rule function in Boost

https://www.boost.org/doc/libs/1_71_0/libs/math/doc/html/math_toolkit/trapezoidal.html

has the adaptive stopping criterion but requires a finite upper bound to integrate to. I suppose we could use the largest floating point number, but maybe there is a way to use something smaller.

martinmodrak · 2019-11-30T15:52:08Z

Thanks, I'll look into it. Note that I can't use the boost implementation directly, because I need to keep the log_sum_exp logic. But I might be able to reimplement it. Also sorry for bothering on the weekend - you are in no way obliged to respond at this time :-)

bgoodri · 2019-11-30T16:04:57Z

No worries; it is a holiday here. You can utilize the log-sum-exp logic without a fixed number of max steps, as I did in

https://github.com/stan-dev/math/blob/develop/stan/math/prim/scal/fun/log_modified_bessel_first_kind.hpp#L208

which in this case would be something like

template <typename T_v>
T_v trapezoid_cosh(const T_v &v, const double &z) {
  using stan::math::value_of;
  using stan::math::log_sum_exp;
  
  const double& approximate_maximum = std::asinh(value_of(v) / z);
  const double& h = approximate_maximum / (0.1 * 5000);
  T_v s = 0.0;
  T_v old_s = 1.0;
  int n = 0;
  while (s > old_s || old_s > s) {
    const double x = n * h;
    const double last_term = logcosh(v * x) - z * cosh(x);
    old_s = s;
    s = log_sum_exp(s, last_term);
    n++;
  }
  return s + std::log(h);
}

But it is not super-accurate yet. Now, we need a better procedure to choose h.

bgoodri · 2019-11-30T17:26:30Z

OK, I have some ideas, but need to eat before I attempt to implement the rest of them.

I think it would be best if we find the actual arg_max of the integrand with Newton-Raphson but your approximation is probably a better starting point than the starting point I was using.
The while loop will terminate quicker if we initialize s = logcosh(v * arg_max) - z * cosh(arg_max), then set n = 1, and then evaluate the while loop with both x = arg_max + n * h and x = arg_max - n * h. Although then you have to return s + std::log(0.5 * h).
The h should be set based on the curvature of the integrand at arg_max or maybe we can set h adaptively based on the curvature at the last point we evaluated at.

bgoodri · 2019-12-03T18:07:21Z

I am still having trouble when the order is 37634 and the argument is 4236. WolframAlpha says it is "undefined"

https://www.wolframalpha.com/input/?i=Log%5BBesselK%5B37634%2C+4236.0%5D%5D

but arb says it is 70531.3573385249910109, which is fairly consistent with the asymptotic expansion for large order and fixed argument at

https://dlmf.nist.gov/10.41.E2

which evaluates to 0.5 * log(0.5 * pi / v) - v * (1 + log(z) - log(2 * v)) which is about 70650.37. With some more tricks I can get 70522.99, which I suppose is ok in the relative error sense, but I think we're going to have to wait another week for Boost 1.72 to get released.

bgoodri · 2020-02-18T18:25:27Z

I think we can get a pretty good version of log(besselK) with the new Boost. I just need to pull some things together.

bbbales2 · 2020-04-13T17:54:18Z

@martinmodrak, @bgoodri just scrolling through old pulls and found this. Ping!

bgoodri · 2021-07-07T18:45:19Z

@martinmodrak I recently read at https://functions.wolfram.com/Bessel-TypeFunctions/BesselK/17/02/01/ that

besselK(v, z) = z / (2 * v) * [besselK(v + 1, z) - besselK(v - 1, z)]

so we can avoid the numerically problematic case where v < 1/2 by computing

log(z) - log(2 * v) + log_diff_exp(log_besselK(v + 1, z), log_besselK(fabs(v - 1), z))

although that might incur too much cancellation if v is too close to zero. That looks like twice as many calls to achieve more numerical stability, but we have to compute log_besselK(v + 1, z) anyway in order to take the derivative wrt z.

Do you want to re-open this PR or open a similar one? I think numerically stable logic would be like

If v is very close to zero, do an asymptotic approximation
Otherwise, if v < 1/2, utilize the above identity to shift the orders to both be greater than 1/2
If v is very large, log an asymptotic approximation (possibly based on a Laplace approximation)
If z is very large, log an asymptotic approximation (possibly based on a Laplace approximation)
If z is much smaller than sqrt(1 + v), log an asymptotic approximation
Otherwise, log Rothwell's integral using the log_sum_exp logic to pull out the peak, also splitting Rothwell's integral into two parts at the mode in case all the area is in the neighborhood of the mode

There might be another case that we have to deal with when z and v are similar but I never understood what drove the numerical inaccuracy in that situation.

bgoodri · 2021-07-13T17:30:56Z

Upon further review, I think that an approach based on the logarithm of

https://functions.wolfram.com/Bessel-TypeFunctions/BesselK/07/01/01/0006/MainEq1.L.gif

may work well. All the overflow / underflow stuff is attributable to the x^v in the denominator, which is well-handled by logarithms. The integral is oscillatory, but Boost has a customized function for integrals like that

https://www.boost.org/doc/libs/1_76_0/libs/math/doc/html/math_toolkit/fourier_integrals.html

that is said to have rapid convergence when the poles of the f function are not too close to the real axis. The underlying Ooura and Mori (1999) paper has a fast-converging example in equation 4.13 that is similar to our case in that the poles are at plus or minus the imaginary unit.

In other words, the main logic of a Stan Math implementation could look like this simple Rcpp one:

// [[Rcpp::plugins(cpp17)]]
// [[Rcpp::depends(BH)]]
#include <Rcpp.h>
#define BOOST_MATH_INSTRUMENT_OOURA
#include <boost/math/quadrature/ooura_fourier_integrals.hpp>
#include <iostream>

// [[Rcpp::export]]
double log_besselK(const double x, const double nu) {
  
  using std::pow;
  using std::log;
  using std::lgamma;
  
  auto integrator = boost::math::quadrature::ooura_fourier_cos<double>();  
  
  auto f = [&nu](double t) {
    return pow(t * t + 1, -nu - 0.5);
  };

  auto [result, relative_error] = integrator.integrate(f, x);
  std::cout << "Integral = " << result << 
    ", relative error estimate " << relative_error << std::endl;
  
  return nu * (log(2.0) - log(x)) - 0.5 * log(M_PI) + lgamma(nu + 0.5) 
         + log(result);
}

It does not seem to be very accurate for large x and small v, but we could deal with that case separately.

bgoodri · 2022-03-06T21:40:52Z

Leaving this here

https://arxiv.org/abs/2201.00090

which sounds like a good approach too

spinkney · 2022-07-21T15:22:18Z

There's this recent implementation https://github.com/tk2lab/logbesselk

spinkney · 2022-07-21T15:33:09Z

@bgoodri here's my attempt at coding up the algorithm in R following the paper description

https://gist.github.com/spinkney/56d8b716248dae8545e5049f4dec33b1

martinmodrak and others added 9 commits February 18, 2019 14:02

First implementation

4aa5e93

Not all tests pass due to numerical issues

Improved docs for integrate_1d

831e488

Improved error reporting for integrate_1d

e2ce33b

Removed unnecessary vars in log_modified_bessel_second_kind

e538162

Merge commit 'f0e38e580d1d34d91fb202b0134d76e9d92a5226' into HEAD

e7341f8

[Jenkins] auto-formatting by clang-format version 5.0.0-3~16.04.1 (ta…

9e0d321

…gs/RELEASE_500/final)

Fixing lint errors

4ba7f4c

[Jenkins] auto-formatting by clang-format version 5.0.0-3~16.04.1 (ta…

5a35097

…gs/RELEASE_500/final)

More lint fixes

4cf62fe

martinmodrak and others added 7 commits February 19, 2019 11:57

Fixed headers after move

67ab74c

Using tanh_sinh directly

c81f29f

Analytical derivative wrt z

dd1ca3d

Moved back to /rev/scal, updated #includes

4103a70

[Jenkins] auto-formatting by clang-format version 5.0.0-3~16.04.1 (ta…

b9142d8

…gs/RELEASE_500/final)

Moving to /rev/arr to satisfy linter

c509258

[Jenkins] auto-formatting by clang-format version 5.0.0-3~16.04.1 (ta…

a6a3c16

…gs/RELEASE_500/final)

martinmodrak and others added 4 commits February 20, 2019 10:49

"Fixed" header tests by moving helper funcs outside of stan::math

71d65d8

Merge remote-tracking branch 'origin/feature/issue-1112-continuous-be…

6c61baa

…sselk' into feature/issue-1112-continuous-besselk

[Jenkins] auto-formatting by clang-format version 5.0.0-3~16.04.1 (ta…

2cbe7ba

…gs/RELEASE_500/final)

Changed snprintf to ostringstream for integrate_1d error messages

8b0ebed

SteveBronder closed this Jul 8, 2019

Implemented log(BesselK) for fractional orders #1121

Implemented log(BesselK) for fractional orders #1121

Conversation

martinmodrak commented Feb 18, 2019 • edited

Summary

Tests

Side Effects

Checklist

bgoodri commented Feb 18, 2019

bgoodri commented Feb 18, 2019

bgoodri commented Feb 18, 2019

bgoodri commented Feb 18, 2019

bgoodri commented Feb 18, 2019

bgoodri commented Feb 18, 2019

martinmodrak commented Feb 19, 2019

bob-carpenter commented Feb 20, 2019 via email

syclik commented Feb 21, 2019

bob-carpenter commented Feb 21, 2019

martinmodrak commented Jul 8, 2019

SteveBronder commented Jul 8, 2019

bgoodri commented Nov 24, 2019

martinmodrak commented Nov 27, 2019 • edited

bgoodri commented Nov 27, 2019

martinmodrak commented Nov 27, 2019 • edited

martinmodrak commented Nov 28, 2019

bgoodri commented Nov 28, 2019 via email

bgoodri commented Nov 28, 2019 via email

bgoodri commented Nov 28, 2019

martinmodrak commented Nov 29, 2019 • edited

bgoodri commented Nov 29, 2019 via email

bgoodri commented Nov 29, 2019

bgoodri commented Nov 29, 2019

jacob-hjortlund commented Nov 30, 2019

martinmodrak commented Nov 30, 2019 • edited

martinmodrak commented Nov 30, 2019

bgoodri commented Nov 30, 2019

martinmodrak commented Nov 30, 2019

bgoodri commented Nov 30, 2019

bgoodri commented Nov 30, 2019 • edited

bgoodri commented Dec 3, 2019 • edited

bgoodri commented Feb 18, 2020

bbbales2 commented Apr 13, 2020

bgoodri commented Jul 7, 2021

bgoodri commented Jul 13, 2021

bgoodri commented Mar 6, 2022

spinkney commented Jul 21, 2022

spinkney commented Jul 21, 2022 • edited

martinmodrak commented Feb 18, 2019 •

edited

martinmodrak commented Nov 27, 2019 •

edited

martinmodrak commented Nov 27, 2019 •

edited

martinmodrak commented Nov 29, 2019 •

edited

martinmodrak commented Nov 30, 2019 •

edited

bgoodri commented Nov 30, 2019 •

edited

bgoodri commented Dec 3, 2019 •

edited

spinkney commented Jul 21, 2022 •

edited