cyl_bessel_j_zero does not respect policies::promote_double<false> #398

evanmiller · 2020-07-08T17:15:26Z

I'm auditing a code base to eliminate uses of long double on x86 (i.e. eliminate all x87 instructions). After compiling, I dump the disassembly and look for forbidden instructions and function calls.

I found that the following small test program - which I am led to believe by Boost documentation will not promote doubles to long doubles - includes calls to a number of long double functions when compiled.

#include <boost/math/special_functions/bessel.hpp>

int main() {
    typedef boost::math::policies::policy<
        boost::math::policies::promote_double<false>
        > numeric_policy;

    boost::math::cyl_bessel_j_zero((double)0.5, 1, numeric_policy());
}

A sampling of disassembly:

$  otool -t -V a.out | grep 'long double' | head -n 5
00000001000039a0	callq	__ZN5boost4math7lanczos19lanczos_initializerINS1_12lanczos17m64EeE4initC2Ev ## boost::math::lanczos::lanczos_initializer<boost::math::lanczos::lanczos17m64, long double>::init::init()
000000010000db3e	callq	__ZL4fabse ## fabs(long double)
000000010000db67	callq	__ZL3cose ## cos(long double)
000000010000db98	callq	__ZL5floore ## floor(long double)
000000010000dc5d	callq	__ZL3sine ## sin(long double)

Is this expected behavior, or a bug?

I realize that I can disable x87 with a compiler flag - but my compiler is actually crashing when I try that. So I am hoping to resolve the issue within Boost.

The text was updated successfully, but these errors were encountered:

NAThompson · 2020-07-08T17:24:54Z

The Lanczos one I recognize: It's required. There is an ill-conditioned filter computation that really does need that extra precision; see here. Looks like its also responsible for the call to abs.

evanmiller · 2020-07-08T17:35:27Z

I am porting code to a platform without native 80-bit or 128-bit (i.e. 'long double' is equivalent to 'double').

Will the Lanczos code fail to converge on such a platform?

NAThompson · 2020-07-08T17:40:01Z

@evanmiller : It won't fail to converge, the filter coefficients will just not be as accurate. For digital filters, we generally want half-ulp accuracy precomputed once, under the assumption that applying the filter to data will dominate the execution time.

evanmiller · 2020-07-08T17:51:58Z

@NAThompson Is it reasonable to ask that this coefficient computation respect the promote_double policy? It is important for my application that the same result is returned on all platforms, even if that result is not as accurate as it could be.

NAThompson · 2020-07-08T17:55:50Z

There's almost zero hope you'll get the same result on all platforms, even if it does respect promote_double; see (for one of many reasons) here.

The semantics of promote_double are somewhat different than what is being done in the Lanczos filters. The promote_double gives the option for a user to obtain half-ulp accuracy in a computation, rather than being limited to a weaker accuracy governed by the condition number of function evaluation.

In the Lanczos filters, we're using argument promotion just to achieve accuracy, because the filter computation is very ill-conditioned.

evanmiller · 2020-07-08T18:22:27Z

There's almost zero hope you'll get the same result on all platforms, even if it does respect promote_double; see (for one of many reasons) here.

Well, I don't need bit-for-bit compatibility. But if there are indeed significant accuracy issues introduced without 80-bit long double, I would like to have control over it for the sake of general cross-platform consistency.

It seems to me that because promote_double defaults to true, a user who sets false is indicating a desire not to use long doubles internally.

In any event, if the current algorithm only guarantees accuracy when 80-bit registers are present, then it appears I will need to find another algorithm.

NAThompson · 2020-07-08T18:28:27Z

In any event, if the current algorithm only guarantees accuracy when 80-bit registers are present, then it appears I will need to find another algorithm.

I did a very deep lit search for this algorithm, and I couldn't find another one that had the same performance and capabilities. So if you do go find another where you can get the filter coeffs without an ill-conditioned step, I'd be very happy about it.

evanmiller · 2020-07-08T18:40:01Z

I did a very deep lit search for this algorithm, and I couldn't find another one that had the same performance and capabilities. So if you do go find another where you can get the filter coeffs without an ill-conditioned step, I'd be very happy about it.

Well, I'm just looking for a pure 64-bit Bessel algorithm - I'm not trying to reinvent the Lanczos wheel. It could be something as simple as replacing Boost's lgamma with the system's lgamma.

I will need to investigate more to see exactly where the long double dependencies are for the functions that I'm calling.

NAThompson · 2020-07-08T18:41:46Z

@evanmiller : Actually I think we're talking about a different Lanczos algorithm. I'm pretty sure there are no filter coefficients needed there. Probably @jzmaddock will know what the Bessel function is doing.

jzmaddock · 2020-07-08T18:43:22Z

This does look like a bug to me, @NAThompson the code in question is my old lanczos gamma function approximation, and it should be using a double precision approximation in this case IMO. @evanmiller can you get a call stack for where the long double calls occur?

evanmiller · 2020-07-08T19:12:49Z

@jzmaddock I don't have a stack trace handy; I'm just disassembling a static binary right now.

I think I hit some Lanczos code through this code path earlier today though:

https://github.com/boostorg/math/blob/develop/include/boost/math/special_functions/detail/bessel_ik.hpp#L63

I can post suspicious disassembly here as I come across it. It looks like there is a similar issue affecting the beta function.

evanmiller · 2020-07-08T19:32:53Z

A long double of cos_pi is being compiled into the original test program. It looks like there are a number of places where cos_pi is not being passed a policy:

$ grep -rn cos_pi .
./special_functions/detail/bessel_jy.hpp:409:            T cv = cos_pi(mod_v);
./special_functions/detail/bessel_jy_derivatives_series.hpp:196:      prefix = boost::math::tgamma(-v, pol) * boost::math::cos_pi(v) * p / boost::math::constants::pi<T>();
./special_functions/detail/bessel_jy_derivatives_asym.hpp:78:   const T ci = cos_pi(vd2shifted);
./special_functions/detail/bessel_jy_derivatives_asym.hpp:111:   const T ci = cos_pi(vd2shifted);
./special_functions/detail/bessel_jy_asym.hpp:83:   T ci = cos_pi(v / 2 + 0.25f);
./special_functions/detail/bessel_jy_asym.hpp:115:   T ci = cos_pi(v / 2 + 0.25f);
./special_functions/detail/bessel_jy_series.hpp:193:      prefix = boost::math::tgamma(-v, pol) * boost::math::cos_pi(v) * p / constants::pi<T>();

evanmiller · 2020-07-09T12:00:44Z

cbrt is another source of inadvertent double promotion.

I'm making good progress eliminating the long doubles by passing policies around in more places - will open a PR later today.

evanmiller · 2020-07-09T12:22:33Z

The factorial functions are another source of double promotion:

math/include/boost/math/special_functions/detail/unchecked_factorial.hpp

Lines 489 to 500 in 6bb0e8d

    
           template <> 
        
           inline BOOST_MATH_CONSTEXPR_TABLE_FUNCTION double unchecked_factorial<double>(unsigned i BOOST_MATH_APPEND_EXPLICIT_TEMPLATE_TYPE_SPEC(double)) 
        
           { 
        
              return static_cast<double>(boost::math::unchecked_factorial<long double>(i)); 
        
           } 
        
           template <> 
        
           struct max_factorial<double> 
        
           { 
        
              BOOST_STATIC_CONSTANT(unsigned, 
        
                 value = ::boost::math::max_factorial<long double>::value); 
        
           };

evanmiller · 2020-07-09T14:47:47Z

A small demonstration program is actually just:

#include <boost/math/special_functions/beta.hpp>

int main() { }

Based on the disassembly, it looks like an 80-bit Lanczos approximator is getting force-instantiated before any user code is called at all:

___cxx_global_var_init:
0000000100002dd0        pushq   %rbp
0000000100002dd1        movq    %rsp, %rbp
0000000100002dd4        movq    0x1225(%rip), %rax ## literal pool symbol address: guard variable for boost::math::lanczos::lanczos_initializer<boost::math::lanczos::lanczos17m64, long double>::initializer
0000000100002ddb        cmpb    $0x0, (%rax)
0000000100002dde        jne     0x100002dfe
0000000100002de4        movq    0x123d(%rip), %rdi ## literal pool symbol address: boost::math::lanczos::lanczos_initializer<boost::math::lanczos::lanczos17m64, long double>::initializer
0000000100002deb        callq   boost::math::lanczos::lanczos_initializer<boost::math::lanczos::lanczos17m64, long double>::init::init() ## boost::math::lanczos::lanczos_initializer<boost::math::lanczos::lanczos17m64, long double>::init::init()
0000000100002df0        movq    0x1209(%rip), %rax ## literal pool symbol address: guard variable for boost::math::lanczos::lanczos_initializer<boost::math::lanczos::lanczos17m64, long double>::initializer
0000000100002df7        movq    $0x1, (%rax)
0000000100002dfe        popq    %rbp
0000000100002dff        retq

So it's possible that this Lanczos code is never actually hit in my test program.

jzmaddock · 2020-07-09T18:10:48Z

I see this, here's a test program for you (requires current develop):


#include <type_traits>
#include <cmath>

#define BOOST_MATH_ASSERT_UNDEFINED_POLICY false
//
// Poison the long double std math functions so we can find accidental usage of these
// when the user has requested that we do *not* use them.
//
// THIS IS NOT STD CONFORMING!!!  Is there a better way?
//
namespace std {
      long double fabs(long double, void* = 0);
      long double abs(long double, void* = 0);
      long double sin(long double, void* = 0);
      long double cos(long double, void* = 0);
      long double tan(long double, void* = 0);
      long double asin(long double, void* = 0);
      long double acos(long double, void* = 0);
      long double atan(long double, void* = 0);
      long double exp(long double, void* = 0);
      long double log(long double, void* = 0);
      long double pow(long double, long double, void* = 0);
}
#define TEST_GROUP_5
#define TEST_GROUP_6
#include "compile_test/instantiate.hpp"

int main()
{
   //boost::math::foo(0.0L);
   instantiate(0.0);
}

The instantiation call stack gives the source of the error(s), the first of which is caused by:

template <>
inline float binomial_coefficient<float, policies::policy<> >(unsigned n, unsigned k, const policies::policy<>& pol)
{
   return policies::checked_narrowing_cast<float, policies::policy<> >(binomial_coefficient<double>(n, k, pol), "boost::math::binomial_coefficient<%1%>(unsigned,unsigned)");
}

Which should be using promote_double<false> in it's forwarding-policy.

Many thanks for looking into this!!

evanmiller · 2020-07-09T18:34:50Z

Clever solution! Just so you know: frexp is another function popular with the long-double crowd - at least according to the disassembly I was examining.

A complete audit is outside the scope of what I'm doing right now, but I'll send along pull requests for any more of those double-long weeds that I come across. The first batch of proposed changes is in #399.

evanmiller · 2020-07-10T04:45:40Z

Changing binomial_coefficient gets rid of the 80-bit Lanczos, hooray! Here is my altered version... is this what you had in mind?

template <>
inline float binomial_coefficient<float, policies::policy<> >(unsigned n, unsigned k, const policies::policy<>& pol)
{
    typedef policies::normalise<
        policies::policy<>,
        policies::promote_float<true>,
        policies::promote_double<false>,
        policies::discrete_quantile<>,
        policies::assert_undefined<> >::type forwarding_policy;
   return policies::checked_narrowing_cast<float, forwarding_policy>(binomial_coefficient<double>(n, k, forwarding_policy()), "boost::math::binomial_coefficient<%1%>(unsigned,unsigned)");
}

If so, I will add it to my pull request.

jzmaddock · 2020-07-10T08:19:11Z

is this what you had in mind?

Yes exactly so!

Once your PR is in I'll try and do a thorough trawl for other issues.

evanmiller · 2020-07-10T11:21:58Z

Ok! I've committed the binomial_coefficient change. Once Travis passes, the PR will be ready for review.

The last function affecting my own code base is unchecked_factorial, as mentioned above. I tried rewriting it to be more generic (i.e. using a single type-independent array), but my template-fu was found wanting. So if your trawl could start there, I would appreciate it!

evanmiller · 2020-07-13T19:41:55Z

With the PR merged, I am now down to two sources of long double with this cyl_bessel_j_zero function:

unchecked_factorial, as mentioned
make_big_value, as in (lots of these):

double const boost::math::tools::make_big_value<double>(long double, char const*, boost::integral_constant<bool, true> const&, boost::integral_constant<bool, false> const&)

I can't yet tell if I can expect identical behavior on 64-bit and 80-bit platforms. For example, make_big_value looks like it gets pulled in from bessel_j0, e.g.

math/include/boost/math/special_functions/detail/bessel_j0.hpp

Lines 67 to 75 in d4d0297

    
           static const T P1[] = { 
        
                static_cast<T>(BOOST_MATH_BIG_CONSTANT(T, 64, -4.1298668500990866786e+11)), 
        
                static_cast<T>(BOOST_MATH_BIG_CONSTANT(T, 64, 2.7282507878605942706e+10)), 
        
                static_cast<T>(BOOST_MATH_BIG_CONSTANT(T, 64, -6.2140700423540120665e+08)), 
        
                static_cast<T>(BOOST_MATH_BIG_CONSTANT(T, 64, 6.6302997904833794242e+06)), 
        
                static_cast<T>(BOOST_MATH_BIG_CONSTANT(T, 64, -3.6629814655107086448e+04)), 
        
                static_cast<T>(BOOST_MATH_BIG_CONSTANT(T, 64, 1.0344222815443188943e+02)), 
        
                static_cast<T>(BOOST_MATH_BIG_CONSTANT(T, 64, -1.2117036164593528341e-01)) 
        
           };

I think that because a double is being returned on both platforms the behavior will be identical, but I am not 100% certain.

evanmiller · 2020-07-13T19:56:35Z

This seems to eliminate the long doubles in make_big_value when compiling with -mlong-double-64:

diff --git a/include/boost/math/tools/big_constant.hpp b/include/boost/math/tools/big_constant.hpp
index aad7ea133..727f44790 100644
--- a/include/boost/math/tools/big_constant.hpp
+++ b/include/boost/math/tools/big_constant.hpp
@@ -33,9 +33,12 @@ struct numeric_traits<__float128>
    static const int max_exponent = 16384;
    static const bool is_specialized = true;
 };
-#else
+#elif LDBL_DIG > DBL_DIG
 typedef long double largest_float;
 #define BOOST_MATH_LARGEST_FLOAT_C(x) x##L
+#else
+typedef double largest_float;
+#define BOOST_MATH_LARGEST_FLOAT_C(x) x
 #endif

 template <class T>

Before this change, any function calling unchecked_factorial<double> would generate long double code. This created a headache for users attempting to eliminate long doubles from their compiled code. A more elegant solution might use templates to create a single lookup array for all major floating-point types, but the present solution (adding an explicit double array) works just fine. See #398.

evanmiller mentioned this issue Jul 9, 2020

Reduce unwanted double promotions #399

Merged

evanmiller mentioned this issue Jul 15, 2020

Remove more long doubles #402

Merged

jzmaddock mentioned this issue Jul 20, 2020

Accumulated fixes for Policies which forbid long double usage. #405

Merged

jzmaddock closed this as completed in #405 Jul 31, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cyl_bessel_j_zero does not respect policies::promote_double<false> #398

cyl_bessel_j_zero does not respect policies::promote_double<false> #398

evanmiller commented Jul 8, 2020

NAThompson commented Jul 8, 2020 •

edited

Loading

evanmiller commented Jul 8, 2020

NAThompson commented Jul 8, 2020

evanmiller commented Jul 8, 2020

NAThompson commented Jul 8, 2020 •

edited

Loading

evanmiller commented Jul 8, 2020

NAThompson commented Jul 8, 2020 •

edited

Loading

evanmiller commented Jul 8, 2020

NAThompson commented Jul 8, 2020

jzmaddock commented Jul 8, 2020

evanmiller commented Jul 8, 2020

evanmiller commented Jul 8, 2020

evanmiller commented Jul 9, 2020

evanmiller commented Jul 9, 2020

evanmiller commented Jul 9, 2020

jzmaddock commented Jul 9, 2020

evanmiller commented Jul 9, 2020

evanmiller commented Jul 10, 2020

jzmaddock commented Jul 10, 2020

evanmiller commented Jul 10, 2020

evanmiller commented Jul 13, 2020

evanmiller commented Jul 13, 2020

cyl_bessel_j_zero does not respect policies::promote_double<false> #398

cyl_bessel_j_zero does not respect policies::promote_double<false> #398

Comments

evanmiller commented Jul 8, 2020

NAThompson commented Jul 8, 2020 • edited Loading

evanmiller commented Jul 8, 2020

NAThompson commented Jul 8, 2020

evanmiller commented Jul 8, 2020

NAThompson commented Jul 8, 2020 • edited Loading

evanmiller commented Jul 8, 2020

NAThompson commented Jul 8, 2020 • edited Loading

evanmiller commented Jul 8, 2020

NAThompson commented Jul 8, 2020

jzmaddock commented Jul 8, 2020

evanmiller commented Jul 8, 2020

evanmiller commented Jul 8, 2020

evanmiller commented Jul 9, 2020

evanmiller commented Jul 9, 2020

evanmiller commented Jul 9, 2020

jzmaddock commented Jul 9, 2020

evanmiller commented Jul 9, 2020

evanmiller commented Jul 10, 2020

jzmaddock commented Jul 10, 2020

evanmiller commented Jul 10, 2020

evanmiller commented Jul 13, 2020

evanmiller commented Jul 13, 2020

NAThompson commented Jul 8, 2020 •

edited

Loading

NAThompson commented Jul 8, 2020 •

edited

Loading

NAThompson commented Jul 8, 2020 •

edited

Loading