Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: quad-double precision. #184

Open
cosurgi opened this issue Jan 17, 2020 · 58 comments
Open

Feature request: quad-double precision. #184

cosurgi opened this issue Jan 17, 2020 · 58 comments

Comments

@cosurgi
Copy link

cosurgi commented Jan 17, 2020

Hi,

as suggested in math #303 I repost this here:

How feasible it is to integrate quad-double precision, It has 212 bits in significand and is faster than MPFR:

I have just recently integrated boost multiprecision with a fairly large numerical computation software YADE.

Float128 isn't enough for my needs, yade would greatly benefit from quad-double (as it is much faster than MPFR). Also yade can serve as a really large testing ground. I have implemented CGAL numerical traits and EIGEN numerical traits to use multiprecision. And all of them are tested daily in our pipeline, see an example run here. Or a more specific example for float128: test (need to scroll up) and check

@cosurgi
Copy link
Author

cosurgi commented Jan 17, 2020

@ckormanyos

How feasible it is to integrate quad-double precision...

This is a good question. I would combine that with also software support for float128_t for those platforms lacking quadmath. In fact, I would see full cross-plattform availability of a float128_t (be it hardware backend or software pure) as a prerequisite before going for a float256_t.

This definitely would be great. Especially if float128 will use compiler/assembler version when available.

Then there is the topic of design questions.

Exrtract float128_t/256_t from the same template?
Support other digit splits, etc.
Lots of questions.

Yeah, it definitely depends on the approach. The simplest one is of course to use qd-2.3.22.tar.gz verbatim and only add the necessary multiprecision wrapper. A much more interesting approach is to extract the design used to "quadruple" the precision in there. In such a way that quad-float128 would be simply another template of the same design. 512bits is exactly what I will need in my calculations.

@ckormanyos
Copy link
Member

ckormanyos commented Jan 17, 2020

A much more interesting approach is to extract the design used to "quadruple" the precision in there. In such a way that quad-float128 would be simply another template of the same design

I was thinking more of the binary128 and binary256 formats that are mentiod in IEEE754:2008. But these would be (at least in their portable and generic form) software-intensive and thus lose the hardware acceleration that is the very thing you seek. So that is not exactly what the original request is about.

The simplest one is of course to use qd-2.3.22.tar.gz verbatim and only add the necessary multiprecision wrapper...

This might unfortunately have liensing inconsistencies, in the sense that using the code directly with BSL might be problematic.

I think that binary128/256 is an idea we could talk about. But I mentioned previously, I think there would be several important design considerations. Another idea would be to create a set of popular backends intended to wrap a given type for the "number" template in Boost.Multiprecision. That might be a way to navigate through the licensing differences. I mean, you could even wrap the old FORTRAN77 REAL*16 that way.

I think I'd like to continue this discussion and see what possibilities there are.

Kind regards, Chris

@cosurgi
Copy link
Author

cosurgi commented Jan 17, 2020

Since binary128/256 would be software intensive I would prefer other approach. When hardware support appears in future CPUs adding such binary support should be fairly easy. As you said, it is not a matter of this thread.

a set of popular backends intended to wrap a given type for the "number" template in Boost.Multiprecision

This solution is much more interesting for me. I think that you have already done that with __float128 and mpfr:: (the file /usr/include/mpreal.h from the debian package libmpfrc++-dev), am I correct?

Also the idea to use the "quadrupling" method described in that paper is a worthwhile direction for me. Even if it means completely reimplementing this idea in boost from scratch. This will take a lot of time though.

So I could summarise following:

  1. backend intended to wrap quad-double. The user has to install and configure qd-2.3.22.tar.gz by himself, but then boost would immediately "understand" and wrap it.
  2. "quadrupling" precision approach, a pure boost-only solution.

@ckormanyos
Copy link
Member

ckormanyos commented Jan 18, 2020

a set of popular backends intended to wrap a given type for the "number" template in Boost.Multiprecision

  1. backend intended to wrap quad-double. The user has to install and configure qd-2.3.22.tar.gz by himself, but then boost would immediately "understand" and wrap it.

Correct. There are already Boost.Multiprecision "float" backend wrappers for __float128/quadmath, GMP/MPIR, MPFR and an "int" wrapper for tommath, and these require that the user environment is able to find and link to the required target-specific lib.

I'm not sure what it means for an existing UDT number library to be important/cool/special enough to be wrapped with a supported backend in Boost? I'm also a bit hesitant to directly use DD/QD prototypes when looking at the licensing of the original work. Nonetheless, DD and QD have been around for a while. In fact, I used them heavily about 10-12 years ago. Back then there was a different DD package with 106 binary digits. Maybe it was the predecessor of the work we are mentioning in this post.

Even though I'm not particularly fond of the "FP hack" required for x86 architecture in these packages, and I'm also not sure exactly how portable DD and QD are, these libs are quite useful.

This idea seems feasible to me.

Kind regards, Chris

@pabristow
Copy link
Contributor

The downside of double-double (and I presume of quad-double) has been shown to be some uncertainty estimating about epsilon and consequence difficulties in estimating precision and uncertainties.

The Apple/Darwin platforms all show 'different' behaviour in many Boost.Math tests.

So the 106 or 212 significand bits may be over-optimistic. If you value speed more then ...

@ckormanyos
Copy link
Member

The downside of double-double (and I presume of quad-double) has been shown to be some uncertainty estimating about epsilon and consequence difficulties in estimating precision and uncertainties.

Indeed. Interestingly enough, two of the greater challenges in potentially wrapping DD/QD and higher orders would be getting the types to behave like proper C++ types and figuring out how to mix the function name calls within BSL headers.

These are the two challenges I'm facing in another (unpublished) project that wraps the high precision type of a popular computer algebra system for Boost.Multiprecision.

Still, I think the DD/QD topic might be interesting to pursue in a feasibility study. Perhaps some of the epsilon inconsistencies could be addressed if the concepts of unity, comparison, frexp, ldexp are established more clearly in small add-on functionalities beyond the raw original implementation.

Kind regards, Chris

@jzmaddock
Copy link
Collaborator

This is a good idea, and in principle easy enough.

As mentioned above already, epsilon is likely to the the main sticking point, there are 2 possible definitions:

  • A strict interpretation of the std, is to pick the smallest number which can be added to 1 and yield a different result - this is numeric_limits<double>::min() or even denorm_min(). This is the definition that gcc on apple used, and it's pretty useless for any real work.
  • A loose definition, would be 2^(B-1) if we have B bits of precision (53*4 in this case). This is more useful generally, but can still lead to nasty surprises.

BTW some time back I tried pretty hard to get Boost.Math working with apples "double double" and just couldn't get everything working - so much of the reasoning about what happens to a number under operation X breaks for these types that it's a bit of a loosing battle. That said, we have since added cpp_dec_float and mpf_float both of which have hidden guard digits and seem to be working OK. In principle, these types aren't that different I hope.

@ckormanyos
Copy link
Member

How feasible it is to integrate quad-double precision, It has 212 bits in significand and is faster than MPFR:

This is a good idea, and in principle easy enough.

Thanks John, for your experienced advice. That's what I was seeking in terms of a kind of nod.

If there are no other blocking points, then I will write this up ASAP as a feasibility study for Boost GSoC 2020. The timing of this user-raised issue is, I believe, perfect for that potential venue.

Kind regards, Chris

@NAThompson
Copy link
Contributor

I'm curious about @cosurgi's use case. I feel like the current Boost.Multiprecision covers the intended use well as it stands, which only maybe a factor of 2 available for performance increase. But if you're using quad double, well, performance can't be a huge issue anyway.

@cosurgi
Copy link
Author

cosurgi commented Jan 18, 2020

@NAThompson even a two-times performance increase means waiting for result one month or two months :)

My use case will be quantum mechanics and quantum chromodynamics. Currently I am preparing the yade source code for all of this. The experimental measurements in some cases are more precise than 20 or even 30 significant digits. Performing high precision calculations is a big deal here. Since I plan to calculate volumes significantly larger than Planck's length I have estimated that my target required precision is about 150 digits, which is 512bits, And it has to be as fast as possible. I know it will take years to achieve, but that's the goal.

My current strategy to achieve this is to make sure that yade is 100% compatibile with boost multiprecision. So that when new faster and more precise types appear I will be able to instantly benefit from them. And hope that someday CPUs with native floating point 512 bit type will appear :)

@cosurgi
Copy link
Author

cosurgi commented Jan 18, 2020

BTW: MPFR is about 10-times slower. The difference in waiting one month and 10 months for a result becomes really meaningful.

@ckormanyos
Copy link
Member

MPFR is about 10-times slower. The difference in waiting one month and 10 months for a result...

So let's prefer the 1 month.

@ckormanyos
Copy link
Member

ckormanyos commented Jan 19, 2020

This is a good idea, and in principle easy enough.
As mentioned above already, epsilon is likely to the the main sticking point, ...

Yes.

BTW some time back I tried pretty hard to get Boost.Math working with apples "double double" and just couldn't get everything working

This comment rang a bell in my mind. So I mounted a drive from 2007. As it turns out, I developed my own version of q_float in March 2005. I had completely forgotten about this. It appears as though my original work used add/sub/mul/div routines originally from a developer known as K. Briggs in doubledouble. But these had subsequently been modified in a file called quad_float, that I had discovered in something at that time called WinNTL-5_3_2, which appears to live on to the present day in newer versions.

I got remarkably far with this code, and am somewhat surprised that I had forgotten about it. If we pursue this topic, I will be sharing my work in a smaller e-mail circle.

In that work, it looks like I simply empirically selected 31 for digits10 and calculated base-2 digits from that. Using this, I selected epsilon to simply be 10^-30. But I had curiously failed to implement any number for max_digits10.

Bset regards, Chris

@jzmaddock
Copy link
Collaborator

Interesting comments and use cases - thanks for this.

There's one other potential blocker that springs to mind: the special functions are kind of tricky to implement for types with large bit counts, but small exponent ranges. quad_float is perhaps the archetype here :(

There's an open bug for tgamma on this: #150, but it wouldn't surprise me if there aren't others lurking once you go hunting.

Question: can double_double be doubled? And doubled again etc? In other words could these be implemented via nested templates?

@ckormanyos
Copy link
Member

ckormanyos commented Jan 19, 2020

The simplest one is of course to use qd-2.3.22.tar.gz verbatim and only add the necessary multiprecision wrapper.

One thing I mentioned about this package is licensing of the code.

I believe that I am getting this interpretation. Not yet, however, fully sure... But I also notice that the x86-FPU-fix encloses on the top-level call mechanism all fucntions using the package. This means, as is confirmed in the DD/QD documentation, that the code will have a hard time being used in multithreading for x86. I would prefer to use a package that sets and resets the x86-FPU-fix within the individual subroutines that need it, thereby allowing for a synchronization mechanism if trying for multithreading.

So at the present, I am not aware of an existing kind of DD or QD that would fully satisfy my personal requirements. That being said, one could focus on a popular one. But there seem to be several popular ones dating all the way back to doubledouble that I had mentioned above.

Best regards, Chris

@ckormanyos
Copy link
Member

ckormanyos commented Jan 19, 2020

Question: can double_double be doubled? And doubled again etc? In other words could these be implemented via nested templates?

You read my mind. I think the answers are yes and yes. But I am also wary of such nested templates having once gotten poor results with an integer type using something like this. But that experience was more than 20 ago, I was less experienced at programming C++ back then and compilers got much (massively much) better at parsing nested templates since then.

@jzmaddock
Copy link
Collaborator

But I also notice that the x86-FPU-fix encloses on the top-level call mechanism all fucntions using the package

Wouldn't that only be required for code executing on the old x87 co-processor, and not x64 code using the SSE registers?

@ckormanyos
Copy link
Member

But I also notice that the x86-FPU-fix encloses on the top-level call mechanism all fucntions using the package

Wouldn't that only be required for code executing on the old x87 co-processor, and not x64 code using the SSE registers?

Yes. I have also just sent you some work in another thread. In that work, the x86-FPU-fix is only needed for that and only for that legacy FPU.

@ckormanyos
Copy link
Member

write this up ASAP as a feasibility study for Boost GSoC 2020. The timing of this user-raised issue is, I believe, perfect for that potential venue.

This has been done near the bottom of this page
Boost.Multiprecision: Double-Double and Quad-Double Types.

@jzmaddock
Copy link
Collaborator

@cosurgi : There's a very tentative version in draft PR referenced above. I would welcome feedback if you'd like to give it a try and see how far you can get.

There are some issues, and some possibly controversial design choices:

  • I've fixed up add/subtract/divide to handle infinities correctly. However, for the ultimate performance, this is perhaps not such a good idea?
  • The one and only very basic test case, has a number of failures. string IO is horribly broken, and printing out the value 64 does not result in "64" for example. This is a direct consequence of some of the arithmetic operators - for example 2^10 does not lead to exactly 1024 :(
  • I've done no testing of special functions or Boost.Math support generally.
  • Many more tests are required, although there are likely to be so many failures from our existing boilerplated tests it's hard to know quite how to proceed. Iostream round tripping isn't even worth trying - it will never work correctly IMO.

Anyhow, if you have any real world code you can plug this into, it would be interesting to hear how you get on. It's all a bit too "fast and loose" for me!

@cosurgi
Copy link
Author

cosurgi commented Feb 3, 2020

@jzmaddock I'm sorry for late reply. This looks very interesting. Yes I have a real-world code YADE with a full blown testing suite for math functions I will try to use it this or the next week.

Just to be sure - I should take this commit ?

The issues:

  • YADE heavily depends on correct handling of NaN and Inf. Without them CGAL does not work at all.
  • Ouch: string IO roundtripping is very important for me.
  • I have tests of all math functions, they are tested against python mpmath. I picked the tolerances by experimenting with the results. That's why string IO is important - strings are the only way to get high precision numbers into python.
  • Ouch again. Maybe we could craft something here? Even using cpp_bin_float as a helper type is acceptable for me. String IO was never supposed to be fast. It just should be accurate.

See some example math tests results for MPFR 150 - scroll up to see the math functions being tested by the earlier mentioned script.

@cosurgi
Copy link
Author

cosurgi commented Feb 3, 2020

Oh, btw if 2^10 is 1023.99999999(9)..... then it's acceptable for me, because the error will be smaller than the tolerance level. We only need the working round tripping.

@jzmaddock
Copy link
Collaborator

Oh, btw if 2^10 is 1023.99999999(9)..... then it's acceptable for me, because the error will be smaller than the tolerance level. We only need the working round tripping.

This is actually the reason that round tripping doesn't work - I mean it's approximately correct (to within an epsilon or so), but without a correctly functioning pow() we basically can't implement correct string output. Actually even the basic arithmetic operations aren't correctly rounded either... as long as string IO is approximately correct (to within an epsilon or so - ie drop one digit and it would round to the "correct" result), wouldn't your tests still function OK?

@cosurgi
Copy link
Author

cosurgi commented Feb 3, 2020

wouldn't your tests still function OK?

heh. I suppose that all math tests would work. But the round tripping test would fail ;) I suppose I could implement an exception for this one.

@jzmaddock
Copy link
Collaborator

Meanwhile.... there are a few issues with the current version - mostly order of initialization issues in numeric_limits<qd_real>, give me a day or so and I'll push a fix as they prevent tgamma and others from working.

@jzmaddock
Copy link
Collaborator

Nevermind, fix pushed. The C99 functions including tgamma etc are now working, note that they are not quite std conforming in that they don't adhere to the non-normative appendix F which deals with handling of infinities and NaNs.

@cosurgi
Copy link
Author

cosurgi commented Feb 16, 2020

BTW: after reading the paper and source code I think that extracting "Quadding" method from this paper into a template is possible. The requirement is that underlying type satisfies IEEE rounding method of the last bit. I will probably need quad-float128 sometime in the future. We could get back to that later.

EDIT: explanation for this requirement is very simple: if the last bit is wrong (incorrectly rounded), then the next element in the quad - although by itself correct, is completely useless - because a single bit preceding it is all wrong.

@cosurgi
Copy link
Author

cosurgi commented Mar 31, 2020

Just a quick heads up. I see that you have done a lot of progress here, should I try testing it again with your latest changes?

@jzmaddock
Copy link
Collaborator

It's way better than it was - especially the library support - there's still one test failing that's not been investigated though....

@cosurgi
Copy link
Author

cosurgi commented Mar 31, 2020

Which one?

@kshegunov
Copy link

Question: can double_double be doubled? And doubled again etc? In other words could these be implemented via nested templates?

Should be in principle. I've done some preliminary work I needed for my own work here, mostly based on a paper by Knuth, I believe, on error-free transformations, but it's utterly unfinished as I had other stuff I had to do. Still the problems are significant, as to have the DD type behave like a real C++ primitive there's a lot of high precision constants necessary e.g. coefficients for the elementary functions' approximation polynomials, for normalizations of the function arguments, etc.

@ckormanyos
Copy link
Member

Restarting (with my offer to make potential progress as mentor on this) for Boost Google Summer of Code 2021

@cosurgi
Copy link
Author

cosurgi commented Feb 23, 2021

Very nice, thank you! I will be watching closely, ready to integrate into YADE. In principle it shouldn't be much more than just changing this file: https://gitlab.com/yade-dev/trunk/-/blob/master/lib/high-precision/RealHP.hpp#L106 and https://gitlab.com/yade-dev/trunk/-/blob/master/lib/high-precision/RealHP.hpp#L119 :)

Then we will have all YADE testing at our hand, for example this: https://gitlab.com/yade-dev/trunk/-/jobs/1048563190

@tinko92
Copy link
Contributor

tinko92 commented Mar 4, 2021

I saw this issue recently and took notice because it seems connected to the floating-point expansions that I implemented for applications in geometric predicates (based on the works of Shewchuk, see https://www.cs.cmu.edu/~quake/robust.html ). Maybe a templated implementation that takes an arbitrary number of components could offer the same functionality as double-double or quad-double with more flexibility.

My implementation of expansion arithmetic can currently be found in a PR to Boost.Geometry. It only contains the operations +, -, *, all implemented for exact computations. Because I think the concept might be of interest and most of the implementation that would be required for a demo is there already, I created a quick proof of concept based on the backend-skeleton. It is not polished and the sqrt and divide methods are probably not very robust, but it should be sufficient to illustrate the idea.

The code can be found at https://github.com/tinko92/floating_point_expansion_number_type

Here is an example of the results that this proof-of-concept produces: Let a = 8, b=3, then the formula (a/b) * b - a
produces for n components:
n=2: -2.46519e-32
n=3: -1.36846e-48
n=4: -7.59645e-65
...
n=8: -7.21326e-130

which is roughly in line with what would be expected if a double has a precision of ~16 digits.

The choice of algorithms is probably not ideal because my implementations of +, - and * are all for exact calculations and if we want to limit the number of components one could probably use faster algorithms, maybe the ones presented in http://perso.ens-lyon.fr/jean-michel.muller/07118139.pdf .

The example I created uses a templates with a fixed limit of components, but it could also be dynamic for arbitrary precision (but limited by the exponent range, of course, so it would never make sense to have an result of more than ~40 components) using an std::vector instead of an array. There are other potential questions, e.g. whether to use zero-elimination for short expansions, whether to use the compress algorithm from Shewchuk's paper for renormalization or the renormalization presented by Joldes et al. and so on.

@ckormanyos
Copy link
Member

ckormanyos commented Mar 6, 2021

I saw this issue recently and took notice because it seems connected to the floating-point expansions that I implemented for applications in geometric predicates ...

Many thanks. Will look into this (and probably ask relevant questions) as the work on quad materializes and becomes closer to getting moving.

@ckormanyos
Copy link
Member

ckormanyos commented Jun 26, 2021

Work on quas had begun within the context of GSoC 2021. Some design choices have been made already/are to be made in the future.

Let's review:

  • We decided to start quad-double BSL backend by reverting one step back and first constructing so-called double_float template class composed of two built-in floating-point components, low and high. The very preliminary draft of this is here.
  • We feel it is best to implement this class and subsequently move on later to the quad double case, whereby it is not yet exactly clear if quad_float should be or how it will be based on/utilize the double_float case.
  • In addition, the preliminary draft of double_float is intended to be almost usable as a standalone class, with all its operators and elementary functions available directly via ADL. As a second step, we intend to place the syntactic backend number requirements on top of this (such as eval_whatever functions).

@jzmaddock and @pabristow and @mborland and @NAThompson: Comments, suggestions, ideas?

@ckormanyos
Copy link
Member

See also this post

@ckormanyos
Copy link
Member

ckormanyos commented Aug 17, 2021

Work on quad has begun...

If you have been following this older thread, there is a lot of new progress on double/quad-float within the context of the GSoC21 (thanks @sinandredemption and @cosurgi). We now have a really cool draft of quad. We decided to implement a flexible template called

template<typename FloatingPointType>
class cpp_quad_float;

which can be instantiated for float, double, long double or boost::multiprecision::float128. This class quadruples these floating-point types basically in terms of precision.

If you get the fork here, you can use <boost/multiprecision/cpp_quad_float.hpp> and some of its convenience types such as boost::multiprecision::cpp_quad_dbl. A sample is shown below.

If you like, try quad-float on your favorite function or part of Boost.Math. We would like to hear field reports.

I like using boost::math::tgamma(). The sample below shows a sample of this.

#include <iomanip>
#include <iostream>

#include <boost/math/constants/constants.hpp>
#include <boost/math/special_functions/gamma.hpp>
#include <boost/multiprecision/cpp_quad_float.hpp>

int main()
{
   using big_float_type = boost::multiprecision::cpp_quad_dbl;

   // N[Sqrt[Pi], 201]
   // 1.77245385090551602729816748334114518279754945612238712821380778985291128459103218137495065673854466541622682362428257066623615286572442260252509370960278706846203769865310512284992517302895082622893210

   std::cout << std::endl << "represent_tgamma_half" << std::endl;

   const big_float_type g    = boost::math::tgamma(big_float_type(0.5F));
   const big_float_type ctrl = sqrt(boost::math::constants::pi<big_float_type>());

   const big_float_type delta = fabs(1 - (g / ctrl));

   const std::streamsize original_streamsize = std::cout.precision();
   std::cout << std::setprecision(std::numeric_limits<big_float_type>::digits10) << g    << std::endl;
   std::cout << std::setprecision(std::numeric_limits<big_float_type>::digits10) << ctrl << std::endl;
   std::cout << std::scientific << std::setprecision(4) << delta << std::endl;
   std::cout.precision(original_streamsize);
   std::cout.unsetf(std::ios::scientific);

   const bool result_is_ok = (delta < std::numeric_limits<big_float_type>::epsilon() * 40U);

   std::cout << "result_is_ok: " << std::boolalpha << result_is_ok << std::endl;

   return (result_is_ok ? 0 : -1);
}

Cc: @jzmaddock, @mborland, @NAThompson, @pabristow

@pabristow
Copy link
Contributor

pabristow commented Aug 24, 2021 via email

@ckormanyos
Copy link
Member

Would the normal math tests – but including type boost::multiprecision::cpp_quad_dbl be a good starting test?

Hi @pabristow thank you for following up on this. Actually, there are quite a few dedicated Multiprecision tests in the standard Multiprecision CI which exercise both the tape's mathematical functions and operations as well as certain special functions from Math.

We are in the process of using/adapting these for the reduced range(s) of double/quad-float. I thik we have about 1/3 or more of the tests running... So we are moving forward on that. We made a dedicated double/quad-float CI for this projec only in the GSoC timeframe so we have been running tests on pushes, PR and also a nightly CI.

@mborland
Copy link
Member

@ckormanyos Seeing as though we did not participate in GSOC 2022 what still needs to happen on the GSOC 2021 fork to bring quad-double to MP?

@ckormanyos
Copy link
Member

ckormanyos commented Aug 15, 2022

what still needs to happen on the GSOC 2021 fork to bring quad-double to MP

Hi Matt (@mborland) we got a good start, but a lot still remains to do for getting Multiprecision-Ready.

Together with Janek (@cosurgi) and our excellent GSoC '21 colleague Fahad (@sinandredemption) we got a good start on the basic algorithms and functionality.

We implemented this thing with double-float and quad-float whereby instantiation with float, double, long double or even __float128 is foreseen.

Quad-float is tricky since it has something like around 30 decimal digits but an exponent range of 39. Otherwise, the design was strong.

From the top of my mind, we still need to...

  • Verify the design or choices made in numeric limits.
  • Ensure that algorithms add, sub, mul, div, sqrt are properly and efficiently implemented.
  • Implement more eval_ functions. In this design we dicided to first go for standalone implementations, then subsequently hook them into number via eval_ functions. If we (or anyone) started again, one might go directly to the eval_ functions, as standalone is not really needed in the design.
  • Optimize elementary transcendentals for these low digit counts.
  • Integrate the backends into a lot of the existing standard tests. We did some testing hook ups to standard tests of sine, cosine, exp and the like but the list of these and their content are somewhat far from complete.

This was (and is) a really cool project and it would be good to make some progress. I remember toward the end we did some really preliminary benchmarking and were a bit disappointed with the performance of quad double.

Perhaps Fahad has some clear recollections of the state of open points.

We also have some open issues and a few branches. They might handle the merge from main: I'll see if a good branch can be rebased or merged from main into branch tonight.

Cc: @jzmaddock

@ckormanyos
Copy link
Member

ckormanyos commented Dec 28, 2022

I'll see if a good branch can be rebased or merged from main into branch

Hi Matt (@mborland) I was able to successfully merge the develop branch from BoostOrg into the GSoC double-double/quad-double GSoC21 repository.

At the moment, only a few elementary function tests and (most of) the arithmetic tests run in a special CI process that I wrote for that project. The branches were about 500 commits behind BoostOrg, so I took a bit of time to merge.

At the moment, the three branches develop, gsoc2021_double_float_chris and gsoc2021_double_float have been synchronized with both each other as well as with develop from BoostOrg. Any of these branches can be used to get an overview of what's going on.

If I get a chance in the upcoming year, I'd like to get back to this project and maybe see if double-double or quad-double can be finished (enough) for productive use. We still have to work out the edge cases and get all the elementary function tests running.

If anyone has interest, time, ideas, thoughts, please feel free to add/join-in

Cc: @cosurgi and @mborland and @jzmaddock and @sinandredemption

@ckormanyos
Copy link
Member

The fork in its updated state can be found here.

Cc: @mborland and @jzmaddock

@jzmaddock
Copy link
Collaborator

Thanks Chris! As you say, it was really just the corner cases and adequate testing that was holding this back, would be nice to see it finished off...

@ckormanyos
Copy link
Member

ckormanyos commented Jan 4, 2023

Hi @cosurgi and @sinandredemption

  • I spent a lot of time standardizing double-float and the new work behaves quite well with respect to the requirements of a Boost.Multiprecision type. Ultimately I decided to first handle cpp_double_fp_backend and when this is production-ready, then and only then move forward to cpp_quad_fp_backend
  • I invested a lot of time recently in NaNs, inf, zeros and implementing the needed eval_-whatever-functions. I also added a lot of (fork-local) tests including elementary functions, arithmetic tests, float-I/O, complex adaption and more.
  • I went for implementation of rudimentary constexpr-ness and targeted C++14.
  • Some mysterious behaviors remain such as with 10-byte long double, round tripping and spurios tolerance mysteries, but these are limited and the behavior is so far quite good.
  • Below you will find a sample of the cool things you can do with number<cpp_double_double> for instance.
  1. At this time, I could actually use some help implementing the full testing including spec-fun.
  2. I would also like to optimize some more eval_-functions such as trig functions (I did optimize exp() and log()).

Can anyone actually help me with 1?
Is anyone interested in contributing to 2?

Cc: @mborland and @jzmaddock

#include <iomanip>
#include <iostream>

#include <boost/math/constants/constants.hpp>
#include <boost/math/special_functions/gamma.hpp>
#include <boost/multiprecision/cpp_double_fp.hpp>

int main()
{
  using double_double_type = boost::multiprecision::cpp_double_double;

  constexpr double_double_type a = 1.5;
  constexpr double_double_type b = 3.5;
  constexpr double_double_type c = a * b;
  constexpr double_double_type d = b / a;

  // Test rudimentary compile time constexpr.
  static_assert(c == 5.25);
  static_assert(d < 3);

  // Test tgamma(), sqrt() and pi<>().

  double_double_type tg = boost::math::tgamma(double_double_type(0.5));

  const auto flg = std::cout.flags();

  std::cout << std::setprecision(static_cast<std::streamsize>(std::numeric_limits<double_double_type>::digits10))
            << std::fixed
            << tg
            << std::endl;

  std::cout << std::setprecision(static_cast<std::streamsize>(std::numeric_limits<double_double_type>::digits10))
            << std::fixed
            << sqrt(boost::math::constants::pi<double_double_type>())
            << std::endl;

  const auto relative = fabs(1 - (tg / sqrt(boost::math::constants::pi<double_double_type>())));

  const auto result_tg_is_ok = (relative < (std::numeric_limits<double_double_type>::epsilon() * 10));

  const auto result_is_ok = result_tg_is_ok;

  std::cout << "result_is_ok: " << std::boolalpha << result_is_ok << std::endl;

  std::cout.flags(flg);

  return (result_is_ok ? 0 : -1);
}

@sinandredemption
Copy link

Hello @ckormanyos

I would definitely want to contribute to #1. Please allow me some time to re-familiarize myself with the codebase, and setup my development environment.

@ckormanyos
Copy link
Member

I would definitely want to contribute to #1

Hi @sinandredemption (Fahad) I'm happy to see you on this thread.

At the present time, I am doing quite a few small, but significant changes reagrding use of NaNs, inf, limits, zeros and the 128-bit floating-point type.

I actually found that Multiprecision standalone is really clean regarding some of these items. I also just finished up using Boost.Multiprecision's near-native float128_type instead of the wrapped number type that we had used during the GSoC. This is a big round of changes. So in a short time, you can use cpp_double_float128 with a lot less overhead. In fact, I even removed the dependency for non-ANSI so you can use this type with eithe std=c++14,17,2a,... or std=gnu++14,17,2a.

One of the first significant testing problems I found has been in rounding near the split-point of the division between fisrt/second. So if you'd like to look at the rounding tests that would be a great place to start.

Cc: @cosurgi and @jzmaddock and @mborland

@ckormanyos
Copy link
Member

ckormanyos commented Jan 5, 2023

Hi @sinandredemption there is also another point regarding even more adaption to constexpr-ness. There are still some constexpr artifacts (like in numeric_limits) that are compile-time-evaluated yet non-constexpr. Often times this is because of the use of something like sqrtf(), sqrt(), sqrtl(), sqrtq() or similar for frexp(), ldexp().

This results in a handful of functions that could use some of Matt's (@mborland) constexpr-cmath faculties to go all the way constexpr on numeric_limits<>. This will possibly also apply to functions like eval_sqrt(), probably even eval_log() and eval_exp() too.

This could also be an interesting place to jump on and contribute. I can explain more in offline chat(s) if/when needed.

@ckormanyos
Copy link
Member

Introduction of the cpp_double_fp_backend<> template has begun in #515, but this will take a few revisions until complete.

If/when this finishes in a whle, I'd then suggest we move on with quad-float

Cc: @sinandredemption and @cosurgi and @jzmaddock and @mborland

@AuroraDysis
Copy link

If possible, could you support a greater variety of combinations?

For example, MultiFloats.jl is a very fast library that represents extended-precision numbers with 2x, 3x, ..., up to 8x the precision of double, which might be helpful.

@ckormanyos
Copy link
Member

could you support a greater variety of combinations?

Hi @AuroraDysis that looks interesting and many thanks for your interest in double/quad/multi-float(s).

During the GSoC and after that, I remember We/I were having trouble figuring out the exact algorithms for add/sub/mul/div and struggling with overflow/underfolw/NanNs. For the moment, this project is on the back burner.

Perhaps the Julia code you reference could give inshight into this. If I/we ever pick up this issue, we will be sure to have a look, but i fear this might have rather low priority at the moment.

@AuroraDysis
Copy link

@ckormanyos Thanks for your reply. I found multi-floats to be very useful in my research area and believe it will play an even more important role in the future. If you have the chance to pick up this project again and need some help, I'd like to contribute or collaborate with you.

As for double floats, DoubleFloats.jl is the most mature library I have ever used, although it is not as fast as MultiFloats.jl.

@ckormanyos
Copy link
Member

ckormanyos commented Jan 13, 2024

found multi-floats to be very useful in my research area and believe it will play an even more important role in the future. If you have the chance to pick up this project again and need some help, I'd like to contribute or collaborate with you.

The plan is to do this and look into multi-floats and ultimately get something adhering to Boost.Multiprecision/Math. If (which actually means when) I/we watm this up again, I'll be sure to look into your referencs and contact you regarding forward motion.

Thanks @AuroraDysis for your interest and willingness to help. When this gets moving again, we will be in touch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants