Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std::numeric_limits<half>::digits10 value is wrong. #252

Closed
c-lipka opened this issue Dec 28, 2017 · 0 comments
Closed

std::numeric_limits<half>::digits10 value is wrong. #252

c-lipka opened this issue Dec 28, 2017 · 0 comments
Labels
Bug A bug in the source code

Comments

@c-lipka
Copy link

c-lipka commented Dec 28, 2017

The constant std::numeric_limits<half>::digits10 in halfLimits.h currently evaluates to 2, but should actually evaluate to 3.

The macro HALF_DIG in half.h is also presumably wrong.

Rationale:

std::numeric_limits<half> should follow the std::numeric_limits<T> template semantics as defined in the C++ standard. For the digits10 member, the standard mandates that they shall be "equivalent" to the FLT_DIG, DBL_DIG and LDBL_DIG macros as defined by the C standard.

The C standard, in turn, specifies that the FLT_DIG, DBL_DIG and LDBL_DIG macros shall be equal to p-1 multiplied by the base-10 logarithm of b, rounded down, where p is the number of base-b digits in the significand, and b is the floating-point numerical base. (A different formula applies if b is a power of 10, but is irrelevant for half since it is base-2.)

Note that p includes the implicit MSbit of the significand (*). Also, the C standard clearly mandates that the macros FLT_MANT_DIG, DBL_MANT_DIG and LDBL_MANT_DIG shall be defined as p for the three standard floating-point types. The C++ standard in turn mandates that the std::numeric_limits<T> template's digits member shall be "equivalent" to these macros.

(* While the definition of p in the C standard may not be obious, the examples should be clear enough, listing the corresponding values for IEEE single and double precision floating-point types as 24 and 53, respectively.)

So std::numeric_limits<half>::digits10 should be equal to std::numeric_limits<half>::digits-1, multiplied by the base-10 logarithm of std::numeric_limits<T>::radix, rounded down. With std::numeric_limits<half>::digits evaluating to 11, and std::numeric_limits<T>::radix evaluating to 2, the value should be approx. 3.01 rounded down, i.e. 3.

Suggested solution:

Since the HALF_DIG macro in half.h is also presumably wrong, and std::numeric_limits<half>::digits10 is defined in terms of that macro, the issue should be fixed by changing the macro's definition from 2 to 3.

@cary-ilm cary-ilm added the Bug A bug in the source code label Jun 13, 2019
@cary-ilm cary-ilm added this to the Needs Attention milestone Jun 29, 2019
kdt3rd added a commit to kdt3rd/openexr that referenced this issue Jul 21, 2019
…gits

Based on float / double math for base 10 digits, with 1 bit of rounding
error, the equation should be floor( mantissa_digits - 1 ) * log10(2) ),
which in the case of half becomes floor( 10 * log10(2) ) or 3

Signed-off-by: Kimball Thurston <kdt3rd@gmail.com>
@kdt3rd kdt3rd closed this as completed in bca0bc0 Jul 22, 2019
DominicJacksonBFX pushed a commit to boris-fx/mocha-openexr that referenced this issue Jun 22, 2022
…gits

Based on float / double math for base 10 digits, with 1 bit of rounding
error, the equation should be floor( mantissa_digits - 1 ) * log10(2) ),
which in the case of half becomes floor( 10 * log10(2) ) or 3

Signed-off-by: Kimball Thurston <kdt3rd@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug A bug in the source code
Projects
None yet
Development

No branches or pull requests

2 participants