Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: What does np.finfo.nmant mean? #22333

Open
oscarbenjamin opened this issue Sep 24, 2022 · 3 comments
Open

DOC: What does np.finfo.nmant mean? #22333

oscarbenjamin opened this issue Sep 24, 2022 · 3 comments

Comments

@oscarbenjamin
Copy link

Issue with current documentation:

This is coming from a SymPy issue. I'm not sure if this should be considered a documentation problem for NumPy or a bug:
sympy/sympy#24071

The docs say that np.finfo.nmant represents:

The number of bits in the mantissa.

https://numpy.org/doc/stable/reference/generated/numpy.finfo.html

So if I try this with double I get:

>>> np.finfo(np.double).nmant
52

Now in IEEE 754 64 bit floating point there is a 53 bit mantissa but only 52 of those bits are stored explicitly within the 64 bit data type with the leading 1 being implicit (for normal nonzero values).

If I try this with longdouble I get:

>>> np.finfo(np.longdouble).nmant
63

My understanding is that on this x86-64 bit Linux system long double means 80 bit extended precision (as in x87). In that format there is a 64 bit mantissa and all 64 bits of it are stored explicitly.

From here I can see that nmant does not represent the number of bits actually stored for the mantissa but it also does not represent the number of bits of effective precision either. Rather the relationship seems to be:

precision = nmant + 1

That particular assumption is used by SymPy when converting numpy datatypes to SymPy's arbitrary precision Float format.

In the SymPy issue what is reported is that on PowerPC long double uses the double-double format and NumPy's nmant apparently gives 105:

>>> np.finfo(np.longdouble).nmant # on PowerPC
105

Here the number of bits of mantissa explicitly stored would be 2*52 = 104. With both implicit ones it would be 2*53 = 106. However the effective precision is apparently 107 (I'm not sure how that works):
https://en.wikipedia.org/wiki/Quadruple-precision_floating-point_format#Double-double_arithmetic
Likewise on that hardware:

>>> (np.longdouble(2)/3).as_integer_ratio()[0].bit_length() # PowerPC
107

Maybe I'm just misunderstanding what nmant is supposed to represent but at least the docs should clarify what it is supposed to mean because I really can't make sense of what nmant = 105 is supposed to indicate here.

Alternatively is this a bug and should nmant actually report as 106 on that particular PowerPC system?

Idea or request for content:

Clarify what nmant means or possibly change its value on PowerPC systems.

@chatcannon
Copy link
Contributor

A similar problem occurs for standard IEEE 754 floats in the case of subnormal numbers: np.finfo does not report the lower precision or number of bits:

In [1]: import numpy as np

In [2]: tiny = np.finfo(1.0).smallest_subnormal * 5

In [3]: np.finfo(tiny).precision
Out[3]: 15

In [4]: np.finfo(tiny).nmant
Out[4]: 52

In this case using nmant will overestimate the number of bits of precision available in the value being represented.

@MatteoRaso
Copy link
Contributor

I think I might have found the problem. nmant comes from the "it" attribute in the machar class. The it attribute is calculated by iterating by 1, doing a test to make sure the mantissa has at least that many digits, and breaking if it doesn't. Since it iterates first, that means nmant is always off by 1. This is on line 153 of _machar.py if anybody wants to check and see if I got it right.

@seberg
Copy link
Member

seberg commented Sep 26, 2022

I think I might have found the problem. nmant comes from the "it" attribute in the machar class.

Should check that this is also always off by one. It probably is, because that behavior probably got copied over. But the actual values come from _register_known_types. All of these are hardcoded in getlimits.py.

If we define it as the minimum guaranteed precision (which I guess it is not what sympy wants), than I think it is actually reliably off-by-one, no? So in that sense it is well defined?
I am not even sure that double-double always normalizes the mantissa to those 106/107 bits.

Note: As the wikipedia reference explains very clearly/nicely, the reason for 107 bits in practically all cases, comes from the fact that the second double precision value can have the opposite sign of the first number, effectively adding one bit. Since only very few values are limited to 106 bits, maybe 107 bit is actually the better "lower" choice.

I think a good solution would be to introduce a new argument mant_dig since that is what sys.float_info uses. That way we can deprecate nmant eventually without hard-breaking anyone who might to rely on it.

This is tricky stuff unfortunately, but if anyone wants to look into it, that would be great. That way I can concentrate on reviewing any PR to address the issue.

MatteoRaso added a commit to MatteoRaso/numpy that referenced this issue Oct 8, 2022
The normal attribute for getting the mantissa is
nmant, but this is always off by 1. However, other
packages (like sympy) require the nmant attribute,
so we would end up breaking code downstream if we
changed it. Instead, I introduced a new attribute
called mant_dig, as member Sebastian Berg suggested.
In the documentation for finfo, I added a small
disclaimer saying that nmant should not be used
anymore. Since nmant is going to have to be
deprecated eventually, I also added a small
disclaimer that nmant is inaccurate, set for
depreciation at some point in the future,
and that mant_dig should be used from here on out.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants