New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Numpy float128 overflows on ppc64le but works on x86_64 #10281

Closed
npanpaliya opened this Issue Dec 27, 2017 · 26 comments

Comments

Projects
None yet
4 participants
@npanpaliya
Copy link

npanpaliya commented Dec 27, 2017

While working on a tensorflow's test case, I came across a problem on ppc64le Linux system that boils down to numpy. I've also written a small test program to show case the behavior on both Power8 and x86_64. This is with numpy version 1.13.3.
Test Program

import numpy
t = numpy.float128(1.7976931e+308)
res = numpy.sqrt(t**2 + 1)
print("Result of sq root of square of 1.7976931e+308:  ", res)

Output on Power with numpy version 1.13.3:

# python test_numpyfloat128.py
test_numpyfloat128.py:3: RuntimeWarning: overflow encountered in longdouble_scalars
  res = numpy.sqrt(t**2 + 1)
('Result of sq root of square of 1.7976931e+308:  ', inf)

Min/max range of float128 on Power:

>>> numpy.finfo(numpy.float128)
finfo(resolution=1e-31, min=-1.79769313486e+308, max=1.79769313486e+308, dtype=float128)

Output on x86 with numpy version 1.13.3:

$ python test_numpyfloat128.py
('Result of sq root of square of 1.7976931e+308:  ', 1.7976931000000000494e+308)

Min/max range of float128 on x86:

>>> numpy.finfo(numpy.float128)
finfo(resolution=1e-18, min=-1.18973149536e+4932, max=1.18973149536e+4932, dtype=float128)

I've read many documents/links about float128 on Power and in general. float128 being platform dependent, is long double on x86 and some other architectures but on Power, it is double double[2]. On Power, I checked long double type is available and it is of 128-bit through a small C/C++ program. Some of the below links say that __float128 type is also available on Power7 and Power8 and we need to have VSX enabled[3], but unfortunately I'm unable to test it in a C/C++ program.

  1. https://www.ibm.com/support/knowledgecenter/en/ssw_aix_71/com.ibm.aix.genprogc/128bit_long_double_floating-point_datatype.htm
  2. https://en.wikipedia.org/wiki/Quadruple-precision_floating-point_format#Double-double_arithmetic
  3. https://gcc.gnu.org/wiki/Ieee128PowerPC#A1.3_Sizes

I have also come across many issues in numpy on Power around this float128. Also as per npy_common.h, it seems that float128 is not available on systems where size of double is 64bits, instead float64 will be used. But I'm still not sure whether Power has an unsupported feature or it's a bug in numpy on PPC.
Kindly provide me some pointers on this. Or if there is any way, with which above test can give me same results as that of x86_64.

@eric-wieser

This comment has been minimized.

Copy link
Member

eric-wieser commented Dec 27, 2017

Min/max range of float128 on Power

I'm sure you noticed this, but that looks like the range of double to me.

What does np.longdouble is np.float128 give?

@npanpaliya

This comment has been minimized.

Copy link

npanpaliya commented Dec 27, 2017

Yes, that's the range of double. np.longdouble == np.float128 gives True on Power.

@eric-wieser

This comment has been minimized.

Copy link
Member

eric-wieser commented Dec 27, 2017

Can you try with is as well as ==, just to be sure?

@npanpaliya

This comment has been minimized.

Copy link

npanpaliya commented Dec 27, 2017

Okay. I tried -

>>> np.longdouble == np.float128
True
>>> np.longdouble is np.float128
True
@eric-wieser

This comment has been minimized.

Copy link
Member

eric-wieser commented Dec 27, 2017

Your last link explains everything:

The PowerPC GCC compiler for Linux, AIX, and Darwin do not use the IEEE 754R 128-bit floating point type for the long double data type. Instead the PowerPC compiler uses a software type known as IBM extended double type. The IBM extended double type is a pair of double values that give the user more bits of precision, but no additional range for the mantissa.

The only remaining question is why numpy calls this type float128, for which the answer is in the size section:

IBM extended double | 16 bytes

So at best, the bug report here is that float128 is a misleading name that can be confused with the " IEEE 754R" spec. If you want to open a new issue about naming, feel free.

@eric-wieser

This comment has been minimized.

Copy link
Member

eric-wieser commented Dec 27, 2017

From the same page, compiling numpy with -mabi=ieeelongdouble will solve your problem.

@npanpaliya

This comment has been minimized.

Copy link

npanpaliya commented Dec 27, 2017

Thanks @eric-wieser for your prompt replies and concluding answers. I'll try building numpy with -mabi=ieeelongdouble. I also agree that naming of float128 should be appropriate on PPC, may be some error should be given while using it.
Thanks again!

@npanpaliya

This comment has been minimized.

Copy link

npanpaliya commented Dec 27, 2017

I also saw a numpy's fork on PPC64 repo. Probably it might have addressed this. I will check that too.

@charris

This comment has been minimized.

Copy link
Member

charris commented Dec 27, 2017

The float128 name may become problematic at some point. Currently, it is either IEEE extended precision, IBM extended double, or IEEE quad precision (was on SUN). NumPy does recognize the different types and they should show up in finfo for the type. At some point I expect quad precision to make a comeback -- gcc has support -- and then we should probably come up with a better name than float128.

@charris

This comment has been minimized.

Copy link
Member

charris commented Dec 27, 2017

Maybe we should also add the "official" name of the type to finfo?

@eric-wieser

This comment has been minimized.

Copy link
Member

eric-wieser commented Dec 27, 2017

And for iinfo while we're at it.

@ahaldane

This comment has been minimized.

Copy link
Member

ahaldane commented Dec 27, 2017

Hmm I didn't specifically account for IBM extended double in the 1.14 Dragon4 float-printing updates. I didn't realize they were used on ppc. That means they probably print incorrectly in 1.14.

I think I have a ppc system somewhere, I'll investigate.

@charris

This comment has been minimized.

Copy link
Member

charris commented Dec 27, 2017

Hmm I didn't specifically account for IBM extended double in the 1.14 Dragon4 float-printing updates.

Oops, I missed that. Might not be simple, the precision is a bit odd ...

@eric-wieser

This comment has been minimized.

Copy link
Member

eric-wieser commented Dec 27, 2017

Doesn't this mean that arrays saved with dtype f16 are inherently non-portable? Do we need a different type code for each type of long double, so that we can at least raise an error when loading pickle files from another machine, rather than doing the wrong thing?

We already solve this for integer types by recording the number of bytes rather than the C type name.

@charris

This comment has been minimized.

Copy link
Member

charris commented Dec 27, 2017

Doesn't this mean that arrays saved with dtype f16 are inherently non-portable?

Yes, that is how it is. The long doubles are useful here and there, but should not be used for data that need to be portable across architectures. It is probably not such a bad thing that Windows just makes them doubles.

@ahaldane

This comment has been minimized.

Copy link
Member

ahaldane commented Dec 27, 2017

I don't have easy access to a ppc system, in the end, it will take more time to find one.

The dragon4 code is in some ways pretty flexible, since it allows arbitrary variations in mantissa and exponent size as long as they fit into a uint64. But the IBM extended type will be trickier since the mantissa is more than 64 bits, meaning I would have to update the function signatures to use a bigger type to pass around the mantissa. I will also need to make the BigInt implementation bigger.

An alternate, quick fix, solution might be to fall back to the system printf for the unusual extended doubles.

@eric-wieser

This comment has been minimized.

Copy link
Member

eric-wieser commented Dec 27, 2017

Instead of passing a BigInt, can you just use the mantissa in the long double to do integer arithmetic?

@eric-wieser

This comment has been minimized.

Copy link
Member

eric-wieser commented Dec 27, 2017

Your suggested quick fix sounds good to me for 1.14 though

@charris

This comment has been minimized.

Copy link
Member

charris commented Dec 27, 2017

A uint64 is going to be too short for quad precision if/when it arrives. I note that Julia has a doubledouble library (MIT license?).

@charris

This comment has been minimized.

Copy link
Member

charris commented Dec 27, 2017

Maybe long term, the thing to do is implement ieee quad for long double printing and use that for pickles and other such as a common portable format except for when long double is actually double.

@ahaldane

This comment has been minimized.

Copy link
Member

ahaldane commented Dec 27, 2017

@eric-wieser the mantissa issue is somewhat separate from the BigInt issue.

For the mantissa issue it's just a question of passing around a type bigger than uint64 in the function arguments, eg we might need a uint128. Or actually, maybe I can just use a BigInt.

For the BigInt, this is because Dragon4 uses bigints internally to store various values. The bigint needs to be up to about 2^(expbits-1) bits long where expbits is the number of exponent bits of the format. But looking at the pages linked above, none of the exponents is longer than 15 bits, and in the current dragon4 implementation I already account for 15 bits (a 2048 byte BigInt). Even ieee-quad only uses 15 bits.

So in conclusion, I think it might actually be easy to adapt the current code for the IBM types and for ieee quad. We can leave the BigInt alone, and just need to pass in the mantissa in a higher-precision format at the beginning.

If we don't already have macros to distinguish which 128-bit float type the system has, I might need to write that too. At first glance it doesn't seem we do...

@charris

This comment has been minimized.

Copy link
Member

charris commented Dec 27, 2017

The float type is determined in two places: during the build configuration (see numpy/core/setup_common.py and numpy/core/setup.py, and in finfo. The exact type should be available in the generated configuration file as HAVE_LDOUBLE_<TYPE>.

@npanpaliya

This comment has been minimized.

Copy link

npanpaliya commented Dec 28, 2017

@ahaldane : Opensource Access to power hardware is possible via http://osuosl.org/services/powerdev/request_hosting/

@eric-wieser : As you suggested I tried building numpy with -mabi=ieeelongdouble but gcc fails with seg fault. Below are the last few lines of build log

compile options: '-Ibuild/src.linux-ppc64le-2.7/numpy/core/src/npymath -Inumpy/core/include -Ibuild/src.linux-ppc64le-2.7/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -I/root/anaconda2/include/python2.7 -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/private -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/npymath -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/private -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/npymath -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/private -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/npymath -c'
gcc: numpy/core/src/npymath/npy_math.c
gcc: warning: using IEEE extended precision long double
cc1: warning: using IEEE extended precision long double [enabled by default]
gcc: internal compiler error: Segmentation fault (program cc1)
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugzilla.redhat.com/bugzilla> for instructions.
gcc: warning: using IEEE extended precision long double
cc1: warning: using IEEE extended precision long double [enabled by default]
gcc: internal compiler error: Segmentation fault (program cc1)
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugzilla.redhat.com/bugzilla> for instructions.
gcc: build/src.linux-ppc64le-2.7/numpy/core/src/npymath/npy_math_complex.c
error: Command "gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -mabi=ieeelongdouble -Wno-psabi -fPIC -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/npymath -Inumpy/core/include -Ibuild/src.linux-ppc64le-2.7/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -I/root/anaconda2/include/python2.7 -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/private -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/npymath -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/private -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/npymath -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/private -Ibuild/src.linux-ppc64le-2.7/numpy/core/src/npymath -c numpy/core/src/npymath/npy_math.c -o build/temp.linux-ppc64le-2.7/numpy/core/src/npymath/npy_math.o -MMD -MF build/temp.linux-ppc64le-2.7/numpy/core/src/npymath/npy_math.o.d" failed with exit status 4

Also, as per your discussion above, will there be a fix for this issue available in 1.14 release?

@eric-wieser

This comment has been minimized.

Copy link
Member

eric-wieser commented Dec 28, 2017

No, the discussion above refers to how repr(float128) assumes that float128 is the IEEE 128-bit float (#10289), which it is not, and how 1.14 might have a fix to not make this assumption in repr.

Remember, the only "bug" here (#10288) is that np.float128 is a confusing name for np.longdouble if long double is not actually the IEEE 128-bit float. Confusing names are something that has to be fixed by a long deprecation cycle.

Can you try a newer GCC?

@npanpaliya

This comment has been minimized.

Copy link

npanpaliya commented Dec 28, 2017

Tried with gcc 5.4 but it gives another error -

In file included from numpy/core/src/npymath/npy_math.c:9:0:
numpy/core/src/npymath/npy_math_internal.h.src: In function ‘npy_sqrtl’:
numpy/core/src/npymath/npy_math_internal.h.src:470:1: error: unrecognizable insn:
 }
 ^
(insn 9 8 10 2 (set (reg:CCFP 160)
        (compare:CCFP (reg:TF 159)
            (reg:TF 159))) numpy/core/src/npymath/npy_math_internal.h.src:469 -1
     (nil))
numpy/core/src/npymath/npy_math_internal.h.src:470:1: internal compiler error: in extract_insn, at recog.c:2343
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-5/README.Bugs> for instructions.
powerpc64le-linux-gnu-gcc: warning: using IEEE extended precision long double
cc1: warning: using IEEE extended precision long double
numpy/core/src/npymath/npy_math_complex.c.src: In function ‘npy_cpowl’:
numpy/core/src/npymath/npy_math_complex.c.src:465:1: note: the ABI of passing aggregates with 16-byte alignment has c            in GCC 5
 npy_cpow@c@ (@ctype@ a, @ctype@ b)
 ^
numpy/core/src/npymath/npy_math_complex.c.src:465:1: internal compiler error: in validate_condition_mode, at config/r           rs6000.c:17356
Please submit a full bug report,

I read a few gcc bug reports and it is said that it is fixed in gcc 7. Not sure though. Do you have any pointers?

@ahaldane

This comment has been minimized.

Copy link
Member

ahaldane commented Dec 28, 2017

@npanpaliya Thanks for the OSU link, I was looking for something like that. I'll look into creating an account

@charris I'll go ahead and try making an account there putting "numpy" down as the organization. I'll post to the list once things move forward a bit, so others can have access.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment