[ppc64le] test suite failing on power9 #1244

kif · 2019-06-19T05:50:41Z

As suggested by @takluyver, here are the results of the test-suite running on IBM power9 computer running Ubuntu 18.04. The python core is 3.6, all modules have been pip installed in a venv (and recompiled as no binary wheels exist yet on this platform) and HDF5 has been compile from sources in version 1.10.5.

Most failing tests are related in a way or another to "float128" which have just been deactivated in #1243. They should probably be marked as "xfailed".

(venv-kieffer_system) test@power9:~/HDF5/h5py/build/lib.linux-ppc64le-3.6$ python
Python 3.6.7 (default, Oct 22 2018, 11:32:17) 
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import h5py
>>> h5py.run_tests()
=================================================================================== test session starts ===================================================================================
platform linux -- Python 3.6.7, pytest-4.4.2, py-1.8.0, pluggy-0.11.0
rootdir: /home/test/HDF5/h5py, inifile: pytest.ini
collected 537 items                                                                                                                                                                       

h5py/tests/hl/test_attribute_create.py ...                                                                                                                                          [  0%]
h5py/tests/hl/test_completions.py ..                                                                                                                                                [  0%]
h5py/tests/hl/test_dataset_getitem.py .................................................x....................................                                                        [ 16%]
h5py/tests/hl/test_dataset_swmr.py ssss..........                                                                                                                                   [ 19%]
h5py/tests/hl/test_datatype.py .........F.....ssss                                                                                                                                  [ 23%]
h5py/tests/hl/test_deprecation.py .                                                                                                                                                 [ 23%]
h5py/tests/hl/test_dims_dimensionproxy.py .                                                                                                                                         [ 23%]
h5py/tests/hl/test_file.py .................                                                                                                                                        [ 26%]
h5py/tests/hl/test_filters.py .s.                                                                                                                                                   [ 27%]
h5py/tests/hl/test_threads.py ..                                                                                                                                                    [ 27%]
h5py/tests/hl/test_vds/test_highlevel_vds.py ......                                                                                                                                 [ 28%]
h5py/tests/hl/test_vds/test_lowlevel_vds.py ...                                                                                                                                     [ 29%]
h5py/tests/hl/test_vds/test_virtual_source.py ................                                                                                                                      [ 32%]
h5py/tests/old/test_attrs.py ...............s                                                                                                                                       [ 35%]
h5py/tests/old/test_attrs_data.py ......F.............                                                                                                                              [ 38%]
h5py/tests/old/test_base.py .....                                                                                                                                                   [ 39%]
h5py/tests/old/test_dataset.py ......F..x...............................s.................................x..........................                                               [ 58%]
h5py/tests/old/test_datatype.py ..                                                                                                                                                  [ 59%]
h5py/tests/old/test_dimension_scales.py ....s................                                                                                                                       [ 63%]
h5py/tests/old/test_file.py .............ss...ssss....................s.........                                                                                                    [ 72%]
h5py/tests/old/test_file_image.py ..                                                                                                                                                [ 73%]
h5py/tests/old/test_group.py .................................ssssss...................................................                                                             [ 89%]
h5py/tests/old/test_h5.py .....                                                                                                                                                     [ 90%]
h5py/tests/old/test_h5d_direct_chunk_write.py .                                                                                                                                     [ 91%]
h5py/tests/old/test_h5f.py .....                                                                                                                                                    [ 91%]
h5py/tests/old/test_h5p.py ........                                                                                                                                                 [ 93%]
h5py/tests/old/test_h5t.py ....                                                                                                                                                     [ 94%]
h5py/tests/old/test_objects.py ...                                                                                                                                                  [ 94%]
h5py/tests/old/test_selections.py ....                                                                                                                                              [ 95%]
h5py/tests/old/test_slicing.py ........................                                                                                                                             [100%]

======================================================================================== FAILURES =========================================================================================
__________________________________________________________________________ TestOffsets.test_float_round_tripping __________________________________________________________________________

self = <h5py.tests.hl.test_datatype.TestOffsets testMethod=test_float_round_tripping>

    @ut.skipIf(
        platform.machine() in x86_32_BIT_SYSTEMS,
        'Test fails on i386, need to sort out long double FIX THIS')
    def test_float_round_tripping(self):
        dtypes = set(f for f in np.typeDict.values()
                     if (np.issubdtype(f, np.floating) or
                         np.issubdtype(f, np.complexfloating))
                     )
    
        dtype_dset_map = {str(j): d
                          for j, d in enumerate(dtypes)}
    
        fname = self.mktemp()
    
        with h5py.File(fname, 'w') as f:
            for n, d in dtype_dset_map.items():
                data = np.arange(10,
                                 dtype=d)
    
                f.create_dataset(n, data=data)
    
        with h5py.File(fname, 'r') as f:
            for n, d in dtype_dset_map.items():
                ldata = f[n][:]
>               self.assertEqual(ldata.dtype, d)
E               AssertionError: dtype('<f8') != <class 'numpy.float128'>

h5py/tests/hl/test_datatype.py:307: AssertionError
__________________________________________________________________________________ TestTypes.test_float ___________________________________________________________________________________

self = <h5py.tests.old.test_attrs_data.TestTypes testMethod=test_float>

    def test_float(self):
        """ Storage of floating point types """
        dtypes = tuple(np.dtype(x) for x in ('<f4','>f4','<f8','>f8'))
    
        for dt in dtypes:
            data = np.ndarray((1,), dtype=dt)
            data[...] = 42.3
            self.f.attrs['x'] = data
            out = self.f.attrs['x']
>           self.assertEqual(out.dtype, dt)
E           AssertionError: dtype('>f16') != dtype('>f8')

h5py/tests/old/test_attrs_data.py:112: AssertionError
____________________________________________________________________________ TestCreateShape.test_long_double _____________________________________________________________________________

self = <h5py.tests.old.test_dataset.TestCreateShape testMethod=test_long_double>

    def test_long_double(self):
        """ Confirm that the default dtype is float """
        dset = self.f.create_dataset('foo', (63,), dtype=np.longdouble)
>       self.assertEqual(dset.dtype, np.longdouble)
E       AssertionError: dtype('<f8') != <class 'numpy.float128'>

h5py/tests/old/test_dataset.py:94: AssertionError
==================================================================================== warnings summary =====================================================================================
build/lib.linux-ppc64le-3.6/h5py/tests/hl/test_dataset_getitem.py::Test1DZeroFloat::test_mask
  /home/test/HDF5/h5py/build/lib.linux-ppc64le-3.6/h5py/_hl/dataset.py:534: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use `array.size > 0` to check that an array is not empty.
    if args == (Ellipsis,) or args == tuple():

-- Docs: https://docs.pytest.org/en/latest/warnings.html
========================================================= 3 failed, 506 passed, 25 skipped, 3 xfailed, 1 warnings in 3.97 seconds =========================================================
1

The text was updated successfully, but these errors were encountered:

takluyver · 2019-06-19T06:50:19Z

Thanks. That's a bit confusing. It seems that creating a dataset with '>f8' uses a 128-bit float (second failure), but creating with np.longdouble uses a 64-bit float (3rd and probably 1st).

kif · 2019-06-19T16:31:04Z

I noted the same ... Some tests have easy workaround and I am preparing a PR but the float64 attribute which ends in long-double hides something deeper.

kif · 2019-06-20T12:55:36Z

Hi Thomas,

I confirm that the setting of an attribute with an array of type "big-endian float64" results in a bug.

import numpy, h5py
a=numpy.array([43], dtype=">f8")                                                                                                                          

In [5]: a                                                                                                                                                                            
Out[5]: array([43.])

In [6]: a.dtype                                                                                                                                                                      
Out[6]: dtype('>f8')

In [7]: f=h5py.File("test.h5", "a")                                                                                                                                                  

In [8]: f.attrs["test>f8"] = a[0]                                                                                                                                                    

In [9]: f.attrs["test>f8"]                                                                                                                                                           
Out[9]: 43.0

In [10]: f.attrs["test>f8"].dtype                                                                                                                                                    
Out[10]: dtype('float64')

In [11]: f.attrs["test>f8_1"]=a                                                                                                                                                      

In [12]: f.attrs["test>f8_1"]                                                                                                                                                        
Out[12]: array([-1.10439183e+232], dtype=float128)

I can provide a PR for the other 2 failing tests but this one deserves further investigation

aragilar · 2019-06-20T13:18:17Z

One thing I did notice is that the values for ppc64le is different to that of ppc64:

h5py/h5py/h5t.pyx

Lines 308 to 317 in 369d9cf

    
           if ftype_ == np.longdouble and MACHINE == 'ppc64': 
        
               # values reported by hdf5 
        
               nmant = 116 
        
               maxexp = 1024 
        
               minexp = -1022 
        
           elif ftype_ == np.longdouble and MACHINE == 'ppc64le': 
        
               # values reported by hdf5 
        
               nmant = 52 
        
               maxexp = 1024 
        
               minexp = -1022

I'd expect the mantissa size to be the same. Could that be the issue?

@kif What format is np.float128 and np.longdouble (e.g. double-double, ieee quad)?

kif · 2019-06-20T13:56:52Z

Hi James,
I suspect that the bug is between IBM power8 and 9 which are both ppc64le but the implement different OpenPower ISA, respectively 2 and 3 and the C-longdouble differs. I have asked confirmation to IBM and hope to get some guidance on the issue. The Power9 is the first processor with Arm64 to implement effectively ieee754-128bit float in hardware.

Unfortunately, numpy sees the numpy.float128 as numpy.longdouble (in conformance with numpy's doc) which is:

In [1]: import numpy                                                                                                                                                                 

In [2]: numpy.finfo("float128")                                                                                                                                                      
Out[2]: finfo(resolution=1e-31, min=-1.79769313486231580793728971405301e+308, max=1.79769313486231580793728971405301e+308, dtype=float128)

In [3]: numpy.finfo("float128").nmant                                                                                                                                                
Out[3]: 105

In [4]: numpy.finfo("float128").nexp                                                                                                                                                 
Out[4]: 11

terjeros · 2019-09-12T17:47:53Z

Tried to import 2.10.0 in Fedora rawhide, got

ppc64le:
https://kojipkgs.fedoraproject.org//work/tasks/2755/37632755/build.log

=================================== FAILURES ===================================
_____________________________ TestTypes.test_float _____________________________
self = <h5py.tests.test_attrs_data.TestTypes testMethod=test_float>
    def test_float(self):
        """ Storage of floating point types """
        dtypes = tuple(np.dtype(x) for x in ('<f4', '>f4', '>f8', '<f8'))
    
        for dt in dtypes:
            data = np.ndarray((1,), dtype=dt)
            data[...] = 42.3
            self.f.attrs['x'] = data
            out = self.f.attrs['x']
            # TODO: Clean up after issue addressed !
            print("dtype: ", out.dtype, dt)
            print("value: ", out, data)
>           self.assertEqual(out.dtype, dt)
E           AssertionError: dtype('>f16') != dtype('>f8')
h5py/tests/test_attrs_data.py:119: AssertionError
----------------------------- Captured stdout call -----------------------------
dtype:  float32 float32
value:  [42.3] [42.3]
dtype:  >f4 >f4
value:  [42.3] [42.3]
dtype:  >f16 >f8
value:  [0.] [42.3]

>

i686:
https://kojipkgs.fedoraproject.org//work/tasks/2752/37632752/build.log

=================================== FAILURES ===================================
____________________ TestOffsets.test_float_round_tripping _____________________
self = <h5py.tests.test_dtype.TestOffsets testMethod=test_float_round_tripping>
    def test_float_round_tripping(self):
        dtypes = set(f for f in np.typeDict.values()
                     if (np.issubdtype(f, np.floating) or
                         np.issubdtype(f, np.complexfloating)))
    
        if platform.machine() in UNSUPPORTED_LONG_DOUBLE:
>           dtype_dset_map = {str(j): d
                              for j, d in enumerate(dtypes)
                              if d not in (np.float128, np.complex256)}
h5py/tests/test_dtype.py:296: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.0 = <enumerate object at 0xec364608>
    dtype_dset_map = {str(j): d
                      for j, d in enumerate(dtypes)
>                     if d not in (np.float128, np.complex256)}
E   AttributeError: module 'numpy' has no attribute 'float128'
h5py/tests/test_dtype.py:298: AttributeError

x86_64, arm32, aarch64 and s390x seems fine:
https://koji.fedoraproject.org/koji/taskinfo?taskID=37632750

tacaswell · 2019-09-12T22:48:05Z

On ppc64le we are upcasting 64 bit big endian floats to 128bit big endian floats and then making the value 0?

It looks like we are seeing similar failures on conda-forge trying to build the 2.9 packages on ppc64le

https://travis-ci.org/conda-forge/h5py-feedstock/builds/577406044?utm_source=github_status&utm_medium=notification

https://travis-ci.org/conda-forge/h5py-feedstock/jobs/577406047

tacaswell · 2019-09-12T23:46:44Z

We can reproduce the ppc64le failures on travis! https://travis-ci.org/h5py/h5py/jobs/584366489#L11992-L12013

It fails a little bit different:

11993_____________________________ TestTypes.test_float _____________________________
11994
11995self = <h5py.tests.test_attrs_data.TestTypes testMethod=test_float>
11996
11997    def test_float(self):
11998        """ Storage of floating point types """
11999        dtypes = tuple(np.dtype(x) for x in ('<f4', '>f4', '>f8', '<f8'))
12000    
12001        for dt in dtypes:
12002            data = np.ndarray((1,), dtype=dt)
12003            data[...] = 42.3
12004            self.f.attrs['x'] = data
12005            out = self.f.attrs['x']
12006            # TODO: Clean up after issue addressed !
12007            print("dtype: ", out.dtype, dt)
12008            print("value: ", out, data)
12009>           self.assertEqual(out.dtype, dt)
12010E           AssertionError: dtype('>f16') != dtype('>f8')
12011
12012py37-test-deps/lib/python3.7/site-packages/h5py/tests/test_attrs_data.py:119: AssertionError
----------------------------- Captured stdout call -----------------------------
12014dtype:  float32 float32
12015value:  [42.3] [42.3]
12016dtype:  >f4 >f4
12017value:  [42.3] [42.3]
12018dtype:  >f16 >f8
12019value:  [0.000000 4] [42.3]

tacaswell · 2019-09-12T23:54:47Z

I am pretty sure the issue is in

h5py/h5py/h5t.pyx

Lines 303 to 322 in fd0753a

    
           cdef (int, int, int) _correct_float_info(ftype_, finfo): 
        
               nmant = finfo.nmant 
        
               maxexp = finfo.maxexp 
        
               minexp = finfo.minexp 
        
               # workaround for numpy's buggy finfo on float128 on ppc64 archs 
        
               if ftype_ == np.longdouble and MACHINE == 'ppc64': 
        
                   # values reported by hdf5 
        
                   nmant = 116 
        
                   maxexp = 1024 
        
                   minexp = -1022 
        
               elif ftype_ == np.longdouble and MACHINE == 'ppc64le': 
        
                   # values reported by hdf5 
        
                   nmant = 52 
        
                   maxexp = 1024 
        
                   minexp = -1022 
        
               elif nmant == 63 and finfo.nexp == 15: 
        
                   # This is an 80-bit float, correct mantissa size 
        
                   nmant += 1 
        
               return nmant, maxexp, minexp

h5py/h5py/h5t.pyx

Lines 1369 to 1397 in fd0753a

    
           def _get_float_dtype_to_hdf5(): 
        
               float_le = {} 
        
               float_be = {} 
        
               h5_be_list = [IEEE_F16BE, IEEE_F32BE, IEEE_F64BE, IEEE_F128BE, 
        
                             LDOUBLE_BE] 
        
               h5_le_list = [IEEE_F16LE, IEEE_F32LE, IEEE_F64LE, IEEE_F128LE] 
        
               if MACHINE != 'ppc64le': 
        
                   h5_le_list.append(LDOUBLE_LE) 
        
               for ftype_, finfo, size in _available_ftypes: 
        
                   nmant, maxexp, minexp = _correct_float_info(ftype_, finfo) 
        
                   for h5type in h5_be_list: 
        
                       spos, epos, esize, mpos, msize = h5type.get_fields() 
        
                       ebias = h5type.get_ebias() 
        
                       if (finfo.iexp == esize and nmant == msize and 
        
                           (maxexp - 1) == ebias 
        
                       ): 
        
                           float_be[ftype_] = h5type 
        
                   for h5type in h5_le_list: 
        
                       spos, epos, esize, mpos, msize = h5type.get_fields() 
        
                       ebias = h5type.get_ebias() 
        
                       if (finfo.iexp == esize and nmant == msize and 
        
                           (maxexp - 1) == ebias 
        
                       ): 
        
                           float_le[ftype_] = h5type 
        
               if ORDER_NATIVE == H5T_ORDER_LE: 
        
                   float_nt = dict(float_le) 
        
               else: 
        
                   float_nt = dict(float_be) 
        
               return float_le, float_be, float_nt

or

h5py/h5py/h5t.pyx

Lines 1050 to 1069 in fd0753a

    
           cdef object py_dtype(self): 
        
               # Translation function for floating-point types 
        
               order = _order_map[self.get_order()]    # string with '<' or '>' 
        
               s_offset, e_offset, e_size, m_offset, m_size = self.get_fields() 
        
               e_bias = self.get_ebias() 
        
               # Handle non-standard exponent and mantissa sizes. 
        
               for ftype_, finfo, size in _available_ftypes: 
        
                   nmant, maxexp, minexp = _correct_float_info(ftype_, finfo) 
        
                   if (size >= self.get_size() and m_size <= nmant and 
        
                       (2**e_size - e_bias - 1) <= maxexp and (1 - e_bias) >= minexp): 
        
                       new_dtype = np.dtype(ftype_).newbyteorder(order) 
        
                       break 
        
               else: 
        
                   raise ValueError('Insufficient precision in available types to ' + 
        
                                    'represent ' + str(self.get_fields())) 
        
               return new_dtype

We now have CI via travis to test this, but if someone has a ppc64le machine they can debug on it would likely go much faster.

kif · 2019-09-19T09:58:08Z

I just transfered the buggy file created like that and found out the error is more likely to be at the writing than at the reading.
The file faulty attribute cannot be read on amd64 due to missing numpy.float128. on arm64 which supports float128 the value is read as ">f16" with a value of zero.

Issue: ">f8" was written as ">f16" on ppc64le Actually the dict doing the mapping between numpy and hdf5 was wrong. Stop iterating when first matching is found, avoiding "f16" to replace "f8"

kif mentioned this issue Jun 20, 2019

skip tests that use float128 on ppc64le ... #1246

Merged

tacaswell mentioned this issue Sep 12, 2019

TST: try adding ppc64le #1304

Closed

tacaswell added this to To do in h5py code camp Sep 18, 2019

tacaswell moved this from To do to In progress in h5py code camp Sep 19, 2019

kif pushed a commit to kif/h5py that referenced this issue Sep 19, 2019

close h5py#1244

79104b3

Issue: ">f8" was written as ">f16" on ppc64le Actually the dict doing the mapping between numpy and hdf5 was wrong. Stop iterating when first matching is found, avoiding "f16" to replace "f8"

kif mentioned this issue Sep 19, 2019

1244 ppc64le tests #1319

Merged

tacaswell moved this from In progress to Done in h5py code camp Sep 19, 2019

takluyver closed this as completed in #1319 Sep 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ppc64le] test suite failing on power9 #1244

[ppc64le] test suite failing on power9 #1244

kif commented Jun 19, 2019

takluyver commented Jun 19, 2019

kif commented Jun 19, 2019

kif commented Jun 20, 2019

aragilar commented Jun 20, 2019

kif commented Jun 20, 2019

terjeros commented Sep 12, 2019

tacaswell commented Sep 12, 2019

tacaswell commented Sep 12, 2019

tacaswell commented Sep 12, 2019

kif commented Sep 19, 2019

[ppc64le] test suite failing on power9 #1244

[ppc64le] test suite failing on power9 #1244

Comments

kif commented Jun 19, 2019

takluyver commented Jun 19, 2019

kif commented Jun 19, 2019

kif commented Jun 20, 2019

aragilar commented Jun 20, 2019

kif commented Jun 20, 2019

terjeros commented Sep 12, 2019

tacaswell commented Sep 12, 2019

tacaswell commented Sep 12, 2019

tacaswell commented Sep 12, 2019

kif commented Sep 19, 2019