Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_cython_fortran_utils.py:test_read fails on s390x (64 bit, big endian) #4569

Closed
olebole opened this issue Jul 7, 2023 · 13 comments · Fixed by #4577
Closed

test_cython_fortran_utils.py:test_read fails on s390x (64 bit, big endian) #4569

olebole opened this issue Jul 7, 2023 · 13 comments · Fixed by #4577
Assignees
Labels
bug tests: running tests Issues with the test setup
Milestone

Comments

@olebole
Copy link
Contributor

olebole commented Jul 7, 2023

Bug report

Bug summary

During the build for the Debian package of version 4.2.1 on our s390x platform, one the yt/utilities/tests/test_cython_fortran_utils.py::test_read test failed. s390x is Debian's only official 64bit/bigendian platform; however the unofficial 64-bit big endian PowerPC port shows the same failure. 64-bit little endian platforms build fine.

Actual outcome

__________________________________ test_read ___________________________________

tmp_path = PosixPath('/tmp/pytest-of-buildd/pytest-0/test_read0')

    def test_read(tmp_path):
        dummy_file = tmp_path / "test.bin"
        # Write a Fortran-formatted file containing one record with 4 doubles
        dummy_file.write_bytes(
            b" \x00\x00\x00\x00\x00\x00\x00\x00\x00\xf0?\x00\x00\x00\x00\x00\x00\x00@\x00\x00\x00\x00\x00\x00\x08@\x00\x00\x00\x00\x00\x00\x10@ \x00\x00\x00"
        )
        with FortranFile(str(dummy_file)) as f:
            np.testing.assert_equal(
>               f.read_vector("d"),
                [1.0, 2.0, 3.0, 4.0],
            )

yt/utilities/tests/test_cython_fortran_utils.py:20: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
yt/utilities/cython_fortran_utils.pyx:101: in yt.utilities.cython_fortran_utils.FortranFile.read_vector
    cpdef np.ndarray read_vector(self, str dtype):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   raise IOError('Sizes do not agree in the header and footer for '
E   OSError: Sizes do not agree in the header and footer for this record - check header dtype

yt/utilities/cython_fortran_utils.pyx:140: OSError

Full build log here

Version Information

  • Operating System: Debian unstable, 64 bit, big endian
  • Python Version: 3.11.4
  • yt version: 3.21.1
  • Cython: 0.29.32

All packages installed from Debian unstable.

@neutrinoceros
Copy link
Member

Hi @olebole, thanks for reporting.
Is this the very first version for which a build targeting 64bit + bigendian platforms is attempted, or is this a regression ?

@olebole
Copy link
Contributor Author

olebole commented Jul 8, 2023

We always build a s390x package for yt, and this is the first version which shows this problem. The last version on Debian is 4.1.4, and this was passing the tests.

@neutrinoceros
Copy link
Member

neutrinoceros commented Jul 8, 2023

Good to know. I had a look at the problematic function but it’s not clear to me why it’s failing now. Any chance you could git bisect this ?

@neutrinoceros
Copy link
Member

I note that the failing test was added in yt 4.2.0 (see #4438). This is probably why it didn't fail before 🙃

@neutrinoceros
Copy link
Member

I think what is happening in that on this line
https://github.com/yt-project/yt/blob/feb3107ca39fc51aab2010cc66f1d376c0361885/yt/utilities/cython_fortran_utils.pyx#L128C1-L128C1

we are reading a little-endian 32 (s1) as a big-endian 67108864, so the rest of the function is reading way beyond the actual file and we end up interpreting a random bitset as s2, which in general will not match our s1.
If my interpretation is correct, the solution would be to explicitly use little-endian int32 variables here, though I don't know how to do that just yet.

@neutrinoceros
Copy link
Member

I may have a patch, can you try this ?

diff --git a/yt/utilities/cython_fortran_utils.pxd b/yt/utilities/cython_fortran_utils.pxd
index ab6c2a0dc..405bb87f3 100644
--- a/yt/utilities/cython_fortran_utils.pxd
+++ b/yt/utilities/cython_fortran_utils.pxd
@@ -5,6 +5,10 @@ ctypedef np.int32_t INT32_t
 ctypedef np.int64_t INT64_t
 ctypedef np.float64_t DOUBLE_t
 
+import sys
+cdef int _IS_BIG_ENDIAN = sys.byteorder == "big"
+
+
 cdef class FortranFile:
     cdef FILE* cfile
     cdef bint _closed
diff --git a/yt/utilities/cython_fortran_utils.pyx b/yt/utilities/cython_fortran_utils.pyx
index aa26add5c..36acaa460 100644
--- a/yt/utilities/cython_fortran_utils.pyx
+++ b/yt/utilities/cython_fortran_utils.pyx
@@ -117,15 +117,17 @@ cdef class FortranFile:
         >>> rv = f.read_vector("d")  # Read a float64 array
         >>> rv = f.read_vector("i")  # Read an int32 array
         """
-        cdef INT32_t s1, s2, size
-        cdef np.ndarray data
+        cdef INT32_t size
+        cdef np.ndarray s1, s2, data
 
         if self._closed:
             raise ValueError("I/O operation on closed file.")
 
         size = self.get_size(dtype)
 
-        fread(&s1, INT32_SIZE, 1, self.cfile)
+        s1 = np.empty(1, dtype="int32")
+        fread(<void *>s1.data, INT32_SIZE, 1, self.cfile)
+        if _IS_BIG_ENDIAN: s1.byteswap(inplace=True)
 
         # Check record is compatible with data type
         if s1 % size != 0:
@@ -134,7 +136,11 @@ cdef class FortranFile:
 
         data = np.empty(s1 // size, dtype=dtype)
         fread(<void *>data.data, size, s1 // size, self.cfile)
-        fread(&s2, INT32_SIZE, 1, self.cfile)
+        if _IS_BIG_ENDIAN: data.byteswap(inplace=True)
+
+        s2 = np.empty(1, dtype="int32")
+        fread(<void *>s2.data, INT32_SIZE, 1, self.cfile)
+        if _IS_BIG_ENDIAN: s2.byteswap(inplace=True)
 
         if s1 != s2:
             raise IOError('Sizes do not agree in the header and footer for '

This is still within the assumption that I actually understood the problem...
I further note that there are a couple other places in the module where byte swapping would be needed to actually support big-endian architectures. It also may not be the most optimised way to do it (I'm allocating full numpy arrays instead of native scalars, just for the sake of not implementing byte-swapping myself), but I'm hoping that it works at least as a proof-of-concept :)

@matthewturk
Copy link
Member

In general I think as long as we're using a python-readable object as a destination, we can specify the destination as being big/little endian explicitly. So I believe that if you were to read it into the numpy array if the numpy array is set explicitly as <i4 or >i4 it should work? And I know that the Python struct module also accepts endianness arguments.

@neutrinoceros
Copy link
Member

True, but isn’t struct.read infamously slow ?

@matthewturk
Copy link
Member

It may be -- but I think we'd need to compare that against spot-allocating numpy arrays. Although I will say, perhaps what would be a more workable solution would be a single allocated numpy buffer of the largest single-type we expect, reading into it as a destination, then accessing a .view on it to manage endianness?

@cphyc cphyc self-assigned this Jul 9, 2023
@cphyc
Copy link
Member

cphyc commented Jul 9, 2023

I think the correct bugfix is in the test itself. Namely, the test writes in binary format some content

dummy_file.write_bytes(
b" \x00\x00\x00\x00\x00\x00\x00\x00\x00\xf0?\x00\x00\x00\x00\x00\x00\x00@\x00\x00\x00\x00\x00\x00\x08@\x00\x00\x00\x00\x00\x00\x10@ \x00\x00\x00"
)
, but the bytestring was generated on a little-endian computer and is thus incorrectly written to the file that's being tested.

I think the best solution is the following patch

diff --git a/yt/utilities/tests/test_cython_fortran_utils.py b/yt/utilities/tests/test_cython_fortran_utils.py
index 023bf3652..add79c46b 100644
--- a/yt/utilities/tests/test_cython_fortran_utils.py
+++ b/yt/utilities/tests/test_cython_fortran_utils.py
@@ -1,3 +1,4 @@
+import struct
 import numpy as np
 import pytest
 
@@ -12,9 +13,12 @@ def test_raise_error_when_file_does_not_exist():
 def test_read(tmp_path):
     dummy_file = tmp_path / "test.bin"
     # Write a Fortran-formatted file containing one record with 4 doubles
-    dummy_file.write_bytes(
-        b" \x00\x00\x00\x00\x00\x00\x00\x00\x00\xf0?\x00\x00\x00\x00\x00\x00\x00@\x00\x00\x00\x00\x00\x00\x08@\x00\x00\x00\x00\x00\x00\x10@ \x00\x00\x00"
-    )
+    # The format is a 32bit integer with value 4*sizeof(double)=32
+    # followed by 4 doubles and another 32bit integer with value 32
+    # Note that there is no memory alignment, hence the "=" below
+    buff = struct.pack("=i 4d i", 32, 1., 2., 3., 4., 32)
+    dummy_file.write_bytes(buff)
+
     with FortranFile(str(dummy_file)) as f:
         np.testing.assert_equal(
             f.read_vector("d"),

@neutrinoceros
Copy link
Member

Ah ! I worked under the assumption that Fortran files were always written in little endian, but if not, I approve @cphyc's patch !

@neutrinoceros neutrinoceros added this to the 4.2.2 milestone Jul 15, 2023
@neutrinoceros
Copy link
Member

@olebole are you able to test @cphyc's patch ?

@olebole
Copy link
Contributor Author

olebole commented Jul 15, 2023

Yes; I uploaded this as 4.2.1-3, and the test succeeds on s390 (full build log).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug tests: running tests Issues with the test setup
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants