Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_compound_vlen_bool intermittently fails on armhf architecture #1927

Open
drew-parsons opened this issue Jul 8, 2021 · 8 comments
Open

Comments

@drew-parsons
Copy link
Contributor

A recent Debian build of h5py 3.3.0 on armhf has failed test_compound_vlen_bool at build-time, giving a Python Bus error. The full build log is available here. The same test failure happened previously with h5py 3.2.1, but the build later succeeded . So the problem seems to be intermittent. Possibly can't be solved by h5py, but I'm reporting in case something can be done. I'll restart the build, we'll see if it succeeds on armhf next time. Historical build logs on armhf are listed here.

build/h5py/_debian_h5py_serial/tests/test_dtype.py::TestVlen::test_compound PASSED [ 53%]
build/h5py/_debian_h5py_serial/tests/test_dtype.py::TestVlen::test_compound_vlen_bool Fatal Python error: Bus error

Current thread 0xf7cb6310 (most recent call first):
  File "/<<PKGBUILDDIR>>/.pybuild/cpython3_3.9_dbg_h5py_serial/build/h5py/_debian_h5py_serial/_hl/dataset.py", line 920 in __setitem__
  File "/<<PKGBUILDDIR>>/.pybuild/cpython3_3.9_dbg_h5py_serial/build/h5py/_debian_h5py_serial/tests/test_dtype.py", line 59 in test_compound_vlen_bool
  File "/usr/lib/python3.9/unittest/case.py", line 550 in _callTestMethod
  File "/usr/lib/python3.9/unittest/case.py", line 593 in run
  File "/usr/lib/python3.9/unittest/case.py", line 653 in __call__
  File "/usr/lib/python3/dist-packages/_pytest/unittest.py", line 278 in runtest
  File "/usr/lib/python3/dist-packages/_pytest/runner.py", line 153 in pytest_runtest_call
  File "/usr/lib/python3/dist-packages/pluggy/callers.py", line 187 in _multicall
  File "/usr/lib/python3/dist-packages/pluggy/manager.py", line 83 in <lambda>
  File "/usr/lib/python3/dist-packages/pluggy/manager.py", line 92 in _hookexec
  File "/usr/lib/python3/dist-packages/pluggy/hooks.py", line 286 in __call__
  File "/usr/lib/python3/dist-packages/_pytest/runner.py", line 247 in <lambda>
  File "/usr/lib/python3/dist-packages/_pytest/runner.py", line 294 in from_call
  File "/usr/lib/python3/dist-packages/_pytest/runner.py", line 246 in call_runtest_hook
  File "/usr/lib/python3/dist-packages/_pytest/runner.py", line 207 in call_and_report
  File "/usr/lib/python3/dist-packages/_pytest/runner.py", line 117 in runtestprotocol
  File "/usr/lib/python3/dist-packages/_pytest/runner.py", line 100 in pytest_runtest_protocol
  File "/usr/lib/python3/dist-packages/pluggy/callers.py", line 187 in _multicall
  File "/usr/lib/python3/dist-packages/pluggy/manager.py", line 83 in <lambda>
  File "/usr/lib/python3/dist-packages/pluggy/manager.py", line 92 in _hookexec
  File "/usr/lib/python3/dist-packages/pluggy/hooks.py", line 286 in __call__
  File "/usr/lib/python3/dist-packages/_pytest/main.py", line 321 in pytest_runtestloop
  File "/usr/lib/python3/dist-packages/pluggy/callers.py", line 187 in _multicall
  File "/usr/lib/python3/dist-packages/pluggy/manager.py", line 83 in <lambda>
  File "/usr/lib/python3/dist-packages/pluggy/manager.py", line 92 in _hookexec
  File "/usr/lib/python3/dist-packages/pluggy/hooks.py", line 286 in __call__
  File "/usr/lib/python3/dist-packages/_pytest/main.py", line 296 in _main
  File "/usr/lib/python3/dist-packages/_pytest/main.py", line 240 in wrap_session
  File "/usr/lib/python3/dist-packages/_pytest/main.py", line 289 in pytest_cmdline_main
  File "/usr/lib/python3/dist-packages/pluggy/callers.py", line 187 in _multicall
  File "/usr/lib/python3/dist-packages/pluggy/manager.py", line 83 in <lambda>
  File "/usr/lib/python3/dist-packages/pluggy/manager.py", line 92 in _hookexec
  File "/usr/lib/python3/dist-packages/pluggy/hooks.py", line 286 in __call__
  File "/usr/lib/python3/dist-packages/_pytest/config/__init__.py", line 157 in main
  File "/usr/lib/python3/dist-packages/_pytest/config/__init__.py", line 180 in console_main
  File "/usr/lib/python3/dist-packages/pytest/__main__.py", line 7 in <module>
  File "/usr/lib/python3.9/runpy.py", line 87 in _run_code
  File "/usr/lib/python3.9/runpy.py", line 197 in _run_module_as_main
make[1]: *** [debian/rules:82: override_dh_auto_test-arch] Error 249

Tests pass on other architectures, see https://buildd.debian.org/status/package.php?p=h5py&suite=experimental

  • Operating System: Debian GNU/Linux (debian unstable)
  • Python version: 3.9 (3.9.2-1)
  • Where Python was acquired: apt-get install python3.9-dev
  • h5py version: 3.3.0
  • HDF5 version: 1.10.6
@drew-parsons drew-parsons changed the title test_compound_vlen_bool intermittently fails on armhf architectire test_compound_vlen_bool intermittently fails on armhf architecture Jul 8, 2021
@drew-parsons
Copy link
Contributor Author

armhf did pass the test and completed the build successfully the second time round (build log).

@mwhudson
Copy link

mwhudson commented Nov 8, 2021

This will be an unaligned access somewhere. The build passes on hoiby which is a genuinely 32 bit arm machine (a Marvell MV78460) and fails on arm-arm-01 which is a 64 bit CPU (an AMD Seattle) and unaligned accesses bus error when running a 32 bit binary against a 64-bit kernel. Ubuntu only has the latter sort of builders and so I have to fix it :-)

@drew-parsons
Copy link
Contributor Author

drew-parsons commented Nov 8, 2021

We had some analysis of alignment issues in mpi4py. Might not help here (it affected s390x not armhf),but the discussion was at mpi4py/mpi4py#91

@mwhudson
Copy link

mwhudson commented Nov 8, 2021

The accesses causing the immediate failure are here https://github.com/h5py/h5py/blob/master/h5py/_conv.pyx#L844-L845, testing to see if the test suite turns up any more now...

@mwhudson
Copy link

mwhudson commented Nov 9, 2021

I uploaded this patch to Ubuntu fwiw:

--- a/h5py/_conv.pyx
+++ b/h5py/_conv.pyx
@@ -841,8 +841,8 @@
 
     H5Tconvert(intype.id, outtype.id, len, data, NULL, H5P_DEFAULT)
 
-    in_vlen[0].len = len
-    in_vlen[0].ptr = data
+    memcpy(&in_vlen[0].len, &len, sizeof(size_t))
+    memcpy(&in_vlen[0].ptr, &data, sizeof(void*))
 
     return 0
 

@drew-parsons
Copy link
Contributor Author

drew-parsons commented Jan 19, 2024

There is more discussion of the problem at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1061063

In some cases there is guard against alignment failure, we can't reproduce the reported error on debian armhf environments.

We've already applied the patch, https://salsa.debian.org/science-team/h5py/-/blob/master/debian/patches/fix-unaligned-access.patch?ref_type=heads
Does it need to be extended to other parts of the code for h5py 3.10?

Here we fixed it for __setitem__. The new report references __getitem__

build/h5py/_debian_h5py_serial/tests/test_dtype.py::TestVlen::test_compound_vlen_bool 
Fatal Python error: Bus error

Current thread 0xf7959020 (most recent call first):
  File 
"/<<PKGBUILDDIR>>/.pybuild/cpython3_3.11_h5py_serial/build/h5py/_debian_h5py_serial/_hl/dataset.py", 
line 841 in __getitem__
  File 
"/<<PKGBUILDDIR>>/.pybuild/cpython3_3.11_h5py_serial/build/h5py/_debian_h5py_serial/tests/test_dtype.py", 
line 60 in test_compound_vlen_bool
  File "/usr/lib/python3.11/unittest/case.py", line 579 in _callTestMethod

@drew-parsons
Copy link
Contributor Author

l.920 from v3.3.0 self.id.write(mspace, fspace, val, mtype, dxpl=self._dxpl) (fixed by the patch) would now be l.999

self.id.write(mspace, fspace, val, mtype, dxpl=self._dxpl)

The l.841 in the new report is

self.id.read(mspace, fspace, arr, mtype, dxpl=self._dxpl)

@drew-parsons
Copy link
Contributor Author

drew-parsons commented Jan 19, 2024

Would this be the problem here?

size = in_vlen0.len = in_vlen[0].len

Commit 64edf49 and dc9f942 from PR #1406 removed all uses of memcpy. Should they all be reinstated? Or at least the one for _conv.pyx ll.709,l.710 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants