Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mpi4py returns a BlockingIOError/OSError: Unable to create file #36

Open
jpbreuer opened this issue May 2, 2022 · 2 comments
Open

mpi4py returns a BlockingIOError/OSError: Unable to create file #36

jpbreuer opened this issue May 2, 2022 · 2 comments

Comments

@jpbreuer
Copy link

jpbreuer commented May 2, 2022

  • BXA version: 4.0.5
  • UltraNest version: 3.4.4
  • Python version: 3.9
  • Xspec or Sherpa and version: Xspec 12.11.1
  • Operating System: Debian GNU/Linux 11 (bullseye)

Description

While attempting to parallelize BXA with mpi, h5py file is created but locked. After following recommendation in previous (closed) bxa issue thread here, and attempting to reinstall all dependencies, problem persists, but with new error.

I read many forums regarding the errors, and they have recommended reinstalling dependencies, it seems as though the h5py file is corrupted while being created.

What I Did

Old error:

Traceback (most recent call last):
  File "/home/jpbreuer/Scripts/bxa_test.py", line 373, in <module>
    results = solver.run(resume=True)
  File "/home/jpbreuer/.local/lib/python3.9/site-packages/bxa/xspec/solver.py", line 188, in run
    self.results = solve(
  File "/home/jpbreuer/.local/lib/python3.9/site-packages/ultranest/solvecompat.py", line 55, in pymultinest_solve_compat
    sampler = ReactiveNestedSampler(
  File "/home/jpbreuer/.local/lib/python3.9/site-packages/ultranest/integrator.py", line 1077, in __init__
    self.pointstore = HDF5PointStore(storage_filename, storage_num_cols, mode='a' if resume else 'w')
  File "/home/jpbreuer/.local/lib/python3.9/site-packages/ultranest/store.py", line 187, in __init__
    self.fileobj = h5py.File(filepath, **h5_file_args)
  File "/home/jpbreuer/.local/lib/python3.9/site-packages/h5py/_hl/files.py", line 507, in __init__
    fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
  File "/home/jpbreuer/.local/lib/python3.9/site-packages/h5py/_hl/files.py", line 232, in make_fid
    fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 106, in h5py.h5f.open
BlockingIOError: [Errno 11] Unable to open file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')

Updated error:

Traceback (most recent call last):
  File "/home/jpbreuer/Scripts/bxa_test.py", line 128, in <module>
    results = solver.run(resume=True)
  File "/usr/local/lib/python3.9/dist-packages/bxa/xspec/solver.py", line 188, in run
    self.results = solve(
  File "/usr/local/lib/python3.9/dist-packages/ultranest/solvecompat.py", line 55, in pymultinest_solve_compat
    sampler = ReactiveNestedSampler(
  File "/usr/local/lib/python3.9/dist-packages/ultranest/integrator.py", line 1077, in __init__
    self.pointstore = HDF5PointStore(storage_filename, storage_num_cols, mode='a' if resume else 'w')
  File "/usr/local/lib/python3.9/dist-packages/ultranest/store.py", line 187, in __init__
    self.fileobj = h5py.File(filepath, **h5_file_args)
  File "/usr/lib/python3/dist-packages/h5py/_debian_h5py_serial/_hl/files.py", line 387, in __init__
    fid = make_fid(name, mode, userblock_size,
  File "/usr/lib/python3/dist-packages/h5py/_debian_h5py_serial/_hl/files.py", line 187, in make_fid
    fid = h5f.create(name, h5f.ACC_EXCL, fapl=fapl, fcpl=fcpl)
  File "h5py/_debian_h5py_serial/_objects.pyx", line 54, in h5py._debian_h5py_serial._objects.with_phil.wrapper
  File "h5py/_debian_h5py_serial/_objects.pyx", line 55, in h5py._debian_h5py_serial._objects.with_phil.wrapper
  File "h5py/_debian_h5py_serial/h5f.pyx", line 108, in h5py._debian_h5py_serial.h5f.create
OSError: Unable to create file (unable to open file: name = 'bxatest/results/points.hdf5', errno = 17, error message = 'File exists', flags = 15, o_flags = c2)
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/h5py/_debian_h5py_serial/_hl/files.py", line 185, in make_fid
    fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)
  File "h5py/_debian_h5py_serial/_objects.pyx", line 54, in h5py._debian_h5py_serial._objects.with_phil.wrapper
  File "h5py/_debian_h5py_serial/_objects.pyx", line 55, in h5py._debian_h5py_serial._objects.with_phil.wrapper
  File "h5py/_debian_h5py_serial/h5f.pyx", line 88, in h5py._debian_h5py_serial.h5f.open
OSError: Unable to open file (truncated file: eof = 96, sblock->base_addr = 0, stored_eof = 2048)
@JohannesBuchner
Copy link
Owner

Double-check that you can import mpi4py in your python/sherpa script.

https://johannesbuchner.github.io/UltraNest/debugging.html#Parallelisation-issues

@JohannesBuchner
Copy link
Owner

and delete bxatest/results/points.hdf5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants