Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interrupting an h5netcdf process "corrupts" netCDF file #98

Closed
groutr opened this issue Apr 20, 2021 · 9 comments
Closed

Interrupting an h5netcdf process "corrupts" netCDF file #98

groutr opened this issue Apr 20, 2021 · 9 comments

Comments

@groutr
Copy link
Contributor

groutr commented Apr 20, 2021

If a process using h5netcdf is interrupted, the netCDF file becomes "corrupted". Subsequent reads by h5netcdf produce the following exception when the file is closed.

Traceback (most recent call last):
  File "waterlevel.py", line 139, in <module>
    read_waterlevel(args.fort63, args.pli, args.output)
  File "waterlevel.py", line 125, in read_waterlevel
    print("ehllo world")
  File "C:\Users\xxxxxxx\Miniconda3\envs\wrf2\lib\site-packages\h5netcdf\core.py", line 888, in __exit__
    self.close()
  File "C:\Users\xxxxxxx\Miniconda3\envs\wrf2\lib\site-packages\h5netcdf\core.py", line 878, in close
    self.flush()
  File "C:\Users\xxxxxxx\Miniconda3\envs\wrf2\lib\site-packages\h5netcdf\core.py", line 864, in flush
    self._create_dim_scales()
  File "C:\Users\xxxxxxx\Miniconda3\envs\wrf2\lib\site-packages\h5netcdf\core.py", line 567, in _create_dim_scales
    h5ds.attrs["_Netcdf4Dimid"] = np.int32(dim_order[dim])
  File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "C:\Users\xxxxxxx\Miniconda3\envs\wrf2\lib\site-packages\h5py\_hl\attrs.py", line 103, in __setitem__
    self.create(name, data=value)
  File "C:\Users\xxxxxxx\Miniconda3\envs\wrf2\lib\site-packages\h5py\_hl\attrs.py", line 212, in create
    h5a.delete(self._id, self._e(name))
  File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py\h5a.pyx", line 145, in h5py.h5a.delete
KeyError: 'Unable to delete attribute (record is not in B-tree)'
Exception ignored in: <function File.close at 0x0000028A60199318>
Traceback (most recent call last):
  File "C:\Users\xxxxxxx\Miniconda3\envs\wrf2\lib\site-packages\h5netcdf\core.py", line 878, in close
  File "C:\Users\xxxxxxx\Miniconda3\envs\wrf2\lib\site-packages\h5netcdf\core.py", line 864, in flush
  File "C:\Users\xxxxxxx\Miniconda3\envs\wrf2\lib\site-packages\h5netcdf\core.py", line 557, in _create_dim_scales
  File "C:\Users\xxxxxxx\Miniconda3\envs\wrf2\lib\site-packages\h5netcdf\core.py", line 391, in _h5group
  File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "C:\Users\xxxxxxx\Miniconda3\envs\wrf2\lib\site-packages\h5py\_hl\group.py", line 288, in __getitem__
  File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py\h5o.pyx", line 190, in h5py.h5o.open
  File "h5py\h5i.pyx", line 46, in h5py.h5i.wrap_identifier
ImportError: sys.meta_path is None, Python is likely shutting down

h5netcdf 0.11.0
h5py 3.1.0
hdf5 1.10.6

netCDF4 appears to be able to still open and close the file without issue.

@groutr
Copy link
Contributor Author

groutr commented Apr 20, 2021

The essence of the code that produces this issue:

import h5netcdf.legacyapi as netCDF4
import numpy as np

with netCDF4.Dataset(fort63) as ds:  # fort63 is the output of an ADCIRC model
    zeta = ds.variables['zeta']
    for col in range(zeta.shape[1]):
        k = np.ma.getdata(zeta[:, col])

zeta.shape[1] is roughly 11,800 columns. Sending a keyboard interrupt during the first time this code is executed will trigger the error. All subsequent runs will produce the exception while the file is being closed.

@groutr
Copy link
Contributor Author

groutr commented Apr 20, 2021

This might be somewhat related (to the second part of this error).
#50

Note in this issue, I am using h5netcdf directly and not thru xarray.
The file I am reading is the output of an ADCIRC model.

@groutr
Copy link
Contributor Author

groutr commented Apr 20, 2021

The reason that I'm using h5netcdf over netCDF4 is that with this particular ADCIRC output, netCDF4 is very slow accessing a column. Switching to h5netcdf cut my script's runtime from about 1.5hrs (estimated) to about 1min. It's a nice performance story for h5netcdf if I can figure out how to resolve this issue.

@kmuehlbauer
Copy link
Collaborator

@groutr Thanks for raising this. I've run into this behaviour occasionally. You do not have a reproducible example (or file to download)? You are running this from a console via a python script or by other means (jupyter)?

@groutr
Copy link
Contributor Author

groutr commented Apr 22, 2021

@kmuehlbauer, unfortunately, I don't think I'm able to share the sample file that I have. I'm running this via a python script executed from the console. This script is designed to run as a directly executed utility. It's essential function is pull data out from a designated set of columns (of a variable) and write the data to a CSV file. Nothing fancy happening here.
I can try testing this method on some publicly available files and see if I can reproduce.

@shoyer
Copy link
Collaborator

shoyer commented Apr 22, 2021

Does this occur even if you explicitly set mode='r' when opening the file? If so, something is very wrong here...

@groutr
Copy link
Contributor Author

groutr commented Apr 23, 2021

@shoyer I only observed the behavior with mode='a' and that is what prompted #99.
There is no flush when mode='r' (https://github.com/h5netcdf/h5netcdf/blob/master/h5netcdf/core.py#L862)

@kmuehlbauer
Copy link
Collaborator

kmuehlbauer commented Apr 23, 2021

This might be resolved by #101. Could you test that PR locally @groutr, and comment over there? I'm testing it myself, but can't find a workflow which breaks.

@kmuehlbauer kmuehlbauer mentioned this issue Jun 11, 2021
16 tasks
@kmuehlbauer kmuehlbauer added this to the h5netcdf 1.0.0 milestone Jun 11, 2021
@kmuehlbauer
Copy link
Collaborator

@groutr That should be resolved by now. We can reopen otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants