New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
infinite loop when closing file #556
Comments
Could you post the error message? Btw, this may be an HDF5 problem... all the libver keyword does is set a flag on the file access property list. |
Here's the error message:
Strangely I don't see the error on my Mac, but only with Linux. The equivalent C program doesn't produce an error:
Is it possible that libver is exposing some issue on the h5py side? |
Maybe, but I can't think where. Once that value goes into the property list we don't touch it any more. I'll leave this issue open but I'm stumped. |
Is there some way that h5py can generate a trace of the HDF5 lib calls? If we had that the hdf5 library development team could take a look. |
Not that I know of. You could try running Python in verbose mode, though (python -v). |
The below example is sufficient to reproduce the problem with h5py-2.6.0. The difference between triggering the problem and not (apart from the aforementioned libver='latest') is the number of attributes: 544 will trigger, 543 will not. The threshold would appear to be the triggering of "dense" attributes as supported in HDF5 1.8 and above. The example below was tested with HDF 1.10.0-patch1.
|
Apparently the magic number shifted after a reboot, it appears not to be a hard number. In my post-reboot checks, 549 is now the number needed to trigger the problem on my OS X Yosemite MacBook Pro; YMMV. |
I see this failure with numbers as low as 50 and it appears that the failure is random but becomes more likely as the number of attributes increases. |
I am seeing the same bug appear randomly, have not found a pattern. This is the error I get:
Has anyone looked into this? |
I wonder if this is fixed by the gc fencing in #903 |
I also can not reproduce this with py3.6, h5py 2.7, hdf5 1.8.18 (event with 100000 attributes). |
I get this error periodically when h5serv (https://github.com/HDFGroup/h5serv) shuts down. I've talked about this with the hdf5 library team, but we haven't been able to isolate the issue. At least it doesn't seem to have any adverse effects. |
Hi all, I've written a simple python module to manage r/w from/to hdf5 files and I'm experiencing the same issue. I have Ubuntu 16.04.4 LTS and h5 version 1.10.2. It seems to be more frequent when attributes were added to a dataset or when the dataset hasn't been already present before launching Here the repo: https://github.com/project-tuva/h5bug It seems that a solution was found for this bug: How can I get rid of this error? Thanks in advance |
I can reproduce this with h5py 2.9 and hdf5 1.10.4 (from arch packages), but with a different error pattern. I do not think this is in closing the file, but it tearing down the library. Using a slightly simpler script: import h5py
import time
ATTR_COUNT = 600
f = h5py.File("attr1k.h5", "w", libver='latest')
create_start = time.time()
print("creating attributes", create_start)
# create attributes
for i in range(ATTR_COUNT):
name = 'a{:04d}'.format(i)
f.attrs[name] = "this is attribute: " + str(i)
f.close()
print('closed file?!')
print('about to tear things down')
However it does not fail every time which suggests to me there is a race condition in de-allocating the guts of hdf5 and the python side. With smaller numbers of attributes you still see this, just at lower rates (from eye-balling it) and as reported above by @derobins and @chissg This may be fixed by / related to https://bitbucket.hdfgroup.org/projects/HDFFV/repos/hdf5/commits/f808c108ed0315f115a7c69cbd8ee95032a64b34 which looks like it got merged to the 1.10 branch in https://bitbucket.hdfgroup.org/projects/HDFFV/repos/hdf5/commits/489f6fb69711ef7f26f4c13ad863438779f654b8 which is post 1.10.4 and pre 1.10.5 (I think? I am a bit confusedby the tagging scheme) . Unfortunately I'm out of bandwidth to build 1.10.5 locally to test this tonight. |
Thomas, I have tried your script with h5py 2.9, hdf5 1.10.5, with the same
result.
The problem is gone when I remove libver='latest'.
I also noticed that the size of the created file is smaller when HDF5
complains.
Has someone already tested this calling the hdf5 library directly?
Greetings, Richard
…On Tue, 12 Mar 2019 at 03:52, Thomas A Caswell ***@***.***> wrote:
I can reproduce this with h5py 2.9 and hdf5 1.10.4 (from arch packages),
but with a different error pattern. I do not think this is in closing the
file, but it tearing down the library. Using a slightly simpler script:
import h5pyimport time
ATTR_COUNT = 600
f = h5py.File("attr1k.h5", "w", libver='latest')
create_start = time.time()print("creating attributes", create_start)# create attributesfor i in range(ATTR_COUNT):
name = 'a{:04d}'.format(i)
f.attrs[name] = "this is attribute: " + str(i)
f.close()print('closed file?!')print('about to tear things down')
creating attributes 1552357400.43879
closed file?!
about to tear things down
HDF5: infinite loop closing library
L,T_top,P,P,Z,FD,E,SL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL,FL
However it does not fail *every* time which suggests to me there is a
race condition in de-allocating the guts of hdf5 and the python side. With
smaller numbers of attributes you still see this, just at lower rates (from
eye-balling it) and as reported above by @derobins
<https://github.com/derobins> and @chissg <https://github.com/chissg>
This may be fixed by / related to
https://bitbucket.hdfgroup.org/projects/HDFFV/repos/hdf5/commits/f808c108ed0315f115a7c69cbd8ee95032a64b34
which looks like it got merged to the 1.10 branch in
https://bitbucket.hdfgroup.org/projects/HDFFV/repos/hdf5/commits/489f6fb69711ef7f26f4c13ad863438779f654b8
which is post 1.10.4 and pre 1.10.5 (I think? I am a bit confusedby the
tagging scheme) .
Unfortunately I'm out of bandwidth to build 1.10.5 locally to test this
tonight.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#556 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ADi7qgs_LvLkZ5O4mdZuyhVEZwfdijqBks5vVxaKgaJpZM4D5c9X>
.
|
That does suggest that it is related to the 'dense' attribute storage, but unfortunate that just upgrading won't fix it :( |
Hey, man, its my honer to ask this question, this problem means your h5py file is not be saved completly ,you can try again ,use anothor model to train your data. |
Please check your program with the latest HDF5 develop and 1_10 branches. |
I can not reproduce this with h5py 2.10 and hdf5 1.10.5. Clasing as fixed upstream, thanks @epourmal ! |
I'm getting an "infinite loop closing library" error when using libver="latest".
See example below. Without the libver='latest' the file closes fine, but the entire script takes some time to run (~8 seconds on my system). With the libver="latest' it's much faster (~0.2 seconds), but I get the infinite loop message.
This is with h5py 2.4 and hdf5 lib 1.8.14.
The text was updated successfully, but these errors were encountered: