Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed HDF5 access #157

Open
cbyrohl opened this issue Feb 29, 2024 · 0 comments
Open

Distributed HDF5 access #157

cbyrohl opened this issue Feb 29, 2024 · 0 comments

Comments

@cbyrohl
Copy link
Owner

cbyrohl commented Feb 29, 2024

We need to rework distributed hdf5 access once more, which appears to have stopped working.

TypeError: h5py objects cannot be pickle

There exists h5pickle, but this does not currently work as drop-in replacement. The issue appears to be related to DaanVanVugt/h5pickle#14.

A naive strategy would to provide only the filepath and hdf5 path to the dataset, opening/reading/closing the file for every dask chunk. The repeated opening and closing introduces a performance penalty. See discussion here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant