-
Notifications
You must be signed in to change notification settings - Fork 2
Convert dataset attributes to Blosc2 vlmeta in HDF5 publisher root #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Otherwise an empty dictionary would trigger automated argument computation.
To avoid frequent reading and conversion.
Qodana for PythonIt seems all right 👌 No new problems were found according to the checks applied 💡 Qodana analysis was run in the pull request mode: only the changed files were checked Contact Qodana teamContact us at qodana-support@jetbrains.com
|
Suggested by Qodana.
FrancescAlted
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good. I have some comments. In particular, can you try with https://github.com/lebedov/msgpack-numpy and see if that would support attributes like in h5py? Thanks!
| # This small workaround avoids Blosc2's strict type packing, | ||
| # so we can handle value subclasses like `numpy.bytes_` | ||
| # (e.g. for Fortran-style string attributes added by PyTables). | ||
| pvalue = msgpack.packb(avalue, default=blosc2_ext.encode_tuple) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can handle most of the Python values, but NumPy objects are not supported. May be using https://github.com/lebedov/msgpack-numpy could be a solution here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you do a test with https://github.com/lebedov/msgpack-numpy and see if that would work for Caterva2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better merge this first and do this in another PR
FrancescAlted
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better merge this and tackle further development later on.
This extends the work in #28 with on-the-fly translation of attributes in HDF5 datasets. The msgpacked attributes are cached for efficiency. Preliminary tests have been performed with basic attributes like strings, but the machinery may need extra work for other attribute types (as is the case with
cat2import, which now shares code with this).