New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Honor padding in compound types in native HDF5 files #720
Conversation
As a reference, here it is the message from Ken Walker to pytables-users on 2018-10-30 about the issue that is being addressed in this PR:
|
With this, PyTables can create tables with paddings as long as they come from NumPy arrays with paddings (i.e. paddings in NumPy structured arrays are respected), and the original paddings are respected during copies too. To do: |
The heavy test suite pass on Linux:
|
The tests LGTM, so if they pass, go ahead and merge. If you'd like an actual review, I can maybe do that tomorrow. Let me know. Nice work! |
Hi @tomkooij . Yes, as this is a change that affects the format of dataset copies (again, only when padding is present), a review would be greatly appreciated. Thanks! |
Let's merge. |
This PR adds support for handling general compound types with padding. So far, PyTables always removed the possible padding (i.e. 'holes') in the compound datatypes, leading to issues when PyTables was used for manipulating or copying HDF5 files created with other tools in that the padding was removed.
With this PR, the HDF5 types are used internally with padding, so preserving it during output operations, most specially with copies (via e.g.
ptrepack
). This would allow a smoother interation with the HDF5 ecosystem out there (see some messages send from Ken Walker in the mailing list in 2018-10-30).