Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undocumented behaviour of bitshuffle when blocksize is not a multiple of 8*type_size. #312

Open
Sopel97 opened this issue Aug 27, 2020 · 2 comments

Comments

@Sopel97
Copy link

Sopel97 commented Aug 27, 2020

Bitshuffling fallbacks to memcpy when size % 8 !=0. This is fixed in version 3, so I presume it's not fixed here for valid reasons. I would propose at least visibly documenting this behaviour as it is completely transparent to the users and results in much worse compression. It took me hours to narrow this down.

@Sopel97 Sopel97 changed the title Bitshuffle not applied when blocksize is not a multiple of 8*type_size. Undocumented behaviour of bitshuffle when blocksize is not a multiple of 8*type_size. Aug 27, 2020
@FrancescAlted
Copy link
Member

Yes, that could be the case. The issue here is that we are always improving C-Blosc2 and sometimes it is difficult to track down which improvements can be backported to C-Blosc (although I can ensure you that we are proactively doing so in general). If that is something that is important for you, we would be happy to add a backport to C-Blosc (of course, a PR would help in getting this included as soon as possible).

@Sopel97
Copy link
Author

Sopel97 commented Aug 27, 2020

I'm not sure this can be fixed by just backporting bitshuffle and bitunshuffle from blosc2 because because it's a breaking change and the BLOSC_VERSION_FORMAT cannot be bumped up to 3 here as far as I understand (because blosc2 has BLOSC_VERSION_FORMAT==3 and it implies supporting filter pipelines in the chunk API). So I'm not sure how this could be approached any other way than to document said defect. Note that in most cases it can be helped on the user side by using a specific blocksize.

I can make a PR for a note in the documentation if there's no other way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants