-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible regression in dictionary training in 1.4.0 #1599
Comments
I digged further here, and apparently |
As a side note, I am not sure why it is necessary to pass the
Related with dictionary training, I do have another, different use case for passing a discontiguous buffer to the training function. The scenario is for Blosc having a data chunk split in blocks that are to be compressed independently, and passing the initial bytes for every block would be great so as to accumulate the redundancy among the blocks in the chunk dictionary. Supporting a discontiguous
If you find this to be feasible, I can open a different ticket with the suggestion. |
The dictionary error codes could certainly be improved. I'll leave this issue open until I get a chance to improve the error codes, and document the minimum number of samples in the header.
|
Thanks for the hints. I still don't see why you need |
After upgrading zstd in c-blosc2 to 1.4.0 (from 1.3.4), I've got this regression:
The fact that the
ZDICT_trainFromBuffer
error message is just 'generic' does not help too much for figuring out what's going on.When going back to commit Blosc/c-blosc2@7ee5507 (i.e. previous to the 1.4.0 update), the output of the test above is ok:
You can have a look at how I do the dict training here.
For reproducing the issue, just clone the c-blosc2 repo, cd into it and do the typical:
Thanks!
The text was updated successfully, but these errors were encountered: