You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Just an idea:
shuffling is efficient when it yields long blocks with same values.
Assuming non-normal distributed data (e.g. text, images, ...), calculating the difference before shuffling might lead to smaller values, thus increasing the chance of long blocks of zeros afterwards.
The text was updated successfully, but these errors were encountered:
Yes, that is a nice idea. Probably will only work with integers, as this can change the precision in floating point, but worth exploring. Will still have room for at least four different pre-conditioners in Blosc, and what you are suggesting may be good candidate. Would you like to create some PR?
In fact, in many cases diff'ing series of IEEE 754 floats as integers is nearly as efficient as calculating the precise floating-point differences. This works if typical delta is less than the typical magnitude (exponent changes rarely), e.g. for measurements of temperature in K, masses, and similar non-negative quantities.
In my experiments, this + bitshuffle did provide better compression compared to plain bitshuffle, and was quite fast.
Just an idea:
shuffling is efficient when it yields long blocks with same values.
Assuming non-normal distributed data (e.g. text, images, ...), calculating the difference before shuffling might lead to smaller values, thus increasing the chance of long blocks of zeros afterwards.
The text was updated successfully, but these errors were encountered: