New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test-h5repack fails on big endian architectures #100
Comments
Indeed this command on s390x gives 0 exit, no output, larger result than input:
I wonder if those parameters computed in a way that may be dependent on endianness? |
You nailed it. I tried
And once we swap the last two numbers, it actually works for my simplified array and It is not entirely clear how to properly fix this unaligned memory access that parses the values in an architecture-dependent way. Any kind of fix looks like it's probably breaking some user. |
After some research, here's an analysis of the current situation of endianness in General context:
|
Apologies everyone for delays in responding. I hope to get to this and a few other issues here next week. |
It is somewhat unfortunate that H5Z-ZFP uses a different encoding of compression parameters than the zfp library does. All compression modes zfp recognizes are given by the four integer parameters Going forward, I wonder if new versions of H5Z-ZFP could use this alternate encoding in a backwards compatible manner, i.e., files written with newer versions of H5Z-ZFP would use zfp's internal encoding, while the filter would still be able to read files using the old encoding. In fact, H5Z-ZFP already supports the 4-parameter "expert mode," and so we could enforce this mode for all writes. zfp also provides a function, Starting with zfp 1.0.0, there is also a new struct, This approach to encoding compression parameters does not address the endianness issue, but it does avoid type punning for fixed-rate and -accuracy modes. |
I am not sure I understand...we DO use ZFP's stream header to capture compression params. We don't do any "special encoding" Its just that we write ZFP's stream header to the dataset's object header |
True, this header is embedded, but the point I was making is how zfp's four integer compression parameters are encoded differently in |
Hmmm....that is only happening in the case an HDF5 caller wants to use the generic interface and then only when passing params from HDF5 caller to H5Z-ZFP plugin. And, there is just NO WAY around this given HDF5's generic filter interface. That said, once the parameters passed in
If a user is suspect or otherwise doesn't want to use this generic interface, they should use the plugin instead as a library and then use the properties interface...which winds up not having to go through this |
This issue specifically was created about the test case |
With regards to a proposal for a solution for this issue, #100 (comment) makes a compelling case for what to encode #100 (comment) makes the point that the current situation is needed for a user interface point of view, to avoid making h5repack useable only for users who can juggle with expert values. I wondered there's a way to extract/compute the expert level values from the A patch proposal would thus be to:
This would:
|
I think there are possibly three independent issues at hand here. I've followed this discussion only from a distance and may not be up to speed on all the gory details, so please accept my apologies if I'm off here or am repeating what's already been said. One issue is that The second issue I see has to do with type punning between The potential third issue is that zfp writes a header as a sequence of bytes to Finally, I think my proposal to use expert-mode coding for all compression modes gets around the second issue. As the first issue appears to be a non-issue and the third-issue has a potential solution, we ought to be fine. |
@lindstro will take a deeper look at yours and @spanezz comments (above) by early next week. That said, it was taking me a really long time to understand why I wasn't getting information from What is happening is that the logic in the filter to explicitly disallow endian targetting (that is writing an endianness that is different from the machine's native endianness) is getting triggered in the current design of the Why? Because the data file it uses is little-endian. The test would work fine if we had the same data file in big-endian format and then used that file when testing on big-endian systems. I'll follow up with other issues raised here by next week. |
I've confirmed adding the When Then, everything works. |
Regarding type punning @lindstro and @spanezz mention above. This is a result of the set of goals of...
As I have studied this more, the punning is highly constrained (constrained enough IMHO to be manageable) to within the passing of HDF5 property list content from HDF5 caller to the plugin in a single executable. Indeed, this happens at Long story short, the punned data lives only long enough for the filter settings set in
This is in fact how the filter has been designed from the start. However, when I introduced logic to tease out just the ZFP library magic and version numbers, I neglected to include it in the byte-swapping logic. That was fixed in #101 and more is explained here |
We recently reported test failures on s390x, see #95. While @spanezz initially reported two failing test cases, the whole bug discussion focused on only one of them. I also figured that these issues are technically independent, so I am now forking the issue.
The second issue (not discussed in #95) is that
test-h5repack
fails with the messagesize-ratio
. It expects a size ratio >= 200 and gets 99. Together with @spanezz, I tried to get ahold of this, but our progress was limited. Thus we report what we know here:powerpc
32bit,ppc64
,sparc64
, ...).mesh_repack.h5
file is slightly larger than the original while it should be significantly smaller.h5dump
mesh_repack.h5
on a little endian without the h5z-zfp plugin, it fails (expected). If you do the same on big endian, it succeeds. This indicates that the repacking step did not actually end up using h5z-zfp.UD=...
) on the big endian machine to some nonsense does not change its behavior while we would have expected it to error out.Given these symptoms, do you already have a guess as to where the cause may be located or how we could narrow down the cause?
The text was updated successfully, but these errors were encountered: