-
Notifications
You must be signed in to change notification settings - Fork 612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fedora Rawhide aarch64 - Test 4 fails #876
Comments
I had a glance at the log but it now has disappeared. I believe it may have been That particular test requires reading an existing file and comparing to a temporary file. The assert message should give both paths. You might confirm that the files are present and readable to the device running the test, and if so, whether they are identical. If they are different, perhaps you could attach them. Guessing possible causes: the reference images exist in the source tree, in the IlmImfTest folder, so when cross-compiling the files may not be readable to the machine running the test. ILM_IMF_TEST_IMAGEDIR can be set at compile time to specify the path that the executing machine should use to read the test images. Also, these images have moved between 2.5.3 and the master branch (to src/test/OpenEXRTest), and the test renamed OpenEXRTest |
It's also failing for s390x. I'm trying to get access to the aarch64 test instance. Is it possible to add any flags/options that provide more verbose output from a package build to get the same information? |
In this case we're not cross-compiling, it's all built on native hardware (or in a VM worst case). I've kicked off a scratch build if you want to look at the logs once they complete or fail. https://koji.fedoraproject.org/koji/taskinfo?taskID=57650850 |
Thanks for that - this is the error message from the build.log
It appears that specific message would only appear if both files are readable and differ. If possible, it would be helpful to get a copy of the contents of I will expand the test's error message to indicate what the difference between the two files is, which may help to debug such issues. Also, I notice the test doesn't check that the files are the same size. |
Ok, I don't have an immediate answer then. I have someone shipping me a Jetson Nano to install Fedora aarch64 on but it will likely be at least next week before I get it or have time to play with it. |
I found a way to build the package using qemu and aarch64 virtualization, so after over 5 hours here's the result : |
@hobbes1069 thanks for that file. This appears to be a file from a different test than the one that reported failure in the build logs. I wonder if there was a completely different failure case this time. If so, we'd need the test log messages to help understand the issue. Is it possible that 5 hours was just too long and the test simply aborted before it finished? |
Ok, that was the only file in the directory... So if it doesn't write the file it will be different from the original, no? :) I just got a Nvidia Jetson Nano gifted by the Fedora Project and just recently got it up and going. Building openexr now. Hopefully faster than emulating. |
Successful tests clean up files, so any files in the temporary directory are related to failed tests. My reading of the issue is that imf_test_copy.exr was left because |
Well so far the 4GB memory in the Jeston Nano has not been enough to build OpenEXR. I'm trying to either reduce the number of parallel jobs in make or extend the swap space available. |
Ahh... Looks like we're hitting an assertion.
|
Can you provide your /var/tmp/IlmImfTest_HILEPYMX/v1.7.test.planar.exr? I tried a Raspberry Pi 4 Model B running 64 bit Ubuntu 20.10 and all tests passed. (Running through I will try with Fedora 33 |
Thanks! It seems like the zlib compressed data differs between aarch64 and x86_64 on fedora (though apparently not Ubuntu). I wonder if the compression level of Z_DEFAULT_COMPRESSION differs for some reason. If I can get fedora aarch64 to boot I can run more detailed tests. |
PM me if you'd like me to setup a login on my Jetson Nano. I can have all the build deps installed ahead of time. |
I've been holding up building official packages for Fedora until this got straightened out, but it doesn't sound like it's a big deal. Do you concur? |
I'm not sure whether this is an issue. The file produced by I personally won't have time to look into this personally for a couple of weeks. One useful experiment would be to see if earlier versions of OpenEXR pass testBackwardCompatibility. Looking at the fedora zlib-devel rpm file, it does have arm-specific patches, and also options to override the compression level when the zlib library is compiled. They could be responsible for the difference, rather than anything in the OpenEXR library itself. |
It looks like earlier versions did not have this issue, but the last version built for Fedora is 2.3.0, due to the inclusion of ilmbase et all. https://kojipkgs.fedoraproject.org//packages/OpenEXR/2.3.0/7.fc34/data/logs/aarch64/build.log The maintainer didn't have time to deal with the change so I stepped in. I will be obsoleting both the OpenEXR and ilmbase packages in Fedora and providing "openexr" in it's place. |
How hard would it be to come up with a patch to test an uncompressed comparison? |
I've made some progress, getting Fedora Minimal working on a Raspberry Pi 4B, and reproduced the issue. The problem seems to be the way that the zlib library is built for Fedora on aarch64. I downloaded zlib-1.2.11, compiled it from source, installed it in a different location, then updated LD_LIBRARY_PATH to pick up my new zlib.so. It would be good to hear from someone who understands those patches: is data compressed by Fedora aarch64 guaranteed to be readable by any other architecture, and vice versa? Is there a way (optionally at least) to guarantee identical binary-compressed data with these patches in place? If the differences are unavoidable then making A temporary fix might be for fedora aarch64 to patch |
Thanks for digging in. It looks like this is a known issue for some time unfortunately... https://bugzilla.redhat.com/show_bug.cgi?id=1665221 For now I've disabled testing on aarch64 and s390x (which I have no ability to test on). |
Yes, I'm the guy partially responsible for the zlib acceleration patches. To answer the general question, yes the output data will vary slightly on aarch64, due to the use of neon/vector registers, but it still conforms to the zlib format. This means as you have discovered the files can be decompressed on other architectures/etc. This is the expected behavior of most compression libraries (the stream is defined, but the actual match vs literal tokens in the compressed stream may vary from release to release or based on size vs compression speed options). And as noted, unit tests wishing to verify the compressor should decompress the data and verify it matches the original rather than comparing the compressed data with another source. Its only because zlib has remained unchanged for most of a couple decades that this works at all. This is both zlibs strong point, but also when compared with more recent comrpession libraries its weak point. Given that it simultaneously gets beaten on speed+compression ratio tests by more modern implementations that use 32-bit+ comparison functions/etc. Also with respect to testing, qemu can run arm32/64 containers or random binaries with the linux binfmt_misc options which can emulate a target arch for a single process. This allows one to say, write a file with the x86 version of a program and then CI/test it with a program compiled for another arch. |
Failing tests on i686 and ppc64le. |
@limburgher ~ thanks for raising it for attention. To get proper visibility outside of this specific triple of Fedora/Rawhide/aarch64, it might be a good idea to open a new issue about the OS/release/architectures you're encountering a failure on, and also post some logs to help with diagnosis. |
I think this can be closed, but still have issues with ppc64le. s390x builders are down for maintenance but I'm trying a qemu emulation build to see if #1175 is still an issue. |
I'm working on upgrading the OpenEXR stack on Fedora but ran into a strange arch specific issue. The x86_64 build passes all tests.
https://download.copr.fedorainfracloud.org/results/hobbes1069/openexr/fedora-rawhide-aarch64/01823533-openexr/builder-live.log.gz
The text was updated successfully, but these errors were encountered: