New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SSE4.1 causes crashes on older CPU #76
Comments
|
Thanks for reporting the issue but i'm honestly quite puzzled by it. Version 1.2.0 certainly did not forcibly enable SSE4.1, that commit you mentioned was misinterpreted. Some source files are indeed always compiled with SSE4.1 but not all of them, and this is intentional. SSE4.1 is still autodetected and the SSE4.1 code should never be executed if the CPU doesn't support it, as it's behind an instruction set support check. However, the library has always required SSE4.1 to actually run (as stated in the documentation), so there's no point creating multiple package variants for it because a simple compile flag change will not make it work with older instruction sets. Issue #43 is still a mystery: we can't reproduce the issue and we didn't receive additional information so we closed it. So the question is why does Blender crash when OIDN is being initialized despite the SSE4.1 check. I just tried to reproduce the issue outside Blender with the official OIDN binaries but it works as expected: OIDN reports an error that SSE4.1 is not supported and no crash happens. Could you please confirm this on your machine by running the I'm wondering whether this crash is specific to the Arch packages of Blender and OIDN. Do you know whether OIDN was modified in any way for Arch (e.g. build flags)? Also, could you please describe steps for reproducing this issue from scratch? Thanks! |
|
Bafflingly, although I managed to build OIDN once using Arch's build script and chroot, I cannot do so again. (I have absolutely no clue whatsoever why it worked once.) I tried using Arch's current ninja-based build script, its previous more traditionalish cmake build script, and just downloading the official source tarball and following the instructions on the main github page. In every instance, compilation eventually failed in the same place: None of that makes any sense to me, but one way or the other, OIDN doesn't seem to build on my box. I have to admit, my understanding is that Arch's chroot environment should prevent any weird build flags from being introduced, but my confidence in that understanding is not that great. However, more immediately relevantly, I built a script to install the official binaries as a "openimagedenoise-bin" package. And with that installed... Blender works perfectly. And as a test, I tried installing Arch's official package just to see if anything changed, and nope, Blender crashes immediately. So... I will update my bug report over there. Thank you! |
|
My initial assessment when I was searching for the error was that SSE 4.1 was being forced. Which I think it's fair since my lack of context about the commit. When I was updating the package I noticed that this might not be the case, so I didn't do it. I was meaning to look into this properly but other stuff came and I hadn't been able to come back to the issue. The package should be reproducible. It seems Our build flags are the following CPPFLAGS='-D_FORTIFY_SOURCE=2'
CFLAGS='-march=x86-64 -mtune=generic -O2 -pipe -fno-plt'
CXXFLAGS='-march=x86-64 -mtune=generic -O2 -pipe -fno-plt'
LDFLAGS='-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now'Here's the fixed # Maintainer: Filipe Laíns (FFY00) <lains@archlinux.org>
# Maintainer: Sven-Hendrik Haase <svenstaro@gmail.com>
pkgname=openimagedenoise
_pkgname=oidn
pkgver=1.2.0
pkgrel=2
pkgdesc='Intel(R) Open Image Denoise library'
arch=('x86_64')
url='https://openimagedenoise.github.io'
license=('Apache')
depends=('intel-tbb' 'python')
makedepends=('git' 'cmake' 'ninja' 'ispc')
source=("git+https://github.com/OpenImageDenoise/oidn#tag=v$pkgver"
'git+https://github.com/OpenImageDenoise/oidn-weights'
"oidn-mkl-dnn::git+https://github.com/OpenImageDenoise/mkl-dnn"
'fix-ispc.patch::https://github.com/OpenImageDenoise/oidn/commit/e321d7c90a2c706a521a3afd8913af36b121dc9e.patch')
sha512sums=('SKIP'
'SKIP'
'SKIP'
'b03916098e771fb0467d32d60aa687c804da1d184956b392489e5943c8d71e439c05d05d809fc8e616a45a1cdcc9425cd29e73bd2222f4b6a2093c3eb09fadb2')
prepare() {
cd $_pkgname
git submodule init
git config submodule.weights.url "$srcdir"/oidn-weights
git config submodule.mkl-dnn.url "$srcdir"/oidn-mkl-dnn
git submodule update
patch -p1 -i ../fix-ispc.patch
}
build() {
cd $_pkgname
cmake \
-B build \
-G Ninja \
-DCMAKE_INSTALL_PREFIX=/usr \
-DCMAKE_BUILD_TYPE=Release
ninja -C build
}
package() {
cd $_pkgname
DESTDIR="$pkgdir" ninja -C build install
rm "$pkgdir/usr/bin/tests"
}The package should be built with @atafra could you see if you can reproduce using the same flags as us? cc @svenstaro |
|
I'm sorry, I did not mean to imply any negligence on anyone's part! I had previously been attempting to compile this using I tested the updated PKGBUILD posted above. It compiled successfully using I don't know if this is at all helpful, but I |
|
|
|
Thanks @frankspace and @FFY00 for all the details! If Blender doesn't crash with the official binaries it seems there's an issue with the Arch packaging but I don't have a clue what could that be yet. I don't think it's TBB related. @FFY00, is it standard practice to override the default build flags? The flags you copied shouldn't cause a crash but I don't recommend it as it will at least cause lower performance. We use different flags (and instruction sets) for different files for optimal performance without breaking compatibility with older architectures. Since the flags are different from the official build, this might be an issue. I would like to reproduce the crash but I've never used Arch before so could you please provide some brief steps for building everything? We're going to release |
Yes, the most relevant reason is to ensure we we have PIE and relro.
I assume you are talking about using
That shouldn't be a problem, we don't overwrite anything, we just provide a default.
Basically you just install |
No, I meant primarily the SSE4.1 optimization flag which does have a performance impact. In the future we might add AVX2 and AVX-512 flags to some files, so removing these flags will cause an even bigger slowdown.
Oh, so if the project adds ISA flags, those will be preserved? In that case, this should be perfectly fine.
How do I build specifically Open Image Denoise and Blender this way? Thanks! |
|
Sorry, for the delayed response, I got my bug reports mixed up! I'm probably not the best person to ask about how the compilation works under the hood. However, the build process basically works like this: Arch (and derivatives) provide a build recipe called a "PKGBUILD." Fortunately, Arch's implementation of OIDN doesn't depend on any other provided files (patches, etc), so you can just download that file from here. Ordinarily, you can just plonk that file into a directory wherever you want to build your package, type " Alas, although I can more or less describe what happens, I really do not have any meaningful understanding of how it works under the hood. Complicating things, and not very clear from the wiki, is that PKGBUILD recipes also have some options, like whether to strip debugging symbols, leave static libraries, etc. The OIDN PKGBUILD doesn't appear to deviate from the defaults, which are listed about halfway down the page here and actually explained about halfway down the page here. Again, I don't fully understand all of that, nor do I understand why Arch isn't downloading the source tarball or is using ninja, but I trust they have good reasons and the explanation would almost certainly be beyond my comprehension as a non-programmer. As noted, I couldn't get it to compile using the official instructions anyway. Sorry I can't help further. I am probably going to create an "openimagedenoise-bin" package in the Arch User Repository in the meantime, which would just repackage the official binaries (which work fine as a drop-in replacement). |
|
Note that blender officially is still on OIDN 1.0. The upgrade to 1.2 has not occurred for official builds yet. |
|
I'm still working on setting up an Arch environment to reproduce the bug, but while doing so I discovered another (probably unrelated) major issue with the Arch package for OIDN 1.2.0: it's corrupted as we've switched to using Git LFS for the large binary files since 1.2.0 but the package is built without using LFS when cloning the repository, so those binary files are just LFS references. The library builds fine this way but at runtime it won't work. |
|
Oh, I will look into it. Maybe it is worth adding a check in the build system. |
|
Thanks @FFY00 ! We’ll add a check in the build system in the next release, printing a message that Git LFS is required to correctly clone the repository. Meanwhile I managed to reproduce the crash, OIDN indeed executes an illegal instruction (pinsrq) on old architectures but not only when executing Blender but also when running the |
|
Fixed in a0060e2 in the Regarding the PKGBUILD: apart from the incorrect clone without Git LFS (maybe pulling the source package instead of using git would be easier?), just a minor note: |
|
The SSE issue is fixed and the build in our package is also fixed. Thanks for the hint, @atafra. I think this issue can be closed. |
|
Can you make a release containing a0060e2? |
|
still having the problem |
Could you please provide some details? Did you try the latest source code in the |
Yes, we've just released |
Thanks, we're updated to it and it seems to work splendidly! |
|
I believe this can be closed then. |
In version 1.2.0, it appears that SSE4.1 is forcibly enabled and therefore causes a crash on CPUs that don't support it. In Issue #43 it was indicated that this should be auto-detected, but that doesn't seem to be the case. Please see this Arch Linux bug report for more details. Apparently, however, commit #7d4f9d8 results in SSE4.1 being manually set. Is there a way to go back to SSE4.1 being auto-detected instead? (Full disclosure: I'm not a programmer so I don't fully understand any of this.) Thank you!
The text was updated successfully, but these errors were encountered: