New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coredump in Alpine arm64 platform #24875
Comments
Does any other version of Gradle work in this scenario? |
No. I tried Gradle 7.3 to 8.1.1, all versions crashed. |
Any updates? |
Managed to reproduce, looks like this is another one caused by musl :/ Stack:
|
We've been disabling file-system watching as that just did not work with musl at all: gradle/native-platform#283 I wonder why we end up loading it anyway... |
Ah, so what we do in that PR is that we still load the native library for file system watching, but if What I don't understand is why passing |
Let's at least try to figure out why |
Update on this issue: Using java 17 or later results in a more informative stacktrace. It seems, that the _init from the library fails. hs_err_pid: https://tpaste.us/xyRo Ref: https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/50073 |
@bratkartoffel mind trying with gradl-8.3 and jdk-21 ? |
Issue still persists: https://tpaste.us/ePYV |
So how can I completely disable filesystems watching? |
@bratkartoffel @lptr i forwarded your info to the jdk developers list, where @theRealAph would love to know how to easiest run it:
what is the easiest way to reproduce it on cloud, something like: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-arm64.html ? |
I haven't had any luck in reproducing it neither locally (using qemu-binfmt and podman) nor on aws using Currently it seems that the crash is only stable reproducible on the alpine builders, which run on lxc afaik. |
On 10/16/23 09:15, Simon F wrote:
I haven't had any luck in reproducing it neither locally (using qemu-binfmt and podman) nor on aws using |ami-0bbd3a96d646dbcd4|. I'll try to get an example ready within the next days.
Currently it seems that the crash is only stable reproducible on the alpine builders, which run on lxc afaik.
It seems obvious from the stack trace that this is a crash in ld.so, so there's
no need for either OpenJDK or Gradle to investigate, at least for now.
…--
Andrew Haley (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
|
it crashes in ld.so because the thing being dlopen()'d was built against glibc and there is an ABI mismatch, since the host is not glibc. there is no safe way to load this plugin when the libc abi does not match what it was built against. it should just be skipped entirely. |
(that was already the intention with gradle/native-platform#283 , but for some reason it gets loaded even when disabled, so this issue is that the disable doesn't work) |
would you mind to be a little more verbose for the less intelligent, or less experienced? how can it be that something on alpine is built against glibc, and what is "the thing" ? |
I have to ask on IRC how the builders are set up. # work in tempdir
cd $(mktemp -d)
# get example project
wget https://docs.gradle.org/current/samples/zips/sample_building_java_applications-groovy-dsl.zip
unzip sample_building_java_applications-groovy-dsl.zip
# create mock APKBUILD file
cat >APKBUILD <<"EOF"
pkgname=test
pkgver=1
pkgrel=0
pkgdesc="none"
url="https://gradle.org"
arch="noarch"
license="Public Domain"
EOF
# fix permissions
chown 1000:1000 . -R
# run the container
docker run --rm -it -v $(pwd):/opt/work -e DABUILD_ARCH=aarch64 --workdir /opt/work --platform linux/arm64/v8 registry.alpinelinux.org/alpine/docker-abuild:edge ash
##### inside the container #####
# install java
abuild-apk add openjdk17-jdk
# reproduce crash
./gradlew clean Interesting though, when running a simple # same script as above until the "docker run" command
docker run --rm -it -v $(pwd):/opt/work --workdir /opt/work --platform linux/arm64/v8 alpine:edge ash
##### inside the container #####
# install java
apk add openjdk17-jdk
# all good
./gradlew clean
I'm not really an c expert, so please correct me if i understood something wrong. When a C-library depends on another library, it is compiled against the ABI of the latter. When the application is started, the ld (loader) sees all referenced libraries and tries to load them when needed. The main c library in this case is different, from glibc (the most widely deployed one) to musl libc (in alpine). The main libc is repsonsible for the base functionality provided by the c programing language itself. As glibc is pretty heavy, overloaded and bugged (due to legacy / backwards compatiblity), other implementations (like musl and dietlibc) arised, aiming for stricter implementation of the c standard. As for musl libc this means, each "undefined behaviour" is almost always a crash in the application for security reasons. One example is that glibc oftern ignores use-after-free issues, but musl libc almost certainly crashes the application. So when a library is built against a specific libc implementation, the compiler may use features specific to this library which makes the application to fail loading when run with the other. Common issues which may be seen are "symbol not found" errors. In this case, it seems that the "_init" function of the So the questions are: // Edit: The other difference i found (when comparing the both containers i used for testing, see above): ok:
not ok:
After some testing, I found out:
|
@soloturn # install packages
doas apk add openjdk17-jdk libstdc++
# work in tempdir
cd $(mktemp -d)
# get example project
wget https://docs.gradle.org/current/samples/zips/sample_building_java_applications-groovy-dsl.zip
unzip sample_building_java_applications-groovy-dsl.zip
# reproduce crash
./gradlew clean After removing the After some searchin, i stumbled accross cloudius-systems/osv#1129 and punitagrawal/osv@63fc92b. Can someone please rebuild the native library with the |
@jbartok Shouldn't this be tagged as a bug instead of a feature? (tagging you since you changed the issue from bug to feature, sorry if I wasn't supposed to do that) |
@lptr Although aarch64 is not my primary platform, this issue is still pretty nasty and blocks some stuff on my side. Is there anything I can do to help you fix this? |
As requested on the gradle slack I've created a minimal reproduction project here: |
Hey guys, I hope all is well! |
I took a look at the PR and I was able to build it for all platforms with few modifications. But I need to test if that solution works. I also looked at fixing the |
A fix (#28021) to not load native services with I tested the flag with a reproducer from #24875 (comment) and build runs succesfully with There is more work required to fix a problem without manually setting a flag though. |
Here's an idea about a more permanent fix:
or:
I don't know when either option of 3) we can do, and I'd much rather invest into using Rust than to keep maintaining the C/C++ code we currently have. |
The workaround for that issue at least in my case is to export that ENV in my container.
|
FWIW my minimal reproduction above (https://github.com/Mahoney-bug-examples/gradle-alpine-arm-bug) builds with gradle 8.8-20240228002527+0000 without the |
Thanks for checking it @Mahoney. @tkcontiant which issue does this variable fix? I might be missing something, but I don't see anything among those options that would affect native library loading. |
Expected Behavior
Gradle should work on Alpine arm64 platform.
Current Behavior
Context (optional)
No response
Steps to Reproduce
Setup QEMU for docker and run:
Then:
Then you can see the output as above. And here is the error log
/root/.gradle/daemon/8.1.1/hs_err_pid227.log
:hs_err_pid227.log
Gradle version
8.1.1
Build scan URL (optional)
No response
Your Environment (optional)
No response
The text was updated successfully, but these errors were encountered: