Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sigsev on installing gems on aarch64 #5447

Closed
cpuguy83 opened this Issue Nov 13, 2018 · 8 comments

Comments

Projects
None yet
2 participants
@cpuguy83
Copy link

commented Nov 13, 2018

Environment

Provide at least:

  • JRuby version (jruby -v) and command line (flags, JRUBY_OPTS, etc)
    9.2., 9.1.
  • Operating system and platform (e.g. uname -a)
    unknown atm (build server not in my control)
  • Using openjdk-8

Expected Behavior

gem install should not crash

Actual Behavior

When building jruby images for Docker, upon installing some base gems we hit a sigsev.
This is happening only on armv8 (aarch64) on alpine linux (which uses musl-libc).

Step 8/13 : RUN gem install bundler rake net-telnet xmlrpc
 ---> Running in ac6b889e4615
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000000000005e30, pid=1, tid=0x0000ffff86683a78
#
# JRE version: OpenJDK Runtime Environment (8.0_171-b11) (build 1.8.0_171-b11)
# Java VM: OpenJDK 64-Bit Server VM (25.171-b11 mixed mode linux-aarch64 compressed oops)
# Derivative: IcedTea 3.8.0
# Distribution: Custom build (Wed Jun 13 18:28:34 UTC 2018)
# Problematic frame:
# C  0x0000000000005e30
#
# Core dump written. Default location: //core or core.1
#
# An error report file with more information is saved as:
# //hs_err_pid1.log
#
# If you would like to submit a bug report, please include
# instructions on how to reproduce the bug and visit:
#   http://icedtea.classpath.org/bugzilla
#
@headius

This comment has been minimized.

Copy link
Member

commented Nov 13, 2018

There should be a longer file "hs_err" something. Can you post that and link here?

@cpuguy83

This comment has been minimized.

@headius

This comment has been minimized.

Copy link
Member

commented Nov 17, 2018

So this is failing on 9.2.x and 9.1.x for you? I am unable to reproduce with JRuby master (9.2.5) on Ubuntu AArch64, so I suspect this is a musl-libc incompatibility.

@headius

This comment has been minimized.

Copy link
Member

commented Nov 18, 2018

Well I do get an error on Alpine but it's not the segv. I'll keep poking around.

@headius

This comment has been minimized.

Copy link
Member

commented Nov 18, 2018

Note: that's Alpine on x86_64.

@headius

This comment has been minimized.

Copy link
Member

commented Nov 18, 2018

Ok, so the problem here is that musl-libc Linuxes don't actually provide a libcrypt since that functionality is part of musl-libc. When we attempt to load our native support, that binding of crypt fails and so we back off native.

I'm thinking we fix this and then have you test again, since it may solve the segv as well.

headius added a commit to headius/jnr-posix that referenced this issue Nov 18, 2018

Isolate the crypt function into its own library.
See jruby/jruby#5447

musl-libc based Linux distributions do not ship the crypt library,
and since jnr-ffi is set up to error if any requested libraries
fail to load, this prevents native access from loading properly.
For purposes of jruby/jruby#5447, it appears the loading process
succeeds but the crypt function is not bound correctly and
segfaults. This logic tries the various possible names for the
crypt library in turn, eventually trying libc and letting it hard
error if the crypt function cannot be found and bound.

@headius headius added this to the JRuby 9.2.5.0 milestone Nov 19, 2018

headius added a commit to headius/jnr-posix that referenced this issue Nov 19, 2018

Isolate the crypt function into its own library.
See jruby/jruby#5447

musl-libc based Linux distributions do not ship the crypt library,
and since jnr-ffi is set up to error if any requested libraries
fail to load, this prevents native access from loading properly.
For purposes of jruby/jruby#5447, it appears the loading process
succeeds but the crypt function is not bound correctly and
segfaults. This logic tries the various possible names for the
crypt library in turn, eventually trying libc and letting it hard
error if the crypt function cannot be found and bound.

@headius headius closed this in 780f76e Nov 21, 2018

@headius

This comment has been minimized.

Copy link
Member

commented Nov 22, 2018

You should be able to test this out in snapshots by now. Cheers! 🍻

@headius

This comment has been minimized.

Copy link
Member

commented Nov 22, 2018

I suspect the problem here was related to the failure to load libcrypt that I fixed in #123. The scenario I'm thinking is that perhaps the failure of the library to load was masked somehow on aarch64 and a partially bound jnr-posix was returned to the caller. With crypt bound to nothing, invoking it triggered a segv.

Anyway...here's hoping 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.