Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jhsdb does not work with coredump #2579

Closed
YaSuenag opened this issue Jun 18, 2020 · 9 comments
Closed

jhsdb does not work with coredump #2579

YaSuenag opened this issue Jun 18, 2020 · 9 comments
Assignees
Labels

Comments

@YaSuenag
Copy link
Contributor

I played Truffle NFI on GraalVM, but I cannot get Java stacks from coredump via jhsdb.
I saw DebuggerException on the console:

Error attaching to core file: Can't attach to the core file
sun.jvm.hotspot.debugger.DebuggerException: Can't attach to the core file
        at jdk.hotspot.agent/sun.jvm.hotspot.debugger.linux.LinuxDebuggerLocal.attach0(Native Method)
        at jdk.hotspot.agent/sun.jvm.hotspot.debugger.linux.LinuxDebuggerLocal.attach(LinuxDebuggerLocal.java:282)
  :

Steps to reproduce the issue

  1. Clone my repository
$ git clone https://github.com/YaSuenag/garakuta.git
  1. Build and install custom language to GraalVM
$ cd truffle-nfi
$ mvn package
$ gu install -L target/truffle-nfi-wrapper-0.1.0-component.jar
  1. Build and run crasher
$ cd truffle-nfi/examples
$ javac MemSetCrash.java
$ java MemSetCrash
  1. Attempt to get Java call stacks via jhsdb
$ jhsdb jstack --exe $GRAALVM_HOME/bin/java --core <coredump>

Describe GraalVM and your environment:

  • GraalVM version: CE 20.1.0
  • JDK major version: 11
  • OS: Fedora 32
  • Architecture: AMD64

More details

I added LIBSAPROC_DEBUG=1 when I ran jhsdb, I saw the debug messages as below.
AFAICS 0x7f1bba69f000 seems to be .svm_heap section.

libsaproc DEBUG: reading library /home/yasuenag/github/graal/sdk/mxbuild/linux-amd64/GRAALVM_CE_JAVA11/graalvm-ce-java11-20.2.0-dev/lib/libjvmcicompiler.so @ 0x7f1bb989c000 [ 0x7f1bb989c000 ]
libsaproc DEBUG: overwrote with new address mapping (memsz 757760 -> 757760)
libsaproc DEBUG: overwrote with new address mapping (memsz 13934592 -> 13934592)
libsaproc DEBUG: address conflict @ 0x7f1bba69f000 (existing map size = 10530816, size = 12942072, flags = 4)
libsaproc DEBUG: can't read shared object's segments
@YaSuenag YaSuenag added the bug label Jun 18, 2020
@YaSuenag
Copy link
Contributor Author

I checked the core file, memory segments for .rodata and .svm_heap seem to be valid (section size is valid):

  [13] .rodata           PROGBITS         0000000000e13000  00e13000
       0000000000022b1d  0000000000000000   A       0     0     4096
  [14] .svm_heap         PROGBITS         0000000000e36000  00e36000
       0000000000c38000  0000000000000000   A       0     0     4096
        0x00007fecb3399000 - 0x00007fecb33bbb1d is .rodata in /home/yasuenag/github/graal/sdk/mxbuild/linux-amd64/GRAALVM_CE_JAVA11/graalvm-ce-java11-20.2.0-dev/lib/libjvmcicompiler.so
        0x00007fecb33bc000 - 0x00007fecb3ff4000 is .svm_heap in /home/yasuenag/github/graal/sdk/mxbuild/linux-amd64/GRAALVM_CE_JAVA11/graalvm-ce-java11-20.2.0-dev/lib/libjvmcicompiler.so

However there are other memory segments which starts with same address of .rodata (0x7fecb3399000). I don't know why these memory segments are available. (Does it relates to call mprotect() in LinuxImageHeapProvider::initialize ?)

        0x00007fecb3399000 - 0x00007fecb3da9000 is load257
        0x00007fecb3da9000 - 0x00007fecb3ff4000 is load258

@YaSuenag
Copy link
Contributor Author

In Linux, native image would call mprotect() to give access flags in LinuxImageHeapProvider.java. It relies on __svm_heap_* global symbols in ELF binary.

In my case, I can see following symbols in libjvmcicompiler.so in upstream:

$ nm sdk/latest_graalvm_home/lib/libjvmcicompiler.so | grep __svm_heap
0000000000e3e000 R __svm_heap_begin
0000000001af3000 R __svm_heap_end
0000000001687000 R __svm_heap_relocatable_begin
00000000017c1000 R __svm_heap_relocatable_end
00000000017c1000 R __svm_heap_writable_begin
0000000001af3000 R __svm_heap_writable_end

__svm_heap_writable_begin (0x17c1000) is in .svm_heap section (0xe3e000 - 0x1af3000). Perhaps it causes this issue.

According to hs_err log, libjvmcicompiler.so was loaded to following addresses:

7fa464506000-7fa4645c1000 r--p 00000000 fd:00 139901189                  /home/yasuenag/github/graal/sdk/mxbuild/linux-amd64/GRAALVM_CE_JAVA11/graalvm-ce-java11-20.2.0-dev/lib/libjvmcicompiler.so
7fa4645c1000-7fa465324000 r-xp 000bb000 fd:00 139901189                  /home/yasuenag/github/graal/sdk/mxbuild/linux-amd64/GRAALVM_CE_JAVA11/graalvm-ce-java11-20.2.0-dev/lib/libjvmcicompiler.so
7fa465324000-7fa465cc7000 r--p 00e1e000 fd:00 139901189                  /home/yasuenag/github/graal/sdk/mxbuild/linux-amd64/GRAALVM_CE_JAVA11/graalvm-ce-java11-20.2.0-dev/lib/libjvmcicompiler.so
7fa465cc7000-7fa465ff9000 rw-p 017c1000 fd:00 139901189                  /home/yasuenag/github/graal/sdk/mxbuild/linux-amd64/GRAALVM_CE_JAVA11/graalvm-ce-java11-20.2.0-dev/lib/libjvmcicompiler.so
7fa465ff9000-7fa465ffc000 r--p 01af3000 fd:00 139901189                  /home/yasuenag/github/graal/sdk/mxbuild/linux-amd64/GRAALVM_CE_JAVA11/graalvm-ce-java11-20.2.0-dev/lib/libjvmcicompiler.so
7fa465ffc000-7fa465ffd000 r--p 01af5000 fd:00 139901189                  /home/yasuenag/github/graal/sdk/mxbuild/linux-amd64/GRAALVM_CE_JAVA11/graalvm-ce-java11-20.2.0-dev/lib/libjvmcicompiler.so
7fa465ffd000-7fa465fff000 rw-p 01af6000 fd:00 139901189                  /home/yasuenag/github/graal/sdk/mxbuild/linux-amd64/GRAALVM_CE_JAVA11/graalvm-ce-java11-20.2.0-dev/lib/libjvmcicompiler.so

From core ( i target in GDB), I can see following addresses:

        0x00007fa465324000 - 0x00007fa465343ee5 is .rodata in /home/yasuenag/github/graal/sdk/mxbuild/linux-amd64/GRAALVM_CE_JAVA11/graalvm-ce-java11-20.2.0-dev/lib/libjvmcicompiler.so
        0x00007fa465344000 - 0x00007fa465ff9000 is .svm_heap in /home/yasuenag/github/graal/sdk/mxbuild/linux-amd64/GRAALVM_CE_JAVA11/graalvm-ce-java11-20.2.0-dev/lib/libjvmcicompiler.so

__svm_heap_writable_begin points to 0x7fa465cc7000 (top of libjvmcicompiler: 0x7fa464506000 + offset: 0x17c1000), we can see it in hs_err log (equivalence of /proc/<PID>/maps ), and also it is in .svm_heap !

I guess ImageHeapLayoutInfo.java would have incorrect value for writable sections because NativeBootImage::build uses it for setting __svm_heap_writable_begin .

@YaSuenag YaSuenag reopened this Jul 10, 2020
@YaSuenag
Copy link
Contributor Author

YaSuenag commented Jul 10, 2020

I found out the cause.

Substrate VM expects to contain sections both RO and RW. Thus .svm_heap would be separated by mprotect() call.
I tried to set writable to true in force, then it works fine!

diff --git a/substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/image/NativeBootImage.java b/substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/image/NativeBootImage.java
index 423966c2b63..8d5472d04a0 100644
--- a/substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/image/NativeBootImage.java
+++ b/substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/image/NativeBootImage.java
@@ -462,7 +462,8 @@ public abstract class NativeBootImage extends AbstractBootImage {
                 objectFile.installDebugInfo(provider);
             }
             // - Write the heap to its own section.
-            boolean writable = SubstrateOptions.UseOnlyWritableBootImageHeap.getValue();
+            //boolean writable = SubstrateOptions.UseOnlyWritableBootImageHeap.getValue();
+            boolean writable = true;
             long heapSize = heapLayout.getImageHeapSize();
             RelocatableBuffer heapSectionBuffer = RelocatableBuffer.factory("heap", heapSize, objectFile.getByteOrder());
             ProgbitsSectionImpl heapSectionImpl = new BasicProgbitsSectionImpl(heapSectionBuffer.getBytes());

Of couse we can control it via -H:+UseOnlyWritableBootImageHeap, but it cannot be control for binaries which are shipped in GraalVM distribution (e.g. libjvmcicompiler, libpolyglot).

UseOnlyWritableBootImageHeap is actually needed? If so, I want to propose to add this option in mx script. Is it acceptable?

@christianhaeubl
Copy link
Member

From Substrate VM-side, the mapping of the image heap looks fine. It is true that .svm_heap may get separated by an mprotect() call but I don't understand why that would necessarily break jhsdb? Isn't that a bug on the jhsdb-side?
For production use, we don't want to make the whole image heap writable.

@YaSuenag
Copy link
Contributor Author

YaSuenag commented Jul 10, 2020

I don't think it is a bug on jhsdb.

jhsdb attempt to parse all PT_LOAD segment in the core and binaries.
ELF has access flags in each segment. It should not be separated by other access flags.

Linux kernel would load PT_LOAD segments in shared libraries (ELF).
PT_LOAD segment can have multiple segments, but they should have same access flags.

If you don't want to make the whole image heap writable, it should be separated as another segments.

@YaSuenag
Copy link
Contributor Author

Does .svm_heap need to be continuous space? If so, it is difficult to fix this issue on Substrate VM side.

It is not a bug on jhsdb, but I think we can avoid it if we apply following patch to LabsJDK:

diff --git a/src/jdk.hotspot.agent/linux/native/libsaproc/ps_core.c b/src/jdk.hotspot.agent/linux/native/libsaproc/ps_core.c
--- a/src/jdk.hotspot.agent/linux/native/libsaproc/ps_core.c
+++ b/src/jdk.hotspot.agent/linux/native/libsaproc/ps_core.c
@@ -399,6 +399,7 @@

         if ((existing_map->memsz != page_size) &&
             (existing_map->fd != lib_fd) &&
+            (existing_map->fd != ph->core->core_fd) &&
             (ROUNDUP(existing_map->memsz, page_size) != ROUNDUP(lib_php->p_memsz, page_size))) {

           print_debug("address conflict @ 0x%lx (existing map size = %ld, size = %ld, flags = %d)\n",

I will send PR to LabsJDK if it is ok.

@YaSuenag
Copy link
Contributor Author

I've sent PR for SA side fix in graalvm/labs-openjdk-11#9 . I will close this issue if it is merged.

@dougxc
Copy link
Member

dougxc commented Jul 22, 2020

@zakkak do you have an opinion on graalvm/labs-openjdk-11#9 (comment)?

@YaSuenag
Copy link
Contributor Author

I fixed this issue in both jdk/jdk and jdk-update/jdk11u-dev.

I believe this 11u change will be backported to Labs JDK, so I close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants