Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation Fault with -XX:NativeMemoryTracking=detail for 8u342 #411

Closed
eicki opened this issue Jul 27, 2022 · 7 comments
Closed

Segmentation Fault with -XX:NativeMemoryTracking=detail for 8u342 #411

eicki opened this issue Jul 27, 2022 · 7 comments
Labels
bug Something isn't working

Comments

@eicki
Copy link

eicki commented Jul 27, 2022

Describe the bug

The java option -XX:NativeMemoryTracking=detail leads to a Segmentation fault in release 8.342.07.1 and 8.342.07.3.
It works with 8.332.08.1 and it also works with -XX:NativeMemoryTracking=summary and it also works in other jdk providers with 8u342, so it is very specific to Corretto.

To Reproduce

Just download the release and execute the following
amazon-corretto-8.342.07.3-linux-x64/bin/java -XX:NativeMemoryTracking=detail -version
You will immediately get a Segmentation fault

Expected behavior

No segmentation fault

Platform information

All linux platforms, also amazoncorretto:8u342 docker image

@eicki eicki added the bug Something isn't working label Jul 27, 2022
@olivergillespie
Copy link
Contributor

Thanks for reporting, sorry about the issue. I can confirm I also see the same issue on my machine. We'll investigate this.

@simonis
Copy link
Contributor

simonis commented Jul 27, 2022

The crash happens during initialization here:

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff6cc1051 in os::current_frame() () from /share/software/Java/amazon-corretto-8.342.07.3-linux-x64/jre/lib/amd64/server/libjvm.so
(gdb) where
#0  0x00007ffff6cc1051 in os::current_frame() () from /share/software/Java/amazon-corretto-8.342.07.3-linux-x64/jre/lib/amd64/server/libjvm.so
#1  0x00007ffff6cc5d2d in os::get_native_stack(unsigned char**, int, int) () from /share/software/Java/amazon-corretto-8.342.07.3-linux-x64/jre/lib/amd64/server/libjvm.so
#2  0x00007ffff6612da8 in ResourceObj::operator new(unsigned long, ResourceObj::allocation_type, MemoryType) () from /share/software/Java/amazon-corretto-8.342.07.3-linux-x64/jre/lib/amd64/server/libjvm.so
#3  0x00007ffff654e748 in _GLOBAL__sub_I_c1_LinearScan.cpp () from /share/software/Java/amazon-corretto-8.342.07.3-linux-x64/jre/lib/amd64/server/libjvm.so
#4  0x00007ffff7de38d3 in call_init (env=0x555555756560, argv=0x7fffffffdc18, argc=2, l=<optimized out>) at dl-init.c:72
#5  _dl_init (main_map=main_map@entry=0x555555756880, argc=2, argv=0x7fffffffdc18, env=0x555555756560) at dl-init.c:119
#6  0x00007ffff7de839f in dl_open_worker (a=a@entry=0x7fffffff9740) at dl-open.c:522
#7  0x00007ffff750e16f in __GI__dl_catch_exception (exception=0x7fffffff9720, operate=0x7ffff7de7f60 <dl_open_worker>, args=0x7fffffff9740) at dl-error-skeleton.c:196
#8  0x00007ffff7de796a in _dl_open (file=0x7fffffff9a90 "/share/software/Java/amazon-corretto-8.342.07.3-linux-x64/jre/lib/amd64/server/libjvm.so", mode=-2147483390, caller_dlopen=0x7ffff79a8cd1 <LoadJavaVM+49>, nsid=<optimized out>, argc=2, argv=<optimized out>, env=0x555555756560) at dl-open.c:605
#9  0x00007ffff7798f96 in dlopen_doit (a=a@entry=0x7fffffff9970) at dlopen.c:66
#10 0x00007ffff750e16f in __GI__dl_catch_exception (exception=exception@entry=0x7fffffff9910, operate=0x7ffff7798f40 <dlopen_doit>, args=0x7fffffff9970) at dl-error-skeleton.c:196
#11 0x00007ffff750e1ff in __GI__dl_catch_error (objname=0x5555557567f0, errstring=0x5555557567f8, mallocedp=0x5555557567e8, operate=<optimized out>, args=<optimized out>) at dl-error-skeleton.c:215
#12 0x00007ffff7799745 in _dlerror_run (operate=operate@entry=0x7ffff7798f40 <dlopen_doit>, args=args@entry=0x7fffffff9970) at dlerror.c:162
#13 0x00007ffff7799051 in __dlopen (file=file@entry=0x7fffffff9a90 "/share/software/Java/amazon-corretto-8.342.07.3-linux-x64/jre/lib/amd64/server/libjvm.so", mode=mode@entry=258) at dlopen.c:87
#14 0x00007ffff79a8cd1 in LoadJavaVM (jvmpath=jvmpath@entry=0x7fffffff9a90 "/share/software/Java/amazon-corretto-8.342.07.3-linux-x64/jre/lib/amd64/server/libjvm.so", ifn=ifn@entry=0x7fffffff9a60) at /home/jenkins/node/workspace/Corretto8/generic_linux/x64/build/Corretto8Src/installers/linux/universal/tar/corretto-build/buildRoot/jdk/src/solaris/bin/java_md_solinux.c:855
#15 0x00007ffff79a6016 in JLI_Launch (argc=<optimized out>, argv=<optimized out>, jargc=1, jargv=0x0, appclassc=1, appclassv=0x0, fullversion=0x5555555548ad "1.8.0_342-b07", dotversion=0x5555555548a9 "1.8", pname=0x5555555548a4 "java", lname=0x55555555489c "openjdk", javaargs=0 '\000', cpwildcard=1 '\001', javaw=0 '\000', ergo=0) at /home/jenkins/node/workspace/Corretto8/generic_linux/x64/build/Corretto8Src/installers/linux/universal/tar/corretto-build/buildRoot/jdk/src/share/bin/java.c:253
#16 0x00005555555546c1 in main ()

It seems to be caused by #398 ("Updates for AL2022") which is specific to Corretto and not in OpenJDK 8u upstream. Specifically, #398 integrated 8207011: Remove uses of the register storage class specifier (see e704345) which removes the register specifier from some variables. According to JDK-8207011 this was intended for newer compilers like GCC 11 because the "C++11 standard deprecates 'register', and starting with C++17 it is a reserved keyword". However, Corretto 8 is still compiled with GCC 7 targetting C++98 (also see this discussion on the PR which attempted to downport JDK-8207011 to 8u)

Without the register specifier, GCC 7 seems to generate bad code for _get_previous_fp() (which gets inlined into os::current_frame()):

intptr_t* _get_previous_fp() {
#ifdef SPARC_WORKS
  ...
#else
  intptr_t **ebp __asm__ (SPELL_REG_FP);
#endif
  return (intptr_t*) *ebp;   // we want what it points to.
}

which leads to loading from address 0 and the crash:

(gdb) x /16i 'os::current_frame()' 
   0x7ffff6cc1030 <_ZN2os13current_frameEv>:	push   %rbp
   0x7ffff6cc1031 <_ZN2os13current_frameEv+1>:	mov    %rsp,%rbp
   0x7ffff6cc1034 <_ZN2os13current_frameEv+4>:	push   %r12
   0x7ffff6cc1036 <_ZN2os13current_frameEv+6>:	push   %rbx
   0x7ffff6cc1037 <_ZN2os13current_frameEv+7>:	lea    -0x40(%rbp),%r12
   0x7ffff6cc103b <_ZN2os13current_frameEv+11>:	mov    %rdi,%rbx
   0x7ffff6cc103e <_ZN2os13current_frameEv+14>:	lea    -0x15(%rip),%rdi        # 0x7ffff6cc1030 <_ZN2os13current_frameEv>
   0x7ffff6cc1045 <_ZN2os13current_frameEv+21>:	sub    $0x40,%rsp
   0x7ffff6cc1049 <_ZN2os13current_frameEv+25>:	mov    %rbp,-0x48(%rbp)
   0x7ffff6cc104d <_ZN2os13current_frameEv+29>:	mov    %rbp,-0x40(%rbp)
=> 0x7ffff6cc1051 <_ZN2os13current_frameEv+33>:	movq   0x0,%xmm0

Before JDK-8207011, _get_previous_fp() looked as follows:

intptr_t* _get_previous_fp() {
#ifdef SPARC_WORKS
  ...
#else
  register intptr_t **ebp __asm__ (SPELL_REG_FP);
#endif
  return (intptr_t*) *ebp;   // we want what it points to.
}```

And the generated code was:
```gdb
(gdb) x /16i 'os::current_frame()' 
   0x7ffff6cc17e0 <_ZN2os13current_frameEv>:	push   %rbp
   0x7ffff6cc17e1 <_ZN2os13current_frameEv+1>:	mov    %rsp,%rbp
   0x7ffff6cc17e4 <_ZN2os13current_frameEv+4>:	push   %r12
   0x7ffff6cc17e6 <_ZN2os13current_frameEv+6>:	push   %rbx
   0x7ffff6cc17e7 <_ZN2os13current_frameEv+7>:	lea    -0x40(%rbp),%r12
   0x7ffff6cc17eb <_ZN2os13current_frameEv+11>:	mov    %rdi,%rbx
   0x7ffff6cc17ee <_ZN2os13current_frameEv+14>:	lea    -0x15(%rip),%rdi        # 0x7ffff6cc17e0 <_ZN2os13current_frameEv>
   0x7ffff6cc17f5 <_ZN2os13current_frameEv+21>:	sub    $0x40,%rsp
   0x7ffff6cc17f9 <_ZN2os13current_frameEv+25>:	movq   0x0(%rbp),%xmm0
=> 0x7ffff6cc17fe <_ZN2os13current_frameEv+30>:	mov    %rsp,-0x48(%rbp)
   0x7ffff6cc1802 <_ZN2os13current_frameEv+34>:	movhps -0x48(%rbp),%xmm0

@eicki
Copy link
Author

eicki commented Jul 27, 2022

Might this bug occur in other situations than NativeMemoryTracking?

@navyxliu
Copy link
Contributor

navyxliu commented Jul 27, 2022

here is a simpler reproducible.

#include <cstdint>
#define SPELL_REG_FP "rbp"

intptr_t* _get_previous_fp() {
   intptr_t **ebp __asm__ (SPELL_REG_FP);
   return (intptr_t*) *ebp;   // we want what it points to.
}

intptr_t* _get_previous_fp_register() {
   register intptr_t **ebp __asm__ (SPELL_REG_FP);
   return (intptr_t*) *ebp;   // we want what it points to.
}
$gcc -c test.cc
$objdump -d test.o 

0000000000000000 <_Z16_get_previous_fpv>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   48 8b 45 f8             mov    -0x8(%rbp),%rax <== ???
   8:   48 8b 00                mov    (%rax),%rax
   b:   5d                      pop    %rbp
   c:   c3                      retq

000000000000000d <_Z25_get_previous_fp_registerv>:
   d:   55                      push   %rbp
   e:   48 89 e5                mov    %rsp,%rbp
  11:   48 89 e8                mov    %rbp,%rax         <== take effect, rax is initialized with rbp. 
  14:   48 8b 00                mov    (%rax),%rax
  17:   5d                      pop    %rbp
  18:   c3                      retq

@jwalsh2me
Copy link

We are seeing the same issue with 8u342.

@Rudometov
Copy link
Contributor

The issue has been fixed in 8.342.07.4 that is released today.

@eicki
Copy link
Author

eicki commented Jul 28, 2022

Can confirm that the issue is solved, thanks for the quick response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants