Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEV on ObjectSynchronizer::FastHashCode #831

Closed
svenrienstra opened this issue Jun 29, 2023 · 6 comments
Closed

SIGSEV on ObjectSynchronizer::FastHashCode #831

svenrienstra opened this issue Jun 29, 2023 · 6 comments
Labels
bug Something isn't working stale Waiting on OP

Comments

@svenrienstra
Copy link

svenrienstra commented Jun 29, 2023

Please provide a brief summary of the bug

We've now multiple times experienced a JVM crash in our cluster. We haven't been able to establish a pattern of when this happens or what triggers this. We get the following crash:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f269de9b474, pid=1, tid=600
#
# JRE version: OpenJDK Runtime Environment Temurin-17.0.7+7 (17.0.7+7) (build 17.0.7+7)
# Java VM: OpenJDK 64-Bit Server VM Temurin-17.0.7+7 (17.0.7+7, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0xe34474]  ObjectSynchronizer::FastHashCode(Thread*, oopDesc*)+0x184

hs_err_pid1.log

Please provide steps to reproduce where possible

I've not been able to reproduce this manually.

Expected Results

A graceful exception, not the whole JVM to crash.

Actual Results

JVM crash

What Java Version are you using?

eclipse-temurin:17-alpine docker image

What is your operating system and platform?

eclipse-temurin:17-alpine running on Kubernetes cluster

How did you install Java?

No response

Did it work before?

No response

Did you test with the latest update version?

No response

Did you test with other Java versions?

No response

Relevant log output

Current thread (0x00007f2659a640a0):  JavaThread "http-nio-8880-exec-1" daemon [_thread_in_vm, id=600, stack(0x00007f263dfe8000,0x00007f263e0e8aa8)]

Stack: [0x00007f263dfe8000,0x00007f263e0e8aa8],  sp=0x00007f263e0e4e10,  free space=1011k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0xe34474]  ObjectSynchronizer::FastHashCode(Thread*, oopDesc*)+0x184
V  [libjvm.so+0x90432e]  JVM_IHashCode+0x9e
J 32297  java.lang.System.identityHashCode(Ljava/lang/Object;)I java.base@17.0.7 (0 bytes) @ 0x00007f268f78859a [0x00007f268f7884a0+0x00000000000000fa]
J 63004 c2 com.blogspot.mydailyjava.weaklockfree.AbstractWeakConcurrentMap.containsKey(Ljava/lang/Object;)Z (46 bytes) @ 0x00007f268f203b14 [0x00007f268f203520+0x00000000000005f4]
J 117851 c2 co.elastic.apm.agent.loginstr.reformatting.AbstractEcsReformattingHelper.onAppendEnter(Ljava/lang/Object;)Z (124 bytes) @ 0x00007f26945aa698 [0x00007f26945aa580+0x0000000000000118]
j  co.elastic.apm.agent.jul.reformatting.JulConsoleHandlerPublishAdvice.initializeReformatting(Ljava/util/logging/ConsoleHandler;)Z+4
J 121708 c2 java.util.logging.Logger.log(Ljava/util/logging/LogRecord;)V java.logging@17.0.7 (153 bytes) @ 0x00007f2694acacdc [0x00007f2694acab40+0x000000000000019c]
J 96765 c1 java.util.logging.Logger.doLog(Ljava/util/logging/LogRecord;)V java.logging@17.0.7 (50 bytes) @ 0x00007f268a2ce814 [0x00007f268a2ce560+0x00000000000002b4]
J 64547 c2 java.util.logging.Logger.log(Ljava/util/logging/Level;Ljava/lang/String;)V java.logging@17.0.7 (25 bytes) @ 0x00007f268fce398c [0x00007f268fce3660+0x000000000000032c]

Register to memory mapping:

RAX=0x00000000069da536 is an unknown value
RBX=0x000000069da53680 is pointing into object: [Ljava.util.logging.Handler; 
{0x000000069da53670} - klass: 'java/util/logging/Handler'[]
 - length: 1
RCX=3313425028 is a compressed pointer to object: java.lang.ThreadLocal$ThreadLocalMap 
{0x000000062bf6d420} - klass: 'java/lang/ThreadLocal$ThreadLocalMap'
 - ---- fields (total size 3 words):
 - private 'size' 'I' @12  76 (4c)
 - private 'threshold' 'I' @16  170 (aa)
 - private 'table' '[Ljava/lang/ThreadLocal$ThreadLocalMap$Entry;' @20  a 'java/lang/ThreadLocal$ThreadLocalMap$Entry'[256] {0x00000006333e3e70} (c667c7ce)
RDX=0x0000000000000006 is an unknown value
RSP=0x00007f263e0e4e10 is pointing into the stack for thread: 0x00007f2659a640a0
RBP=0x00007f263e0e4e60 is pointing into the stack for thread: 0x00007f2659a640a0
RSI=0x63697461c1c33484 is an unknown value
RDI=0x00007f2659a640a0 is a thread
R8 =0x00007f2683eeeb6a points into unknown readable memory: 00 ff ff ff ff 00
R9 =3269675602 is a compressed pointer to object: co.elastic.apm.agent.weakconcurrent.CachedLookupKey$1 
{0x00000006171a5290} - klass: 'co/elastic/apm/agent/weakconcurrent/CachedLookupKey$1'
 - ---- fields (total size 2 words):
 - private final 'threadLocalHashCode' 'I' @12  -1996036213 (8906e78b)
R10=0x00007f268f788527 is at entry_point+135 in (nmethod*)0x00007f268f788310
R11=0x0000000000000006 is an unknown value
R12=0x00000000d52903e6 is an unknown value
R13=0xffffff80000000ff is an unknown value
R14=0x000000069da53680 is pointing into object: [Ljava.util.logging.Handler; 
{0x000000069da53670} - klass: 'java/util/logging/Handler'[]
 - length: 1
R15=0x00007f2659a640a0 is a thread
@svenrienstra svenrienstra added the bug Something isn't working label Jun 29, 2023
@karianna
Copy link
Contributor

@svenrienstra This is coming from the elastic APM agent. Are you able to remove that and see if the crash still occurs?

@svenrienstra
Copy link
Author

@karianna we've already disabled the specific instrumentation which caused it (log-reformatting) for now. We preferably don't want to remove the whole agent because we don't really want to run without APM information on production. Due too the unpredictable nature of when this happens (this week twice, before that it didn't happen for a week) it'll take a while to see if this has any effect.

@karianna
Copy link
Contributor

Ok - those agents tend to byte code weave or call lower level stacks so compiler crashes can occur. I'd also report it to them

@karianna
Copy link
Contributor

Also see #775

@svenrienstra
Copy link
Author

Thanks @karianna, I'll follow that as well, although we're not using ZGC but G1. I've also openend a ticket with Elastic: https://discuss.elastic.co/t/jvm-crash-originating-from-apm-agent/337265

@github-actions
Copy link

We are marking this issue as stale because it has not been updated for a while. This is just a way to keep the support issues queue manageable.
It will be closed soon unless the stale label is removed by a committer, or a new comment is made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale Waiting on OP
Projects
None yet
Development

No branches or pull requests

2 participants