Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Liberty Server Hang randomly #24651

Closed
mwangcaibmcom opened this issue Mar 10, 2023 · 4 comments · Fixed by #24660
Closed

Liberty Server Hang randomly #24651

mwangcaibmcom opened this issue Mar 10, 2023 · 4 comments · Fixed by #24660
Labels
Needs member attention release bug This bug is present in a released version of Open Liberty release:23003

Comments

@mwangcaibmcom
Copy link

Describe the bug
Liberty Server hang happened randomly within docker containers which are deployed in k8s cluster in an Integration Pipeline. The javacore dumps were collected to analyze the issue.
In one javacore, one deadlock was identified and all operational threads are blocked due to the deadlock

2LKMONINUSE      sys_mon_t:0x00007FCE3C16F258 infl_mon_t: 0x00007FCE3C16F2D8:
3LKMONOBJECT       org/apache/cxf/jaxrs/interceptor/CachedTime@0x00000000FBF1D0C8: Flat locked by "Default Executor-thread-8" (J9VMThread:0x0000000001B4BF00), entry count 1
3LKWAITERQ            Waiting to enter:
3LKWAITER                "Default Executor-thread-1" (J9VMThread:0x00000000006EB200)
3LKWAITER                "Default Executor-thread-3" (J9VMThread:0x0000000000718D00)
3LKWAITER                "Default Executor-thread-17" (J9VMThread:0x0000000002644B00)
3LKWAITER                "Default Executor-thread-19" (J9VMThread:0x0000000000346F00)
3LKWAITER                "Default Executor-thread-20" (J9VMThread:0x0000000000618300)
3LKWAITER                "Default Executor-thread-29" (J9VMThread:0x0000000002645700)
3LKWAITER                "Default Executor-thread-30" (J9VMThread:0x0000000002643F00)
3LKWAITER                "Default Executor-thread-39" (J9VMThread:0x00000000022FF900)
3LKWAITER                "Default Executor-thread-40" (J9VMThread:0x00000000022DAA00)
3LKWAITER                "Default Executor-thread-49" (J9VMThread:0x000000000216DE00)
3LKWAITER                "Default Executor-thread-50" (J9VMThread:0x00000000022FED00)
3LKWAITER                "Default Executor-thread-59" (J9VMThread:0x0000000001B74900)
3LKWAITER                "Default Executor-thread-60" (J9VMThread:0x0000000002178F00)
3LKWAITER                "Default Executor-thread-62" (J9VMThread:0x0000000001B72300)

"Default Executor-thread-8" J9VMThread:0x0000000001B4BF00, omrthread_t:0x00007FCE18012DF0, java/lang/Thread:0x00000000FBF99E78, state:B, prio=5
       (java/lang/Thread getId:0x4C, isDaemon:true)
       com/ibm/ws/classloading/internal/ThreadContextClassLoader(0x0000000086272FF8)
       (native thread ID:0x1A4, native priority:0x5, native policy:UNKNOWN, vmstate:B, vm thread flags:0x00000281)
       (native stack address range from:0x00007FCEA0FF6000, to:0x00007FCEA1036000, size:0x40000)
      CPU usage total: 3.272702139 secs, current category="Application"
Blocked on: java/lang/StringBuffer@0x00000000FBF99F10 Owned by: "Default Executor-thread-1" (J9VMThread:0x00000000006EB200, java/lang/Thread:0x00000000804DA638)
      Heap bytes allocated since last GC cycle=0 (0x0)
      Java callstack:
          at java/lang/StringBuffer.setLength(Bytecode PC:0(Compiled Code))
             (entered lock: java/lang/StringBuffer@0x00000000FBF99F10, entry count: 1)
          at org/apache/cxf/jaxrs/interceptor/CachedTime.updateTime(CachedTime.java:86)
          at org/apache/cxf/jaxrs/interceptor/CachedTime.getTimeAsString(CachedTime.java:134)
             (entered lock: org/apache/cxf/jaxrs/interceptor/CachedTime@0x00000000FBF1D0C8, entry count: 1)
          at org/apache/cxf/jaxrs/interceptor/JAXRSOutInterceptor.setResponseDate(JAXRSOutInterceptor.java:451)
          at org/apache/cxf/jaxrs/interceptor/JAXRSOutInterceptor.prepareResponseHeaders(JAXRSOutInterceptor.java:304)
          at org/apache/cxf/jaxrs/interceptor/JAXRSOutInterceptor.serializeMessage(JAXRSOutInterceptor.java:171)
          at org/apache/cxf/jaxrs/interceptor/JAXRSOutInterceptor.processResponse(JAXRSOutInterceptor.java:124)
          at org/apache/cxf/jaxrs/interceptor/JAXRSOutInterceptor.handleMessage(JAXRSOutInterceptor.java:85)
          at org/apache/cxf/phase/PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308)
             (entered lock: org/apache/cxf/phase/PhaseInterceptorChain@0x00000000FBF9A0D0, entry count: 1)
          at org/apache/cxf/interceptor/OutgoingChainInterceptor.handleMessage(OutgoingChainInterceptor.java:90)
          at org/apache/cxf/phase/PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308)
             (entered lock: org/apache/cxf/phase/PhaseInterceptorChain@0x00000000FBF9A148, entry count: 1)
          at org/apache/cxf/transport/ChainInitiationObserver.onMessage(ChainInitiationObserver.java:123)
          at org/apache/cxf/transport/http/AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:277)
          at com/ibm/ws/jaxrs20/endpoint/AbstractJaxRsWebEndpoint.invoke(AbstractJaxRsWebEndpoint.java:137)
          at com/ibm/websphere/jaxrs/server/IBMRestServlet.handleRequest(IBMRestServlet.java:146)
          at com/ibm/websphere/jaxrs/server/IBMRestServlet.doDelete(IBMRestServlet.java:131)
          at javax/servlet/http/HttpServlet.service(HttpServlet.java:712)
          at com/ibm/websphere/jaxrs/server/IBMRestServlet.service(IBMRestServlet.java:96)
          at com/ibm/ws/webcontainer/servlet/ServletWrapper.service(ServletWrapper.java:1258)
          at com/ibm/ws/webcontainer/servlet/ServletWrapper.handleRequest(ServletWrapper.java:746)
          at com/ibm/ws/webcontainer/servlet/ServletWrapper.handleRequest(ServletWrapper.java:443)
          at com/ibm/ws/webcontainer/filter/WebAppFilterChain.invokeTarget(WebAppFilterChain.java:193)
          at com/ibm/ws/webcontainer/filter/WebAppFilterChain.doFilter(WebAppFilterChain.java:98)
          at com/ibm/entity/analytics/api/rest/filter/LoggingFilterBase.doFilter(LoggingFilterBase.java:83)
          at com/ibm/ws/webcontainer/filter/FilterInstanceWrapper.doFilter(FilterInstanceWrapper.java:201)
          at com/ibm/ws/webcontainer/filter/WebAppFilterChain.doFilter(WebAppFilterChain.java:91)
          at com/ibm/ws/security/jaspi/JaspiServletFilter.doFilter(JaspiServletFilter.java:56)
          at com/ibm/ws/webcontainer/filter/FilterInstanceWrapper.doFilter(FilterInstanceWrapper.java:201)
          at com/ibm/ws/webcontainer/filter/WebAppFilterChain.doFilter(WebAppFilterChain.java:91)
          at com/ibm/ws/webcontainer/filter/WebAppFilterManager.doFilter(WebAppFilterManager.java:1004)
          at com/ibm/ws/webcontainer/filter/WebAppFilterManager.invokeFilters(WebAppFilterManager.java:1142)
          at com/ibm/ws/webcontainer/webapp/WebApp.handleRequest(WebApp.java:5078)
          at com/ibm/ws/webcontainer/osgi/DynamicVirtualHost$2.handleRequest(DynamicVirtualHost.java:316)
          at com/ibm/ws/webcontainer/WebContainer.handleRequest(WebContainer.java:1007)
          at com/ibm/ws/webcontainer/osgi/DynamicVirtualHost$2.run(DynamicVirtualHost.java:281)
          at com/ibm/ws/http/dispatcher/internal/channel/HttpDispatcherLink$TaskWrapper.run(HttpDispatcherLink.java:1246)
          at com/ibm/ws/http/dispatcher/internal/channel/HttpDispatcherLink.wrapHandlerAndExecute(HttpDispatcherLink.java:468)
          at com/ibm/ws/http/dispatcher/internal/channel/HttpDispatcherLink.ready(HttpDispatcherLink.java:427)
          at com/ibm/ws/http/channel/internal/inbound/HttpInboundLink.handleDiscrimination(HttpInboundLink.java:567)
          at com/ibm/ws/http/channel/internal/inbound/HttpInboundLink.handleNewRequest(HttpInboundLink.java:501)
          at com/ibm/ws/http/channel/internal/inbound/HttpInboundLink.processRequest(HttpInboundLink.java:361)
          at com/ibm/ws/http/channel/internal/inbound/HttpInboundLink.ready(HttpInboundLink.java:328)
          at com/ibm/ws/channel/ssl/internal/SSLConnectionLink.determineNextChannel(SSLConnectionLink.java:1129)
          at com/ibm/ws/channel/ssl/internal/SSLConnectionLink$MyReadCompletedCallback.complete(SSLConnectionLink.java:678)
          at com/ibm/ws/channel/ssl/internal/SSLReadServiceContext$SSLReadCompletedCallback.complete(SSLReadServiceContext.java:1824)
          at com/ibm/ws/tcpchannel/internal/WorkQueueManager.requestComplete(WorkQueueManager.java:514)
          at com/ibm/ws/tcpchannel/internal/WorkQueueManager.attemptIO(WorkQueueManager.java:584(Compiled Code))
          at com/ibm/ws/tcpchannel/internal/WorkQueueManager.workerRun(WorkQueueManager.java:968)
          at com/ibm/ws/tcpchannel/internal/WorkQueueManager$Worker.run(WorkQueueManager.java:1057)
          at com/ibm/ws/threading/internal/ExecutorServiceImpl$RunnableWrapper.run(ExecutorServiceImpl.java:245(Compiled Code))
          at java/util/concurrent/ThreadPoolExecutor.runWorker(Bytecode PC:94(Compiled Code))
          at java/util/concurrent/ThreadPoolExecutor$Worker.run(Bytecode PC:5)
          at java/lang/Thread.run(Bytecode PC:13)

In another javacore generated in different time when server hang, there were thousands locks of JIT-QueueSlotMonitor-<number> lock like below, which is very abnormal

2LKREGMON          JIT-QueueSlotMonitor-3735 lock (0x00007F28A82383F8): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3736 lock (0x00007F28A82384A8): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3737 lock (0x00007F28A8238558): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3738 lock (0x00007F28A8238608): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3739 lock (0x00007F28A82386B8): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3740 lock (0x00007F28A8238768): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3741 lock (0x00007F28A8238818): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3742 lock (0x00007F28A82388C8): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3743 lock (0x00007F28A8238978): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3744 lock (0x00007F28A8238A28): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3745 lock (0x00007F28A8238AD8): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3746 lock (0x00007F28A8238B88): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3747 lock (0x00007F28A8238C38): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3748 lock (0x00007F28A8238CE8): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3749 lock (0x00007F28A8238D98): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3750 lock (0x00007F28A8238E48): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3751 lock (0x00007F28A8238EF8): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3752 lock (0x00007F28A8238FA8): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3753 lock (0x00007F28A8239058): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3754 lock (0x00007F28A8239108): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3755 lock (0x00007F28A82391B8): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3756 lock (0x00007F28A8239268): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3757 lock (0x00007F28A8239318): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3758 lock (0x00007F28A82393C8): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3759 lock (0x00007F28A8239478): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3760 lock (0x00007F28A8239528): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3761 lock (0x00007F28A82395D8): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3762 lock (0x00007F28A8239688): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3763 lock (0x00007F28A8239738): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3764 lock (0x00007F28A82397E8): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3765 lock (0x00007F28A8239898): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3766 lock (0x00007F28A8239948): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3767 lock (0x00007F28A82399F8): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3768 lock (0x00007F28A8239AA8): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3769 lock (0x00007F28A8239B58): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3770 lock (0x00007F28A8239C08): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3771 lock (0x00007F28A8239CB8): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3772 lock (0x00007F28A8239D68): <unowned>
2LKREGMON          JIT-QueueSlotMonitor-3773 lock (0x00007F28A8239E18): <unowned>

Steps to Reproduce
The issue happened randomly in a secured environments and no steps to reproduce it

Expected behavior
A clear and concise description of what you expected to happen.

Diagnostic information:

  • OpenLiberty Version: 21.0.0.10
  • Affected feature(s)
  • Java Version: IBM Semeru Runtime Certified Edition 11.0.17.0 (build 11.0.17+8)
  • server.xml configuration (WITHOUT sensitive information like passwords)
  • If it would be useful, upload the messages.log file found in $WLP_OUTPUT_DIR/messages.log

Additional context

@mwangcaibmcom mwangcaibmcom added the release bug This bug is present in a released version of Open Liberty label Mar 10, 2023
@jhanders34
Copy link
Member

For the first deadlock reported in the description it relates to Open J9 issue: eclipse-openj9/openj9#14037. In Liberty I think we can work around this and I have started a PR #24660 to provide a workaround.

@jhanders34
Copy link
Member

This issue was originally reported with issue #12704.

@mwangcaibmcom
Copy link
Author

@jhanders34 can you please let us know at what versions the issue has been fixed. we are currently at version 21.0.0.10. So, we know to what version we need to upgrade

@jhanders34
Copy link
Member

@jhanders34 can you please let us know at what versions the issue has been fixed. we are currently at version 21.0.0.10. So, we know to what version we need to upgrade

Presently Liberty 23.0.0.3 is the next release that will be made available. There isn't a guarantee that the pull request that I merged in will be included in 23.0.0.3. When the build is chosen for 23.0.0.3, and if the pull request is in that build, this issue will be marked with a release:23003 label if it is fixed in that release. I anticipate that to be what happens, but cannot guarantee or promise it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs member attention release bug This bug is present in a released version of Open Liberty release:23003
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants