Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenJDK java/nio/Buffer/Chew OutOfMemoryError: Direct buffer memory #17723

Open
pshipton opened this issue Jun 30, 2023 · 29 comments
Open

OpenJDK java/nio/Buffer/Chew OutOfMemoryError: Direct buffer memory #17723

pshipton opened this issue Jun 30, 2023 · 29 comments

Comments

@pshipton
Copy link
Member

pshipton commented Jun 30, 2023

https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.openjdk_s390x_linux_Nightly_testList_1/2/
jdk11_tier1_buffer_0
java/nio/Buffer/Chew.java

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk11_j9_sanity.openjdk_s390x_linux_Nightly_testList_1/2/openjdk_test_output.tar.gz

01:24:23  java.lang.OutOfMemoryError: Direct buffer memory
01:24:23  	at java.base/java.nio.Bits.reserveMemory(Bits.java:175)
01:24:23  	at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
01:24:23  	at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:317)
01:24:23  	at Chew.main(Chew.java:39)
01:24:23  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
01:24:23  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
01:24:23  	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
01:24:23  	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
01:24:23  	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
01:24:23  	at java.base/java.lang.Thread.run(Thread.java:839)
@pshipton
Copy link
Member Author

@dmitripivkine fyi

@dmitripivkine
Copy link
Contributor

There is no stored result unfortunately.

@pshipton
Copy link
Member Author

pshipton commented Jul 20, 2023

https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.openjdk_s390x_linux_Release_testList_0/4 - ub20-390-6
jdk11_tier1_buffer_1 -XX:-UseCompressedOops

No diagnostics. We'll need to make a change to cause diagnostics to occur. There is no -Xdump option specified when running the test, we are using the default options. The default options create diagnostics on systhrow OOM, however the OOM is thrown from Java code in this case.

@pshipton
Copy link
Member Author

pshipton commented Aug 3, 2023

https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.openjdk_s390x_linux_Nightly_testList_0/24/ - rh7-390-3
jdk11_tier1_buffer_1
java/nio/Buffer/Chew.java

06:00:00  java.lang.OutOfMemoryError: Direct buffer memory
06:00:00  	at java.base/java.nio.Bits.reserveMemory(Bits.java:175)
06:00:00  	at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
06:00:00  	at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:317)
06:00:00  	at Chew.main(Chew.java:39)

@pshipton
Copy link
Member Author

pshipton commented Aug 3, 2023

Added -Xdump option to the test.
ibmruntimes/openj9-openjdk-jdk11@openj9...pshipton:chew

10x grinder passed
5 x 100x grinder https://openj9-jenkins.osuosl.org/job/Grinder/2643/

@pshipton
Copy link
Member Author

pshipton commented Aug 3, 2023

Reproduction in https://openj9-jenkins.osuosl.org/job/Grinder_iteration_3/245/ - rh7-390-2
https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Grinder_iteration_3/245/openjdk_test_output.tar.gz
There are 2 core files in aqa-tests\TKG\output_16910741528697\jdk_custom_0_ITER_40\work\java\nio\Buffer\Chew\

@dmitripivkine pls take a look.

@dmitripivkine
Copy link
Contributor

JRE: 858314328 bytes / 5273 allocations
|
+--VM: 567679336 bytes / 4995 allocations
|  |
|  +--Classes: 328093072 bytes / 267 allocations
|  |  |
|  |  +--Shared Class Cache: 314572896 bytes / 2 allocations
|  |  |
|  |  +--Other: 13520176 bytes / 265 allocations
|  |
|  +--Modules: 553176 bytes / 2208 allocations
|  |
|  +--Memory Manager (GC): 17780856 bytes / 495 allocations
|  |  |
|  |  +--Java Heap: 16838656 bytes / 1 allocations
|  |  |
|  |  +--Other: 942200 bytes / 494 allocations
|  |
|  +--Threads: 11785248 bytes / 172 allocations
|  |  |
|  |  +--Java Stack: 136408 bytes / 19 allocations
|  |  |
|  |  +--Native Stack: 11403264 bytes / 20 allocations
|  |  |
|  |  +--Other: 245576 bytes / 133 allocations
|  |
|  +--Trace: 199056 bytes / 271 allocations
|  |
|  +--JVMTI: 17776 bytes / 13 allocations
|  |
|  +--JNI: 40448 bytes / 65 allocations
|  |
|  +--Port Library: 208041888 bytes / 79 allocations
|  |  |
|  |  +--Unused <32bit allocation regions: 208030072 bytes / 1 allocations
|  |  |
|  |  +--Other: 11816 bytes / 78 allocations
|  |
|  +--Other: 1167816 bytes / 1425 allocations
|
+--JIT: 274892208 bytes / 244 allocations
|  |
|  +--JIT Code Cache: 268435456 bytes / 1 allocations
|  |
|  +--JIT Data Cache: 4194432 bytes / 2 allocations
|  |
|  +--Other: 2262320 bytes / 241 allocations
|
+--Class Libraries: 15742784 bytes / 34 allocations
   |
   +--VM Class Libraries: 15742784 bytes / 34 allocations
      |
      +--sun.misc.Unsafe: 15732496 bytes / 22 allocations
         |
         +--Direct Byte Buffers: 15730368 bytes / 21 allocations
         |
         +--Other: 2128 bytes / 1 allocations
      |
      +--Other: 10288 bytes / 12 allocations

@dmitripivkine
Copy link
Contributor

@pshipton Am I reading this correctly that DBB allocation itself is not a problem (15730368 bytes) comparing with other native memory takers (SC, JIT Code Cache)? And, of course, is native memory under 1G is enough to run test properly (container I guess)?
There is nothing wrong in the core at the first look. I can dig deeper if you think there is a reason.

@dmitripivkine
Copy link
Contributor

There are 15 jdk/internal/ref/Cleaner objects in Cleared State (each responsible for 1M DBB) waiting for Finalization in the queue.

@pshipton
Copy link
Member Author

pshipton commented Aug 3, 2023

@dmitripivkine I believe the test is confirming that unreferenced DBB allocations are freed. The test is running with -Xmx16m which also affects the artificial limit for DBB allocations https://eclipse.dev/openj9/docs/xxmaxdirectmemorysize/ . The test is allocating 64MB in a loop which is well over the limit so the unreferenced allocations need to freed.

I'm not sure why we are getting these intermittent failures on zlinux when it doesn't occur on other platforms, and it just recently started showing up. It may be worth a deeper look. Perhaps it's related to the JIT on this platform keeping older references alive. I can look for the JIT option that more aggressively releases references and try a grinder with the option.

@pshipton
Copy link
Member Author

pshipton commented Aug 3, 2023

Start with a grinder with -Xint
6 x 150x https://openj9-jenkins.osuosl.org/job/Grinder/2644/

@dmitripivkine
Copy link
Contributor

dmitripivkine commented Aug 3, 2023

Perhaps it's related to the JIT on this platform keeping older references alive.

I don't think so. There are 15 (keeping 1M DBB each) objects in the Finalization queue (so, seems all of them). There are in Cleared state, so have been processed by GC. There is a question why Finalization is not running? May be there were no recourses to run Finalization thread? There were two SystemGC() back to back.

@dmitripivkine
Copy link
Contributor

dmitripivkine commented Aug 3, 2023

There is a lot of compilation activity after second SystemGC. May be it is where computational resources are going to?

@pshipton
Copy link
Member Author

pshipton commented Aug 3, 2023

Could be. The code is running this loop https://github.ibm.com/runtimes/openj9-openjdk-jdk11/blob/ibm_sdk/src/java.base/share/classes/java/nio/Bits.java#L156-L172 which is calling Reference.waitForReferenceProcessingImpl() and Thread.sleep() if that returns false.

If there doesn't appear to be any problem then I'll just start ignoring the failures.

@pshipton
Copy link
Member Author

pshipton commented Aug 3, 2023

There was a failure with -Xint which rules out compilation activity being the problem. Could waitForReferenceProcessingImpl() be returning the wrong value?

https://openj9-jenkins.osuosl.org/job/Grinder_iteration_1/328/ on https://openj9-jenkins.osuosl.org/computer/rh7-390-2

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Grinder_iteration_1/328/openjdk_test_output.tar.gz

@dmitripivkine
Copy link
Contributor

Could waitForReferenceProcessingImpl() be returning the wrong value?

There is an implementation:

UDATA j9gc_wait_for_reference_processing(J9JavaVM *vm)
{
	UDATA ret = 0;
	if (NULL != vm->processReferenceMonitor) {
		omrthread_monitor_enter(vm->processReferenceMonitor);
		if (0 != vm->processReferenceActive) {
			omrthread_monitor_wait(vm->processReferenceMonitor);
			ret = 1;
		}
		omrthread_monitor_exit(vm->processReferenceMonitor);
	}
	return ret;
}

@dmitripivkine
Copy link
Contributor

Hm, processReferenceActive is 0.

> !j9javavm 0x000003FF940191D0 | grep processReference
	0x18f8: struct J9ThreadMonitor* processReferenceMonitor = !j9threadmonitor 0x000003FF94087178
	0x1900: UDATA processReferenceActive = 0x0000000000000000 (0)

@pshipton
Copy link
Member Author

pshipton commented Aug 3, 2023

Maybe the finalization thread missed a notification so it doesn't run?

@pshipton
Copy link
Member Author

pshipton commented Aug 3, 2023

@tobi fyi

@tajila
Copy link
Contributor

tajila commented Sep 5, 2023

If we dont see this issue again in the next couple of days we can close it now that the System.gc() changes are in.

@pshipton
Copy link
Member Author

pshipton commented Sep 5, 2023

The System.gc() changes are #18044

@pshipton
Copy link
Member Author

pshipton commented Sep 5, 2023

Not seen lately, closing for now.

@pshipton pshipton closed this as completed Sep 5, 2023
@pshipton
Copy link
Member Author

pshipton commented Dec 4, 2023

https://openj9-jenkins.osuosl.org/job/Test_openjdk21_j9_sanity.openjdk_s390x_linux_Nightly_testList_0/109/

21:48:33  java.lang.OutOfMemoryError: Cannot reserve 1048576 bytes of direct buffer memory (allocated: 15728688, limit: 16777216)
21:48:33  	at java.base/java.nio.Bits.reserveMemory(Bits.java:178)
21:48:33  	at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:132)
21:48:33  	at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:360)
21:48:33  	at Chew.main(Chew.java:39)

@pshipton
Copy link
Member Author

https://openj9-jenkins.osuosl.org/job/Test_openjdk23_j9_sanity.openjdk_s390x_linux_Nightly_testList_1/75

05:18:16  java.lang.OutOfMemoryError: Cannot reserve 1048576 bytes of direct buffer memory (allocated: 15728688, limit: 16777216)
05:18:16  	at java.base/java.nio.Bits.reserveMemory(Bits.java:178)
05:18:16  	at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:116)
05:18:16  	at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:363)
05:18:16  	at Chew.main(Chew.java:39)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants