Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DaaLoadTest BigInteger.toString OutOfMemoryError #19831

Open
pshipton opened this issue Jul 9, 2024 · 4 comments
Open

DaaLoadTest BigInteger.toString OutOfMemoryError #19831

pshipton opened this issue Jul 9, 2024 · 4 comments

Comments

@pshipton
Copy link
Member

pshipton commented Jul 9, 2024

https://hyc-runtimes-jenkins.swg-devops.com/job/Test_openjdk21_j9_special.system_ppc64le_linux_testList_0/188
DaaLoadTest_all_special_5m_17
-Xcompressedrefs -Xgcpolicy:balanced -Xquickstart -Xjit

3x x 3 grinder passed

https://na.artifactory.swg-devops.com/artifactory/sys-rt-generic-local/hyc-runtimes-jenkins.swg-devops.com/Test_openjdk21_j9_special.system_ppc64le_linux_testList_0/188/system_test_output.tar.gz

16:33:07  DLT 16:33:06.663 - Completed 100.0%. Number of tests started=910 (+0)
16:33:13  DLT stderr JVMDUMP055I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError", exception "Java heap space" at 2024/07/08 16:33:12 - please wait.

16:33:24  DLT testFailure: testIntrinsicPerformance(net.openj9.test.arithmetics.TestPerformance): Java heap space
16:33:24  DLT java.lang.OutOfMemoryError: Java heap space
16:33:24  DLT 	at java.base/java.lang.Long.toString(Long.java:494)
16:33:24  DLT 	at java.base/java.lang.Long.toString(Long.java:145)
16:33:24  DLT 	at java.base/java.math.BigInteger.smallToString(BigInteger.java:4173)
16:33:24  DLT 	at java.base/java.math.BigInteger.toString(BigInteger.java:4217)
16:33:24  DLT 	at java.base/java.math.BigInteger.toString(BigInteger.java:4236)
16:33:24  DLT 	at java.base/java.math.BigInteger.toString(BigInteger.java:4112)
16:33:24  DLT 	at java.base/java.math.BigInteger.toString(BigInteger.java:4286)
16:33:24  DLT 	at net.openj9.test.Utils.toPrecision(Utils.java:489)
16:33:24  DLT 	at net.openj9.test.arithmetics.TestPerformance.getComparisionReferenceValue(TestPerformance.java:375)
16:33:24  DLT 	at net.openj9.test.arithmetics.TestPerformance.testGE(TestPerformance.java:315)
16:33:24  DLT 	at net.openj9.test.arithmetics.TestPerformance.testIntrinsicPerformance(TestPerformance.java:85)
@pshipton
Copy link
Member Author

pshipton commented Jul 9, 2024

@dmitripivkine pls take a look

@dmitripivkine
Copy link
Contributor

Adding this item to GC queue.
OOM is triggered by Excessive GC condition. However there are 673 free regions (of 2022 total), 1m each.
This does not look right, we need to look closer.

@dmitripivkine
Copy link
Contributor

There is the problem with Balanced GC - calculation of current Eden size converged to minimum value of two regions and got stuck there. This causes PGCs triggered by Allocation Failure almost back to back up to 95% time GC utilization and Excessive GC condition latch eventually. We are investigating.

@amicic
Copy link
Contributor

amicic commented Jul 9, 2024

Some initial observations.

We seem to be constantly doing PGCs, and that's because Eden is only 2MBs (2 regions of 1MB). We should be expanding the Eden since the heap is 2GB and fully expanded. But for some reason we don't expand.
The core logic for Eden expansion (for the case of fully expanded heap) is in calculateRecommendedEdenChangeForExpandedHeap, which has some non-trivai calculatinos. Unfortuantelly a couple of trace point in that method did not show up in the dump (even though one of them is Level 1!?)

However, there are quite a few relevant parameters in SchedulingDelegate, and most of them look reasonable, except maybe _estimatedFreeTenure of about 8MB, which seems rather low (trace points report around 700MB)

> !mm_schedulingdelegate 0x00007FFF90069338 | grep _estimatedFreeTenure
	0x100: U64 _estimatedFreeTenure = 0x000000000083F133 (8646963) 	
	
20:33:12.362818623  0x0 j9mm.346             Event       PGC end: workstackoverflow=0 overflowcount=0 heapFree=715128832/2120220672
20:33:12.365062888  0x0 j9mm.345             Event       PGC start: globalcount=4335 nextGMPIncrement=0
20:33:12.442877284  0x0 j9mm.346             Event       PGC end: workstackoverflow=0 overflowcount=0 heapFree=714080256/2120220672
20:33:12.444926051  0x0 j9mm.345             Event       PGC start: globalcount=4336 nextGMPIncrement=0

GC does observe high PGC over head (more like 84-89%, but still very high and we should expand Eden)

> !mm_schedulingdelegate 0x00007FFF90069338 | grep _historicalPartialGCTime
	0x128: U64 _historicalPartialGCTime = 0x0000000000000043 (67)
> !mm_schedulingdelegate 0x00007FFF90069338 | grep _averagePgcInterval
	0x180: U64 _averagePgcInterval = 0x00000000000129ED (76269)
	
(lldb) p 67 * 1000.0 / 76269
(double) 0.87846962724042532
          
> !mm_schedulingdelegate 0x00007FFF90069338 | grep _partialGcOverhead
	0x120: double _partialGcOverhead = 0.8466813906537427

Evidence that Eden is stuck at 2 regions:

> !mm_schedulingdelegate 0x00007FFF90069338 | grep -i edenregion
	0x30: U64 _idealEdenRegionCount = 0x0000000000000002 (2)
	0x38: U64 _minimumEdenRegionCount = 0x0000000000000001 (1)
	0x40: U64 _edenRegionCount = 0x0000000000000002 (2)
	0x108: U64 _maxEdenRegionCount = 0x00000000000005EC (1516)
	0x110: U64 _minEdenRegionCount = 0x0000000000000002 (2)

Evidence that heap is fully expanded:

> !MM_GCExtensions 0x00007FFF90059500 |grep memoryMax
	0x5fd8: U64 memoryMax = 0x000000007E600000 (2120220672)

> !mm_schedulingdelegate 0x00007FFF90069338 | grep -i _numberOfHeapRegions
	0x58: U64 _numberOfHeapRegions = 0x00000000000007E6 (2022)

(lldb) p 2022 * 1024 * 1024
(int) 2120220672

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants