Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible memory leak #11251

Open
weetyre opened this issue Feb 20, 2024 · 8 comments
Open

Possible memory leak #11251

weetyre opened this issue Feb 20, 2024 · 8 comments

Comments

@weetyre
Copy link

weetyre commented Feb 20, 2024

I deployed ignite with k8s and allocated 8G memory to it, but the memory usage is up to 90%. I suspect there are two possibilities: one is that the business really needs more direct memory, and the other is that there is really a memory leak. If it's a memory leak, is it an existing problem?

@weetyre
Copy link
Author

weetyre commented Feb 20, 2024

the error msg is here:

2024-02-03T07:00:43,787Z+0000|ERROR|checkpoint-runner-cpu-#91|Log4J2Logger#error:533|Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Unable to write]]
org.apache.ignite.internal.processors.cache.persistence.StorageException: Unable to write
at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FsyncFileWriteHandle.flush(FsyncFileWriteHandle.java:475) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FsyncFileWriteHandle.addRecord(FsyncFileWriteHandle.java:267) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:968) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:921) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.beforeReleaseWrite(PageMemoryImpl.java:1867) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlockPage(PageMemoryImpl.java:1698) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:523) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:515) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writeUnlock(PageHandler.java:416) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.DataStructure.writeUnlock(DataStructure.java:276) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.DataStructure.writeUnlock(DataStructure.java:247) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.put(PagesList.java:933) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList$PutBucket.run(PagesList.java:199) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList$PutBucket.run(PagesList.java:173) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writePage(PageHandler.java:313) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.DataStructure.write(DataStructure.java:304) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.flushBucketsCache(PagesList.java:415) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.saveMetadata(PagesList.java:356) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.saveStoreMetadata(GridCacheOffheapManager.java:322) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.lambda$syncMetadata$1(GridCacheOffheapManager.java:299) ~[ignite-core-2.16.0.jar:2.16.0]
at org.apache.ignite.internal.util.IgniteUtils.lambda$wrapIgniteFuture$5(IgniteUtils.java:11930) ~[ignite-core-2.16.0.jar:2.16.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
at java.lang.Thread.run(Unknown Source) ~[?:?]
Caused by: java.io.IOException: java.lang.OutOfMemoryError**: Cannot reserve 131072 bytes of direct buffer memory (allocated: 2411594182, limit: 2411724800)**
... 24 more
Caused by: java.lang.OutOfMemoryError: Cannot reserve 131072 bytes of direct buffer memory (allocated: 2411594182, limit: 2411724800)
at java.nio.Bits.reserveMemory(Unknown Source) ~[?:?]
at java.nio.DirectByteBuffer.(Unknown Source) ~[?:?]
at java.nio.ByteBuffer.allocateDirect(Unknown Source) ~[?:?]
at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FsyncFileWriteHandle$1.initialValue(FsyncFileWriteHandle.java:133) ~[ignite-core-2.16.0.jar:2.16.0]

@ptupitsyn
Copy link
Contributor

k8s pod may have 8G memory, but the JVM is limited to 2.4G (limit: 2411724800).

Try adjusting Xmx/Xms JVM settings: https://ignite.apache.org/docs/latest/perf-and-troubleshooting/memory-tuning

@shuangquansq
Copy link

shuangquansq commented Mar 6, 2024

Hi, ptupitsyn:
our ignite run with k8s pod. pod have 8G memory and guest host memory is 64G. Our use defaultDataRegionConfiguration, but haven't set the maxsize memory for defaultDataRegionConfiguration. Native persistence.
I see the sorce code:"
/** Fraction of available memory to allocate for default DataRegion. */
private static final double DFLT_DATA_REGION_FRACTION = 0.2;

/** Default data region's size is 20% of physical memory available on current machine. */
public static final long DFLT_DATA_REGION_MAX_SIZE = Math.max(
    (long)(DFLT_DATA_REGION_FRACTION * U.getTotalMemoryAvailable()),
    DFLT_DATA_REGION_INITIAL_SIZE);
"

the interface U.getTotalMemoryAvailable()-->sunOs.getTotalPhysicalMemorySize(), will get the guest host memory 64G 64G*0.2=12.8G.
Although JVM limited the -XX:MaxDirectMemorySize=2300m. But ignite doesn't know it, only know maxsize is 12.8G. So ignite will continue request the memory, until occur OutOfMemoryError.
is it right?

@ptupitsyn
Copy link
Contributor

It is not about data regions or persistence, the error indicates that you hit the JVM heap limit. Try changing Xmx to a higher value.

Also, 8G RAM is too small for an Ignite node in production. Please check the document I've linked above. A good starting point is -Xms10g -Xmx10g, and you need more RAM for data regions, I'd say at least 32G in total per pod.

@shuangquansq
Copy link

hi, ptupitsyn
the error log:
Caused by: java.io.IOException: java.lang.OutOfMemoryError**: Cannot reserve 131072 bytes of direct buffer memory (allocated: 2411594182, limit: 2411724800)**,
is direct memory for off-heap, not he heap memory. Our jvm paramter:
-XX:MaxRAMPercentage=35.0
-Xms1g
-XX:MetaspaceSize=128m
-XX:MaxMetaspaceSize=512m
-XX:MaxDirectMemorySize=2300m
-XX:+UseG1GC
-XX:MaxGCPauseMillis=200
-XX:G1ReservePercent=15
-XX:InitiatingHeapOccupancyPercent=45
-XX:+ExplicitGCInvokesConcurrent
-XX:+AlwaysPreTouch
-XX:AutoBoxCacheMax=20000
-XX:+ScavengeBeforeFullGC
-XX:+UseStringDeduplication
-XX:MaxHeapFreeRatio=40

-XX:MaxDirectMemorySize=2300m is equal 2411724800.
The memory usage usual over 90%. 80.9=7.2G。
jvm heap:8G
35%=2.8G
jvm meta:128+512=640M=0.64G
jvm directMemory=2.3G
Total is 5.74. The memory 7.2-5.74=1.46G, we don't know where it used. Maybe happen memory leak.
Thanks.

@ptupitsyn
Copy link
Contributor

There are no known memory leaks in Ignite. Please adjust JVM settings as explained above.

@qybf
Copy link

qybf commented Mar 29, 2024

the error msg is here:

2024-02-03T07:00:43,787Z+0000|ERROR|checkpoint-runner-cpu-#91|Log4J2Logger#error:533|Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Unable to write]] org.apache.ignite.internal.processors.cache.persistence.StorageException: Unable to write at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FsyncFileWriteHandle.flush(FsyncFileWriteHandle.java:475) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FsyncFileWriteHandle.addRecord(FsyncFileWriteHandle.java:267) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:968) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:921) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.beforeReleaseWrite(PageMemoryImpl.java:1867) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlockPage(PageMemoryImpl.java:1698) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:523) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:515) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writeUnlock(PageHandler.java:416) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.DataStructure.writeUnlock(DataStructure.java:276) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.DataStructure.writeUnlock(DataStructure.java:247) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.put(PagesList.java:933) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList$PutBucket.run(PagesList.java:199) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList$PutBucket.run(PagesList.java:173) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writePage(PageHandler.java:313) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.DataStructure.write(DataStructure.java:304) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.flushBucketsCache(PagesList.java:415) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.saveMetadata(PagesList.java:356) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.saveStoreMetadata(GridCacheOffheapManager.java:322) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.lambda$syncMetadata$1(GridCacheOffheapManager.java:299) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.util.IgniteUtils.lambda$wrapIgniteFuture$5(IgniteUtils.java:11930) ~[ignite-core-2.16.0.jar:2.16.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?] at java.lang.Thread.run(Unknown Source) ~[?:?] Caused by: java.io.IOException: java.lang.OutOfMemoryError**: Cannot reserve 131072 bytes of direct buffer memory (allocated: 2411594182, limit: 2411724800)** ... 24 more Caused by: java.lang.OutOfMemoryError: Cannot reserve 131072 bytes of direct buffer memory (allocated: 2411594182, limit: 2411724800) at java.nio.Bits.reserveMemory(Unknown Source) ~[?:?] at java.nio.DirectByteBuffer.(Unknown Source) ~[?:?] at java.nio.ByteBuffer.allocateDirect(Unknown Source) ~[?:?] at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FsyncFileWriteHandle$1.initialValue(FsyncFileWriteHandle.java:133) ~[ignite-core-2.16.0.jar:2.16.0]

I also encountered a similar problem with you, I was deployed on a Linux host single node ignite, also appeared out of mem. But I feel that my heap memory is useless, all with the heap memory, according to the error with the obvious heap memory is not enough, I am very confused.

@qybf
Copy link

qybf commented Mar 29, 2024

the error msg is here:
2024-02-03T07:00:43,787Z+0000|ERROR|checkpoint-runner-cpu-#91|Log4J2Logger#error:533|Critical system error detected. Will be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=CRITICAL_ERROR, err=class o.a.i.i.processors.cache.persistence.StorageException: Unable to write]] org.apache.ignite.internal.processors.cache.persistence.StorageException: Unable to write at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FsyncFileWriteHandle.flush(FsyncFileWriteHandle.java:475) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FsyncFileWriteHandle.addRecord(FsyncFileWriteHandle.java:267) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:968) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.log(FileWriteAheadLogManager.java:921) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.beforeReleaseWrite(PageMemoryImpl.java:1867) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlockPage(PageMemoryImpl.java:1698) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:523) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.writeUnlock(PageMemoryImpl.java:515) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writeUnlock(PageHandler.java:416) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.DataStructure.writeUnlock(DataStructure.java:276) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.DataStructure.writeUnlock(DataStructure.java:247) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.put(PagesList.java:933) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList$PutBucket.run(PagesList.java:199) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList$PutBucket.run(PagesList.java:173) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.writePage(PageHandler.java:313) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.DataStructure.write(DataStructure.java:304) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.flushBucketsCache(PagesList.java:415) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.saveMetadata(PagesList.java:356) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.saveStoreMetadata(GridCacheOffheapManager.java:322) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.lambda$syncMetadata$1(GridCacheOffheapManager.java:299) ~[ignite-core-2.16.0.jar:2.16.0] at org.apache.ignite.internal.util.IgniteUtils.lambda$wrapIgniteFuture$5(IgniteUtils.java:11930) ~[ignite-core-2.16.0.jar:2.16.0] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?] at java.lang.Thread.run(Unknown Source) ~[?:?] Caused by: java.io.IOException: java.lang.OutOfMemoryError**: Cannot reserve 131072 bytes of direct buffer memory (allocated: 2411594182, limit: 2411724800)** ... 24 more Caused by: java.lang.OutOfMemoryError: Cannot reserve 131072 bytes of direct buffer memory (allocated: 2411594182, limit: 2411724800) at java.nio.Bits.reserveMemory(Unknown Source) ~[?:?] at java.nio.DirectByteBuffer.(Unknown Source) ~[?:?] at java.nio.ByteBuffer.allocateDirect(Unknown Source) ~[?:?] at org.apache.ignite.internal.processors.cache.persistence.wal.filehandle.FsyncFileWriteHandle$1.initialValue(FsyncFileWriteHandle.java:133) ~[ignite-core-2.16.0.jar:2.16.0]

I also encountered a similar problem with you, I was deployed on a Linux host single node ignite, also appeared out of mem. But I feel that my heap memory is useless, all with the heap memory, according to the error with the obvious heap memory is not enough, I am very confused.

So I feel that xms setting is so large that it is useless, mainly to set-XX:MaxDirectMemorySize very large. -Xms2G -Xmx4G -XX:MaxDirectMemorySize=6G is much better

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants