Skip to content

Bugfix for NPE when lazyLoading in findEach due to Garbage Collection#3046

Merged
rbygrave merged 2 commits intoebean-orm:masterfrom
FOCONIS:bugfix/lazy_loading_findEach_garbage_collection
Jun 16, 2023
Merged

Bugfix for NPE when lazyLoading in findEach due to Garbage Collection#3046
rbygrave merged 2 commits intoebean-orm:masterfrom
FOCONIS:bugfix/lazy_loading_findEach_garbage_collection

Conversation

@jonasPoehler
Copy link
Copy Markdown
Contributor

Hello Rob,

I have to admit, that for this issue to occur, some special cases have to come together, but the provided test somewhat reproduces it and I will do my best to try and explain what seems to be going on here.

We started seeing exceptions, like the one at the very end (taken from the failing test), in our application but didn't really know where they came from and how to reproduce them, since restarting that part of code did not reproduce the error. However one idea came to mind, which seems to be the case here:

  • Run a query using findEach, but we are only interested in some of the delivered results.
  • The beans are then put into the persistenceContext where they are stored in BeanRef-Objects, which are weakly referenced, so the garbage collection can clean up non-referenced beans from the findEach
  • Now we trigger a lazy load on some of the fetched beans (in this case just the size of a M2M relation)
  • Since we are running in a findEach, ebean fetches these relations in a batch of 10 and tries to populate all of these 10 beans with the fetched information
  • If however the garbage collection runs between the construction of the query (determination of which ids to fetch) and the population of the beans and clears at least one of these unreferenced beans, the lookup in the persistenceContext returns null, resulting in ebean trying to fill a property on a non existing bean.

The provided test sometimes shows this behaviour, although it does not reliably reproduce it. I sometimes had to manually start it multiple times until it fails (doesn't seem to fail when using IntelliJ's "Run until failure" mode).
Our suggested fix is to just ignore these cases, since the beans are not hard referenced anywhere and thus are assumed to not be required. The alternative solution would be hold a hard reference to these beans when fetching information on them (we did this in FOCONIS@8c77aef), however this seems like unnecessarily keeping things in heap which should be cleaned up anyway.

What are your thoughts on this matter?

All the best
Jonas

java.lang.RuntimeException: get manybs on [org.tests.m2m.softdelete.MsManyA] type[null] threw error.

	at io.ebeaninternal.server.deploy.BeanProperty.getValue(BeanProperty.java:771)
	at io.ebeaninternal.server.deploy.BeanPropertyAssocMany.beanCollection(BeanPropertyAssocMany.java:244)
	at io.ebeaninternal.server.deploy.BeanPropertyAssocMany.addBeanToCollectionWithCreate(BeanPropertyAssocMany.java:234)
	at io.ebeaninternal.server.query.CQuery.setLazyLoadedChildBean(CQuery.java:388)
	at io.ebeaninternal.server.query.SqlTreeLoadBean$Load.complete(SqlTreeLoadBean.java:339)
	at io.ebeaninternal.server.query.SqlTreeLoadBean$Load.perform(SqlTreeLoadBean.java:365)
	at io.ebeaninternal.server.query.SqlTreeLoadBean.load(SqlTreeLoadBean.java:382)
	at io.ebeaninternal.server.query.SqlTreeLoadRoot.load(SqlTreeLoadRoot.java:26)
	at io.ebeaninternal.server.query.CQuery.readNextBean(CQuery.java:415)
	at io.ebeaninternal.server.query.CQuery.hasNext(CQuery.java:495)
	at io.ebeaninternal.server.query.CQuery.readCollection(CQuery.java:524)
	at io.ebeaninternal.server.query.CQueryEngine.findMany(CQueryEngine.java:350)
	at io.ebeaninternal.server.query.DefaultOrmQueryEngine.findMany(DefaultOrmQueryEngine.java:122)
	at io.ebeaninternal.server.core.OrmQueryRequest.findList(OrmQueryRequest.java:407)
	at io.ebeaninternal.server.core.DefaultServer.findList(DefaultServer.java:1404)
	at io.ebeaninternal.server.core.DefaultServer.findList(DefaultServer.java:1383)
	at io.ebeaninternal.server.core.DefaultBeanLoader.executeQuery(DefaultBeanLoader.java:168)
	at io.ebeaninternal.server.core.DefaultBeanLoader.loadMany(DefaultBeanLoader.java:40)
	at io.ebeaninternal.server.core.DefaultServer.loadMany(DefaultServer.java:460)
	at io.ebeaninternal.server.loadcontext.DLoadManyContext$LoadBuffer.loadMany(DLoadManyContext.java:211)
	at io.ebean.common.AbstractBeanCollection.lazyLoadCollection(AbstractBeanCollection.java:90)
	at io.ebean.common.BeanList.init(BeanList.java:136)
	at io.ebean.common.BeanList.size(BeanList.java:449)
	at org.tests.query.TestQueryFindEachHeapPressure.lambda$test$0(TestQueryFindEachHeapPressure.java:60)
	at io.ebeaninternal.server.core.OrmQueryRequest.findEach(OrmQueryRequest.java:365)
	at io.ebeaninternal.server.core.DefaultServer.findEach(DefaultServer.java:1342)
	at io.ebeaninternal.server.querydefn.DefaultOrmQuery.findEach(DefaultOrmQuery.java:1454)
	at org.tests.query.TestQueryFindEachHeapPressure.test(TestQueryFindEachHeapPressure.java:56)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:725)
	at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
	at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:149)
	at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:140)
	at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:84)
	at org.junit.jupiter.engine.execution.ExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(ExecutableInvoker.java:115)
	at org.junit.jupiter.engine.execution.ExecutableInvoker.lambda$invoke$0(ExecutableInvoker.java:105)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
	at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:104)
	at org.junit.jupiter.engine.execution.ExecutableInvoker.invoke(ExecutableInvoker.java:98)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:214)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:210)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:135)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:66)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:35)
	at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57)
	at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:54)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:107)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:88)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:54)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:67)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:52)
	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:114)
	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:86)
	at org.junit.platform.launcher.core.DefaultLauncherSession$DelegatingLauncher.execute(DefaultLauncherSession.java:86)
	at org.junit.platform.launcher.core.SessionPerRequestLauncher.execute(SessionPerRequestLauncher.java:53)
	at com.intellij.junit5.JUnit5IdeaTestRunner.startRunnerWithArgs(JUnit5IdeaTestRunner.java:57)
	at com.intellij.rt.junit.IdeaTestRunner$Repeater$1.execute(IdeaTestRunner.java:38)
	at com.intellij.rt.execution.junit.TestsRepeater.repeat(TestsRepeater.java:30)
	at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:35)
	at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:232)
	at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:55)
Caused by: java.lang.NullPointerException: Cannot invoke "io.ebean.bean.EntityBean._ebean_getField(int)" because "bean" is null
	at io.ebeaninternal.server.properties.EnhanceBeanPropertyAccess$Getter.get(EnhanceBeanPropertyAccess.java:57)
	at io.ebeaninternal.server.deploy.BeanProperty.getValue(BeanProperty.java:767)
	... 96 more

@rob-bygrave
Copy link
Copy Markdown
Contributor

I think comes back to the design question around use of weak refs here for ToMany due to @Inheritance and dealing with batch loading boundaries for that. @rPraml have you looked at this?

@rPraml
Copy link
Copy Markdown
Contributor

rPraml commented May 10, 2023

@rob-bygrave Yes, I developed the fix with Jonas. It took some time until we understand what happens here. In short words

  • you need to use findEach (so that weakRefs for beans in persistence-context and and for beancollections in LoadManyBuffer are used)
  • A batch load for collections must occur (LoadManyRequest.parentIdList extracts the IDs from the weak referenced ownerBeans and weak referenced BeanCollections)
  • execute the query for the required IDs
  • The garbage collection must run NOW - it might throw out beanCollections, respective beans from the persistence context
  • when we try to distribute the result, we might not get the bean in the PC

As the Workaround showed us, it helps, when we hold the beans, so that they cannot be GC'ed. But we find it better, if we just discard the query result.

@rbygrave
Copy link
Copy Markdown
Member

I think we need to go back to the design around using weakRefs for beancollections in LoadManyBuffer and see if there is another design choice available there.

@rPraml
Copy link
Copy Markdown
Contributor

rPraml commented May 17, 2023

We introduced weak refs with #2936 - Here I discovered, that the buffers could build long "chains", that keeps everything in memory and causing an OOM. This happens, if beans are skipped, due inheritance.

I think we need to go back to the design around using weakRefs for beancollections in LoadManyBuffer and see if there is another design choice available there.

Yes, as mentioned in #2936, the OOM will not occor, if the buffers are aligned, so that we can keep hard references to BeanCollection.
Maybe a LoadBuffer can do dummy reads or some padding, if inheritance is involved.

Fortunately, we have a (manual) tests to verify a new implementation ;)

@rob-bygrave do you have an estimated timeline and a plan, how to proceed here in short and long term?
I am also thinking of others which might run in this problem:

@rbygrave
Copy link
Copy Markdown
Member

Ok, I think the plan will be to merge this in first, and then ponder the longer term options.

In long term: Remove weak refs and use padding/dummy reads/whatever in LoadBuffers?

Yes, revisit the original problem and see if padding is an option. A more brutal option is to actually move away from supporting inheritance in Ebean (and instead leave that up to the application side).

@rbygrave rbygrave added this to the 13.19.1 milestone Jun 16, 2023
@rbygrave rbygrave added the bug label Jun 16, 2023
@rbygrave rbygrave merged commit 9fc1bec into ebean-orm:master Jun 16, 2023
@rPraml rPraml deleted the bugfix/lazy_loading_findEach_garbage_collection branch August 10, 2023 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants