Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change the PersistenceContext default resetLimit for iteration queries to 100 (from 1000) #2407

Closed
wants to merge 1 commit into from

Conversation

rbygrave
Copy link
Member

@rbygrave rbygrave commented Oct 7, 2021

FYI @rPraml

Changes the resetLimit from 1000 to 100 for iteration queries (findEach, findIterate, findStream).

Then with TestPersistenceContext.findWithGcTest() we see

Total instances: 500, instances left in memory: 99
Total instances: 1000, instances left in memory: 99
Total instances: 1500, instances left in memory: 99
Total instances: 2000, instances left in memory: 99
Total instances: 2500, instances left in memory: 99
Total instances: 3000, instances left in memory: 99
Total instances: 3500, instances left in memory: 99
Total instances: 4000, instances left in memory: 99
Total instances: 4500, instances left in memory: 99
Total instances: 5000, instances left in memory: 99
Total instances: 5500, instances left in memory: 99
Total instances: 6000, instances left in memory: 99
Total instances: 6500, instances left in memory: 99
Total instances: 7000, instances left in memory: 99
Total instances: 7500, instances left in memory: 99
Total instances: 8000, instances left in memory: 99
Total instances: 8500, instances left in memory: 99
Total instances: 9000, instances left in memory: 99
Total instances: 9500, instances left in memory: 99
Total instances: 10000, instances left in memory: 99

@rbygrave rbygrave self-assigned this Oct 7, 2021
@rPraml
Copy link
Contributor

rPraml commented Oct 8, 2021

Hello Rob, thanks for the fix. The memory leak fix seems to be easier than I thought.
Changing the value to 100 suits us.

But why is this done at all? As far as I understand, you put a lot of effort in findIterate (#2021 and so on) so that it periodically starts a new persistence context.

What do you generally think if we have only one persistence context, that uses weak references, so that it has not the ability to hold beans in memory.

IMHO that would make sense, that the PC holds only the beans in memory, that are strong referenced somewhere else. It would also free up memory if you clear lists (e.g. customer.getOrders().clear() will free all orders in the PC)

I made a proof of concept in this branch https://github.com/FOCONIS/ebean/commits/experimental/use-weak-ref

I do not know every detail (e.g. if PC is used for prefetch/preload and so on) - I also know, that using references may add additional costs (runtime/memory) - but on the other hand, the change will remove complexity and the PC will cache as many beans as the garbage collection allows.

If you think, WeakReferences are the wrong way here, you may take a look at that change FOCONIS@060c385 it will remove code used only in one test.

Cheers
Roland

@rbygrave
Copy link
Member Author

But why is this done at all?

Largely because it means people can then use findEach(), findStream() etc even for small results - choose it for the programming model/style rather than "large resultSet" reasons. In doing so they can get the same behaviour wrt persistence context as if they used findList.

That is, people can use findEach(), findStream() without thinking or understanding much and the previous persistence context per top level bean isn't really appropriate/good for that case. I felt we need the default behaviour for small results via findEach/findStream to be just as good as findList.

@rbygrave
Copy link
Member Author

persistence context, that uses weak references

Or a specialised persistence context for the iterating queries case that uses weak references. We only get the extra costs of weak references for the iterating queries.

@rPraml
Copy link
Contributor

rPraml commented Oct 11, 2021

Or a specialised persistence context for the iterating queries case that uses weak references. We only get the extra costs of weak references for the iterating queries.

First, I thought, that can be a good idea, to use the special WeakRef Persistence Context in findIterate/findStream. But there may be the following use cases:

try (Transaction txn = DB.beginTransaction) {
   list = DB.find(AnyClass.class).query....findList(); // here PC1 is active
   resultMap = new HashMap();
   DB.find(AnyClass.class).query....findEach( bean -> {
       // Here, PC2 (which uses weakReferences) is active, but we must know PC1 as parent
       // because 'bean' could be already in 'list'
       if (addToResult(bean)) {
          resultMap.put(bean.getId(), bean); // Here we pass a bean from the findEach loop to the outer context
       }
   });
   list = DB.find(AnyClass.class).query....findList();  // PC1 is active again, but bean could be in "resultMap" (or should PC2 be active now)
} 
  • If we start a new WeakRefPersistenceContext on findEach()/findStream() etc. and there is already a context present, we probably have to copy all elements to the new context. (as it is currently done in DefaultPersistenceContext)
  • Passing beans outside the findEach loop may lead to problems, as all following queries may use the wrong persistence context.
  • Starting with a WeakRefPersistenceContext will solve all problems, but will add additional overhead.

I made some quick & dirty performance measurements here:

FOCONIS@3acdd3c

and I think the WeakRefPersistenceContext is about 3 times slower. The measurements are for 100K beans. So DefaultPersistenceContext will ad ~ 130ns per bean get/put and WeakRefPersistenceContext ~ 400ns per bean.

CreateTime: 272 ms, Default: 283 ms (Overhead: 10 ms), Weak: 309 ms (Overhead: 37 ms)
CreateTime: 266 ms, Default: 284 ms (Overhead: 17 ms), Weak: 305 ms (Overhead: 38 ms)
CreateTime: 271 ms, Default: 283 ms (Overhead: 12 ms), Weak: 318 ms (Overhead: 47 ms)
CreateTime: 282 ms, Default: 280 ms (Overhead: -1 ms), Weak: 309 ms (Overhead: 27 ms)
CreateTime: 269 ms, Default: 279 ms (Overhead: 10 ms), Weak: 314 ms (Overhead: 45 ms)
CreateTime: 269 ms, Default: 281 ms (Overhead: 11 ms), Weak: 307 ms (Overhead: 37 ms)
CreateTime: 276 ms, Default: 287 ms (Overhead: 10 ms), Weak: 326 ms (Overhead: 50 ms)
CreateTime: 271 ms, Default: 315 ms (Overhead: 43 ms), Weak: 336 ms (Overhead: 65 ms)
CreateTime: 269 ms, Default: 290 ms (Overhead: 20 ms), Weak: 309 ms (Overhead: 40 ms)

To remove complexity, we could implement a PersistenceContextDelegate:

class PersistenceContextDelegate implements PersistenceContext {
    private PersistenceContext delegate = new DefaultPersistenceContext();
// delegate methods..

    void switchToWeakRef() {
        delegate = new WeakRefPersistenceContext(delegate); // copy all content
    } 
}

This would make it much easier to switch from DefaultPersistenceContext to WeakRefPersistenceContext, if a 'forEach' loop will start. (Especially in ReadJson etc)

Cheers
Roland

@rbygrave
Copy link
Member Author

Discard this now.

@rbygrave rbygrave closed this Oct 12, 2021
@rbygrave rbygrave deleted the feature/findEach-resetLimit-100 branch November 30, 2021 23:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants