Process invalidations immediately for closed connections #185

jmuchemb · 2017-11-13T13:09:03Z

This fixes a memory leak when pool-timeout is not set.

This fixes a memory leak when `pool-timeout` is not set.

icemac · 2017-12-19T06:39:14Z

@jimfulton What do you think about this PR?

jimfulton · 2017-12-19T14:26:47Z

Thanks for the ping.

This PR doesn't make any sense to me. :)

I set it aside when it was submitted and failed to come back to it. I'll dig in this weekend.

jimfulton · 2018-01-13T20:41:19Z

This implementation seems rather hacky to me. The assignment of _invalidate on a connection's storage is especially unsavory.

A more detailed description of the problem would have made review easier.

Let me see if I understand:

Currently for the MVCCAdapterInstance case, invalidations are stored in the adapter and remain unprocessed until a closed connection is opened. This means that when a connection is closed, we store these invalidations longer than we want and keep data in the connection's cache longer than we want. I agree that this is wasteful. It would be nice to do these invalidations immediately, both to save memory and to do work while connections are idle.

Have I got this right?

Here's an alternative:

Change the MVCCAdapter constructor to take a database as a second argument. When an adapter gets invalidations, it will update it's instances as it does now. It will then call a method on the database (invalidations_received ?) that causes all closed connections to poll for invalidations. That will flush pending invalidations and invalidate cached objects sooner.

Thoughts?

Also, @jamadden would likely be interested in this discussion. :)

jamadden · 2018-01-13T21:20:27Z

Change the MVCCAdapter constructor to take a database as a second argument. When an adapter gets invalidations, it will update it's instances as it does now. It will then call a method on the database (invalidations_received ?) that causes all closed connections to poll for invalidations. That will flush pending invalidations and invalidate cached objects sooner.

At first glance, I was thinking that seems like it's going to (up to) double the number of invalidation polls that RelStorage makes, assuming any write activity at all, if you're using multiple connections. But RelStogare natively implements IMVCCStorage so it wouldn't actually be involved in this, would it? So any potential memory savings wouldn't apply to it.

In general I'm not sure I like the idea, if I understand what it's trying to accomplish. It seems like this would only be an issue if you had a lot of connections in the pool that were active, and then all (or all but one) sat idle. How often does that happen? Maybe if that's a problem there can be a way to manually ask closed connections to process invalidations (or just clear their caches) and affected applications can do that at their leisure (presumably they'd know when they were "done" or "idle"). Or maybe that's what pool-timeout is for 😄

jimfulton · 2018-01-13T21:38:20Z

@jamadden You're right that this wouldn't effect RelStorage. Perhaps some other idea would, if we even want that.

An issue that the MVCCAdapter has that RelStorage doesn't is that invalidations are pushed to it and can pile up.

This is a potential issue whenever clients are idle for an extended time. I can think of lots of cases like this, such as special clients (e.g. sqs workers) that are idle much of the time, or web clients in generously provisioned applications. For many of these, you might want to keep the caches around, but not keep invalidated objects and not want to let invalidations pile up. (Again, not an issue for relstorage.)

When I first looked at the PR a while ago, my reaction was that pool timeout probably shouldn't be disabled by default. But even if it was enabled, I'd think you would want to be conservative, and might want to keep at least one connection and might still have idle connections, or connections idle for some time.

jmuchemb · 2018-01-13T21:44:10Z

Have I got this right?

yes

Change the MVCCAdapter constructor to take a database as a second argument. When an adapter gets invalidations, it will update its instances as it does now. It will then call a method on the database (invalidations_received ?) that causes all closed connections to poll for invalidations. That will flush pending invalidations and invalidate cached objects sooner.

But the instances of an adapter include those that are used by closed connections. Although invalidations are processed with what you suggest, I don't see how it prevents the memory leak (MVCCAdapterInstance._invalidations).

This implementation seems rather hacky to me. The assignment of _invalidate on a connection's storage is especially unsavory.

This reminds me my comment about the addition of adapters. On ZODB4, this PR would look much better.

jimfulton · 2018-01-13T21:54:15Z

On Sat, Jan 13, 2018 at 2:44 PM, Julien Muchembled ***@***.*** > wrote: Change the MVCCAdapter constructor to take a database as a second argument. When an adapter gets invalidations, it will update its instances as it does now. It will then call a method on the database ( invalidations_received ?) that causes all closed connections to poll for invalidations. That will flush pending invalidations and invalidate cached objects sooner. But the instances of an adapter include those that are used by closed connections.

Yes.

Although invalidations are processed with what you suggest, I don't see how it prevents the memory leak (MVCCAdapterInstance._invalidations).

When the DB instance is notified, it will tell each of the closed connections to poll for invalidations. This will cause the corresponding adapter instances to flush their invalidations, thus reducing memory, and the corresponding objects (if any) in the closed connection caches to be invalidated. This implementation seems rather hacky to me. The assignment of _invalidate

on a connection's storage is especially unsavory. This reminds me my comment about the addition of adapters <https://groups.google.com/d/msg/zodb/QJYmvF0eUUM/6sfZ_ApSBgAJ>. On ZODB4, this PR would look much better.

This isn't even an issue for ZODB 4, IIRC, because it processes invalidations against closed connections. (If that's not true, then perhaps that should be fixed.)

jmuchemb · 2018-01-13T22:35:44Z

@jimfulton Now I understand what you suggest, but from a performance point of view, it's ugly. The oids would be copied to set, just to be removed immediately. That would become quite labyrinthine:

MVCCAdapter.invalidate -> MVCCAdapterInstance._invalidate
MVCCAdapter.invalidate -> DB.invalidations_received? -> Connection.invalidations_received? -> MVCCAdapterInstance.poll_invalidations

poll_invalidations would also do useless work (updating _start)

Then to avoid code duplication, newTransaction would be split, which means yet another function call.

I admit that setting MVCCAdapterInstance._invalidate is not nice, mainly because it's not a defined interface and MVCC storages like RelStorage should not be modified this way.

About pool-timeout, I always prefer when software are smart enough to adapt automatically and don't bother with obscure settings.

jimfulton · 2018-01-13T22:43:09Z

I don't don't see significant overhead in what I'm proposing. It has the advantage of having closed connections behave much like closed connections. It also gives behavior similar to what we has in ZODB 4 for closed connections.

But whatever.

jmuchemb · 2018-01-13T22:44:33Z

This isn't even an issue for ZODB 4, IIRC, because it processes invalidations against closed connections. (If that's not true, then perhaps that should be fixed.)

ZODB4 also has this memory leak. That's the version I used when I got a OOM-kill.

Closed connections also collected oids (in Connection._invalidated), and invalidations were processed when it was reopened.

What is simpler with ZODB4 is that oids are collected on the Connection itself, whereas with ZODB5 we have to link MVCCAdapterInstance to Connection.

jimfulton · 2018-01-13T22:46:40Z

On Sat, Jan 13, 2018 at 3:44 PM, Julien Muchembled ***@***.*** > wrote: This isn't even an issue for ZODB 4, IIRC, because it processes invalidations against closed connections. (If that's not true, then perhaps that should be fixed.) ZODB4 also has this memory leak. That's the version I used when I got a OOM-kill. Closed connections also collected oids (in Connection._invalidated), and invalidations were processed when it was reopened.

I'd be happy to view that as a ZODB 4 misfeature.

jmuchemb · 2018-01-13T22:48:18Z

But whatever.

I do want ZODB to be smarter by not requiring to configure pool-timeout for the usual case. The current situation is too bad, and your suggestion remains an improvement.

jimfulton · 2018-01-13T22:56:17Z

So I see 2 issues:

Pool timeout should be enabled by default.
Better handling of invalidations for MVCCAdapter invalidations for closed connections.
I'm open to other ideas beside my proposal.

Perhaps 2 new PRs, if you're up to it? (I sadly don't have time to work on this now. :()

jmuchemb · 2018-01-13T23:06:01Z

Pool timeout should be enabled by default.

If invalidations are processed immediately for closed connections, I don't see why.

new PRs, if you're up to it? (I sadly don't have time to work on this now. :()

Sure. maybe at the end of this month.

jmuchemb · 2019-10-02T14:08:56Z

@jamadden I am still looking for a solution about this issue and I wonder if RelStorage is affected or not. I don't see why it wouldn't.

jamadden · 2019-10-02T15:17:43Z

I don't believe RelStorage is affected, and that's because invalidations are collected synchronously, when the connection is in use. A RelStorage that's not being used isn't receiving notifications or anything like that, it polls for them at particular parts of the transaction lifecycle. (But see below.) That's what Jim meant by:

An issue that the MVCCAdapter has that RelStorage doesn't is that invalidations are pushed to it and can pile up.

On large, busy sites, we were seeing that sometimes poll queries could be very expensive if these older closed connections got used again.

Using the transaction manager in explicit mode cuts down on the number of poll queries that the Connection makes, which had some benefit, but not enough.

So RelStorage 3 introduces a sort of unified invalidation cache that cuts down on the total size of any given invalidation poll query and allows the information collected to be shared across connections, making each individual poll query much cheaper. Storing that information could be seen as a semi-asynchronous collection of invalidations. But there's not a tremendous amount of memory associated with storing the invalidation information because we move collected OIDs off-heap into native integers.

In the event that it does add up because the Connection pool is poorly configured, old RelStorage connections simply become ineligible to be part of the unified invalidation cache after some threshold and the older invalidations get discarded. If they get used again, they'll invalidate all of the object cache and start fresh --- likely at that point most of the object cache is no good anymore anyway, there have been so many writes. These old connections tend to only get used during periods of bursts of activity when we're doing a substantial number of writes, which take longer than reads, so we start scraping the bottom of the barrel of the Connection pool.

The approach here of assigning an _invalidate attribute wouldn't work on RelStorage because there's no such attribute.

jmuchemb · 2019-10-02T19:00:52Z

Thanks.

In the event that it does add up because the Connection pool is poorly configured, old RelStorage connections simply become ineligible to be part of the unified invalidation cache after some threshold and the older invalidations get discarded.

It's like MVCCAdapterInstance._invalidate would stop adding oids to _invalidations if there are too many elements and do self._invalidations = None. But the drawback is again to have a magical configuration parameter, which I don't like. Just to be sure I have understood, what is this parameter in RelStorage ?

I started something, including a micro-optimization that causes invalidations to be processed when a connection is closed but you may not like this in the case of RelStorage.

jamadden · 2019-10-02T19:31:49Z

But the drawback is again to have a magical configuration parameter, which I don't like. Just to be sure I have understood, what is this parameter in RelStorage ?

It's ultimately intended to be an adaptive heuristic, taking into account things like number of old connections, time length between them polling, transactions/second in the environment. The goal is that it doesn't need to be configured: zodb/relstorage#338

But it's currently just a number specifying how many transactions (not invalidations) a connection is allowed to fall behind, while we watch how it behaves in our busy environments, just in case it doesn't actually need to be that complicated. Behaviour has actually been quite sane with that cutoff, so far. The number of loads (indicating connections invaliding otherwise valid objects) hasn't gone up or shown spikes.

I started something, including a micro-optimization that causes invalidations to be processed when a connection is closed but you may not like this in the case of RelStorage.

Yeah, that doesn't sound useful. It actually sounds like an anti-optimization if it involves extra polling.

icemac · 2019-10-11T14:24:24Z

@jmuchemb Do you think it is okay to delete the branch of this PR?

jmuchemb · 2019-10-11T14:37:49Z

I didn't forget to delete it. It's just that the issue is still open. It's important for us and I'll fix it, sooner or later. No idea when I'll work again on it.

If you insist, it's true I have it locally.

icemac · 2019-10-11T14:41:51Z

@jmuchemb I was only asking because the PR is closed. We can keep it if you wish, no problem.

Process invalidations immediately for closed connections

069d30f

This fixes a memory leak when `pool-timeout` is not set.

icemac requested a review from jimfulton December 19, 2017 06:36

jimfulton closed this Jan 13, 2018

jmuchemb mentioned this pull request Dec 11, 2019

Race in Connection.open() vs invalidations -> corrupt data #290

Closed

jmuchemb mentioned this pull request Jul 29, 2020

Include both modified and just created objects into invalidations zopefoundation/ZEO#160

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process invalidations immediately for closed connections #185

Process invalidations immediately for closed connections #185

jmuchemb commented Nov 13, 2017

icemac commented Dec 19, 2017

jimfulton commented Dec 19, 2017

jimfulton commented Jan 13, 2018

jamadden commented Jan 13, 2018

jimfulton commented Jan 13, 2018

jmuchemb commented Jan 13, 2018

jimfulton commented Jan 13, 2018 via email

jmuchemb commented Jan 13, 2018

jimfulton commented Jan 13, 2018

jmuchemb commented Jan 13, 2018

jimfulton commented Jan 13, 2018 via email

jmuchemb commented Jan 13, 2018

jimfulton commented Jan 13, 2018

jmuchemb commented Jan 13, 2018

jmuchemb commented Oct 2, 2019

jamadden commented Oct 2, 2019

jmuchemb commented Oct 2, 2019

jamadden commented Oct 2, 2019

icemac commented Oct 11, 2019

jmuchemb commented Oct 11, 2019

icemac commented Oct 11, 2019

Process invalidations immediately for closed connections #185

Process invalidations immediately for closed connections #185

Conversation

jmuchemb commented Nov 13, 2017

icemac commented Dec 19, 2017

jimfulton commented Dec 19, 2017

jimfulton commented Jan 13, 2018

jamadden commented Jan 13, 2018

jimfulton commented Jan 13, 2018

jmuchemb commented Jan 13, 2018

jimfulton commented Jan 13, 2018 via email

jmuchemb commented Jan 13, 2018

jimfulton commented Jan 13, 2018

jmuchemb commented Jan 13, 2018

jimfulton commented Jan 13, 2018 via email

jmuchemb commented Jan 13, 2018

jimfulton commented Jan 13, 2018

jmuchemb commented Jan 13, 2018

jmuchemb commented Oct 2, 2019

jamadden commented Oct 2, 2019

jmuchemb commented Oct 2, 2019

jamadden commented Oct 2, 2019

icemac commented Oct 11, 2019

jmuchemb commented Oct 11, 2019

icemac commented Oct 11, 2019