Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueObservation won't start until concurrent write transaction has ended #601

Closed
zdnk opened this issue Aug 23, 2019 · 25 comments
Closed
Labels
enhancement help wanted Extra attention is needed

Comments

@zdnk
Copy link

zdnk commented Aug 23, 2019

Hello, I have encountered this issue today and not sure if there is any setting to fix this.

Basically I have observation:

return ProductCategory.children(of: id).rx.observeAll(in: writer, startImmediately: true, observeOn: ConcurrentDispatchQueueScheduler(qos: .userInitiated))

And then there is a write transaction that takes several seconds to finish (syncing database from server dump).

It seems that the observation does not emit any events as long the transaction is running which seems odd since I am using DatabasePool:

            var config = Configuration()
            config.foreignKeysEnabled = true
            config.label = "products-db"
            config.maximumReaderCount = 100
            let pool = try! DatabasePool(path: path, configuration: config)
            pool.setupMemoryManagement(in: UIApplication.shared)

I am also using the regular read and write methods on DatabasePool

Would you be able to provide any guidance on this issue please?

@zdnk
Copy link
Author

zdnk commented Aug 23, 2019

I just double-checked with brekapoints and it really is not running my read statements while there is write transaction :(

@zdnk
Copy link
Author

zdnk commented Aug 23, 2019

One note: the observable creation is actually wrapped in pool.read {}, but since reads are not serialized according to the documentation, that should not be the issue, right?

EDIT: I dont actually start subscribing the RxSwift.Observable inside the read closure tho.

@groue
Copy link
Owner

groue commented Aug 23, 2019

Hello @zdnk,

It seems that the observation does not emit any events as long the transaction is running.

So you say that you'd like the observation to be notified of partial changes that happen during the transaction, even before the transaction has been completed.

Indeed this is not how ValueObservation works. Its documentation states:

Changes are only notified after they have been committed in the database.

And this won't change. ValueObservation applies one of the most important design decisions in GRDB, which is that most applications want to use the database as a single source of truth. And what is the database truth? Data which is saved on disk. This is why only changes which are saved to disk after the transaction has been successfully committed are notified by ValueObservation.

This principle greatly simplifies the architecture of most applications. For example, clients of a ValueObservation only care about the content of the database. They don't have to worry about transactions that fail. Or uncommitted transactions that temporarily put the database in an inconsistent state. Your server synchronizer only has to care about its transaction, and nothing else. All components are independent and only have to care about their own business, around a database which is always reputed correct. This is quite a relief!

[this] seems odd since I am using DatabasePool

OK, let's lift a misinterpretation. The behavior of a ValueObservation is indeed different in a DatabasePool and in a DatabaseQueue. But not as you expect.

After the successful commit of a transaction that has performed changes in the observed region, ValueObservation has to fetch fresh values in order to notify them. It is crucial that those fresh values come from the exact database state that has just been committed. Other eventual concurrent writes should never never never be allowed to interfere.

DatabasePool can do that without blocking concurrent writes. DatabaseQueue can not. Imagine that fetching fresh values is very very slow. With DatabaseQueue, the observation would block all concurrent writes until those fresh values are fetched. Only then other writes could be performed. With DatabasePool, the observation does not block concurrent writes, and yet is able to notify the correct fresh values. DatabasePool reduces write contention. But it does not break the principle of the single source of truth.

Database pools and queues generally have the same observable behavior (you don't see differences in outputs). But pools can optimize some scenarios. This also is an important design decision, because it makes it possible, for example, to use a database pool in the app, and a fast in-memory database queue in tests.

Would you be able to provide any guidance on this issue please?

As always, with great pleasure :-) I've said "most applications" a couple of times above. Because I know some applications want to play wild games.

The wild game you want to play is to handle changes before they are committed. You can, with the low-level TransactionObserver protocol. It is the mother of all database observation features. I think this is what you are after. Now, since it is a very rare need, things will be less convenient.

@groue groue added the question label Aug 23, 2019
@zdnk
Copy link
Author

zdnk commented Aug 23, 2019

Sorry. it seems I did not say all the needed facts.
So basically, when I start observing the database there is already write transaction in progress and I wont get its data before its committed - thats totally fine and correct. However, before the transaction started, there were data already (as the sync is diff), but the observer does not emit them before the write transaction commits. That is my issue :) I do understand all of the above you wrote.

Meaning write transaction starts and lives on its own in completely different part of the application, but meanwhile I am browsing the userinterface and at some point I start observation on database, but it wont emit anything before theres an "open slot" for reading for some reason. Even if there were data before the write transaction started.

@groue
Copy link
Owner

groue commented Aug 23, 2019

All right @zdnk, I understand now that you expect the initial values to be fetched and notified as soon as the observation starts, even if there is a long running transaction in the background.

And I also now better understand why you expected DatabasePool to allow this.

OK now we are synchronized 😅

As you witness, this is not how it works today. The observation requires an initial write access before it performs its initial fetch, and this is why it is blocked. Please let me gather my thoughts, and see if this limitation could be lifted eventually, or not.

@zdnk
Copy link
Author

zdnk commented Aug 23, 2019

Sorry for not being clear.

Sure, thanks for your time :)

@groue groue changed the title Observation blocked when write transaction in progress ValueObservation won't start until concurrent write transaction has ended Aug 23, 2019
@groue
Copy link
Owner

groue commented Aug 23, 2019

Yes, you have tickled my curiosity. If yourself have an idea about how we could deal with the various involved dispatch queues, and guarantee that no change which happens "after" (whatever it means) the initial fetch is performed is missed, I'm all ears.

@groue
Copy link
Owner

groue commented Aug 23, 2019

I mean that your request seems legit. But it looks like it requires serious concurrency voodoo.

@zdnk
Copy link
Author

zdnk commented Aug 23, 2019

Fetching the current state, registering the observation and sending the current state immediately is not enough for some reason I guess then?

@zdnk
Copy link
Author

zdnk commented Aug 23, 2019

What if you would insert a lock into the write queue, fetch current state, start observing, send current state, release the lock?

@groue
Copy link
Owner

groue commented Aug 24, 2019

What if you would insert a lock into the write queue, fetch current state, start observing, send current state, release the lock?

Dispatch queues can only enqueue work items. And when you start the observation, any number of work items may already be enqueued in the writer dispatch queue, the one that should be observed. So I'm not sure what "inserting a lock" could really mean.

When you start the observation, what you know, as a user, is that you will get notified with initial values fetched from the "current state" of the database, and that all further modifications performed from this "current state" will be notified.

"Current state" is easily said, but less easily defined. It is quite fuzzy as soon as you consider that there may be any number of concurrent writes running at the moment you start the observation.

In the current implementation, the "current state" from which the observation starts is the state of the database at the first available slot in the writer dispatch queue, because the observation start is enqueued in the writer dispatch queue. Notified changes are changes that are enqueued after the start of the observation.

In the graph below, <write C> is already enqueued at the moment the observation starts (x), but not <write D>:

Time -------------------------------------------------------------------------->
Writes:             <write: A><write: B><write: C>             <write D>
Observation starts:              x               |                     |
Observation reads:               |               <read: C>             <read D>
                                 |               |       |                    |
Notified values (CD):            |               |       C                    D
The delay we are talking about:  <--------------->

What you ask for is something different: the "current state" from which the observation starts is the state of the database left by the last executed write. This gives a different picture:

Time -------------------------------------------------------------------------->
Writes:             <write: A><write: B><write: C>             <write D>
Observation starts:              x     |         |                     |
                                 |     |         |                     |
Observation reads:               <read: A>       <read C>              <read D>
                                       | |              |                     |
                                       <read: B>        |                     |
                                         |     |        |                     |
Notified values (ABCD):                  A     B        C                     D
No more delay

Looks easy, hu? But it is not quite that easy at all.

The writes B and C, which have already been enqueued at the moment the observation start, should now observe the database in order to check if the observation should be triggered.

But adding a transaction observer is currently like all regular write operations: it is enqueued on the writer dispatch queue. We currently have no way to "retroactively" inject a transaction observer. It would require quite a refactoring. Besides, retroactive observation is an important change that may have other impacts that we haven't analysed yet.

On top of that, the write B is not observed at the moment the observation start: maybe some impactful changes have already been performed, without being observed. This means that the write B must always trigger the observation, just in case some unnoticed changes have been performed. I'm not sure the consequences of this are well-understood yet.


All in all, I understand and support your improvement request.

I discover with quite an embarrassment that observations are blocked by long-running writes, a behavior that was right under my nose but that I missed until this issue. Several pieces of documentation should be updated. And application components are not as independent as I like to claim 😅.

But it won't be an easy task, and nobody should expect a quick solution. I may look at it eventually, or help anyone who wants to jump on the task. The topic needs maturing.

I'll leave the issue open for a few days, just in case someone has something useful to say. After that, I'll close it and move this improvement to the TODO list and the CONTRIBUTING suggestions, ready for anyone to pick it up.


So, what are your options, in the current state of the library?

Since you are using RxGRDB, you can prepend your database observation with an initial fetch, with DatabaseReader.rx.read(observeOn:value:).

You will get an observable which won't have to wait for the end of the long-running write. You will lose a guarantee, which is that all changes are notified: changes that happen between the initial fetch, and the beginning of the observation will be missed. But the guarantee that the observable eventually emits the latest database value will remain. Depending on the needs of your application, this may be enough:

Time -------------------------------------------------------------------------->
Writes:             <write: A><write: B><write: C>             <write D>
Observation starts:              x               |                     |
                                                 |                     |
Observation reads:                               <read C>              <read D>
                                                        |                     |
Initial fetch:                   <read: A>              |                     |
                                         |              |                     |
                                         |              |                     |
                                         |              |                     |
Notified values (ACD):                   A              C                     D
No more delay, but B is missed

But you'll get duplicate initial value if there is no concurrent write when the observation starts:

Time -------------------------------------------------------------------------->
Writes:             <write: A>                                 <write D>
Observation starts:              x                                     |
                                                                       |
Observation reads:               <read: A>                             <read D>
                                         |                                    |
Initial fetch:                   <read: A>                                    |
                                         |                                    |
                                         |                                    |
                                         |                                    |
Notified values (AAD):                   AA                                   D
No more delay, but A is duplicated

And I also suggest you spend a few minutes of thinking so that you make sure that the ordering of the notified values is correct. You don't want the first value of the observation, which is notified after the initial fetch, to come from an earlier state of the database.

@groue groue added enhancement help wanted Extra attention is needed and removed question labels Aug 24, 2019
@groue
Copy link
Owner

groue commented Aug 24, 2019

The topic needs maturing

It sure does.

Maybe we could build on top of the suggested solution, and just define yet another "observation mode" where an immediate fetch is guaranteed as long as you use a Database Pool, but where the guarantee that all changes are notified is lost. This would avoid both the heavy refactoring, and the difficult questions described above.

This new mode would be much easier to achieve, and yet satisfy a lot of users, me included.

But the cost is more options, more documentation, more choices for the user, on a topic which, unfortunately, is very difficult to understand 😶

@zdnk
Copy link
Author

zdnk commented Aug 24, 2019

Well I see it is complicated issue to ensure the guarantee.
I am ok with a workaround of fetching the initial data myself for now and emitting them to observable sequence. I am a little worried of some inconsistency, but it should not be anything serious.

I have an extra question about performance of observations as I am observing each element in large UITableView as they become visible and dispose those observations as the become hidden. You can imagine the scenario with the user scrolling pretty fast and on larger screens and with multiple controllers in navigation stack possibly hundreds of active observations at the same time. Should I be worried about the performance? I know literally nothing about the mechanics of this feature so here I am asking.

@groue
Copy link
Owner

groue commented Aug 24, 2019

Well I see it is complicated issue to ensure the guarantee.

There is no quick and obvious solution, yes.

I am ok with a workaround of fetching the initial data myself for now and emitting them to observable sequence. I am a little worried of some inconsistency, but it should not be anything serious.

I hope it will work as expected.

I have an extra question about performance of observations as I am observing each element in large UITableView as they become visible and dispose those observations as the become hidden. You can imagine the scenario with the user scrolling pretty fast and on larger screens and with multiple controllers in navigation stack possibly hundreds of active observations at the same time. Should I be worried about the performance? I know literally nothing about the mechanics of this feature so here I am asking.

Well, each and every active observation slows down writes a little, because it checks all writes in case they could impact its observed region. And if a write may have impacted the observed region, then the observation performs a fetch. The number of concurrent fetches is limited by the maximumReaderCount configuration. So if you have many observations that are triggered by a single write, you may face read contention. After an observation has fetched its fresh values, it asynchronously dispatches it on your target dispatch queue (often the main queue).

In your case, I wonder if a single observation that refreshes your full table view content wouldn't be a better idea.

Not only because it may perform faster.

But mainly because it would make sure your table view content always come as a whole, from a single database read. This is one of the most important GRDB concurrency rule that one should always try to apply. It makes application code simple, without any risk of inconsistencies, stale values, or anything like that.

Many many observations, because of the risks of read contention they create, may scatter their change notifications in several runloop steps. You may end up with table view cells which are not refreshed at the same time, and a potentially very weird data source. Maybe this is OK. But maybe it is not. And it surely is it not always ok in all situations.

@zdnk
Copy link
Author

zdnk commented Aug 24, 2019

Well, funny thing. I did write it with the workaround, but after a while when a long write transaction is runnings, it stops working the same way as I would use the default observation with guarantee :(
Here is the workaround:

extension Reactive where Base: FetchRequest, Base.RowDecoder: FetchableRecord {
    
    public func unguaranteedObserveAll(
        in reader: DatabaseReader,
        observeOn scheduler: ImmediateSchedulerType = ConcurrentDispatchQueueScheduler(qos: .userInitiated))
        -> Observable<[Base.RowDecoder]>
    {
        return Observable.create { (observer) -> Disposable in
            do {
                let value = try reader.read { db in
                    return try self.base.fetchAll(db)
                }
                observer.onNext(value)
                observer.onCompleted()
            } catch {
                observer.onError(error)
            }
            
            return Disposables.create()
        }
        .flatMap { value in
            return Observable.merge(
                .just(value),
                self.observeAll(in: reader, startImmediately: false, observeOn: scheduler)
            )
        }
    }
    
}

@zdnk
Copy link
Author

zdnk commented Aug 24, 2019

I tried increasing the readers count to 100000, but it behaves still the same. At some point it stops executing the read closures.
A note: I doing a read inside a read actually, but that should not be an issue right?

@groue
Copy link
Owner

groue commented Aug 25, 2019

Here is the workaround

It looks sensible 👍

I tried increasing the readers count to 100000

Hmmm, your device hasn't 100000 CPU cores: this number is completely meaningless and can not bypass the physical limits!

I suggest you step back a little bit, and consider more seriously my suggestion to reduce drastically your number of active observations. As I tried to explain above, you err on the side of the misuse. If you notice a bug, then please allow me to reproduce it.

@groue
Copy link
Owner

groue commented Aug 25, 2019

I can attempt at better describing the scheduling of observations. It may help you understand the behavior of your app.

There are constraints you must keep in mind:

In a database pool, there is a unique writer serial queue, which protects the access to the unique writer SQLite connection. Writes happen there. Observations start there too, because it is the writer database connection which is observed.

In a database pool, there is at most N reader serial queues, which protect the access to N read-only SQLite connections. Reads that happen there can be performed concurrently with writes.

There is a special kind of read, which observations rely on a lot: the "isolated concurrent read from a given state of the database". The goal is to block the writer queue as little as possible, until a reader connection has acquired the snapshot isolation state. Acquiring snapshot isolation is fast. After the reader is isolated, we have the guarantee that it will see the database is the exact desired state. Only then we can release the writer queue and let other writes happen. And the reader can start reading concurrently.

Here is how observations use those isolated concurrent reads: after a write has ended, observations spawn concurrent read in order to fetch their fetch values. The writer serial queue remains locked until all observations have acquired snapshot isolation, so that they can fetch their fresh values. If the writer queue were unlocked before that, some observations would not see the database in the state they expect.

Now let's apply the resource constraints to this scenario. Let's imagine that a write has just ended. The writer queue is still locked. We'll start refreshing the observations.

The first observation wants to perform its isolated concurrent read. Is there an idle reader in the N available ones? If not, it has to wait until one is released. The writer queue is still locked.

An idle reader is found, and turns busy: it tries to acquire snapshot isolation. It asynchronously runs the SQLite statements that trigger this state. Is there an idle thread that libDispatch can use in order to execute this code? If not, the reader has to wait until a thread is free. The writer queue is still locked.

Eventually snapshot isolation is acquired. The reader can fetch the fresh values. Later, when the fetch has ended, the reader will turn idle again, ready for another read.

And we perform the same task for the all active observations: wait for an idle reader, wait for snapshot isolation, repeat.

Only after all observations have acquired snapshot isolation, the writer queue is released and becomes ready for other writes.

To sum up: database pools help reducing writer contention, until reader contention kicks in. Creating many observations increases reader contention.

I have already given a lot of information. It may be too much, I'm sorry. I only hope it will help you understand the behavior of your app.

Also, I'm wondering whether the Instruments app could not help you in this task.

@zdnk
Copy link
Author

zdnk commented Aug 25, 2019

I see, thanks for explaining. I suppose the observation works only on the columns I am actually selecting in the query and not the whole tables, is that correct?

I am asking because I could observe the whole list, but select IDs only and then when the cell is about to be display, fetch the rest from the database, but I am not sure when I am selecting only the ID in the observation and something changes if it would trigger an event in the observation.

EDIT: actually I see this is where the DatabaseRegionObservation comes in I guess. I can have the full request with all columns I need selected there and when it get notified, fetch IDs only to save memory and then use those IDs to lazily load and dispose data for the cells/views as needed.

@groue
Copy link
Owner

groue commented Aug 25, 2019

I suppose the observation works only on the columns I am actually selecting in the query and not the whole tables, is that correct?

Yes.

am asking because I could observe the whole list, but select IDs only and then when the cell is about to be display, fetch the rest from the database, but I am not sure when I am selecting only the ID in the observation and something changes if it would trigger an event in the observation.

Observations can observe individual rows like Player.filter(key: 1) or Country.filter(key: "FR"), or User.filter(Column("email") == "...").

But only rows identified by their numeric primary key can be efficiently observed, without performing a refetch of fresh values whenever a change happens on other rows.

You can deduce this from the API of the mother TransactionObserver protocol, which is only given a ROWID when it is notified of a row deletion, insertion, or update in its databaseDidChange(with:) method. It has no way to know, given a change that happens in the country table, that this changes concerns the country FR or not. ValueObservation has thus to refetch just in case. This is not a limitation of GRDB, but a limitation of SQLite.

To sum up: observing individual rows that are not identified by their numeric primary key trigger reads even for changes that happen on other rows.

@zdnk
Copy link
Author

zdnk commented Aug 25, 2019

I also seeing one more issue with observations. Whenever they are created and cancelled even before getting a chance to start (on the writer queue), they still keep the reader thread and it is not released to be reused before it gets to it on the writer queue which wastes resources a lot if the user is browsing application screens where the observations are being used.

I hope my observation is correct and you understand what I mean if weren't realizing this behavior before.

To fix this, I would really appreciate some unsafe/unguaranteed way of observing, because I don't think I'll be able to get a workaround without modifying GRDB.

Another option in my case is to write events for the synchronization state changes as it is the only source of inserts and updates, however I was hoping to avoid thhat with database observing :(

groue added a commit to RxSwiftCommunity/RxGRDB that referenced this issue Aug 26, 2019
@groue
Copy link
Owner

groue commented Aug 26, 2019

I also seeing one more issue with observations. Whenever they are created and cancelled even before getting a chance to start (on the writer queue), they still keep the reader thread and it is not released to be reused before it gets to it on the writer queue which wastes resources a lot if the user is browsing application screens where the observations are being used.

I hope my observation is correct and you understand what I mean if weren't realizing this behavior before.

ValueObservation doesn't do anything until it is started. And its RxGRDB wrapper doesn't start it until the Observable is subscribed to.

But once started, the initial database access can not be cancelled, this is true.

Thanks for spotting this. Your suggestion has been added in the TODO: RxSwiftCommunity/RxGRDB@1bdebc6

Now, it must be said that until your suggested optimization gets implemented, the cost is not very high until reader contention kicks in.

To fix this, I would really appreciate some unsafe/unguaranteed way of observing, because I don't think I'll be able to get a workaround without modifying GRDB.

True. All contributions for improvements are welcome, including pull requests.

I do appreciate your exploration because you suggest many good ideas, improvement paths, and clarifications.

But it is reader contention that is your main issue. You face all the nasty consequences of over-flooding the database. I, for the last time, suggest that you reduce the number of active observations. Fighting GRDB is ill-advised: reader contention is not a GRDB issue. It is a misuse of SQLite on a physically-constrained device.

@groue
Copy link
Owner

groue commented Aug 28, 2019

This conversation has been turned into TODO items added to GRDB itself: a6fa76a. Contributions are welcome!

@groue groue closed this as completed Sep 2, 2019
@groue
Copy link
Owner

groue commented Mar 22, 2020

This issue is fixed in #736.

@groue
Copy link
Owner

groue commented May 2, 2020

Fixed in v5.0.0-beta

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants