Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Query subscriptions emit data in an incorrect order #793
When using a Query subscription, I think there is an implicit assumption that the last call to
It seems like this ordering was trying to be enforced at the BoxStore level. In the ObjectClassPublisher a concentrated effort is made to ensure that entity id changes are processed in a synchronized fashion by locking it to one thread at a time. In other words, it ensures that changes are emitted to data observers in the order that they were published.
The problem appears within the QueryPublisher because it takes those changes and publishes them to an unbounded thread pool. Once there, it executes the query to get the data, and emits them to all child observers. This is problematic, because there is nothing preventing multiple threads in the pool from querying and emitting data to the child observer. An example best illustrates the problem.
Lets say we have a Box with a list of car brands. For simplicity, imagine only one thread runs at a time.
Now, the query data observer incorrectly believes that the box contains only Toyota instead of Toyota and Lexus.
The way we are currently working around this problem is only using the entity class observers. Then in the observer we have a
Thanks for the detailed report!
I guess the order can be guaranteed by queuing
However, a transformer also runs on the thread pool without guaranteeing order breaking the above fix. (Providing a thread-pool scheduler might break order as well, but here it is an explicit decision by the user.)
If you are talking about synchronously posting to a queue, and then processing out that queue on a thread-pool thread (similar to how its done in the ObjectClassPublisher) then I think that would work fine for all cases except the transformer case you mentioned.
Do you think for the ObjectClassPublisher we should modify the
As for transformations there are a few solutions I could think of:
When I first read this I thought "an asynchronous thread from what?". The DB Change? Currently it means from the change emission thread. Part of the confusion I think stems from
My thinking is to just run the transformation on the already asynchronous thread the change notification is on. In other words, don't use a scheduler unless one is given to you. This would make the comparison to
The downside to this is that the change thread is then held up by however long it takes to do the transform. I'm not sure this is much of a disadvantage, though, because the only situation where this wouldn't apply to all changes anyway (since all query results are transformed) are situations where users are sharing the same query object to create different subscriptions.
All requests to publishers are now queued and processed on a single thread per publisher. This thread is still scheduled on the internal thread pool. Transformers will run on that same publisher thread. So results will be delivered from the same publisher thread, regardless if transformers are used or not.
If the frequency of publish requests is high (e.g. an entity is updated frequently) and is higher than the transform frequency, this may lead to publish requests filling up the queue. To mitigate that
I uploaded the Java sources for
We welcome your feedback.
This looks good to me. The only point that I'd make is this maybe a good time to also address #695.
As a TL;DR the behavior seen in that ticket is due to observers being obtained at the time of notification instead of the time the data changes (AKA when publish is called).