-
Notifications
You must be signed in to change notification settings - Fork 222
RFC on implementation of _changes feed in FoundationDB #401
Conversation
We could choose to have per-DB transaction ID subspaces (or not), but the cleanup process is global, and clear txn IDs for multiple DBs in a single transaction.
Another topic to keep in mind here is that in having a single HTTP response include all the updates across an entire database we're limiting the top-end write throughput for which this endpoint would be usable. Looking at some simple benchmarks I would guess that somewhere between 10k and 50k writes/sec we'll find that a single consumer of the We've had other discussions about parallel access to the Net - I've focused this RFC on reimplementing the existing API in a FoundationDB world, but there's good reason to try to evolve the API going forward. |
So I managed to fall down the rabbit hole of "Handling Unknown Commit Results". I'm gonna include some thoughts here but I actually think this belongs in the RFC for revision metadata handling as I'll explain in a bit. So the first time I read this I kinda glossed over the section thinking that it was actually going to end up not being an issue when coupled with the revision handling since we'd end up with conflicts elsewhere that would prevent the duplicate entry issues. However, that's not the whole story of what's actually going on that the fdb level here. Hopefully to help clarify the situation here, there are generally speaking three different scenarios related to the application of a transaction:
I was originally focused on the third situation. That case is mostly (though not entirely it turns out!) handled by the way that we read from the However, another aspect of this is detecting between situation 1 and 2 in the face of an unknown commit status. At this point we could just as well throw an error to the client but that's unlikely to lead to happy fun times for our users so attempting to resolve that issue is part of the motivation here. The third option in the RFC about creating transaction ids makes this a lot easier to understand. Basically with every transaction we write a randomized key to the database. At the start of the transaction we check for the key and if it exists we just return successfully to the client. Options 1 and 2 are variations on storing this transaction id inside the So that said, while the duplicate entries in the changes feed that initially motivated this discussion should not be an issue, given how we access the
This is because a recreation does not actually go through normal MVCC since that would require clients to have looked up a possibly deleted document on every document creation. We just grab whatever is there. (Also related, initial doc creation does not have this issue because the absence of any revisions is treated as the precondition). Of the three options in the RFC I think that Option 3 is probably the best path forward as its relatively straight forward both conceptually and implementation wise. The one thing I'd tweak is to change the cleanup aspect. Rather than having Erlang nodes periodically sweeping ets tables I'd prefer something a bit more reliable to ensure we're not slowly accumulating garbage in the transaction id key space. I'm not 100% sure on the best route forward on this. Two thoughts I've had are to pair a UUID with a timestamp and then have each request probabilistically add in a clear for that keyspace prior to some previous time (i.e., 1 in 1,000 chance that any given request will clear out any transaction ids older than an hour). Obviously time in a distributed system is not a thing so that makes me a bit concerned on how that'd mess up. A second approach to consider would be to include the hostname of the Erlang node and then just clean based on a local node's time which is less terrible but maybe some sort of monitoring of pids in transactions might be enough to know what can be cleared? Though that sounds scarily like it'd turn into our new couch_server. |
It seems like we really need to sit down and think hard about the recreated document scenario and whether the current approach of extending a branch is even correct. It has so many weird edge cases like the "validation function bypass on replication" one. |
I've cleaned up the RFC to reflect our conclusion on handling of unknown commit results. I think this is ready for a final merge unless anyone has any objections |
Opening a PR now to get comments. There were three open questions on the mailing list:
feed=continuous
?commit_unknown_result
from FoundationDB?I think we're close to consensus on the first two and have selected options in this RFC. I have an opinion about the third one as well but have left all the options I could think of in the RFC since it's quite fresh.