-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider global deterministic ordering of commits across streams #170
Comments
Just by adding more precision of the CommitStamp in the db significantly improves the odds of not having commits with the same stamp however this also restricted by the resolution of the DateTime.UtcNow property of .Net. This is only updated anywhere from 10-15 milliseconds. You can get around this issue by utilizing the Stopwatch class in .Net, this will give you practically down to the Tick level of precision. In my fork that can be seen here: erothFEI@aab703f, I have added a unit test that demonstrates what I am thinking. This is obviously not a real unit test it was just a quick way for me to demonstrate what I am thinking. By doing it like that it virtually makes it not possible to have a duplicate CommitStamp on the same machine and would make it very rarely happen on different machines. And honestly if it were to happen, then really the two commits did happen at the same exact time and does it matter which one was first if you are dealing with different streams. You would still probably want to sort by commitsequence after sorting by commitstamp to guarantee the correct order within a stream. Introducing some kind of global sequence across machines seems like way to much of a hassle and makes it much more difficult to scale freely. What do you guys think? |
I don't think having commits with the same commit stamp is an actual problem - the library is designed with multiple writers (same or different machines) in mind, so it's perfectly viable to have commits with the same commit stamp. The first problem is that the queries were sorting by commit stamp and resulting incorrect order within a stream. This has been fixed in PR #169, where we are now sorting by CommitSequence and thus guarantee ordering within a stream. The second problem is that people are using CommitStamp to infer global ordering. This is bad. It looks like it works... until it doesn't. Also, the client is responsible for setting the CommitStamp value so there is no central authority ensuring that they are correct. Really, CommitStamp should just be treated at this point as something informational and not guaranteeing anything.
I agree, but it seems others don't and I'd like to understand their use cases more. |
http://stackoverflow.com/questions/17483273/neventstore-issue-with-replaying-events Currently when replaying from the store events can come in really messy order. For example events recorded today come before events recorded 1 month ago. This is huge. That means we have to design the read model without relations because everything is blowing right now. :( After all this is history, events (stream revisions) should be replayed in correct order somehow. We do not use shards or multiple writers and that is why I cannot help here much. We were forced to remove all hard relations (NHibernate read model) and replace them with simple Guids. The joins are done in code. If we are going to shard it will be ok, but we are not. |
There is a geteventstore adapter coming for persistence. This is already On Monday, July 8, 2013, mynkow wrote:
Le doute n'est pas une condition agréable, mais la certitude est absurde. |
I've responded to the SO question. Feel free to continue discussion here Yes, there is a GetEventStore adapter in the works, and while it will If you need to stay with SQL Server, and you must *have "global
Regards, Damian On 8 July 2013 11:47, mynkow notifications@github.com wrote:
|
None actually global ordering is quite simple in a single partition. In Most can run in a single partition thus perfect ordering (let's say 10k tps) On Monday, July 8, 2013, Damian Hickey wrote:
Le doute n'est pas une condition agréable, mais la certitude est absurde. |
When using multiple partitions (aka shards), do you have a single sequence number somewhere that guarantees ordering across all the partitions? |
Vector clocks. On Monday, July 8, 2013, Damian Hickey wrote:
Le doute n'est pas une condition agréable, mais la certitude est absurde. |
I do not get it. I cannot imagine a model without any relations. When replaying events I get some event for user before the event for creating that user. The events are persisted in 5 months difference. Events are thrown from different aggregates. |
You should consider not doing DDD/CQRS/ES so. It's not a silver bullet. If On 8 July 2013 15:24, mynkow notifications@github.com wrote:
|
Sorry but what?! I don't normally claim authority but here is a place I What is wrong with expecting ordering in an event stream outside aggregate Greg On Monday, July 8, 2013, Damian Hickey wrote:
Le doute n'est pas une condition agréable, mais la certitude est absurde. |
Firstly, I suspect @mynkow has some domain modeling probs. I don't have
Em... NES does guarantee ordering in an event stream. The typical NES |
Ok so streams per aggregate work very well in the domain. However many (most) projections cross aggregate boundaries in which case Greg On Monday, July 8, 2013, Damian Hickey wrote:
Le doute n'est pas une condition agréable, mais la certitude est absurde. |
Yep, and the aggregates may be in different domains and may not even be in the same store / partition / node / locality / connectivity. |
And this is what you think about when deciding that. Ordering between On Monday, July 8, 2013, Damian Hickey wrote:
Le doute n'est pas une condition agréable, mais la certitude est absurde. |
You mean ordering between partitions in GES, and not in a "persistence On 8 July 2013 16:48, Greg Young notifications@github.com wrote:
|
In any case. The normal way of doing it is to provide a 100% deterministic On Monday, July 8, 2013, Damian Hickey wrote:
Le doute n'est pas une condition agréable, mais la certitude est absurde. |
Yeah, I think NES can be improved in this regard. To clarify, when you say 100% deterministic do you mean:
Replay 1: A1, A2, A3, A4, B1, B2, B3, B4 At the moment, NES supports scenario 1. In scenario 2 for replay 2, NES |
2 On Monday, July 8, 2013, Damian Hickey wrote:
Le doute n'est pas une condition agréable, mais la certitude est absurde. |
I think NES (at least in the SQL engine) is just basing ordering on Commit Sequence. So with this example: C1 = A1, A2, A3 So the events happened like this: A1,A2,A3,B1,B2,B3,A4,A5,B4 So NES saves in commits: C1,D1,C2,C3,D2 But when getting all events it would order by Commit sequence: C1,D1,C2,D2,C3 Which would give you events in this order: A1,A2,A3,B1,B2,B3,A4,B4,A5 So if one aggregate grows faster in number of commits compared to other aggregates you start getting more and more out of order events across aggregates. Eric Roth |
Eric is correct But my main point is that it is still deterministic if there haven't been Hmm, I wonder if that's why JO use CommitStamp there in the first place? In On 8 July 2013 17:23, erothFEI notifications@github.com wrote:
|
Yea that is why I tried creating that fork that just significantly reduced the chance of duplicate CommitStamps by increasing the precision. You could practically eliminate the chance of duplicate CommitStamps on the same machine and significantly reduce the chance across machines because CommitStamps would be stored at the nanosecond level (of course depending if the DB can store that much precision). Using vector clocks would also work but it has issues of its own. In GES, what are you using as the actor, the actual client or the servers? Do you do vector pruning? Eric Roth |
In single partition we use two longs (prepare and commit position logically On Monday, July 8, 2013, erothFEI wrote:
Le doute n'est pas une condition agréable, mais la certitude est absurde. |
@erothFEI solution is far better than the current one. As already said "So if one aggregate grows faster in number of commits compared to other aggregates you start getting more and more out of order events across aggregates.". Of course you cannot get 100% correct order but you do not get 5-6 month older events replayed after the newest ones. On Monday I will try @erothFEI's code and ping back. |
Just take a subscribe an introduce delay of say 5 seconds to reorder. On Monday, July 8, 2013, mynkow wrote:
Le doute n'est pas une condition agréable, mais la certitude est absurde. |
you can always change the order by to be by commit stamp then commit sequence you will just want to make sure you have all your indexes correct so it doesn't get slow. |
@damianh I am thinking to load the events by commitstamp only when doing replay. This way loading aggregates will work normally. In theory this must work because when replaying the events there are no ARs involved at all. What do you think? |
@mynkow Doesn't guarantee ordering. There could still be commit messages within the same stream that have the same stamp. Depending on your system, this may be very unlikely to occur, but when it does, it could be nasty. We are currently considering tenant/partition level ordering though. |
Fixed. See forthcoming announcement on mailing list. |
I assume you are referring to the v5 CI packages announcement (including |
@serra Yep. |
Originally discussed in #159
The text was updated successfully, but these errors were encountered: