-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ordering by CommitStamp is not reliable #159
Comments
The SQL statement issued to Oracle is shared by all SQL persistence engines: SELECT StreamId, StreamRevision, CommitId, CommitSequence, CommitStamp, Headers, Payload
FROM Commits
WHERE Dispatched = 0
ORDER BY CommitStamp
LIMIT @Limit OFFSET @Skip; so I assume that this issue is a problem across all them all. There is already a CommitSequence that is incremented per stream. Would something like this work (just off top of my head)? SELECT StreamId, StreamRevision, CommitId, CommitSequence, CommitStamp, Headers, Payload
FROM Commits
WHERE Dispatched = 0
GROUP BY StreamId
ORDER BY CommitSequence
LIMIT @Limit OFFSET @Skip; Any chance of a failing test? (Doesn't have to be mspec based, xunit would do) |
It looks like for the MySQL dialect any DateTime is saved as Ticks. Look at the CoalesceParameterValue override. This is a common problem even in MongoPersistence (so i would guess also in Raven). There is an issue for mongo that has a pull request that changes mongo to save Ticks, however it is a breaking change for current users.... |
Breaking changes of this nature would be a huge problem without some kind On Mon, May 13, 2013 at 3:26 PM, erothFEI notifications@github.com wrote:
|
I have done a little more research into this and for the sql engine Oracle is particularly bad because it is storing CommitStamp as a Date type which only goes down to the second precision. It should probably be changed to timestamp which brings the precision down to the nanosecond level. It would be easy enough to write a sql script to do that update on any database. For standard sql server 2008+ we should probably use datetime2 instead of datetime which will bring us down to the 100 nanosecond precision. Here is a good link describing comparison types to datetime2 in other dialects: http://wiki.ispirer.com/sqlways/sql-server/data-types/datetime2. Or we can just move all dialects to use Ticks and just write a bunch of sql scripts to do updates to all of the sql dialects. For Mongo we are currently only getting millisecond precision so updating to some form of saving ticks is the only way to get more precise because Mongo does not support any other built in datetime types. We would have to write some kind of db update script for this one as well. For Raven it turns out by default it is giving the same precision as datetime2 so if that is precise enough we would not have to touch Raven. |
While more precise datetime values are definitely more desirable across all storage engines, breaking schema changes are not going to make 3.2, but will be deferred to 4.0. Yes, we will need to supply a migration script. Nonetheless, and unless I'm missing something, increasing the precision of CommitStamp won't solve this particular issue, but merely reduce the chance of it occurring? |
Yes it will just reduce the chance, the only way i can think of to completely solve it would be to have some sort of global commitsquence (the current one is only per stream I believe) but managing that would be a mess. We could update to order by CommitStamp then by CommitSequence but that would only guarantee commits within a stream are in the correct order but if they are in different streams there is still a chance they are out of order, but i would assume that is less likely a problem? |
We assure this in a single partition but not between many. Eg if on same On Wednesday, May 15, 2013, erothFEI wrote:
Le doute n'est pas une condition agréable, mais la certitude est absurde. |
In a previous life time I handled this as well but using Timestamp (ticks) On Wednesday, May 15, 2013, erothFEI wrote:
Le doute n'est pas une condition agréable, mais la certitude est absurde. |
A stream represents the transactional boundary, and they may even be distributed in the future, so yeah, global commit sequence wouldn't work.
I think that is all we care about. Guaranteeing order across streams when reading / replaying doesn't make sense to me. Are there any use cases where it does? |
There are tons of usecases for assuring across streams. Either perfectly On Wednesday, May 15, 2013, Damian Hickey wrote:
Le doute n'est pas une condition agréable, mais la certitude est absurde. |
@gregoryyoung What if you re-partition, i.e. move streams between partitions? (Is that even a thing?) Does a stream care which partition it lives on? Do streams themselves get partitioned, i.e. really long lived stream where older events are stored remote, and newer events are 'local'? |
@Jonne Can you confirm whether the ordering of events is important to you per-stream, or globally? |
We support ability to provide partition locality to two streams. For your situation sure we can do that but it will be as multiple streams. On Wednesday, May 15, 2013, Damian Hickey wrote:
Le doute n'est pas une condition agréable, mais la certitude est absurde. |
For us it's important, because we synchronize events between systems and have dependencies between denormalizers. |
@Jonne ordering per-stream, or ordering globally? |
Ok, there are 2 issues here, the first is guaranteed ordering per-stream which is a bug that could be fixed for 3.2. (If anyone would like to comment on my proposed SQL change, I'd appreciate that) The second is global ordering across streams which would be probably mean a breaking schema change. i.e. we'd need to have a counter somewhere. I, personally, am not particularly fond of this due to the sharding implications as well as contention/concurrency issues with multiple writers from different locations. I concede that it's worth discussing further for a future version though. Ideally such functionality would be provided as an extension. |
@dennisdoomen If you have a moment, could you expand on your distributed scenario? Are you replicating the store? Do you allow distributed writes or are the peers readonly? Cheers. |
For your sql, as written it doesn't work right? The grouping breaks the select because all of the selected columns other than StreamId are not in the group statement so it will fail. How about this: Would probably need to update indexes so this is still fast for large databases, but this should guarantee ordering per stream. Now if it turns out that two commits on different streams have the same CommitStamp it will always give you the stream with the least amount of commits first which may or may not be a problem. |
@damianh We are essentially replicating the store to different peers (which we call 'slaves') bidirectionally. We use the concept of ownership on aggregate level to prevent multiple peers to change the same stream. This prevents any functional conflicts (for now). We also extended the Commits table with information about the peer that caused the commits, so that we can optimize the data is transferred between the peers (can be LAN, Wifi or 3G). |
This fix guarantees ordering per-stream. It will produce a different sequence across streams than previously, but there has never been any guarantees in this respect in the first place. |
Global ordering was broken by "fix" in Pull Request #169. Now we can not use it version 3.2 into production. Our projections rely on right order of events, so projections can not be rebuilt. |
@gaevoy please see #170 and specifically damianh's last comment. |
@gaevoy in version 3.2 and 4.x, ordering is only guaranteed per stream. There was never any 'global' ordering guarantee. In 5.x, ordering across streams is supported (but at the cost of sharding). The primary purpose of this is to support replay / catchup. If you require global ordering across streams for projection logic, and you are doing DDD, you should revisit that design decision. |
As of 4.1.0.7 (available on CI feed) the ordering is now by CommitStamp followed by StreamId, CommitSequence. This is sub-optimal but it matches 3.2 behaviour and fixes the reliability problem. This modification is in 4.1 only and will not be ported to 5.x - the recommended way is to use the checkpoint feature. |
@damianh, I have tested 4.1.0.8 from NuGet on our production event stream, now everything is okey, all projections were rebuilt. Thank you. |
Thx for the feedback.
|
Note the problem reported by @gaevoy also affects the IStoreEvents.Advanced.GetFrom and GetFromTo APIs (not only GetUndispatchedCommits). For those, ordering by CommitStamp (first) is absolutely necessary to implement catch up. Without that ordering by CommitStamp, the whole (global) commit stream would need to be read into memory to get a meaningful checkpoint. Therefore, catch up cannot actually be implemented between 3.2 and 4.1.0.7 AFAICS. |
The GetUndispatchedCommits query orders by CommitStamp. However, these can be the same for multiple commits. On Oracle this can result in events being dispatched in the wrong order. Maybe we should add a commit sequence (Identity or sequence) and order commits based on this?
The text was updated successfully, but these errors were encountered: