Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ordering by CommitStamp is not reliable #159

Closed
Jonne opened this issue Apr 18, 2013 · 28 comments
Closed

Ordering by CommitStamp is not reliable #159

Jonne opened this issue Apr 18, 2013 · 28 comments
Milestone

Comments

@Jonne
Copy link
Contributor

Jonne commented Apr 18, 2013

The GetUndispatchedCommits query orders by CommitStamp. However, these can be the same for multiple commits. On Oracle this can result in events being dispatched in the wrong order. Maybe we should add a commit sequence (Identity or sequence) and order commits based on this?

@ghost ghost assigned damianh May 9, 2013
@damianh
Copy link
Contributor

damianh commented May 11, 2013

The SQL statement issued to Oracle is shared by all SQL persistence engines:

SELECT StreamId, StreamRevision, CommitId, CommitSequence, CommitStamp, Headers, Payload
FROM Commits
WHERE Dispatched = 0
ORDER BY CommitStamp
LIMIT @Limit OFFSET @Skip;

so I assume that this issue is a problem across all them all.

There is already a CommitSequence that is incremented per stream. Would something like this work (just off top of my head)?

SELECT StreamId, StreamRevision, CommitId, CommitSequence, CommitStamp, Headers, Payload
FROM Commits
WHERE Dispatched = 0
GROUP BY StreamId
ORDER BY CommitSequence
LIMIT @Limit OFFSET @Skip;

Any chance of a failing test? (Doesn't have to be mspec based, xunit would do)

@erothFEI
Copy link

It looks like for the MySQL dialect any DateTime is saved as Ticks. Look at the CoalesceParameterValue override. This is a common problem even in MongoPersistence (so i would guess also in Raven). There is an issue for mongo that has a pull request that changes mongo to save Ticks, however it is a breaking change for current users....

@joliver
Copy link
Contributor

joliver commented May 13, 2013

Breaking changes of this nature would be a huge problem without some kind
of transition script.

On Mon, May 13, 2013 at 3:26 PM, erothFEI notifications@github.com wrote:

It looks like for the MySQL dialect any DateTime is saved as Ticks. Look
at the CoalesceParameterValue override. This is a common problem even in
MongoPersistence (so i would guess also in Raven). There is an issue for
mongo that has a pull request that changes mongo to save Ticks, however it
is a breaking change for current users....


Reply to this email directly or view it on GitHubhttps://github.com//issues/159#issuecomment-17841930
.

@erothFEI
Copy link

I have done a little more research into this and for the sql engine Oracle is particularly bad because it is storing CommitStamp as a Date type which only goes down to the second precision. It should probably be changed to timestamp which brings the precision down to the nanosecond level. It would be easy enough to write a sql script to do that update on any database.

For standard sql server 2008+ we should probably use datetime2 instead of datetime which will bring us down to the 100 nanosecond precision. Here is a good link describing comparison types to datetime2 in other dialects: http://wiki.ispirer.com/sqlways/sql-server/data-types/datetime2.

Or we can just move all dialects to use Ticks and just write a bunch of sql scripts to do updates to all of the sql dialects.

For Mongo we are currently only getting millisecond precision so updating to some form of saving ticks is the only way to get more precise because Mongo does not support any other built in datetime types. We would have to write some kind of db update script for this one as well.

For Raven it turns out by default it is giving the same precision as datetime2 so if that is precise enough we would not have to touch Raven.

@damianh
Copy link
Contributor

damianh commented May 14, 2013

While more precise datetime values are definitely more desirable across all storage engines, breaking schema changes are not going to make 3.2, but will be deferred to 4.0. Yes, we will need to supply a migration script.

Nonetheless, and unless I'm missing something, increasing the precision of CommitStamp won't solve this particular issue, but merely reduce the chance of it occurring?

@erothFEI
Copy link

Yes it will just reduce the chance, the only way i can think of to completely solve it would be to have some sort of global commitsquence (the current one is only per stream I believe) but managing that would be a mess.

We could update to order by CommitStamp then by CommitSequence but that would only guarantee commits within a stream are in the correct order but if they are in different streams there is still a chance they are out of order, but i would assume that is less likely a problem?

@gregoryyoung
Copy link

We assure this in a single partition but not between many. Eg if on same
physical instance. Something similar could probably work well.

On Wednesday, May 15, 2013, erothFEI wrote:

Yes it will just reduce the chance, the only way i can think of to
completely solve it would be to have some sort of global commitsquence (the
current one is only per stream I believe) but managing that would be a mess.

We could update to order by CommitStamp then by CommitSequence but that
would only guarantee commits within a stream are in the correct order but
if they are in different streams there is still a chance they are out of
order, but i would assume that is less likely a problem?


Reply to this email directly or view it on GitHubhttps://github.com//issues/159#issuecomment-17894802
.

Le doute n'est pas une condition agréable, mais la certitude est absurde.

@gregoryyoung
Copy link

In a previous life time I handled this as well but using Timestamp (ticks)
and a random number. Likelyhood of both colliding is pretty ridiculous.

On Wednesday, May 15, 2013, erothFEI wrote:

Yes it will just reduce the chance, the only way i can think of to
completely solve it would be to have some sort of global commitsquence (the
current one is only per stream I believe) but managing that would be a mess.

We could update to order by CommitStamp then by CommitSequence but that
would only guarantee commits within a stream are in the correct order but
if they are in different streams there is still a chance they are out of
order, but i would assume that is less likely a problem?


Reply to this email directly or view it on GitHubhttps://github.com//issues/159#issuecomment-17894802
.

Le doute n'est pas une condition agréable, mais la certitude est absurde.

@damianh
Copy link
Contributor

damianh commented May 14, 2013

A stream represents the transactional boundary, and they may even be distributed in the future, so yeah, global commit sequence wouldn't work.

only guarantee commits within a stream are in the correct order

I think that is all we care about. Guaranteeing order across streams when reading / replaying doesn't make sense to me. Are there any use cases where it does?

@gregoryyoung
Copy link

There are tons of usecases for assuring across streams. Either perfectly
(or not). History can always be done perfectly but not live in a
distributed environment.

On Wednesday, May 15, 2013, Damian Hickey wrote:

A streams represent the transactional boundary, and they may even be
distributed in the future, so yeah, global commit sequence wouldn't work.

only guarantee commits within a stream are in the correct order

I think that is all we care about. Guaranteeing order across streams
when reading / replaying doesn't make sense to me. Are there any use cases
where it does?


Reply to this email directly or view it on GitHubhttps://github.com//issues/159#issuecomment-17895140
.

Le doute n'est pas une condition agréable, mais la certitude est absurde.

@damianh
Copy link
Contributor

damianh commented May 14, 2013

@gregoryyoung What if you re-partition, i.e. move streams between partitions? (Is that even a thing?) Does a stream care which partition it lives on? Do streams themselves get partitioned, i.e. really long lived stream where older events are stored remote, and newer events are 'local'?

@damianh
Copy link
Contributor

damianh commented May 14, 2013

@Jonne Can you confirm whether the ordering of events is important to you per-stream, or globally?

@gregoryyoung
Copy link

We support ability to provide partition locality to two streams.

For your situation sure we can do that but it will be as multiple streams.

On Wednesday, May 15, 2013, Damian Hickey wrote:

@gregoryyoung https://github.com/gregoryyoung What if you re-partition,
i.e. move streams between partitions? (Is that even a thing?) Does a stream
care which partition it lives on? Do streams themselves get partitioned,
i.e. really long lived stream where older events are stored remote, and
newer events are 'local'?


Reply to this email directly or view it on GitHubhttps://github.com//issues/159#issuecomment-17895425
.

Le doute n'est pas une condition agréable, mais la certitude est absurde.

@Jonne
Copy link
Contributor Author

Jonne commented May 14, 2013

For us it's important, because we synchronize events between systems and have dependencies between denormalizers.

@damianh
Copy link
Contributor

damianh commented May 14, 2013

@Jonne ordering per-stream, or ordering globally?

@dennisdoomen
Copy link

@damianh Ordering globally. We (me and @Jonne) assign a unique sequence number to every commit and use that to keep track of which commits a particular peer in a (occasionally disconnected) distributed scenario is missing. This has proved to be a very robust solution.

@damianh
Copy link
Contributor

damianh commented May 15, 2013

Ok, there are 2 issues here, the first is guaranteed ordering per-stream which is a bug that could be fixed for 3.2. (If anyone would like to comment on my proposed SQL change, I'd appreciate that)

The second is global ordering across streams which would be probably mean a breaking schema change. i.e. we'd need to have a counter somewhere. I, personally, am not particularly fond of this due to the sharding implications as well as contention/concurrency issues with multiple writers from different locations. I concede that it's worth discussing further for a future version though. Ideally such functionality would be provided as an extension.

@damianh
Copy link
Contributor

damianh commented May 15, 2013

@dennisdoomen If you have a moment, could you expand on your distributed scenario? Are you replicating the store? Do you allow distributed writes or are the peers readonly? Cheers.

@erothFEI
Copy link

For your sql, as written it doesn't work right? The grouping breaks the select because all of the selected columns other than StreamId are not in the group statement so it will fail.

How about this:
SELECT StreamId, StreamRevision, CommitId, CommitSequence, CommitStamp, Headers, Payload
FROM Commits
WHERE Dispatched = 0
ORDER BY CommitStamp, CommitSequence

Would probably need to update indexes so this is still fast for large databases, but this should guarantee ordering per stream.

Now if it turns out that two commits on different streams have the same CommitStamp it will always give you the stream with the least amount of commits first which may or may not be a problem.

@dennisdoomen
Copy link

@damianh We are essentially replicating the store to different peers (which we call 'slaves') bidirectionally. We use the concept of ownership on aggregate level to prevent multiple peers to change the same stream. This prevents any functional conflicts (for now). We also extended the Commits table with information about the peer that caused the commits, so that we can optimize the data is transferred between the peers (can be LAN, Wifi or 3G).

@damianh
Copy link
Contributor

damianh commented May 16, 2013

This fix guarantees ordering per-stream. It will produce a different sequence across streams than previously, but there has never been any guarantees in this respect in the first place.

@damianh damianh closed this as completed May 16, 2013
damianh added a commit that referenced this issue May 17, 2013
@gaevoy
Copy link

gaevoy commented Sep 13, 2013

Global ordering was broken by "fix" in Pull Request #169. Now we can not use it version 3.2 into production. Our projections rely on right order of events, so projections can not be rebuilt.
Can you revert this change?

@serra
Copy link
Contributor

serra commented Sep 13, 2013

@gaevoy please see #170 and specifically damianh's last comment.

@damianh
Copy link
Contributor

damianh commented Sep 13, 2013

@gaevoy in version 3.2 and 4.x, ordering is only guaranteed per stream. There was never any 'global' ordering guarantee. In 5.x, ordering across streams is supported (but at the cost of sharding). The primary purpose of this is to support replay / catchup. If you require global ordering across streams for projection logic, and you are doing DDD, you should revisit that design decision.

@damianh
Copy link
Contributor

damianh commented Sep 16, 2013

As of 4.1.0.7 (available on CI feed) the ordering is now by CommitStamp followed by StreamId, CommitSequence. This is sub-optimal but it matches 3.2 behaviour and fixes the reliability problem.

This modification is in 4.1 only and will not be ported to 5.x - the recommended way is to use the checkpoint feature.

@gaevoy
Copy link

gaevoy commented Sep 17, 2013

@damianh, I have tested 4.1.0.8 from NuGet on our production event stream, now everything is okey, all projections were rebuilt. Thank you.

@damianh
Copy link
Contributor

damianh commented Sep 17, 2013

Thx for the feedback.
On 17 Sep 2013 16:45, "gaevoy" notifications@github.com wrote:

@damianh https://github.com/damianh, I have tested 4.1.0.8 from NuGet
on our production event stream, now everything is okey, all projections
were rebuilt. Thank you.


Reply to this email directly or view it on GitHubhttps://github.com//issues/159#issuecomment-24598918
.

@ulrichb
Copy link

ulrichb commented Dec 13, 2013

Note the problem reported by @gaevoy also affects the IStoreEvents.Advanced.GetFrom and GetFromTo APIs (not only GetUndispatchedCommits). For those, ordering by CommitStamp (first) is absolutely necessary to implement catch up. Without that ordering by CommitStamp, the whole (global) commit stream would need to be read into memory to get a meaningful checkpoint. Therefore, catch up cannot actually be implemented between 3.2 and 4.1.0.7 AFAICS.

@damianh damianh removed their assignment May 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants