-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
changefeedccl: document high-water-mark semantics in REGIONAL BY ROW tables during network partition #93203
Comments
Hello, I am Blathers. I am here to help you get the issue triaged. It looks like you have not filled out the issue in the format of any of our templates. To best assist you, we advise you to use one of these templates. I was unable to automatically find someone to ping. If we have not gotten back to your issue within a few business days, you can try the following:
🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
cc @cockroachdb/cdc |
@kathancox I think this is a docs issue, we should possibly add a section on network paritions and how changefeeds respon. TLDR all the nodes will keep emitting and changefeed guarantees still hold |
@amruss so in a |
@amruss I have created a docs issue to update the docs around this: https://cockroachlabs.atlassian.net/browse/DOC-6492 |
@stevekuznetsov there is some complexity here depending on where the coordinator node is. The coordinator will not be able to see some of the aggregator nodes & vice versa. I don't think we've tested in this exact scenario so I'm not 100% sure whether we will fail the changefeed or not. I think this issue should be to test this scenario and document it. |
@amruss sounds good - it would be awesome as well if we could differentiate in the doc between design constraints for the system and emergent behavior from e.g. the query planner or your comment about the coordinator. The broader use-case is, in a |
Highwater mark is tracked per range -- so, I assume under partitioning scenario (and assuming changefeed It does sound to me that in such a case, running geo filtered changefeed should work. |
@miretskiy I'm not sure if I understood your reply correctly, please confirm. I think you are saying that in a multi-region table (possibly with many ranges) there will only be a single coordinator and CRDB doesn't allow for specifying in which region the coordinator runs. |
In addition to that, does it mean that running a changefeed over a multi-region table and losing a region stops propagating the high-water mark altogether? I think this is true, at least based on some info I have found in the docs: I'd appreciate if somebody could confirm. Thanks! |
@p0lyn0mial the caveat I understood is that your change-feed can be filtered, and if the filter pins a specific geo, you get resolved timestamps in that filtered view. |
@miretskiy's comment sounds like that this is not implemented today in a sensible way (but could) because there is no changefeed aggregator geo pinning. |
Yep, that was the gist of my request for:
|
A client uses change-feeds to keep an up-to-date client-side cache of a table. This client requires observing events in a global order, so they wait for high-water-marks to re-order previously-received events client-side before consuming them. When this client applies this approach to a
REGIONAL BY ROW
table, what is the expected semantic of high-water-marks delivered in the change-feed during a network partition between regions that the table has rows for?If the client wants to keep at the very least a strongly consistent view of local data in the table, would it suffice to issue a geo-filtered change-feed, and would the high-water marks for those filtered change-feeds be delivered when the change-feed in question has seen all relevant events up to the mark, or are the high-water-marks still table-global?
Jira issue: CRDB-22219
Epic CRDB-21737
The text was updated successfully, but these errors were encountered: