Introduce MutationJournal for coordinator logs#3914
Introduce MutationJournal for coordinator logs#3914iamaleksey merged 1 commit intoapache:cep-45-mutation-trackingfrom
Conversation
There was a problem hiding this comment.
I thought we were going to stick with a time based hlc for this?
There was a problem hiding this comment.
Most recently we've discussed going back to sequential - or possibly 4 bytes sequential id and 4 bytes for a timestamp component to aid debug-ability.
Sequential has a couple added benefits:
- Replicas can close whole ranges even with a coordinator down, if they have the entire sequence among them: with sequential can tell if you aren't missing anything in-between.
- Can precisely tell how many mutations behind a replica is wrt a particular coordinator log.
Either way, this is something that can be easily switched back/forth at any point in the future.
There was a problem hiding this comment.
Ah ok I didn’t realize. Yeah it should also be easier to encode ranges of them as well.
So the benefits of using an hlc would be
- Tombstone purging is straightforward. I’m not sure if it’s 100% necessary though - I think OPRT works with just
isRepairedand no timestamp. - Repair barriers, I’m not sure how these would work with sequential ids.
- Debuggability - If we ever have to debug an incident 1-2 months after it occurred, having a timestamp component will make our lives a lot easier.
I think all of these can be addressed with the 4byte per-id timestamp component (in seconds) though, so long as we guarantee it doesn’t move backwards. If the logId was a timestamp (seconds), we could get just under 72/3 days of millisecond/microsecond information in the per-id timestamp if the log id is used as an epoch. Not sure if that would be worth the trouble though.
There was a problem hiding this comment.
Force-pushed the added component.
patch by Aleksey Yeschenko; reviewed by Blake Eggleston for CASSANDRA-20353
patch by Aleksey Yeschenko; reviewed by Blake Eggleston for CASSANDRA-20353