-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apparently TagPidSequenceNr is reset to 1 when the application is restarted causing duplicates #312
Comments
Hi @glammers1 This looks broken. I just did your scenario but it worked and we have a test that does this. Is this something that's happening deterministically for you? If so can you send me the logs @ DEBUG level? |
The logs i'd be interested in are (we left pretty verbose debug logging in for this initial version): Seeing the recovery of the persistent actor, what sequence number from:
Then any of the missing tag writes being sent:
Any any tag progress updates
|
I think this happens when you have a snapshot for the latest event when you close down. The journal currently uses the replay messages to restore tag pid sequence numbers and write any missed tag_writes causes by a hard/crash shutdown. I would suggest not using 0.80 if you use snapshots until we find a solution for this |
Hi, As you have said the issue is related with snapshotting. You can use this PoC to reproduce the issue (I think is easier to use the PoC in DEBUG log level instead of copying the logs here in order to have full control of the problem, but I don't care to copy them here if you need it). The project has the following:
Note: if I move the persistAll(
List(Tagged(Evt(data + state.counter), Set("all", "myTag")),
Tagged(MessageProcessed, Set("all", "myTag")))) {
case Tagged(evt @ Evt(_), _) =>
updateState(evt)
saveSnapshot(state)
case Tagged(_, _) =>
context.system.log.info("message processed")
}
Steps to reproduce:
|
Thanks for the comprehensive reproducer. The reason moving the saveSnapshot fixes it as the bug only happens if the last event you do before shutting down is in the latest snapshot. I've tested your project with the PR I raised (#316) and it fixes the issue. |
Fixed by #316 |
I'm doing some tests with the default configuration and, in my humble opinion, there're some unexpected behavior.
I have an actor that receives Increment messages to increase a counter. When the application is running from scratch everything works fine, the events Incremented are persisted in the messages table and tag_view table, also tag_write_progress looks pretty good.
(I've simplified the example for the shape of simplicity, just sending 4 Increment messages)
messages:
tag_views:
tag_write_progress:
Next steps are: stop, re-run the application and send a new Increment message. A TagPidSequenceNr with value 1 is assigned to the Ìncremented event. Tables change to:
messages:
tag_views:
tag_write_progress:
Now, if we read the journal with NoOffset, the counter value is 4 (not 5) because the last event is dropped with a log message:
The state of the actor is not correct and, for example, the following 3 Increment messages also will be dropped until the tag_pid_sequence_nr = 5.
The text was updated successfully, but these errors were encountered: