Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ssb/consistency error: message sequence missmatch for feed #81

Open
cryptix opened this issue Mar 1, 2021 · 8 comments
Open

ssb/consistency error: message sequence missmatch for feed #81

cryptix opened this issue Mar 1, 2021 · 8 comments
Labels
bug Something isn't working documentation

Comments

@cryptix
Copy link
Member

cryptix commented Mar 1, 2021

This might happen because of a local duplication bug (some messages are append twice) in the verify logic.

It is a fatal error because it cripples replicatio. Tthe system thinks a feed has more messages then it actually does.

The workaround is to start go-sbot with -fsck sequences -repair which will check all the messages in a log, delete the affected feeds. On the next sync/connection it will reload them.

AFAICT this only happens for external messages, not the ones signed by the bot itself.

@cryptix cryptix added the bug Something isn't working label Mar 1, 2021
@cryptix
Copy link
Member Author

cryptix commented Mar 1, 2021

This is related to #40

@mhfowler
Copy link

mhfowler commented Mar 3, 2021

@cryptix I just ran into this error too, and confirm your workaround fixed the issue, thanks!

@ahdinosaur
Copy link

have been running into this error again and again, not sure if i'm doing something wrong but seems to happen occasionally if i stop and restart the service running go-sbot.

@mycognosist
Copy link
Member

mycognosist commented Nov 24, 2022

Just ran into this with a build from latest code on master; running on Pi3 B+. Issue occurred after killing the go-sbot process and then running it again.

The offending feed was that of the go-sbot itself (ie. the local feed).

go-sbot: fsck returned: ssb/consistency error: message sequence missmatch for feed @Fq0a4uYHoEpYOV29LjQVK8Lt3fDK0epfjMxCa9/8NK4=.ed25519 Stored:6 Logical:4

After running go-sbot -fsck sequences -repair I was then able to execute go-sbot without any errors. However, I am no longer seeing replication with my local Patchwork instance.

CC: @decentral1se

@decentral1se
Copy link
Member

@mycognosist thanks for the report! Would be great to fix this one.

Are you seeing anything in the go-sbot logs re: failure to replicate? Patchwork should be sending incoming replication requests.

Do you see any duplicated messages in the output of sbotcli log? Maybe I could try writing a script which looks for duplicates in the log and that would make rooting out the core of this issue easier? Unsure how hard it is to compare and probably slow, but might be handy. It'd be nice to at least confirm the assumption in #81 (comment) e.g. local duplication.

As for connecting this with #40, I'm a bit lost. That seems like the dragons part. Any further info would be great. Maybe we can still break this up into small parts for debugging / troubleshooting? Try to figure out how to reproduce seems difficult. Even if we could do it manually, it'd be great. Perhaps some small win in the graceful shutdown code can be had...

For the time being, the work-around is probably the one mentioned in https://github.com/ssbc/go-ssb#startup-error--illegal-json-value but that is obviously not ideal. But just in case you need to keep moving for whatever reasons.

@mycognosist
Copy link
Member

Hmm so what I notice is that the local log was seemingly wiped after the go-sbot -fsck sequences -repair command was executed.

If I call sbotcli hist now I only see one message: the post I made after starting go-sbot post-repair. That would explain why that message didn't make it across to Patchwork (sequence mismatch).

Let me take another run at this with a fresh start and see if I encounter the error again. I'll try to find duplicate messages if I hit a snag again.

One other detail to mention is that I was running legacy replication when this happened.

Maybe I could try writing a script which looks for duplicates in the log and that would make rooting out the core of this issue easier?

This could be handy. I guess you could create a list of each message key you find and shout if there's a duplicate.

As for connecting this with #40, I'm a bit lost. That seems like the dragons part. Any further info would be great.

Agreed. Could be a good issue to ask cryptix about on a call (if we can).

@decentral1se
Copy link
Member

Potentially also related: #201

@decentral1se decentral1se pinned this issue Nov 24, 2022
@decentral1se
Copy link
Member

Maybe I could try writing a script which looks for duplicates in the log and that would make rooting out the core of this issue easier?

This could be handy. I guess you could create a list of each message key you find and shout if there's a duplicate.

Approach for checking for duplicates has emerged in #201 (comment) thanks to @mplorentz, scanning for duplicated signatures seems like a simple way to do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation
Projects
Development

No branches or pull requests

5 participants