New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should we support fast batch validation #1
Comments
Isn't this just an internal optimization of validateBatch? I mean, if the API doesn't change, and there are no side effects (are there no side effects?) other than faster perf, why not do it? |
The biggest problem I can think of is that someone writes a new client and messes up the signatures so that they in rare cases don't work. If we sync with this new To expand on that a bit, in both cases you have a portion of the network with one version of a feed and another with a different version. And the only way to allow that feed to continue (if that is even a goal?) would be to come to some conclusion of what the common root chain was and then delete everything after that. This is what people do in the forking situation (since they are in the minority :-)). |
Could the new validateBatch be:
Ah, I see what you mean. Replication happens before validation, and the peer who uploaded the messages will now mark the downloader peer as being at message N. Worse, the downloader could forward the invalid messages to another EBT peer, before (???) validation. We can pretty easily handle the case of an incorrect SSB app by applying validateSingle on the first ~10 messages of the feed, and then validateBatch on all the remaining ones. If you coded signature generation incorrectly, it's probably very rare that signatures are sometimes correct, sometimes incorrect. |
Well if the last one fails, then we are screwed anyway. No need to try and salvage any of the other messages I think. The nice thing about the throttle is that we basically gets this random sampling of points on the chain.
Well they would always have to be validation locally first in any case. But yes as for replication you assume that if you send seq 100 for feed A to another, that the peer will have that, so you don't send again. But the next time you connect with that peer, it will tell you that it never got the messages (this is similar to a crash before saving), so you send them again. Again, this is not really a big problem I think, the same could happen if you forked your feed.
I'm pretty sure we only send messages that we have in our db (meaning validated).
Sure we could do some validation of the first messages in the batch as well, but no matter what we do there i still the chance that we would accept messages that don't validate unless we validate everything. I'm just arguing that the risk of this is rather low, especially for a format like classic and that there are already other bigger problems like forking where some way of signalling the other end that a feed is borked would be nice. |
Yeah, I got a bunch of details wrong, sorry. About forking, I remember Aljoscha saying that fork recovery shouldn't be a thing because then you gain the ability to freely rewrite the past in whatever way you want, but I'm thinking that this is a very theoretical possibility, and maybe we should after all just implement fork recovery with a simple strategy like longest-fork-wins. |
As discussed here ssb:message/sha256/Xf2JIAYPmJJ_0SRVoVQn3NwlC636pQbXmJW1HjLIKQ4=, it could be interesting to add support for only validating the signature of the last message in a batch. This is similar to what buttwoo does. The question if how we go about this? Should it be a config option?
It is significantly faster to sync with this trick. See the numbers here compared to with this:
The text was updated successfully, but these errors were encountered: