-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash resistance for excludeMembers #123
Comments
Gonna write some more notes to try to figure out exactly how I should go about this. Goals:
Questions:
|
Tough question, but I'm also leaning towards anyone helping proceed with the exclusion, simply because that dangling exclude-member msg may be confusing.
Being eager about it shouldn't be a problem, because of the "same membership" forked epoch resolution. So if admin A tried to exclude Oscar but stopped in between, then admins B and C can proceed to do it, and they will create two forked epochs, but they'll have the same membership set, and then tie breaking rule applies. In terms of code, I don't know how to organize it. |
Yeah I think I basically ended up going with being agnostic towards who made the breaking state.
I think I was about to try the eager solution as well but ended up deciding against it, since most/all the recovery logic uses long-ish timeouts in it, which would make regular function usage way too slow. |
For exclusion we post 3 different messages
group/exclude-member
message. We don't need to recover from this, if we crash on this step, the user can tell and they just have to try again.group/init
to init the new epoch. Hopefully it's enough to look for a 1. msg. But hmm when should we search for that? If we call excludeMembers again with the exact same args? Should excludeMembers maybe just post exclude-member, and msgs 2. and 3. should be left to listeners?group/add-member
messages. The lib/epoch functiongetMissingMembers
is probably very helpful here.Todos:
Test that we recover on restart of our clientMaybe also run the recovery functions at the start of all functions that get from/operate on members tangles, e.g. if the user notices that exclusion failed, so they click Exclude againhmm the time delays are core in the recovery (not doing it when not needed), and we can't have such delays before running most functionsThe text was updated successfully, but these errors were encountered: