-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recover by creating a new epoch if someone only posted an exclude message #127
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'On exclusion, recover if we fail to re-add anyone at all' seems to be flaking a bit on 'There was only 1 re-addition message from alice', it gets 2 instead of 1. Seems to happen more locally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wonder if alice's listener runs so fast that it accidentally creates a new epoch while exclusion is happening, ending up with 2 epochs and therefore 2 re-addition msgs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also happened in ci now. but not because of the latest commit, it happened when i stashed that one too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm no both messages seem to belong to the same epoch. could be more race conditions? maybe just a race condition in the reAdd function?
ok 10 alice excludes bob but crashes before re-adding herself and carol
readdmember res [
{
"previous": "%TC5NW1rAOkiQ/W4YTLhFrh4ASAsI7EcQ8SjOWSZQIRw=.sha256",
"sequence": 3,
"author": "@M3OAID01925gO7/l3ZjKbtbmUkXw0gvqRwr5n7hwwPk=.ed25519",
"timestamp": 1685619076372,
"hash": "sha256",
"content": {
"type": "group/add-member",
"version": "v2",
"secret": "47qTvz1zDX5aUBXZPfYx45ORSNXculWk8pUeirK+Hyg=",
"root": "ssb:message/classic/8ZNj1MuaTVEwULaFtrlzlIKqFLcM6NxmiVODovwoY-0=",
"creator": "ssb:feed/bendybutt-v1/R7oVP6uSBqSmyC-iQxX67S9Zl2498yxJk4NarqMaBtI=",
"recps": [
"ssb:identity/group/98leh3cs6xQmqOVAG90zdMI6WhO04KsaR4kkyds25oU=",
"ssb:feed/bendybutt-v1/R7oVP6uSBqSmyC-iQxX67S9Zl2498yxJk4NarqMaBtI=",
"ssb:feed/bendybutt-v1/vNKLsZwOzBfYvqaOeaBQqQ6dPy7HE9ogb3DnzlTaonM="
],
"tangles": {
"group": {
"root": "ssb:message/classic/8ZNj1MuaTVEwULaFtrlzlIKqFLcM6NxmiVODovwoY-0=",
"previous": [
"ssb:message/classic/mFD-lXW4dmVUoYOL0ybqjI5G8fVAORgNHrvvN1JOuMY="
]
},
"members": {
"root": "ssb:message/classic/mFD-lXW4dmVUoYOL0ybqjI5G8fVAORgNHrvvN1JOuMY=",
"previous": [
"ssb:message/classic/mFD-lXW4dmVUoYOL0ybqjI5G8fVAORgNHrvvN1JOuMY="
]
}
}
},
"signature": "ybDL5FGpyvTCtTedu0NVe0EiDVXuesmLaS3rQsSZi6WOmWObIBAx6Qtq+o+QndvGMV65s2DDgTd3CcmWraw6Bw==.sig.ed25519"
}
]
readdmember res [
{
"previous": "%6lgUNyUwrYgtu9PPbrmTnlyrbJq0z9paj2oBkcjXSog=.sha256",
"sequence": 4,
"author": "@M3OAID01925gO7/l3ZjKbtbmUkXw0gvqRwr5n7hwwPk=.ed25519",
"timestamp": 1685619076457,
"hash": "sha256",
"content": {
"type": "group/add-member",
"version": "v2",
"secret": "47qTvz1zDX5aUBXZPfYx45ORSNXculWk8pUeirK+Hyg=",
"root": "ssb:message/classic/8ZNj1MuaTVEwULaFtrlzlIKqFLcM6NxmiVODovwoY-0=",
"creator": "ssb:feed/bendybutt-v1/R7oVP6uSBqSmyC-iQxX67S9Zl2498yxJk4NarqMaBtI=",
"recps": [
"ssb:identity/group/98leh3cs6xQmqOVAG90zdMI6WhO04KsaR4kkyds25oU=",
"ssb:feed/bendybutt-v1/R7oVP6uSBqSmyC-iQxX67S9Zl2498yxJk4NarqMaBtI=",
"ssb:feed/bendybutt-v1/vNKLsZwOzBfYvqaOeaBQqQ6dPy7HE9ogb3DnzlTaonM="
],
"tangles": {
"group": {
"root": "ssb:message/classic/8ZNj1MuaTVEwULaFtrlzlIKqFLcM6NxmiVODovwoY-0=",
"previous": [
"ssb:message/classic/6lgUNyUwrYgtu9PPbrmTnlyrbJq0z9paj2oBkcjXSog="
]
},
"members": {
"root": "ssb:message/classic/mFD-lXW4dmVUoYOL0ybqjI5G8fVAORgNHrvvN1JOuMY=",
"previous": [
"ssb:message/classic/6lgUNyUwrYgtu9PPbrmTnlyrbJq0z9paj2oBkcjXSog="
]
}
}
},
"signature": "JPGP2ppjawHR4TV+s4pVyyfudYNK449ua3G2mhu6Vc9F/yB8G9L6XJBgxIJuTaRuMsGGXv+MztmpDjpzun3ZBg==.sig.ed25519"
}
]
ok 11 got num of additions after exclude
not ok 12 There was only 1 re-addition message from alice
---
operator: equal
expected: 1
actual: 2
at: Test.<anonymous> (/home/me/prj/ssb/ssb-tribes2/test/exclude-members.test.js:857:5)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the timeout system a bit now so the default in tests is 50ms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm now there's a new one
# Can exclude a person in a group with a lot of members
ok 1 tribes2 started
ok 2 created a group
ok 3 invited 10 peers
ok 4 invited 10 peers
ok 5 peers accept invites
ok 6 alice was able to exclude bob in a group with many members
key ssb:message/classic/0Im6p8Re4bMMTmgaRod2E85YCvMpz49WBPHFeE6a8FI= found, but is disconnected from main graph
[...]
key ssb:message/classic/0Im6p8Re4bMMTmgaRod2E85YCvMpz49WBPHFeE6a8FI= found, but is disconnected from main graph
not ok 7 bob is excluded from group
---
operator: deepEqual
expected: |-
{ id: 'ssb:identity/group/MpdBB51KjaHZ_sFCqIcB1nNE-b9jMkvr6gROxeNflj4=', excluded: true }
actual: |-
{ id: 'ssb:identity/group/MpdBB51KjaHZ_sFCqIcB1nNE-b9jMkvr6gROxeNflj4=', excluded: undefined }
at: Test.<anonymous> (/home/me/prj/ssb/ssb-tribes2/test/exclude-members.test.js:600:5)
stack: |-
Error: bob is excluded from group
at Test.assert [as _assert] (/home/me/prj/ssb/ssb-tribes2/node_modules/.pnpm/tape@5.6.3/node_modules/tape/lib/test.js:312:48)
at Test.bound [as _assert] (/home/me/prj/ssb/ssb-tribes2/node_modules/.pnpm/tape@5.6.3/node_modules/tape/lib/test.js:95:17)
at Test.tapeDeepEqual (/home/me/prj/ssb/ssb-tribes2/node_modules/.pnpm/tape@5.6.3/node_modules/tape/lib/test.js:553:7)
at Test.bound [as deepEquals] (/home/me/prj/ssb/ssb-tribes2/node_modules/.pnpm/tape@5.6.3/node_modules/tape/lib/test.js:95:17)
at Test.<anonymous> (/home/me/prj/ssb/ssb-tribes2/test/exclude-members.test.js:600:5)
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
...
not ok 8 Others got excluded from the group
---
operator: fail
at: <anonymous> (/home/me/prj/ssb/ssb-tribes2/test/exclude-members.test.js:619:20)
stack: |-
Error: Others got excluded from the group
at Test.assert [as _assert] (/home/me/prj/ssb/ssb-tribes2/node_modules/.pnpm/tape@5.6.3/node_modules/tape/lib/test.js:312:48)
at Test.bound [as _assert] (/home/me/prj/ssb/ssb-tribes2/node_modules/.pnpm/tape@5.6.3/node_modules/tape/lib/test.js:95:17)
at Test.fail (/home/me/prj/ssb/ssb-tribes2/node_modules/.pnpm/tape@5.6.3/node_modules/tape/lib/test.js:406:7)
at Test.bound [as fail] (/home/me/prj/ssb/ssb-tribes2/node_modules/.pnpm/tape@5.6.3/node_modules/tape/lib/test.js:95:17)
at /home/me/prj/ssb/ssb-tribes2/test/exclude-members.test.js:619:20
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async Test.<anonymous> (/home/me/prj/ssb/ssb-tribes2/test/exclude-members.test.js:606:3)
...
1..8
# tests 8
# pass 6
# fail 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So bob thinks he's not excluded, and the other peers all seem to be creating new epochs, ending up with 11 sometimes...
Could definitely do more logging here to dig deeper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mixmix feel free to take a look here. i can't decide if it's an antipattern to have the timeout be 0 in tests or having it be non-zero. and do these errors that pop up point to a design error in the recovery or is there a simple fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm when we ask people to create a new epoch we don't also tell them to re-add members, should do that at least
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-adding people in the listener now too but didn't seem to affect this test
OK, have split this into two PRs to simplify debugging:
|
test/list-members.test.js
Outdated
timeoutLow: 0.5, | ||
timeoutHigh: 0.7, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just wrote this in chat about this test:
hmm quick hunch in one test:
alice adds bob and carol
alice removes bob
carol creates a new epoch too for some reason
should be fine in theory but alice doesn't get that epoch, and when alice later adds david, it's only to her own epoch
then when alice gets carol's epoch, there's a conflict
alice picks carol's epoch as preferred since it has fewer members
-> wrong member list, missing david
easiest "fix" for all this is just to disable recovery in most tests (set timeout high (ie default)). then selectively enable in a few cases that we actually want to test
i might just also switch from this to doing the demo app. would be fun to show in berlin. and would show us what bugs actually exist IRL
tests passing now (just linter failing). mostly since we disabled recovery in most tests, so, limited knowledge gained. wondering if we should just merge and see how this fares in the demo app. the situations we're testing just get more and more complex 😓 🤷 |
demo app mostly working now, gonna try this branch out |
Doesn't seem to break stuff a lot but there's this bug that pops up (i think unrelated to this PR) that probably stops the code in this PR from running at all ssbc/ssb-tribes2-demo#7 |
Fresh review (in progress) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it.
@@ -780,10 +788,10 @@ test('On exclusion, if we fail to re-add all people, someone else does that inst | |||
|
|||
test('On exclusion, recover if we fail to re-add anyone at all', async (t) => { | |||
const run = Run(t) | |||
const alice = Testbot({ name: 'alice', timeoutScale: 0 }) | |||
const alice = Testbot({ name: 'alice', ...lowTimeouts }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🌶️ - IMO use the API that ssb-tribes uses, so people reading test code don't get confused by API
const alice = Testbot({ name: 'alice', ...lowTimeouts }) | |
const alice = Testbot({ name: 'alice', tribes2: lowTimeouts }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to skip this change. It's just a recommended pattern
p(bob.close)(true), | ||
p(carol.close)(true), | ||
]) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A beautiful test 🍑
Co-authored-by: mix irving <mix@protozoa.nz>
Based on #126
Progresses #123
Fixes #30