Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Spontaneous leaving/rejoining rooms ["state resets"] #1953

Closed
richvdh opened this issue Feb 27, 2017 · 30 comments
Closed

Spontaneous leaving/rejoining rooms ["state resets"] #1953

richvdh opened this issue Feb 27, 2017 · 30 comments

Comments

@richvdh
Copy link
Member

richvdh commented Feb 27, 2017

This might be closely related to #1940, but in this case it's definitely not due to rejected events - and it appears the user can interact with the room as if they had never left

@richvdh
Copy link
Member Author

richvdh commented Feb 27, 2017

Example: @enckse:epiphyte.network left HQ on 2017-02-24 in event id $1487956741180WOzcD:epiphyte.network, but was rejoined on 2017-02-25 by event id $148803062826PXWcQ:potatofrom.space (and on a number of occasions thereafter).

This appears to be an artifact of the state resolution algorithm.

@richvdh
Copy link
Member Author

richvdh commented Feb 27, 2017

More details about the example above:

@enckse:epiphyte.network had previously done the following in HQ:

  • A: left ($1486594266447oNWWS:epiphyte.network)
  • B: joined ($1487868651107pBFZY:epiphyte.network)
  • C: left ($1487956741180WOzcD:epiphyte.network)

His state after these was, of course, C (left). $1487985638318SHxfm:57north.org.uk rewound this to A via its prev_event to $1487677365376MdkVb:slxh.eu (state A), because (A, C) resolves to A. That was ok, because he was still left; however $148803062826PXWcQ:potatofrom.space then reset this to state B via its prev_event to $1487897356168153YJjYh:matrix.org and others (state B), because (A, B) resolves to B.

@seanenck
Copy link
Contributor

FYI - this just happened again to me on 0.20.0 (recognizing this issue isn't closed, just still an issue in 0.20.0)

@richvdh
Copy link
Member Author

richvdh commented Apr 23, 2017

Indeed, 0.20 will have started to make this slightly rarer, thanks #2025, but it's still going to happen when servers turn up and serve new events which refer to old points in the event graph.

Using the example above, the event graph looks like this. (Time flows downwards. ASCII-art arrows represent 'prev_event' links, omitting irrelevant events. A/B/C represent enkse's state after the event).

$1486594266447oNWWS:epiphyte.network   (A)
  ^
  |
$1487677365376MdkVb:slxh.eu  (A)
  ^                      ^
  |                      +-----------------+
  |                                        |
$1487868651107pBFZY:epiphyte.network (B)   |
  ^                                        |
  |                                        |
$1487897356168153YJjYh:matrix.org (B)      |
  ^                      ^                 |
  |                      +---------------- | ---+
  |                                        |    |
$1487956741180WOzcD:epiphyte.network (C)   |    |
  ^                                        |    |
  +---------------+   +--------------------+    |
                  |   |                         |
   $1487985638318SHxfm:57north.org.uk (A)       |         
                  ^                             |
                  |   +-------------------------+
                  |   |
   $148803062826PXWcQ:potatofrom.space (B)

I asked @NegativeMjark for thoughts on how we could fix this; I think what he said amounts to this:

We need to change the state resolution algorithm so that two leaves (as we see at $1487985638318SHxfm:57north.org.uk) resolve to the second leave (C) rather than the first leave (A). In other words, in this example, (A, C) would resolve to C rather than A.

The problem this introduces is that different servers in the network could reach different conclusions about the state of the room. This in turn increases the likelihood of events being rejected. However:

  • hopefully the effects of this would be small: servers might disagree on whether a particular user was a member or not, but unless that user changes the state of the room, that doesn't really matter. (The user's server might get excluded from federation, but this is only an issue in large rooms with lots of servers in the federation, so he'll probably get the events from other servers eventually).
  • Rejected events reset the room state #1935 already means that different servers have different ideas about room state, and we've got this far without the sky falling in. It seems unlikely this would be worse.

An alternative approach would be for a "fixed" server to notice that this sort of state reversion would have happened, and to make sure that, in the next event it sends, it includes prev_event edges to allow "unfixed" servers to sort themselves out. (In the example above, this would be done by including prev_event edges to events with both B and C states).

@turt2live
Copy link
Member

We just had this on Synapse 0.21 (updated today). Please PM me for logs (@travis:t2l.io).

Background: We have a board of directors chat (fully intended to be private/secret), and a director recently left the board, and was naturally removed from the room about a month ago. They suddenly started receiving notifications from the room today, without causing any kind of event or without any action on our part (besides the upgrade?). We subsequently banned the person in an attempt to stop their notifications.

@JaniM
Copy link

JaniM commented May 19, 2017

Ftr, this has been seen in #offtopic.

Mijris kicked: https://matrix.to/#/!UcYsUzyxTGDxLBEvLz:matrix.org/$1495057823766vaFre:krtdex.com

Mijris talked: https://matrix.to/#/!UcYsUzyxTGDxLBEvLz:matrix.org/%241495116380125811LpGEh:chat.weho.st

No join inbetween. This also happened alongside a permisaions revert, where PL limits and user PLs were rolled back (quite common in #offtopic).

@ara4n
Copy link
Member

ara4n commented May 21, 2017

Reports of this happening in HQ for users on the matrix.org HS: https://matrix.to/#/!XqBunHwQIXUiqCaoxq:matrix.org/$14953870101384360LjBep:matrix.org

@richvdh richvdh changed the title Spontaneous leaving/rejoining rooms Spontaneous leaving/rejoining rooms ["state resets"] Aug 17, 2017
@richvdh
Copy link
Member Author

richvdh commented Aug 17, 2017

https://docs.google.com/document/d/1UqcfS72qh8V7iXJ4oEkMIQ_E63xybRkpLxMEWw-EHBo contains some notes on what we might do about this

@non-Jedi
Copy link

In #offtopic:matrix.org today we had the worst state reset we've yet seen (and
they've been a persistent problem in that room for a long time). Power levels in
the room were reset to a state they last held at least a year ago and possibly
as long ago as two years.

This resulted in only one person having an admin account. Luckily, he's still
around. Unluckily, he's lost the password to that account. If he is unable to
recover that account's password, a very active room with over 700 members will
be without any admins.

Furthermore, if that user wasn't around anymore, #offtopic would be
unrecoverable.

@richvdh
Copy link
Member Author

richvdh commented Oct 4, 2017

I had a look at what happened in #offtopic recently. #offtopic is particularly problematic, because the powerlevels have been changed a lot over time, by a number of users, some of whom have only been moderators in the past (so would not have had permission to set the powerlevel).

It appears that the reset in this case happened on 5th September, when event ID $150462325215JivgW:kamp.site was sent into the room referencing a couple of old events.

Some relevant power-level events:

$146634168435QVJBF:m.skaverat.net (A): skaverat gives Half-shot moderator powers.
$148044164010nTaIb:m.skaverat.net (B): skaverat gives Half-shot admin powers.
$1484704212476QrWha:half-shot.uk (C): Half-shot changes the power levels.

Here's a subset of the event graph for #offtopic. (Time flows downwards. ASCII-art arrows represent 'prev_event' links, omitting irrelevant events. [square brackets] represent graph depth. A/B/C represent the power_level state after the event):

$146634168435QVJBF:m.skaverat.net [90868] (A)
 ^
 |
[lots of state resets along the same lines, ending up back at A]
 ^
 |
$1480026000136LiHJq:krtdex.com [210694] (A)
 ^                    ^
 |                    +---------------------------+
 |                                                |
$148044164010nTaIb:m.skaverat.net [214271] (B)    |
 ^                                                |
 |                                                |
$1484704212476QrWha:half-shot.uk [246382] (C)     |
 ^                                                |
 |                                                |
$1484704352501ntrBL:half-shot.uk [246388] (C)     |
                  ^                               |
                  |   +---------------------------+
                  |   |
  $150462325215JivgW:kamp.site [402716] (A)

Note how the new event references old events which are in state A, and state C, but not any events which are in state B which is a necessary intermediate step to get from A to C. The state resolution algorithm therefore rejects C, and ends up at state A.

In technical terms, this is a lack of associativity in the state resolution algorithm. Treating state resolution as a function R, we have:

R(A, R(B,C)) = R(A, C) = A  

R(R(A,B), C) = R(B, C) = C

  ∴  R(A, R(B,C)) != R(R(A,B), C)

@richvdh
Copy link
Member Author

richvdh commented Oct 5, 2017

Power levels in the room were reset to a state they last held at least a year ago and possibly
as long ago as two years.

It was November 2016, just for the record.

@richvdh
Copy link
Member Author

richvdh commented Oct 5, 2017

Something else maybe worth recording: in general, it is not the case that R(A, B, C) is equivalent to either R(R(A,B), C) or R(A, R(B,C)).

@maxidorius
Copy link
Contributor

maxidorius commented Feb 24, 2018

Following further into this discussion:

  1. Wouldn't the PL case outlined above be a fringe case? How can a HS end up referencing both $1480026000136LiHJq:krtdex.com and $1484704352501ntrBL:half-shot.uk if the krtdex.com is a parent of the half-shot event?
  2. Shouldn't the internal resolution of the offending HS properly clear the extremities on that one, so prev_events does not reference parent(s) of a reference?
  3. Shouldn't a invalid HS protect itself against such "invalid" events?

@richvdh
Copy link
Member Author

richvdh commented Feb 25, 2018

How can a HS end up referencing both $1480026000136LiHJq:krtdex.com and $1484704352501ntrBL:half-shot.uk if the krtdex.com is a parent of the half-shot event?

If an HS missed the event after the krtdex.com event (and wasn't able to backfill it later), then it would end up with a gap in the DAG, and the krtdex.com event would be a forward extremity. Then, next time it sent an event, it would try to heal the DAG by sending an event which referenced all of its known forward extremities as prev_events.

Shouldn't the internal resolution of the offending HS properly clear the extremities on that one, so prev_events does not reference parent(s) of a reference?

It would do if it knew that one was a parent of another, but - as above - in this case kamp.site did not know that..

Shouldn't a invalid HS protect itself against such "invalid" events?

I don't really know what you mean here.

@maxidorius
Copy link
Contributor

Fair points for the missing DAG, even tho that seems implementation dependent: one might choose to not let events be created until the missing DAG has been fetched - with unforseen consequences to me at this point, so putting that one aside for now.

Shouldn't a invalid HS protect itself against such "invalid" events?

I don't really know what you mean here.

I would expect the state conflict resolution algo to only be applied to different branches of the DAG, but in this case it's the same one, and I don't see how that will not cause issues if they are intermediary states. But the real question is why would you apply this to the same DAG branch? Can't a HS ignore that parent events in the resolution?
This would obviously mean that a HS should only reference events in prev_events if they have the full DAG between them, so they can know if those are related.

@maxidorius
Copy link
Contributor

maxidorius commented Feb 25, 2018

Maybe I'm not wording this correctly, but what I mean to say is: it seems to me like this is unsolvable if you allow state conflict resolution algo to be applied to events in the same branch, since I don't see how you can guarantee avoiding a missing intermediary state in the missing part, so fetching the DAG between the events you try to apply the state to is also needed, whatever is done.

@richvdh
Copy link
Member Author

richvdh commented Feb 26, 2018

it seems to me like this is unsolvable if you allow state conflict resolution algo to be applied to events in the same branch

I'm not sure what you mean by "events in the same branch", but I guess you mean events where one is an ancestor of another? I don't think it gets any easier if you restrict it so that state resolution can only be performed between events which are not ancestors of one another.

since I don't see how you can guarantee avoiding a missing intermediary state in the missing part, so fetching the DAG between the events you try to apply the state to is also needed, whatever is done.

It should be noted that you will receive any state updates one way or another, even if you don't have the complete DAG. The problem is that you don't (necessarily) have the complete history of how the state got that way, which is (one of) the reasons the state resolution algorithm can't require you to walk the entire DAG.

@maxidorius
Copy link
Contributor

So, let's get back to basics. The state resolution algorithm is used when you need to find out the room state for an event which has 2 or more references in prev_events, because you need to find out what the combined state of those two events will be before applying the authorization rules on it.

From the discussion so far, you have at least two possibilities:

  1. None of the referenced events are ancestors of another. If searching the DAG, you will find a common ancestor for some of them, and ultimately a common ancestor to all.
  2. Some of the referenced events are ancestors of another(s). This means that the complete DAG has a "shortcut path" (there might be a technical term for this, if so please do tell).

For 1), I don't see an issue at first glance with the current algorithm, since you effectively need to merge together two views of the rooms that diverged and have not interacted with each other.

Now, for 2), taking back your example in offtopic, State (A) is an ancestor of (C), which means (A) is already "included" in (C). If you try to merge them (or replay them on top of each other), you will of course end up with a reset. It doesn't even make sense to want to merge A and C as there is no conflict to start with!

At this point, prev_events is being used for two distinct purposes:

  1. Homeservers trying to help each other fix missing pieces of their DAG by informing about all events they have no childs for. That is very sound approach
  2. Allow homeservers to define a parent/child relationship which will affect state.

Use 1) is informative, and does not force anything on the other homeservers, and does not rely on a homeserver having the full DAG subset to make an informed decision about state. So you can put anything you want in there. The more the better at this point.

On the other hand, 2) is authoritative and since events are immutable once stored (putting redact aside for this), you need to get the state bit right.
In the offtopic example, The kamp.site has an "isolated" piece of DAG potentially, meaning it doesn't know yet how that part fits with the rest and shouldn't make an authoritative claim about an event until that missing gap is filled up.

At best, the event $150462325215JivgW:kamp.site [402716] should have referenced only $1480026000136LiHJq:krtdex.com [210694] and the isolated DAG should have been in a locked state, being in progress of backfilling.

Therefore, I think the whole aglo is fundamentally flawed with:

  • Having one piece of information for both purposes. There should be two keys in a PDU: one for filling up the DAG, one for authoritative relationships.
    And unless I am mistaken, that was already split in the past as there is a legacy prev_state key.
  • Homeservers allowed to reference events they haven't managed to connect to the "main" DAG, starting either from room creation, or first user join. They should only reference their latest event connected to the main DAG, not the ones which are "pending" for backfill.

So, assuming the kamp.site HS had indeed a hole in their DAG because they just came back online after a long time, and putting the above recommendations in practice, you would have had the following information in the event:

{
   ...
   "prev_events": [
      "$1480026000136LiHJq:krtdex.com",
      "$1484704352501ntrBL:half-shot.uk"
   ],
   "prev_state": [
     "$1480026000136LiHJq:krtdex.com"
   ]
   ...
}

Hopefully this makes sense.

@richvdh
Copy link
Member Author

richvdh commented Feb 26, 2018

For 1), I don't see an issue at first glance with the current algorithm, since you effectively need to merge together two views of the rooms that diverged and have not interacted with each other.

Not necessarily. Consider if the graph was like this:

$1480026000136LiHJq:krtdex.com [210694] (A)
 ^                    ^
 |                    +--------------------------------+
 |                                                     |
$148044164010nTaIb:m.skaverat.net [214271] (B)         |
 ^                                                 $message:event (A)
 |                                                     |
$1484704212476QrWha:half-shot.uk [246382] (C)          |
 ^                                                     |
 |                                                     |
$1484704352501ntrBL:half-shot.uk [246388] (C)          |
                  ^                                    |
                  |        +---------------------------+
                  |        |
  $150462325215JivgW:kamp.site [402716] (A)

kamp.site has two events in its prev_events, neither of which is an ancestor of another, yet we still have a problem when resolving the state between the two.


At this point, prev_events is being used for two distinct purposes:

  1. Homeservers trying to help each other fix missing pieces of their DAG by informing about all events they have no childs for. That is very sound approach
  2. Allow homeservers to define a parent/child relationship which will affect state.

It's almost all about case 2. 1 is just a side-effect.

At best, the event $150462325215JivgW:kamp.site [402716] should have referenced only $1480026000136LiHJq:krtdex.com [210694] and the isolated DAG should have been in a locked state, being in progress of backfilling.

well, that would be one approach to solving this problem; however Matrix has to date operated under the premise that you can continue to see events and be affected by state changes even if there is a hole in the DAG, and the effects of changing that premise aren't entirely clear to me right now.

Furthermore, it's not clear to me that it's the only approach to solving this problem: for instance, if we could ensure that the state res algorithm was associative, then it would also solve the problem. Conflict-free Replicated Data Types might offer one way of ensuring that - but it might also be hard to map the power of our power_levels onto a CRDT. Anyway the point is that there may be more than one way to do this.

@maxidorius
Copy link
Contributor

kamp.site has two events in its prev_events, neither of which is an ancestor of another, yet we still have a problem when resolving the state between the two.

I don't see an issue. you do have potential conflicts, of course, but the algorithm would propose one way to solve it without going back to a previous state. Instead you would ignore an invalid state.

My understanding is:
reset state = consequence of an invalid algorithm that doesn't consider all available information.
ignore state = arbitrary choice between two states that cannot co-exist together even with all available information.

Am I missing something?

@richvdh
Copy link
Member Author

richvdh commented Feb 26, 2018

you said:

For 1), I don't see an issue at first glance with the current algorithm.

My diagram above is (I think) an example of 1). If we follow the current algorithm, then the power_levels are still unexpectedly reset to A on state resolution. So I am saying that the current algorithm is not satisfactory even in this case.

but the algorithm would propose one way to solve it without going back to a previous state.

Well, that sounds like a different algorithm.

Instead you would ignore an invalid state.

We do ignore an invalid state. The question is over how you define "invalid".

reset state = consequence of an invalid algorithm that doesn't consider all available information.

It's an algorithm which produces surprising, and often unsatisfactory, results. There may exist additional information which could be made available to the state resolution algorithm, and a different algorithm might produce better results. Calling the current algorithm "invalid" is maybe a bit strong, but it's certainly not optimal.

ignore state = arbitrary choice between two states that cannot co-exist together even with all available information.

I don't really understand what you mean here.

@maxidorius
Copy link
Contributor

maxidorius commented Feb 26, 2018

My diagram above is (I think) an example of 1). If we follow the current algorithm, then the power_levels are still unexpectedly reset to A on state resolution. So I am saying that the current algorithm is not satisfactory even in this case.

Indeed :( so the current algo is just doomed

I don't really understand what you mean here.

If you have a netsplit, and that occuring at the same time (same depth, same prev_event):

  • User A on Server Y ban user B from server Z
  • User B on Server Z invites user C from server X

Then one state is no better than the other, in terms of intention from the users, but they can't both be true. One event needs to be ignored.

But in the offtopic, it's not that any event is invalid - they are all valid and possible - and the state shouldn't change, it's that state A is evaluated compared to C, when they shouldn't be in the first place.

This is what I mean: you have state change, which is due to a bug in an algorithm but should not have happened, and you have state choice, where no algorithm can give you an outcome closer to a regular person expectation.

I'm not trying to tell how to solve it at this point, I am only trying to understand what is really going on and why so I can actually implement it and then maybe help fix it (even tho algorithms are not my forte)

@maxidorius
Copy link
Contributor

maxidorius commented Apr 8, 2018

Meanwhile I've put together a test case for mxhsd after I've implemented the current state resolution algo which should include state resets. Hopefully it will provide more clues for people to grasp the issue at hand.

@Half-Shot
Copy link
Collaborator

Got this on MatrixHQ recently. Yay :|

@aaronraimist
Copy link
Contributor

Is this considered fixed in v3 (and v2)?

@non-Jedi
Copy link

non-Jedi commented Apr 1, 2019

I'm in at least one v3 room that has inconsistent state between homeservers if not true “state resets".

@ara4n
Copy link
Member

ara4n commented Apr 1, 2019

@non-Jedi this is news to us; are these rooms something we can debug?

@richvdh
Copy link
Member Author

richvdh commented Apr 1, 2019

Note that "state resets" per se do not cause inconsistent state between homeservers. All servers agree on what the state is; it just doesn't match what a human expects it to be. So if you're seeing inconsistency between servers please consider it a separate issue.

@aaronraimist it's believed to be fixed, or at least significantly improved, but since the problem is due to emergent behaviour when the basic algorithms are applied to complex systems, it's hard to be certain at this stage.

@non-Jedi
Copy link

non-Jedi commented Apr 1, 2019

@ara4n see #4980

@richvdh
Copy link
Member Author

richvdh commented Aug 15, 2019

We believe this to be fixed (or at least massively improved) as of more recent room versions.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants