Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC3902: Faster remote room joins over federation (overview) #3902

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
171 changes: 171 additions & 0 deletions proposals/3902-faster-remote-joins.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
# MSC3902: Faster remote room joins over federation (overview)

## Background

It is well known that joining large rooms over federation can be very slow (see,
for example, [synapse#1211](https://github.com/matrix-org/synapse/issues/1211)).

Much of the reason for this is the large number of events which are returned by
the [`/send_join`](https://spec.matrix.org/v1.4/server-server-api/#put_matrixfederationv2send_joinroomideventid)
API. (As of August 2022, a `/send_join` request for Matrix HQ returns 206479
events.) These events are necessary to correctly validate the state of the room
at the point of the join, but the list is expensive for the "resident" server to
generate, and even more so for the joining server to validate and store.

This proposal therefore sets out the changes needed so that most of the room
state can be popuated lazily, in the background, *after* the user has joined
the room.

This proposal supersedes [MSC2775](https://github.com/matrix-org/matrix-spec-proposals/pull/2775).

## Proposal

Firstly, we change `/send_join` to return, on request, a much reduced list of
room state. The details of the changes to the API are set out in
[MSC3706](https://github.com/matrix-org/matrix-spec-proposals/pull/3706), but
in summary: `m.room.member` events are omitted from the response.

This gives the joining server enough information to start handling some
interactions with the room. Conceptually, processing then splits into two
threads: one, a modified mechanism for handling incoming events and requests in
the "partial-state" room; and second, a background process which concurrently
"resynchronises" the room state.

### Handling requests and events in the partial-state room

A number of changes must be made to handle the "partial-state" scenario. (As of
this writing, these changes are limited to homeserver implementations, but the
list may be extended to include changes to client implementations before this
MSC is concluded.)

* Processing incoming events received over federation:

* Currently, the
[spec](https://spec.matrix.org/v1.4/server-server-api/#checks-performed-on-receipt-of-a-pdu)
requires that an incoming event "Passes authorization rules based on the
state before the event, otherwise it is rejected". Since we do not know
the (full) state before the event, we can no longer apply this
check. Instead, we perform a state-resolution between the limited state
that we do have, and the event's auth events; we then check that the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This state resolution will be extremely expensive because nearly all events are conflicting

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very familiar with state resolution, but if we take the expected case where the auth events match those in the partial state, we'd be calculating state_res( partial_state={ create, power levels, name, topic, ... }, auth_events={ create, power levels, sender membership }).

You're right that the conflicted state set would include most state events in the room: {name, topic, ..., sender membership}
but wouldn't this set be small, since most memberships aren't in the room's partial state? I'd also expect the auth difference to be small because both sets of state contain the same create and power levels events in the common case.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember the rationale for doing state res here. I think a way to combine the partial state with the auth events was needed and state resolution was the usual way we combine state. Something simpler, like filling in the gaps in partial state with the auth events, might be appropriate too, but I may be overlooking some considerations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember the rationale for doing state res here. I think a way to combine the partial state with the auth events was needed and state resolution was the usual way we combine state.

Yes, basically this. It was an attempt to enforce bans, as far as possible.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since most memberships aren't in the room's partial state

Wouldn't this fill up with time and when the room is almost fully synced include a lot of events?

incoming event passes the authorization rules based on that resolved
state.

This process means that we are largely trusting remote servers not to send
invalid events (hence the need for a revalidation during the
resynchronisation process); however it does mean that if we have a ban for
a particular user, then their events will be rejected.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we have a ban for
a particular user, then their events will be rejected.

This seems to be the goal of soft failing and not the "state at event" check. Before fast remote joins, these banned user's events would still be accepted as outliers. To get the same behavior, we could remove the "state at event" auth check (unless there's a good reason to keep it?).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The case we had in mind was where the "true" state of the room after event P has a ban for user @bad:some-homeserver.com and @bad:some-homeserver.com sends an event E, with {P} as its prev_events and citing its membership prior to the ban as an auth event. The faster joining homeserver may or may not have the ban for that user in its idea of the state after P. If it does, we want event E to be correctly rejected rather than trusting its auth events. To reject E, we have to run the "Passes authorization rules based on the (partial) state before the event" check. But the check would also reject most normal messages because their sender's memberships won't be in the partial state, so we combine the partial state with the auth events using state resolution.

In the soft fail case, where @bad:some-homeserver.com is banned in the current state of the room, but picks very old prev_events to send events with, the accept/reject behavior is unchanged.


* Additionally, no attempt is made to perform a "soft fail" check on incoming events.

* Handling other federation requests: most federation requests require
knowledge of the room state for authorisation (we should reject requests
from servers which do not have users in the room). However, we can no longer
correctly determine that
state. [MSC3895](https://github.com/matrix-org/matrix-spec-proposals/pull/3895)
specifies a new error code to indicate that we were unable to authorise a
request.

* Handling client-server requests: depending on the request in question, the
server may or may not be able to accurately answer it. For example, a
request for the topic of the room via
[`/rooms/{roomId}/state/m.room.topic`](https://spec.matrix.org/v1.4/client-server-api/#get_matrixclientv3roomsroomidstateeventtypestatekey)
can reliably be answered (since we assume we have all non-membership state
in the room), whereas a request for the [list of joined
members](https://spec.matrix.org/v1.4/client-server-api/#get_matrixclientv3roomsroomidjoined_members) cannot be answered.

In the current implementation, requests that require knowledge of
`m.room.member` events for remote users will *block* until the
resynchronisation completes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be to send the currently known members and send updates to this list over /sync's state. Then it also makes sense to include partial rooms in non-lazyloaded syncs. What do you think about that? I think that's better than blocking, especially when we don't know when or if the room ever fully loads.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current implementation in Synapse, non-lazy_load_members syncs don't block but instead pretend the room has not been joined until the homeserver has the full state of the room available. Other endpoints, like /members, still block.

I'm not sure whether it's spec-compliant to initially send the known members over /sync (we can propose a change to the spec if it isn't) or how clients would interpret it. It might be that all clients are okay with it. We just haven't tested it.


(Note that we can reliably answer requests that require knowledge only of
the membership state for local users.)
Comment on lines +80 to +81
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? Suppose @alice:alice.com partially joins a room via bob.com. Before the resync completes, she is banned from that room by @chris:chris.com. Call the ban event B.

Is there some way that B can be an invalid event whilst appearing valid enough to alice.com? If so, that means we can't reliably answer requests about local users' membership(?): during resync we'd think Alice was banned, but post-resync we'd realise she wasn't.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that can certainly happen. "Reliably" here is doing a bit of heavy lifting: we actually don't know any of the state for certain during the resync (events may be accepted that should not have been, and they can also be rejected when they should not have been. I seem to remember writing complement tests for both cases). We may therefore tell clients about state that turns out to be wrong. In some ways this is an extension of matrix-org/matrix-spec#1209.

But still, membership events for local users are in the same bracket as, for example, m.room.topic events: we assume that our view of the state is good enough, and return the events to clients without waiting for the resync to complete.


* [`/sync`](https://spec.matrix.org/v1.4/client-server-api/#get_matrixclientv3sync)
requires specific changes:

* If [lazy-loading](https://spec.matrix.org/v1.4/client-server-api/#lazy-loading-room-members)
of memberships is enabled, then any "partial state" room is included in
the response. Even when lazy-loading is enabled, the server is expected to
"include membership events for the `sender` of events being returned in
the response". Since we do not have the full state of the room, we may be
missing membership events for some senders. We resolve this by checking
the `auth_events` for affected events, which must include a reference to a
membership event.

* If lazy-loading is *not* enabled, partial-state rooms are omitted from the
response (until the state synchronisation completes).

(This is [pending implementation](https://github.com/matrix-org/synapse/issues/12989) in Synapse.)
Comment on lines +97 to +98
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line is redundant, now that the Synapse issue is closed (by this PR).

Suggested change
(This is [pending implementation](https://github.com/matrix-org/synapse/issues/12989) in Synapse.)


* Outgoing events: This is [pending
implementation](https://github.com/matrix-org/synapse/issues/12997), but is
likely to require some changes to ensure we do not get into a situation of
being unable to safely answer a
[`/get_missing_events`](https://spec.matrix.org/v1.4/server-server-api/#post_matrixfederationv1get_missing_eventsroomid)
or
[`/state_ids`](https://spec.matrix.org/v1.4/server-server-api/#get_matrixfederationv1state_idsroomid)
request for an event we have generated.

* [Device management](https://spec.matrix.org/v1.4/server-server-api/#device-management):
homeserver implementations are expected to maintain a cache of the device
list for all remote users that share a room with a local user, via
`m.device_list_update` EDUs. To handle incomplete membership lists, we need to make the following changes:

* Fixes to outgoing device list updates: we keep a record of any *local*
device list changes that take place during the resynchronisation, and,
once resync completes, we send them out to any homeservers that were in
the room at any point since we started joining. ([Synapse
implementation](https://github.com/matrix-org/synapse/pull/13934))

* Fixes to incoming device list updates: normally we ignore device-list
updates from users who we don't think we share a room with. To ensure we
do not discard incoming device list updates, we keep a record of any
*remote* device list updates we receive, and replay them once resync
completes. ([Synapse
implementation](https://github.com/matrix-org/synapse/pull/13913))

### Resynchronisation

Once a server receives a "partial state" response to `/send_join`, it must then
call [`/state/{room_id}`](https://spec.matrix.org/v1.4/server-server-api/#get_matrixfederationv1stateroomid),
setting `event_id` to the ID of the join event returned by `/send_join`, to
obtain a full snapshot of the state at that event. It can then update its database
accordingly.

However, this process may take some time, and it is likely that other events
have arrived in the meantime. These new events will also have been stored with
"partial state", and will not have been subject to the full event authorisation
process. The server must therefore work forward through the event DAG,
richvdh marked this conversation as resolved.
Show resolved Hide resolved
recalculating the state at each event, and rechecking event authorisation,
until it has caught up with "real time" and new events are being created with
"full state".

## Potential issues

TBD

## Alternatives

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatives: #3883, or "interactive peeking"

Less complicated state res => easier to verify correctness of spec and implementation


TBD

## Security considerations

It's important to note that, during the resynchronisation process, events are
accepted without running the full checks process; this is an inevitable
consequence of having partial state, but does mean that we might accept abusive
events that would otherwise be rejected.

This is mitigated by (a) the process of re-running the event authorisation
process once we have full state, and (b) the fact that "partial state" is a
transient state: in other words, the window for sending abusive content is
limited, and only users who happen to be in the room during the
"resynchronisation" process will observe the abusive content.

## Unstable prefix

n/a

## Dependencies

This MSC builds on [MSC3706](https://github.com/matrix-org/matrix-spec-proposals/pull/3706) and [MSC3895](https://github.com/matrix-org/matrix-spec-proposals/pull/3895)
(which at the time of writing have not yet been accepted into the spec).