From 8c8d5e3c21aec65d146d6d90d018e36da240d493 Mon Sep 17 00:00:00 2001 From: Matthew Hodgson Date: Tue, 4 Aug 2020 02:52:00 +0100 Subject: [PATCH 01/68] MSC2716: Incrementally importing history into existing rooms A proposal for letting ASes specify event parents and timestamps when submitting events, letting them much more effectively insert past conversation history. cc @tulir for feedback, as the main consumer of the ?ts= API today... --- ...6-importing-history-into-existing-rooms.md | 115 ++++++++++++++++++ 1 file changed, 115 insertions(+) create mode 100644 proposals/2716-importing-history-into-existing-rooms.md diff --git a/proposals/2716-importing-history-into-existing-rooms.md b/proposals/2716-importing-history-into-existing-rooms.md new file mode 100644 index 0000000000..60c0e6bc46 --- /dev/null +++ b/proposals/2716-importing-history-into-existing-rooms.md @@ -0,0 +1,115 @@ +# MSC2716: Incrementally importing history into existing rooms + +## Problem + +Matrix has historically been unable to easily import existing history into a +room that already exists. This is a major problem when bridging existing +conversations into Matrix, particularly if the scrollback is being +incrementally or lazily imported. + +For instance, an NNTP bridge might work by letting a user join a room that +maps to a given newsgroup, first showing an empty room, and then importing the +most recent 1000 newsgroup posts for that room to flesh out some history. The +bridge might then choose to slowly import additional posts for that newsgroup +in the background, until however many decades of backfill were complete. +Finally, as more archives surface, they might also need to be manually +gradually added into the history of the room - slowly building up the complete +history of the conversations over time. + +This is currently not supported because: + * There is no way to set historical room state in a room via the CS or AS API - + you can only edit current room state. + * There is no way to create messages in the context of historical room state in + a room via CS or AS API - you can only create events relative to current room + state. + * There is currently no way to override the timestamp on an event via the AS API. + (We used to have the concept of [timestamp + massaging](https://matrix.org/docs/spec/application_service/r0.1.2#timestamp-massaging), + but it never got properly specified) + +## Proposal + + 1. We let the AS API override the parent(s) of an event when injecting it into + the room, thus letting bridges consciously specify the topological ordering of + the room DAG. We do this by adding a `parent` querystring parameter on the + `PUT /_matrix/client/r0/rooms/{roomId}/send/{eventType}/{txnId}` and + `PUT /_matrix/client/r0/rooms/{roomId}/state/{eventType}/{stateKey}` endpoints. + The `parent` parameter can be repeated multiple times to specify multiple parent + event IDs of the event being submitted. An event must not have more than 20 parents. + If a `parent` parameter is not presented, the server assumes the event is being + appended to the current timeline and calculates the parents as normal. If an + unrecognised event ID is specified as a `parent`, the request fails with a 404. + + 2. We also let the AS API override ('massage') the `origin_server_ts` timestamp applied + to sent events. We do this by adding a `ts` querystring parameter on the + `PUT /_matrix/client/r0/rooms/{roomId}/send/{eventType}/{txnId}` and + `PUT /_matrix/client/r0/rooms/{roomId}/state/{eventType}/{stateKey}`endpoints, specifying + the value to apply to `origin_server_ts` on the event (UNIX epoch milliseconds). + + 3. Finally, we can add a optional `"m.historical": true` field to events to + indicate that they are historical at the point of being added to a room, and + as such servers should not serve them to clients via the CS `/sync` API - + instead preferring clients to discover them by paginating scrollback via + `/messages`. + +This lets history be injected at the right place topologically in the room. For instance, different eras of the room could +end up as branches off the original `m.room.create` event, each first setting up the contextual room state for that era before +the block of imported history. So, you could end up with something like this: + +``` +m.room.create + |\ + | \___________________________________ + | \ \ + | \ \ +live timeline previous 1000 messages another block of ancient history +``` + +We consciously don't support the new `parent` and `ts` parameters on the +various helper syntactic-sugar APIs like `/kick` and `/ban`. If a bridge/bot is +smart enough to be faking history, it is already in the business of dealing +with raw events, and should not be using the syntactic sugar APIs. + +## Potential issues + +There are a bunch of security considerations here - see below. + +## Alternatives + +We could insist that we use the SS API to import history history in this manner rather than +extending the AS API. However, it seems unnecessarily burdensome to make bridge authors +understand the SS API, especially when we already have so many AS API bridges. Hence these +minor extensions to the existing AS API. + +Another way of doing this might be to store the different eras of the room as +different versions of the room, using `m.room.tombstone` events to form a +linked list of the eras. This has the advantage of isolating room state +between different eras of the room, simplifying state resolution calculations +and avoiding risk of any cross-talk. It's also easier to reason about, and +avoids exposing the DAG to bridge developers. However, it would require +better presentation of room versions in clients, and it would require support +for retrospectively specifying the `predecessor` of the current room when you +retrospectively import history. Currently `predecessor` is in the immutable +`m.room.create` event of a room, so cannot be changed retrospectively - and +doing so in a safe and race-free manner sounds Hard. + +## Security considerations + +This allows an AS to tie the room DAG in knots by specifying inappropriate +event IDs as parents, potentially DoSing the state resolution algorithm, or +triggering undesired state resolution results. This is already possible by the +SS API today however, and given AS API requires the homeserver admin to +explicitly authorise the AS in question, this doesn't feel too bad. + +This also makes it much easier for an AS to maliciously spoof history. This +is a bit unavoidable given the nature of the feature, and is also possible +today via SS API. + +If the state changes from under us due to importing history, we have no way to +tell the client about it. This is an [existing +bug](https://github.com/matrix-org/synapse/issues/4508) that can be triggered +today by SS API traffic, so is orthogonal to this proposal. + +## Unstable prefix + +Feels unnecessary. \ No newline at end of file From 3a03172476b583ffb6e968886e95ce940bfb56e8 Mon Sep 17 00:00:00 2001 From: Matthew Hodgson Date: Tue, 4 Aug 2020 02:59:06 +0100 Subject: [PATCH 02/68] note that we don't solve lazyloading history from ASes --- proposals/2716-importing-history-into-existing-rooms.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/proposals/2716-importing-history-into-existing-rooms.md b/proposals/2716-importing-history-into-existing-rooms.md index 60c0e6bc46..cfc76fd615 100644 --- a/proposals/2716-importing-history-into-existing-rooms.md +++ b/proposals/2716-importing-history-into-existing-rooms.md @@ -74,6 +74,11 @@ with raw events, and should not be using the syntactic sugar APIs. There are a bunch of security considerations here - see below. +This doesn't provide a way for a HS to tell an AS that a client has tried to call +/messages beyond the beginning of a room, and that the AS should try to +lazy-insert some more messages (as per https://github.com/matrix-org/matrix-doc/issues/698). +For this MSC to be properly useful, we might want to flesh that out. + ## Alternatives We could insist that we use the SS API to import history history in this manner rather than From 5e6b7b9a93bbc670794037269687dbc7e91d9731 Mon Sep 17 00:00:00 2001 From: Matthew Hodgson Date: Tue, 4 Aug 2020 23:16:50 +0100 Subject: [PATCH 03/68] add another alternative --- .../2716-importing-history-into-existing-rooms.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/proposals/2716-importing-history-into-existing-rooms.md b/proposals/2716-importing-history-into-existing-rooms.md index cfc76fd615..959614e01f 100644 --- a/proposals/2716-importing-history-into-existing-rooms.md +++ b/proposals/2716-importing-history-into-existing-rooms.md @@ -98,6 +98,21 @@ retrospectively import history. Currently `predecessor` is in the immutable `m.room.create` event of a room, so cannot be changed retrospectively - and doing so in a safe and race-free manner sounds Hard. +Another way could be to let the server who issued the m.room.create also go +and retrospectively insert events into the room outside the context of the DAG +(i.e. without parent prev_events or signatures). To quote the original +[bug](https://github.com/matrix-org/matrix-doc/issues/698#issuecomment-259478116): + +> You could just create synthetic events which look like normal DAG events but + exist before the m.room.create event. Their signatures and prev-events would + all be missing, but they would be blindly trusted based on the HS who is + allowed to serve them (based on metadata in the m.room.create event). Thus + you'd have a perimeter in the DAG beyond which events are no longer + decentralised or signed, but are blindly trusted to let HSes insert ancient + history provided by ASes. + +However, this feels needlessly complicated if the DAG approach is sufficient. + ## Security considerations This allows an AS to tie the room DAG in knots by specifying inappropriate From 94514392b118dfae8ee6840b13b83d2f8ce8fcfc Mon Sep 17 00:00:00 2001 From: Matthew Hodgson Date: Tue, 11 Aug 2020 01:23:12 +0100 Subject: [PATCH 04/68] s/parent/prev_event/ for consistency with SS API --- ...2716-importing-history-into-existing-rooms.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/proposals/2716-importing-history-into-existing-rooms.md b/proposals/2716-importing-history-into-existing-rooms.md index 959614e01f..fdd3b85bdf 100644 --- a/proposals/2716-importing-history-into-existing-rooms.md +++ b/proposals/2716-importing-history-into-existing-rooms.md @@ -29,16 +29,16 @@ This is currently not supported because: ## Proposal - 1. We let the AS API override the parent(s) of an event when injecting it into + 1. We let the AS API override the prev_event(s) of an event when injecting it into the room, thus letting bridges consciously specify the topological ordering of - the room DAG. We do this by adding a `parent` querystring parameter on the + the room DAG. We do this by adding a `prev_event` querystring parameter on the `PUT /_matrix/client/r0/rooms/{roomId}/send/{eventType}/{txnId}` and `PUT /_matrix/client/r0/rooms/{roomId}/state/{eventType}/{stateKey}` endpoints. - The `parent` parameter can be repeated multiple times to specify multiple parent - event IDs of the event being submitted. An event must not have more than 20 parents. - If a `parent` parameter is not presented, the server assumes the event is being - appended to the current timeline and calculates the parents as normal. If an - unrecognised event ID is specified as a `parent`, the request fails with a 404. + The `prev_event` parameter can be repeated multiple times to specify multiple parent + event IDs of the event being submitted. An event must not have more than 20 prev_events. + If a `prev_event` parameter is not presented, the server assumes the event is being + appended to the current timeline and calculates the prev_events as normal. If an + unrecognised event ID is specified as a `prev_event`, the request fails with a 404. 2. We also let the AS API override ('massage') the `origin_server_ts` timestamp applied to sent events. We do this by adding a `ts` querystring parameter on the @@ -132,4 +132,4 @@ today by SS API traffic, so is orthogonal to this proposal. ## Unstable prefix -Feels unnecessary. \ No newline at end of file +Feels unnecessary. From 8668e3a438efacc46ea6f16fa912f58d6c8d0987 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 21 Jul 2021 19:12:18 -0500 Subject: [PATCH 05/68] Add initial draft of alternative batch sending historical messages Copied from https://github.com/matrix-org/matrix-doc/pull/2716#discussion_r655859091 --- .../2716-batch-send-historical-messages.md | 154 ++++++++++++++++++ 1 file changed, 154 insertions(+) create mode 100644 proposals/2716-batch-send-historical-messages.md diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md new file mode 100644 index 0000000000..14ba8c4125 --- /dev/null +++ b/proposals/2716-batch-send-historical-messages.md @@ -0,0 +1,154 @@ +# MSC2716: Batch send historical messages + +For the full problem statement, considerations, see the other `proposals/2716-importing-history-into-existing-rooms.md` document. Happy to merge the two, once we get more feedback on it. + +## Alternative batch send proposal + + +### Expectation + +Historical messages that we insert should appear in the timeline just like they would if they were sent back at that time. + +Here is what scrollback is expected to look like in Element: + +![](https://user-images.githubusercontent.com/558581/119064795-cae7e380-b9a1-11eb-9366-5e1f5e6370a8.png) + + + +### New batch send approach + +Add a new endpoint, `POST /_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send?prev_event=&chunk_id=`, which can insert a chunk of events historically back in time next to the given `prev_event`. `chunk_id` comes from `next_chunk_id` in the response of the batch send endpoint and is derived from the "insertion" events added to each chunk. It's not required for the first batch send. +``` +# Body +{ + "events": [ ... ], + "state_events_at_start": [ ... ] +} + +# Response +{ + "state_events": [...list of state event ID's we inserted], + "events": [...list of historical event ID's we inserted], + "next_chunk_id": "random-unique-string", +} +``` + +`state_events_at_start` is used to define the historical state events needed to auth the `events` like join events. These events can float outside of the normal DAG. In Synapse, these are called `outlier`'s and won't be visible in the chat history which also allows us to insert multiple chunks without having a bunch of `@mxid joined the room` noise between each chunk. + +`events` is chronological chunk/list of events you want to insert. For Synapse, there is a reverse-chronological constraint on chunks so once you insert one chunk of messages, you can only insert older an older chunk after that. tldr; Insert from your most recent chunk of history -> oldest history. + +The `state_events`/`events` payload is in **chronological** order (`[0, 1, 2]`) and is processed it in that order so the `prev_events` point to it's older-in-time previous message which is more sane in the DAG. **Depth discussion:** For Synapse, when persisting, we **reverse the list (to make it reverse-chronological)** so we can still get the correct `(topological_ordering, stream_ordering)` so it sorts between A and B as we expect. Why? `depth` is not re-calculated when historical messages are inserted into the DAG. This means we have to take care to insert in the right order. Events are sorted by `(topological_ordering, stream_ordering)` where `topological_ordering` is just `depth`. Normally, `stream_ordering` is an auto incrementing integer but for `backfilled=true` events, it decrements. Historical messages are inserted all at the same `depth`, and marked as backfilled so the `stream_ordering` decrements and each event is sorted behind the next. (from https://github.com/matrix-org/synapse/pull/9247#discussion_r588479201) + +With the new process, the DAG will look like: + +![](https://user-images.githubusercontent.com/558581/119065959-4f3b6600-b9a4-11eb-9e23-2e3769e74679.png) + + +Next we add "insertion" and "marker" events into the mix so that federated remote servers can also navigate and to know where/how to fetch historical messages correctly. + +To lay out the different types of servers consuming these historical messages (more context on why we need "marker"/"insertion" events): + + 1. Local server + - This can pretty much work out of the box. Just add the events to the database and they're available. The new endpoint is just a mechanism to insert the events. + 1. Federated remote server that already has all scrollback history and then new history is inserted + - The big problem is how does a HS know it needs to go fetch more history if they already fetched all of the history in the room? We're solving this with "marker" events which are sent on the "live" timeline and point back to the event where we inserted history next to. The HS can then go and backfill the "insertion" event and continue navigating the chunks from there. + 1. Federated remote server that joins a new room with historical messages + - We need to update the `/backfill` response to include historical messages from the chunks + 1. Federated remote server already in the room when history is inserted + - Depends on whether the HS has the scrollback history for where the history was inserted at. If already has all history, see scenario 2, if doesn't, see scenario 3. + 1. For federated servers already in the room that haven't implemented MSC2716 + - Those homeservers won't have historical messages available because they're unable to navigate the marker/insertion events. But the historical messages would be available once the HS implements MSC2716 and processes the "marker" events that point to the history. + + +### New approach with "insertion" events + + - With "insertion" events, we just add them to the start of each chronological chunk (where the oldest message in the chunk is). The next older in time chunk can connect to that "insertion" point from the previous chunk. + - The initial "insertion" event could be from the main DAG or we can create it ad-hoc in the first chunk so the homeserver can start traversing up the chunk from there after a "marker" event points to it. In subsequent chunks, we can already traverse from the insertion event it points to. + - Consideration: the "insertion" events add a new way for an application service to tie the chunk reconciliation in knots(similar to the DAG knots that can happen). + + +[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBCIC0tLS0tLS0tLS0tLT4gQVxuICAgIGVuZFxuICAgIFxuICAgIHN1YmdyYXBoIGNodW5rMFxuICAgICAgICBjaHVuazAtMigoXCIyXCIpKSAtLT4gY2h1bmswLTEoKDEpKSAtLT4gY2h1bmswLTAoKDApKSAtLT4gY2h1bmswLWluc2VydGlvblsvaW5zZXJ0aW9uXFxdXG4gICAgZW5kXG5cbiAgICBzdWJncmFwaCBjaHVuazFcbiAgICAgICAgY2h1bmsxLTIoKFwiMlwiKSkgLS0-IGNodW5rMS0xKCgxKSkgLS0-IGNodW5rMS0wKCgwKSkgLS0-IGNodW5rMS1pbnNlcnRpb25bL2luc2VydGlvblxcXVxuICAgIGVuZFxuICAgIFxuICAgIHN1YmdyYXBoIGNodW5rMlxuICAgICAgICBjaHVuazItMigoXCIyXCIpKSAtLT4gY2h1bmsyLTEoKDEpKSAtLT4gY2h1bmsyLTAoKDApKSAtLT4gY2h1bmsyLWluc2VydGlvblsvaW5zZXJ0aW9uXFxdXG4gICAgZW5kXG5cbiAgICBcbiAgICBjaHVuazAtaW5zZXJ0aW9uQmFzZVsvaW5zZXJ0aW9uXFxdIC0tLS0tLS0-IEFcbiAgICBjaHVuazAtMigoXCIyXCIpKSAtLi0-IGNodW5rMC1pbnNlcnRpb25CYXNlWy9pbnNlcnRpb25cXF1cbiAgICBjaHVuazAtaW5zZXJ0aW9uIC0tLS0tLS0-IEFcbiAgICBjaHVuazEtaW5zZXJ0aW9uIC0tPiBBXG4gICAgY2h1bmsxLTIgLS4tPiBjaHVuazAtaW5zZXJ0aW9uXG4gICAgY2h1bmsyLWluc2VydGlvbiAtLT4gQVxuICAgIGNodW5rMi0yIC0uLT4gY2h1bmsxLWluc2VydGlvbiIsIm1lcm1haWQiOiJ7fSIsInVwZGF0ZUVkaXRvciI6dHJ1ZSwiYXV0b1N5bmMiOnRydWUsInVwZGF0ZURpYWdyYW0iOnRydWV9) +
+mermaid graph syntax + +```mermaid +flowchart BT + subgraph live + B ------------> A + end + + subgraph chunk0 + chunk0-2(("2")) --> chunk0-1((1)) --> chunk0-0((0)) --> chunk0-insertion[/insertion\] + end + + subgraph chunk1 + chunk1-2(("2")) --> chunk1-1((1)) --> chunk1-0((0)) --> chunk1-insertion[/insertion\] + end + + subgraph chunk2 + chunk2-2(("2")) --> chunk2-1((1)) --> chunk2-0((0)) --> chunk2-insertion[/insertion\] + end + + + chunk0-insertionBase[/insertion\] -------> A + chunk0-2(("2")) -.-> chunk0-insertionBase[/insertion\] + chunk0-insertion -------> A + chunk1-insertion --> A + chunk1-2 -.-> chunk0-insertion + chunk2-insertion --> A + chunk2-2 -.-> chunk1-insertion +``` + +
+ +![](https://user-images.githubusercontent.com/558581/125011503-34917f00-e02e-11eb-9c9e-f4f2253e0c56.png) + + + + + +The structure of the insertion event would look like: +```js +{ + "type": "m.room.insertion", + "sender": "@example:example.org", + "content": { + "m.next_chunk_id": next_chunk_id, + "m.historical": True, + }, + # Since the insertion event is put at the start of the chunk, + # where the oldest event is, copy the origin_server_ts from + # the first event we're inserting + "origin_server_ts": events_to_create[0]["origin_server_ts"], +} +``` + + + +### "Marker" events + + - A "marker" event points simply back to an "insertion" event. + - The "marker" event solves the problem of, how does a federated homeserver know about the historical events which won't come down incremental sync? And the scenario where the federated HS already has all the history in the room, so it won't do a full sync of the room again. + - Unlike the historical events, the "marker" event is sent as a normal event on the "live" timeline so that comes down incremental sync and is available to all homeservers regardless of how much scrollback history they already have. + - A "marker" event is not needed for every chunk/batch of historical messages. Multiple chunks can be inserted then once we're done importing everything, we can add one "marker" event pointing at the root "insertion" point + - If more history is decided to be added later, another "marker" can be sent to let the homeservers know again. + - When a remote federated homeserver, receives a "marker" event, it can mark the "insertion" prev events as needing to backfill from that point again and can fetch the historical messages when the user scrolls back to that area in the future. For Synapse, we plan to add the details to the `insertion_event_lookups` table. + - In Synapse, we discussed not wanting to fetch the "insertion" event when the "marker" comes down the pipe but I've realized that in order to store `insertion_prev_event_id` in the table, we either need to a) add it as part of the "marker" event which works to not fetch anything additional or b) backfill just the "insertion" event to get it. I think I am going to opt for option A though. The plan is to add a new `insertion_event_lookups` table to store which events are marked as insertion points. It stores, `insertion_event_id` and `insertion_prev_event_id` and when we scrollback over `insertion_prev_event_id` again, we trigger some backfill logic to go fetch it. Similar to the `event_backward_extremities` already implemented. + - We could remove the need for "marker" events if we decided to only allow sending "insertion" events on the "live" timeline at any point where you would later want to add history. But this isn't compatible with our dynamic insertion use cases like Gitter where the rooms are already created (no "insertion" event at the start of the room), and the examples from this MSC like NNTP (newsgroup) and email which can potentially want to branch off of everything. + +So the structure of the "marker" event would look like: +```js +{ + "type": "m.room.marker", + "sender": "@example:example.org", + "content": { + "m.insertion_id": insertion_event.event_id, + "m.insertion_prev_events": insertion_event.prev_events, + }, + "room_id": "!jEsUZKDJdhlrceRyVU:example.org" +} +``` + + + + From d40f3b9c7e9f2290b44e5996a1b9dbbd5cc08a70 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 21 Jul 2021 20:36:38 -0500 Subject: [PATCH 06/68] Update with chunk events --- .../2716-batch-send-historical-messages.md | 199 ++++++++++++++---- 1 file changed, 156 insertions(+), 43 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 14ba8c4125..4b14f507fd 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -17,7 +17,7 @@ Here is what scrollback is expected to look like in Element: ### New batch send approach -Add a new endpoint, `POST /_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send?prev_event=&chunk_id=`, which can insert a chunk of events historically back in time next to the given `prev_event`. `chunk_id` comes from `next_chunk_id` in the response of the batch send endpoint and is derived from the "insertion" events added to each chunk. It's not required for the first batch send. +Add a new endpoint, `POST /_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send?prev_event=&chunk_id=`, which can insert a chunk of events historically back in time next to the given `prev_event`. This endpoint can only be used by application services. `chunk_id` comes from `next_chunk_id` in the response of the batch send endpoint and is derived from the "insertion" events added to each chunk. It's not required for the first batch send. ``` # Body { @@ -33,7 +33,7 @@ Add a new endpoint, `POST /_matrix/client/unstable/org.matrix.msc2716/rooms/ oldest history. @@ -41,67 +41,98 @@ The `state_events`/`events` payload is in **chronological** order (`[0, 1, 2]`) With the new process, the DAG will look like: -![](https://user-images.githubusercontent.com/558581/119065959-4f3b6600-b9a4-11eb-9e23-2e3769e74679.png) +![](https://user-images.githubusercontent.com/558581/126577416-68f1a5b0-2818-48c1-b046-21e504a0fe83.png) -Next we add "insertion" and "marker" events into the mix so that federated remote servers can also navigate and to know where/how to fetch historical messages correctly. +[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBCIC0tLS0tLS0tLS0tLS0-IEFcbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBjaHVuazBcbiAgICAgICAgY2h1bmswLTIoKFwiMlwiKSkgLS0-IGNodW5rMC0xKCgxKSkgLS0-IGNodW5rMC0wKCgwKSlcbiAgICBlbmRcblxuICAgIHN1YmdyYXBoIGNodW5rMVxuICAgICAgICBjaHVuazEtMigoXCIyXCIpKSAtLT4gY2h1bmsxLTEoKDEpKSAtLT4gY2h1bmsxLTAoKDApKVxuICAgIGVuZFxuICAgIFxuICAgIHN1YmdyYXBoIGNodW5rMlxuICAgICAgICBjaHVuazItMigoXCIyXCIpKSAtLT4gY2h1bmsyLTEoKDEpKSAtLT4gY2h1bmsyLTAoKDApKVxuICAgIGVuZFxuXG4gICAgXG4gICAgY2h1bmswLTAgLS0tLS0tLT4gQVxuICAgIGNodW5rMS0wIC0tPiBBXG4gICAgY2h1bmsyLTAgLS0-IEFcbiAgICBcbiAgICAlJSBhbGlnbm1lbnQgbGlua3MgXG4gICAgY2h1bmswLTAgLS0tIGNodW5rMS0yXG4gICAgY2h1bmsxLTAgLS0tIGNodW5rMi0yXG4gICAgJSUgbWFrZSB0aGUgbGlua3MgaW52aXNpYmxlIFxuICAgIGxpbmtTdHlsZSAxMCBzdHJva2Utd2lkdGg6MnB4LGZpbGw6bm9uZSxzdHJva2U6bm9uZTtcbiAgICBsaW5rU3R5bGUgMTEgc3Ryb2tlLXdpZHRoOjJweCxmaWxsOm5vbmUsc3Ryb2tlOm5vbmU7IiwibWVybWFpZCI6Int9IiwidXBkYXRlRWRpdG9yIjpmYWxzZSwiYXV0b1N5bmMiOnRydWUsInVwZGF0ZURpYWdyYW0iOmZhbHNlfQ) -To lay out the different types of servers consuming these historical messages (more context on why we need "marker"/"insertion" events): - 1. Local server - - This can pretty much work out of the box. Just add the events to the database and they're available. The new endpoint is just a mechanism to insert the events. - 1. Federated remote server that already has all scrollback history and then new history is inserted - - The big problem is how does a HS know it needs to go fetch more history if they already fetched all of the history in the room? We're solving this with "marker" events which are sent on the "live" timeline and point back to the event where we inserted history next to. The HS can then go and backfill the "insertion" event and continue navigating the chunks from there. - 1. Federated remote server that joins a new room with historical messages - - We need to update the `/backfill` response to include historical messages from the chunks - 1. Federated remote server already in the room when history is inserted - - Depends on whether the HS has the scrollback history for where the history was inserted at. If already has all history, see scenario 2, if doesn't, see scenario 3. - 1. For federated servers already in the room that haven't implemented MSC2716 - - Those homeservers won't have historical messages available because they're unable to navigate the marker/insertion events. But the historical messages would be available once the HS implements MSC2716 and processes the "marker" events that point to the history. +
+mermaid graph syntax + +```mermaid +flowchart BT + subgraph live + B -------------> A + end + + subgraph chunk0 + chunk0-2(("2")) --> chunk0-1((1)) --> chunk0-0((0)) + end + + subgraph chunk1 + chunk1-2(("2")) --> chunk1-1((1)) --> chunk1-0((0)) + end + + subgraph chunk2 + chunk2-2(("2")) --> chunk2-1((1)) --> chunk2-0((0)) + end + + + chunk0-0 -------> A + chunk1-0 --> A + chunk2-0 --> A + + %% alignment links + chunk0-0 --- chunk1-2 + chunk1-0 --- chunk2-2 + %% make the links invisible + linkStyle 10 stroke-width:2px,fill:none,stroke:none; + linkStyle 11 stroke-width:2px,fill:none,stroke:none; +``` + +
+ ### New approach with "insertion" events - - With "insertion" events, we just add them to the start of each chronological chunk (where the oldest message in the chunk is). The next older in time chunk can connect to that "insertion" point from the previous chunk. - - The initial "insertion" event could be from the main DAG or we can create it ad-hoc in the first chunk so the homeserver can start traversing up the chunk from there after a "marker" event points to it. In subsequent chunks, we can already traverse from the insertion event it points to. - - Consideration: the "insertion" events add a new way for an application service to tie the chunk reconciliation in knots(similar to the DAG knots that can happen). +Next we add "insertion" and "chunk" events so it's more presriptive on how each historical chunk should connect to each other and how the homeserver can navigate the DAG. + + - With "insertion" events, we just add them to the start of each chronological chunk (where the oldest message in the chunk is). The next older-in-time chunk can connect to that "insertion" point from the previous chunk. + - The initial "insertion" event could be from the main DAG or we can create it ad-hoc in the first chunk so the homeserver can start traversing up the chunk from there after a "marker" event points to it. + - We use "chunk" events to point to the "insertion" event by referencing the "next_chunk_id" from the "insertion" event. + - Consideration: the "insertion"/"chunk" events add a new way for an application service to tie the chunk reconciliation in knots(similar to the DAG knots that can happen). + + +![](https://user-images.githubusercontent.com/558581/126261578-a6f6c3ec-bf5c-4c82-a128-66ba4a9ac079.png) + +[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBCIC0tLS0tLS0tLS0tLS0tPiBBXG4gICAgZW5kXG4gICAgXG4gICAgc3ViZ3JhcGggY2h1bmswXG4gICAgICAgIGNodW5rMC1jaHVua1tbXCJjaHVua1wiXV0gLS0-IGNodW5rMC0yKChcIjJcIikpIC0tPiBjaHVuazAtMSgoMSkpIC0tPiBjaHVuazAtMCgoMCkpIC0tPiBjaHVuazAtaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcblxuICAgIHN1YmdyYXBoIGNodW5rMVxuICAgICAgICBjaHVuazEtY2h1bmtbW1wiY2h1bmtcIl1dIC0tPiBjaHVuazEtMigoXCIyXCIpKSAtLT4gY2h1bmsxLTEoKDEpKSAtLT4gY2h1bmsxLTAoKDApKSAtLT4gY2h1bmsxLWluc2VydGlvblsvaW5zZXJ0aW9uXFxdXG4gICAgZW5kXG4gICAgXG4gICAgc3ViZ3JhcGggY2h1bmsyXG4gICAgICAgIGNodW5rMi1jaHVua1tbXCJjaHVua1wiXV0gLS0-IGNodW5rMi0yKChcIjJcIikpIC0tPiBjaHVuazItMSgoMSkpIC0tPiBjaHVuazItMCgoMCkpIC0tPiBjaHVuazItaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcblxuICAgIFxuICAgIGNodW5rMC1pbnNlcnRpb25CYXNlWy9pbnNlcnRpb25cXF0gLS0tLS0tLT4gQVxuICAgIGNodW5rMC1jaHVuayAtLi0-IGNodW5rMC1pbnNlcnRpb25CYXNlXG4gICAgY2h1bmswLWluc2VydGlvbiAtLS0tLS0tPiBBXG4gICAgY2h1bmsxLWluc2VydGlvbiAtLT4gQVxuICAgIGNodW5rMS1jaHVuayAtLi0-IGNodW5rMC1pbnNlcnRpb25cbiAgICBjaHVuazItaW5zZXJ0aW9uIC0tPiBBXG4gICAgY2h1bmsyLWNodW5rIC0uLT4gY2h1bmsxLWluc2VydGlvbiIsIm1lcm1haWQiOiJ7fSIsInVwZGF0ZUVkaXRvciI6ZmFsc2UsImF1dG9TeW5jIjp0cnVlLCJ1cGRhdGVEaWFncmFtIjpmYWxzZX0) -[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBCIC0tLS0tLS0tLS0tLT4gQVxuICAgIGVuZFxuICAgIFxuICAgIHN1YmdyYXBoIGNodW5rMFxuICAgICAgICBjaHVuazAtMigoXCIyXCIpKSAtLT4gY2h1bmswLTEoKDEpKSAtLT4gY2h1bmswLTAoKDApKSAtLT4gY2h1bmswLWluc2VydGlvblsvaW5zZXJ0aW9uXFxdXG4gICAgZW5kXG5cbiAgICBzdWJncmFwaCBjaHVuazFcbiAgICAgICAgY2h1bmsxLTIoKFwiMlwiKSkgLS0-IGNodW5rMS0xKCgxKSkgLS0-IGNodW5rMS0wKCgwKSkgLS0-IGNodW5rMS1pbnNlcnRpb25bL2luc2VydGlvblxcXVxuICAgIGVuZFxuICAgIFxuICAgIHN1YmdyYXBoIGNodW5rMlxuICAgICAgICBjaHVuazItMigoXCIyXCIpKSAtLT4gY2h1bmsyLTEoKDEpKSAtLT4gY2h1bmsyLTAoKDApKSAtLT4gY2h1bmsyLWluc2VydGlvblsvaW5zZXJ0aW9uXFxdXG4gICAgZW5kXG5cbiAgICBcbiAgICBjaHVuazAtaW5zZXJ0aW9uQmFzZVsvaW5zZXJ0aW9uXFxdIC0tLS0tLS0-IEFcbiAgICBjaHVuazAtMigoXCIyXCIpKSAtLi0-IGNodW5rMC1pbnNlcnRpb25CYXNlWy9pbnNlcnRpb25cXF1cbiAgICBjaHVuazAtaW5zZXJ0aW9uIC0tLS0tLS0-IEFcbiAgICBjaHVuazEtaW5zZXJ0aW9uIC0tPiBBXG4gICAgY2h1bmsxLTIgLS4tPiBjaHVuazAtaW5zZXJ0aW9uXG4gICAgY2h1bmsyLWluc2VydGlvbiAtLT4gQVxuICAgIGNodW5rMi0yIC0uLT4gY2h1bmsxLWluc2VydGlvbiIsIm1lcm1haWQiOiJ7fSIsInVwZGF0ZUVkaXRvciI6dHJ1ZSwiYXV0b1N5bmMiOnRydWUsInVwZGF0ZURpYWdyYW0iOnRydWV9)
mermaid graph syntax ```mermaid flowchart BT subgraph live - B ------------> A + B --------------> A end subgraph chunk0 - chunk0-2(("2")) --> chunk0-1((1)) --> chunk0-0((0)) --> chunk0-insertion[/insertion\] + chunk0-chunk[["chunk"]] --> chunk0-2(("2")) --> chunk0-1((1)) --> chunk0-0((0)) --> chunk0-insertion[/insertion\] end subgraph chunk1 - chunk1-2(("2")) --> chunk1-1((1)) --> chunk1-0((0)) --> chunk1-insertion[/insertion\] + chunk1-chunk[["chunk"]] --> chunk1-2(("2")) --> chunk1-1((1)) --> chunk1-0((0)) --> chunk1-insertion[/insertion\] end subgraph chunk2 - chunk2-2(("2")) --> chunk2-1((1)) --> chunk2-0((0)) --> chunk2-insertion[/insertion\] + chunk2-chunk[["chunk"]] --> chunk2-2(("2")) --> chunk2-1((1)) --> chunk2-0((0)) --> chunk2-insertion[/insertion\] end chunk0-insertionBase[/insertion\] -------> A - chunk0-2(("2")) -.-> chunk0-insertionBase[/insertion\] + chunk0-chunk -.-> chunk0-insertionBase chunk0-insertion -------> A chunk1-insertion --> A - chunk1-2 -.-> chunk0-insertion + chunk1-chunk -.-> chunk0-insertion chunk2-insertion --> A - chunk2-2 -.-> chunk1-insertion + chunk2-chunk -.-> chunk1-insertion ```
-![](https://user-images.githubusercontent.com/558581/125011503-34917f00-e02e-11eb-9c9e-f4f2253e0c56.png) @@ -111,15 +142,29 @@ The structure of the insertion event would look like: ```js { "type": "m.room.insertion", - "sender": "@example:example.org", + "sender": "@appservice:example.org", "content": { "m.next_chunk_id": next_chunk_id, - "m.historical": True, + "m.historical": true }, - # Since the insertion event is put at the start of the chunk, - # where the oldest event is, copy the origin_server_ts from - # the first event we're inserting - "origin_server_ts": events_to_create[0]["origin_server_ts"], + "room_id": "!jEsUZKDJdhlrceRyVU:example.org", + // Doesn't affect much but good to use the same time as the closest event + "origin_server_ts": 1626914158639 +} +``` + + +The structure of the chunk event would look like: +```js +{ + "type": "m.room.chunk", + "sender": "@appservice:example.org", + "content": { + "m.chunk_id": chunk_id, + }, + "room_id": "!jEsUZKDJdhlrceRyVU:example.org", + // Doesn't affect much but good to use the same time as the closest event + "origin_server_ts": 1626914158639 } ``` @@ -127,28 +172,96 @@ The structure of the insertion event would look like: ### "Marker" events - - A "marker" event points simply back to an "insertion" event. +Finally, we add "marker" events into the mix so that federated remote servers can also navigate and to know where/how to fetch historical messages correctly. + +To lay out the different types of servers consuming these historical messages (more context on why we need "marker" events): + + 1. Local server + - This can pretty much work out of the box. Just add the events to the database and they're available. The new endpoint is just a mechanism to insert the events. + 1. Federated remote server that already has all scrollback history and then new history is inserted + - The big problem is how does a HS know it needs to go fetch more history if they already fetched all of the history in the room? We're solving this with "marker" events which are sent on the "live" timeline and point back to the "insertion" event where we inserted history next to. The HS can then go and backfill the "insertion" event and continue navigating the chunks from there. + 1. Federated remote server that joins a new room with historical messages + - We need to update the `/backfill` response to include historical messages from the chunks + 1. Federated remote server already in the room when history is inserted + - Depends on whether the HS has the scrollback history. If the HS already has all history, see scenario 2, if doesn't, see scenario 3. + 1. For federated servers already in the room that haven't implemented MSC2716 + - Those homeservers won't have historical messages available because they're unable to navigate the "marker"/"insertion" events. But the historical messages would be available once the HS implements MSC2716 and processes the "marker" events that point to the history. + + +--- + + - A "marker" event simply points back to an "insertion" event. - The "marker" event solves the problem of, how does a federated homeserver know about the historical events which won't come down incremental sync? And the scenario where the federated HS already has all the history in the room, so it won't do a full sync of the room again. - Unlike the historical events, the "marker" event is sent as a normal event on the "live" timeline so that comes down incremental sync and is available to all homeservers regardless of how much scrollback history they already have. - - A "marker" event is not needed for every chunk/batch of historical messages. Multiple chunks can be inserted then once we're done importing everything, we can add one "marker" event pointing at the root "insertion" point + - Note: If a server joins after a "marker" event is sent, it could be lost in the middle of the timeline and they could jump back in time past the "marker" and never pick it up. But `backfill/` response should have historical messages included. It gets a bit hairy if the server has the room backfilled, the user leaves, a "marker" event is sent, more messages put it back in the timeline, the user joins back, jumps back in the timeline and misses the "marker" and expects to see the historical messages. They will be missing the historical messages until they can backfill the gap where they left. + - A "marker" event is not needed for every chunk/batch of historical messages. Multiple chunks can be inserted then once we're done importing everything, we can add one "marker" event pointing at the root "insertion" event - If more history is decided to be added later, another "marker" can be sent to let the homeservers know again. - - When a remote federated homeserver, receives a "marker" event, it can mark the "insertion" prev events as needing to backfill from that point again and can fetch the historical messages when the user scrolls back to that area in the future. For Synapse, we plan to add the details to the `insertion_event_lookups` table. - - In Synapse, we discussed not wanting to fetch the "insertion" event when the "marker" comes down the pipe but I've realized that in order to store `insertion_prev_event_id` in the table, we either need to a) add it as part of the "marker" event which works to not fetch anything additional or b) backfill just the "insertion" event to get it. I think I am going to opt for option A though. The plan is to add a new `insertion_event_lookups` table to store which events are marked as insertion points. It stores, `insertion_event_id` and `insertion_prev_event_id` and when we scrollback over `insertion_prev_event_id` again, we trigger some backfill logic to go fetch it. Similar to the `event_backward_extremities` already implemented. - - We could remove the need for "marker" events if we decided to only allow sending "insertion" events on the "live" timeline at any point where you would later want to add history. But this isn't compatible with our dynamic insertion use cases like Gitter where the rooms are already created (no "insertion" event at the start of the room), and the examples from this MSC like NNTP (newsgroup) and email which can potentially want to branch off of everything. + - When a remote federated homeserver, receives a "marker" event, it can mark the "insertion" prev events as needing to backfill from that point again and can fetch the historical messages when the user scrolls back to that area in the future. + - We could remove the need for "marker" events if we decided to only allow sending "insertion" events on the "live" timeline at any point where you would later want to add history. But this isn't compatible with our dynamic insertion use cases like Gitter where the rooms already exist with no "insertion" events at the start of the room, and the examples from this MSC like NNTP (newsgroup) and email which can potentially want to branch off of everything. -So the structure of the "marker" event would look like: +The structure of the "marker" event would look like: ```js { "type": "m.room.marker", - "sender": "@example:example.org", + "sender": "@appservice:example.org", "content": { - "m.insertion_id": insertion_event.event_id, - "m.insertion_prev_events": insertion_event.prev_events, + "m.insertion_id": insertion_event.event_id }, - "room_id": "!jEsUZKDJdhlrceRyVU:example.org" + "room_id": "!jEsUZKDJdhlrceRyVU:example.org", + "origin_server_ts": 1626914158639, } ``` +![](https://user-images.githubusercontent.com/558581/126578757-7171c714-ee6b-49ea-a998-fe73bb3f4450.png) + +[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBtYXJrZXIxPlwibWFya2VyXCJdIC0tLS0-IEIgLS0tLS0tLS0tLS0tLS0-IEFcbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBjaHVuazBcbiAgICAgICAgY2h1bmswLWNodW5rW1tcImNodW5rXCJdXSAtLT4gY2h1bmswLTIoKFwiMlwiKSkgLS0-IGNodW5rMC0xKCgxKSkgLS0-IGNodW5rMC0wKCgwKSkgLS0-IGNodW5rMC1pbnNlcnRpb25bL2luc2VydGlvblxcXVxuICAgIGVuZFxuXG4gICAgc3ViZ3JhcGggY2h1bmsxXG4gICAgICAgIGNodW5rMS1jaHVua1tbXCJjaHVua1wiXV0gLS0-IGNodW5rMS0yKChcIjJcIikpIC0tPiBjaHVuazEtMSgoMSkpIC0tPiBjaHVuazEtMCgoMCkpIC0tPiBjaHVuazEtaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBjaHVuazJcbiAgICAgICAgY2h1bmsyLWNodW5rW1tcImNodW5rXCJdXSAtLT4gY2h1bmsyLTIoKFwiMlwiKSkgLS0-IGNodW5rMi0xKCgxKSkgLS0-IGNodW5rMi0wKCgwKSkgLS0-IGNodW5rMi1pbnNlcnRpb25bL2luc2VydGlvblxcXVxuICAgIGVuZFxuXG4gICAgbWFya2VyMSAtLi0-IGNodW5rMC1pbnNlcnRpb25CYXNlXG4gICAgY2h1bmswLWluc2VydGlvbkJhc2VbL2luc2VydGlvblxcXSAtLS0tLS0tPiBBXG4gICAgY2h1bmswLWNodW5rIC0uLT4gY2h1bmswLWluc2VydGlvbkJhc2VcbiAgICBjaHVuazAtaW5zZXJ0aW9uIC0tLS0tLS0-IEFcbiAgICBjaHVuazEtaW5zZXJ0aW9uIC0tPiBBXG4gICAgY2h1bmsxLWNodW5rIC0uLT4gY2h1bmswLWluc2VydGlvblxuICAgIGNodW5rMi1pbnNlcnRpb24gLS0-IEFcbiAgICBjaHVuazItY2h1bmsgLS4tPiBjaHVuazEtaW5zZXJ0aW9uIiwibWVybWFpZCI6Int9IiwidXBkYXRlRWRpdG9yIjpmYWxzZSwiYXV0b1N5bmMiOnRydWUsInVwZGF0ZURpYWdyYW0iOmZhbHNlfQ) + +
+mermaid graph syntax + +```mermaid +flowchart BT + subgraph live + marker1>"marker"] ----> B --------------> A + end + + subgraph chunk0 + chunk0-chunk[["chunk"]] --> chunk0-2(("2")) --> chunk0-1((1)) --> chunk0-0((0)) --> chunk0-insertion[/insertion\] + end + + subgraph chunk1 + chunk1-chunk[["chunk"]] --> chunk1-2(("2")) --> chunk1-1((1)) --> chunk1-0((0)) --> chunk1-insertion[/insertion\] + end + + subgraph chunk2 + chunk2-chunk[["chunk"]] --> chunk2-2(("2")) --> chunk2-1((1)) --> chunk2-0((0)) --> chunk2-insertion[/insertion\] + end + + marker1 -.-> chunk0-insertionBase + chunk0-insertionBase[/insertion\] -------> A + chunk0-chunk -.-> chunk0-insertionBase + chunk0-insertion -------> A + chunk1-insertion --> A + chunk1-chunk -.-> chunk0-insertion + chunk2-insertion --> A + chunk2-chunk -.-> chunk1-insertion +``` + +
+ + + + + + +### Limit who can send historical messages + +Since events being silently sent in the past is hard to moderate, it will probably be good to limit who can add historical messages to the timeline. The batch send endpoint is already limited to application services but we also need to limit who can send "insertion", "chunk", and "marker" events since someone can attempt to send them via the normal `/send` API (we don't want any nasty weird knots to reconcile either) + +We can limit and protect from this by introducing a new `historical` power level which controls who can send "insertion", "chunk", and "marker" events. Since we're changing the power levels and `event_auth.py` stuff in Synapse, this probably requires a new room version. For experimenting, we can use an experimental room version, `org.matrix.msc2716`. + +Alternatively, we can use the existing `events` power level. For the default and existing rooms, if the "insertion", "chunk", and "marker" event PL levels are unset, we can completely disallow sending of those events in the room. This lets people opt-in and set a power level when they want an application service to start inserting history. + From 5854ca250f507006c323c023ca2b3506e70128f7 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 21 Jul 2021 21:10:37 -0500 Subject: [PATCH 07/68] Add note about adding m.historical --- proposals/2716-batch-send-historical-messages.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 4b14f507fd..bd3011ab86 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -39,6 +39,8 @@ Add a new endpoint, `POST /_matrix/client/unstable/org.matrix.msc2716/rooms/ Date: Wed, 28 Jul 2021 23:11:31 -0500 Subject: [PATCH 08/68] Only connect the base insertion event to the specified prev_event --- .../2716-batch-send-historical-messages.md | 30 +++++++------------ 1 file changed, 11 insertions(+), 19 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index bd3011ab86..b4d56c4dfb 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -97,10 +97,9 @@ Next we add "insertion" and "chunk" events so it's more presriptive on how each - Consideration: the "insertion"/"chunk" events add a new way for an application service to tie the chunk reconciliation in knots(similar to the DAG knots that can happen). -![](https://user-images.githubusercontent.com/558581/126261578-a6f6c3ec-bf5c-4c82-a128-66ba4a9ac079.png) +![](https://user-images.githubusercontent.com/558581/127040602-e95ac36a-5e64-4176-904d-6abae2c95ae9.png) - -[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBCIC0tLS0tLS0tLS0tLS0tPiBBXG4gICAgZW5kXG4gICAgXG4gICAgc3ViZ3JhcGggY2h1bmswXG4gICAgICAgIGNodW5rMC1jaHVua1tbXCJjaHVua1wiXV0gLS0-IGNodW5rMC0yKChcIjJcIikpIC0tPiBjaHVuazAtMSgoMSkpIC0tPiBjaHVuazAtMCgoMCkpIC0tPiBjaHVuazAtaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcblxuICAgIHN1YmdyYXBoIGNodW5rMVxuICAgICAgICBjaHVuazEtY2h1bmtbW1wiY2h1bmtcIl1dIC0tPiBjaHVuazEtMigoXCIyXCIpKSAtLT4gY2h1bmsxLTEoKDEpKSAtLT4gY2h1bmsxLTAoKDApKSAtLT4gY2h1bmsxLWluc2VydGlvblsvaW5zZXJ0aW9uXFxdXG4gICAgZW5kXG4gICAgXG4gICAgc3ViZ3JhcGggY2h1bmsyXG4gICAgICAgIGNodW5rMi1jaHVua1tbXCJjaHVua1wiXV0gLS0-IGNodW5rMi0yKChcIjJcIikpIC0tPiBjaHVuazItMSgoMSkpIC0tPiBjaHVuazItMCgoMCkpIC0tPiBjaHVuazItaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcblxuICAgIFxuICAgIGNodW5rMC1pbnNlcnRpb25CYXNlWy9pbnNlcnRpb25cXF0gLS0tLS0tLT4gQVxuICAgIGNodW5rMC1jaHVuayAtLi0-IGNodW5rMC1pbnNlcnRpb25CYXNlXG4gICAgY2h1bmswLWluc2VydGlvbiAtLS0tLS0tPiBBXG4gICAgY2h1bmsxLWluc2VydGlvbiAtLT4gQVxuICAgIGNodW5rMS1jaHVuayAtLi0-IGNodW5rMC1pbnNlcnRpb25cbiAgICBjaHVuazItaW5zZXJ0aW9uIC0tPiBBXG4gICAgY2h1bmsyLWNodW5rIC0uLT4gY2h1bmsxLWluc2VydGlvbiIsIm1lcm1haWQiOiJ7fSIsInVwZGF0ZUVkaXRvciI6ZmFsc2UsImF1dG9TeW5jIjp0cnVlLCJ1cGRhdGVEaWFncmFtIjpmYWxzZX0) +[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBCIC0tLS0tLS0tLS0tLS0tLS0tPiBBXG4gICAgZW5kXG4gICAgXG4gICAgc3ViZ3JhcGggY2h1bmswXG4gICAgICAgIGNodW5rMC1jaHVua1tbXCJjaHVua1wiXV0gLS0-IGNodW5rMC0yKChcIjJcIikpIC0tPiBjaHVuazAtMSgoMSkpIC0tPiBjaHVuazAtMCgoMCkpIC0tPiBjaHVuazAtaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcblxuICAgIHN1YmdyYXBoIGNodW5rMVxuICAgICAgICBjaHVuazEtY2h1bmtbW1wiY2h1bmtcIl1dIC0tPiBjaHVuazEtMigoXCIyXCIpKSAtLT4gY2h1bmsxLTEoKDEpKSAtLT4gY2h1bmsxLTAoKDApKSAtLT4gY2h1bmsxLWluc2VydGlvblsvaW5zZXJ0aW9uXFxdXG4gICAgZW5kXG4gICAgXG4gICAgc3ViZ3JhcGggY2h1bmsyXG4gICAgICAgIGNodW5rMi1jaHVua1tbXCJjaHVua1wiXV0gLS0-IGNodW5rMi0yKChcIjJcIikpIC0tPiBjaHVuazItMSgoMSkpIC0tPiBjaHVuazItMCgoMCkpIC0tPiBjaHVuazItaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcblxuICAgIFxuICAgIGNodW5rMC1pbnNlcnRpb25CYXNlWy9pbnNlcnRpb25cXF0gLS0tLS0tLS0tLS0tLT4gQVxuICAgIGNodW5rMC1jaHVuayAtLi0-IGNodW5rMC1pbnNlcnRpb25CYXNlWy9pbnNlcnRpb25cXF1cbiAgICBjaHVuazEtY2h1bmsgLS4tPiBjaHVuazAtaW5zZXJ0aW9uXG4gICAgY2h1bmsyLWNodW5rIC0uLT4gY2h1bmsxLWluc2VydGlvblxuIiwibWVybWFpZCI6Int9IiwidXBkYXRlRWRpdG9yIjpmYWxzZSwiYXV0b1N5bmMiOnRydWUsInVwZGF0ZURpYWdyYW0iOmZhbHNlfQ)
mermaid graph syntax @@ -108,7 +107,7 @@ Next we add "insertion" and "chunk" events so it's more presriptive on how each ```mermaid flowchart BT subgraph live - B --------------> A + B -----------------> A end subgraph chunk0 @@ -124,12 +123,9 @@ flowchart BT end - chunk0-insertionBase[/insertion\] -------> A - chunk0-chunk -.-> chunk0-insertionBase - chunk0-insertion -------> A - chunk1-insertion --> A + chunk0-insertionBase[/insertion\] -------------> A + chunk0-chunk -.-> chunk0-insertionBase[/insertion\] chunk1-chunk -.-> chunk0-insertion - chunk2-insertion --> A chunk2-chunk -.-> chunk1-insertion ``` @@ -139,7 +135,6 @@ flowchart BT - The structure of the insertion event would look like: ```js { @@ -216,9 +211,9 @@ The structure of the "marker" event would look like: } ``` -![](https://user-images.githubusercontent.com/558581/126578757-7171c714-ee6b-49ea-a998-fe73bb3f4450.png) +![](https://user-images.githubusercontent.com/558581/127429607-d67b6785-050f-4944-bd11-f31870ed43a0.png) -[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBtYXJrZXIxPlwibWFya2VyXCJdIC0tLS0-IEIgLS0tLS0tLS0tLS0tLS0-IEFcbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBjaHVuazBcbiAgICAgICAgY2h1bmswLWNodW5rW1tcImNodW5rXCJdXSAtLT4gY2h1bmswLTIoKFwiMlwiKSkgLS0-IGNodW5rMC0xKCgxKSkgLS0-IGNodW5rMC0wKCgwKSkgLS0-IGNodW5rMC1pbnNlcnRpb25bL2luc2VydGlvblxcXVxuICAgIGVuZFxuXG4gICAgc3ViZ3JhcGggY2h1bmsxXG4gICAgICAgIGNodW5rMS1jaHVua1tbXCJjaHVua1wiXV0gLS0-IGNodW5rMS0yKChcIjJcIikpIC0tPiBjaHVuazEtMSgoMSkpIC0tPiBjaHVuazEtMCgoMCkpIC0tPiBjaHVuazEtaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBjaHVuazJcbiAgICAgICAgY2h1bmsyLWNodW5rW1tcImNodW5rXCJdXSAtLT4gY2h1bmsyLTIoKFwiMlwiKSkgLS0-IGNodW5rMi0xKCgxKSkgLS0-IGNodW5rMi0wKCgwKSkgLS0-IGNodW5rMi1pbnNlcnRpb25bL2luc2VydGlvblxcXVxuICAgIGVuZFxuXG4gICAgbWFya2VyMSAtLi0-IGNodW5rMC1pbnNlcnRpb25CYXNlXG4gICAgY2h1bmswLWluc2VydGlvbkJhc2VbL2luc2VydGlvblxcXSAtLS0tLS0tPiBBXG4gICAgY2h1bmswLWNodW5rIC0uLT4gY2h1bmswLWluc2VydGlvbkJhc2VcbiAgICBjaHVuazAtaW5zZXJ0aW9uIC0tLS0tLS0-IEFcbiAgICBjaHVuazEtaW5zZXJ0aW9uIC0tPiBBXG4gICAgY2h1bmsxLWNodW5rIC0uLT4gY2h1bmswLWluc2VydGlvblxuICAgIGNodW5rMi1pbnNlcnRpb24gLS0-IEFcbiAgICBjaHVuazItY2h1bmsgLS4tPiBjaHVuazEtaW5zZXJ0aW9uIiwibWVybWFpZCI6Int9IiwidXBkYXRlRWRpdG9yIjpmYWxzZSwiYXV0b1N5bmMiOnRydWUsInVwZGF0ZURpYWdyYW0iOmZhbHNlfQ) +[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBtYXJrZXIxPlwibWFya2VyXCJdIC0tLS0-IEIgLS0tLS0tLS0tLS0tLS0tLS0-IEFcbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBjaHVuazBcbiAgICAgICAgY2h1bmswLWNodW5rW1tcImNodW5rXCJdXSAtLT4gY2h1bmswLTIoKFwiMlwiKSkgLS0-IGNodW5rMC0xKCgxKSkgLS0-IGNodW5rMC0wKCgwKSkgLS0-IGNodW5rMC1pbnNlcnRpb25bL2luc2VydGlvblxcXVxuICAgIGVuZFxuXG4gICAgc3ViZ3JhcGggY2h1bmsxXG4gICAgICAgIGNodW5rMS1jaHVua1tbXCJjaHVua1wiXV0gLS0-IGNodW5rMS0yKChcIjJcIikpIC0tPiBjaHVuazEtMSgoMSkpIC0tPiBjaHVuazEtMCgoMCkpIC0tPiBjaHVuazEtaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBjaHVuazJcbiAgICAgICAgY2h1bmsyLWNodW5rW1tcImNodW5rXCJdXSAtLT4gY2h1bmsyLTIoKFwiMlwiKSkgLS0-IGNodW5rMi0xKCgxKSkgLS0-IGNodW5rMi0wKCgwKSkgLS0-IGNodW5rMi1pbnNlcnRpb25bL2luc2VydGlvblxcXVxuICAgIGVuZFxuXG4gICAgXG4gICAgbWFya2VyMSAtLi0-IGNodW5rMC1pbnNlcnRpb25CYXNlXG4gICAgY2h1bmswLWluc2VydGlvbkJhc2VbL2luc2VydGlvblxcXSAtLS0tLS0tLS0tLS0tPiBBXG4gICAgY2h1bmswLWNodW5rIC0uLT4gY2h1bmswLWluc2VydGlvbkJhc2VbL2luc2VydGlvblxcXVxuICAgIGNodW5rMS1jaHVuayAtLi0-IGNodW5rMC1pbnNlcnRpb25cbiAgICBjaHVuazItY2h1bmsgLS4tPiBjaHVuazEtaW5zZXJ0aW9uXG4iLCJtZXJtYWlkIjoie30iLCJ1cGRhdGVFZGl0b3IiOmZhbHNlLCJhdXRvU3luYyI6dHJ1ZSwidXBkYXRlRGlhZ3JhbSI6ZmFsc2V9)
mermaid graph syntax @@ -226,7 +221,7 @@ The structure of the "marker" event would look like: ```mermaid flowchart BT subgraph live - marker1>"marker"] ----> B --------------> A + marker1>"marker"] ----> B -----------------> A end subgraph chunk0 @@ -241,13 +236,11 @@ flowchart BT chunk2-chunk[["chunk"]] --> chunk2-2(("2")) --> chunk2-1((1)) --> chunk2-0((0)) --> chunk2-insertion[/insertion\] end + marker1 -.-> chunk0-insertionBase - chunk0-insertionBase[/insertion\] -------> A - chunk0-chunk -.-> chunk0-insertionBase - chunk0-insertion -------> A - chunk1-insertion --> A + chunk0-insertionBase[/insertion\] -------------> A + chunk0-chunk -.-> chunk0-insertionBase[/insertion\] chunk1-chunk -.-> chunk0-insertion - chunk2-insertion --> A chunk2-chunk -.-> chunk1-insertion ``` @@ -257,7 +250,6 @@ flowchart BT - ### Limit who can send historical messages Since events being silently sent in the past is hard to moderate, it will probably be good to limit who can add historical messages to the timeline. The batch send endpoint is already limited to application services but we also need to limit who can send "insertion", "chunk", and "marker" events since someone can attempt to send them via the normal `/send` API (we don't want any nasty weird knots to reconcile either) From b448452ab71f7a4570a2a564ed9d83eecc23f1b8 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Fri, 6 Aug 2021 21:20:35 -0500 Subject: [PATCH 09/68] Start of consolidation and adding more clear information --- .../2716-batch-send-historical-messages.md | 203 +++++++++++++++--- 1 file changed, 173 insertions(+), 30 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index b4d56c4dfb..ef933d64ce 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -1,9 +1,35 @@ -# MSC2716: Batch send historical messages +# MSC2716: Incrementally importing history into existing rooms -For the full problem statement, considerations, see the other `proposals/2716-importing-history-into-existing-rooms.md` document. Happy to merge the two, once we get more feedback on it. +## Problem -## Alternative batch send proposal +Matrix has historically been unable to easily import existing history into a +room that already exists. This is a major problem when bridging existing +conversations into Matrix, particularly if the scrollback is being +incrementally or lazily imported. +For instance, an NNTP bridge might work by letting a user join a room that +maps to a given newsgroup, first showing an empty room, and then importing the +most recent 1000 newsgroup posts for that room to flesh out some history. The +bridge might then choose to slowly import additional posts for that newsgroup +in the background, until however many decades of backfill were complete. +Finally, as more archives surface, they might also need to be manually +gradually added into the history of the room - slowly building up the complete +history of the conversations over time. + +This is currently not supported because: + * There is no way to set historical room state in a room via the CS or AS API - + you can only edit current room state. + * There is no way to create messages in the context of historical room state in + a room via CS or AS API - you can only create events relative to current room + state. + * There is currently no way to override the timestamp on an event via the AS API. + (We used to have the concept of [timestamp + massaging](https://matrix.org/docs/spec/application_service/r0.1.2#timestamp-massaging), + but it never got properly specified) + + + +## Proposal ### Expectation @@ -14,41 +40,143 @@ Here is what scrollback is expected to look like in Element: ![](https://user-images.githubusercontent.com/558581/119064795-cae7e380-b9a1-11eb-9366-5e1f5e6370a8.png) +### Overview -### New batch send approach +**Endpoint:** -Add a new endpoint, `POST /_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send?prev_event=&chunk_id=`, which can insert a chunk of events historically back in time next to the given `prev_event`. This endpoint can only be used by application services. `chunk_id` comes from `next_chunk_id` in the response of the batch send endpoint and is derived from the "insertion" events added to each chunk. It's not required for the first batch send. -``` -# Body + - `POST /_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send?prev_event=&chunk_id=` + +**Event types:** + + - `m.room.insertion`: Events that mark points in time where you can insert historical messages + - `m.room.chunk`: This is what connects one historical chunk to the other. In the DAG, we navigate from an insertion event to the chunk event that points at it, up the historical messages to the insertion event, then repeat the process + - `m.room.marker`: Used to hint to homeservers (and potentially to cache bust on clients) that there is new history back time that you should go fetch next time someone scrolls back around the specified insertion event. + +**Content fields:** + + - `m.historical` (`[true|false]`): Used on any event to indicate they were historically imported after the fact + - `m.next_chunk_id` (`string`): This is a random unique string for a `m.room.insertion` event to indicate what ID the next "chunk" event should specify in order to connect to it + - `m.chunk_id` (`string`): Used on `m.room.chunk` events to indicate which `m.room.insertion` event it connects to by its `m.next_chunk_id` field + - `m.marker.insertion` (another `event_id` string): For `m.room.marker` events to point at an `m.room.insertion` event by `event_id` + +**Power level:** + +Since events being silently sent in the past is hard to moderate, it will probably be good to limit who can add historical messages to the timeline. The batch send endpoint is already limited to application services but we also need to limit who can send "insertion", "chunk", and "marker" events since someone can attempt to send them via the normal `/send` API (we don't want any nasty weird knots to reconcile either). + + - `historical`: This controls who can send `m.room.insertion`, `m.room.chunk`, and `m.room.marker` in the room. + +**Room version:** + +The redaction algorithm changes are the only hard requirement for a new room version because we need to make sure when redacting, we only strip out fields without affecting anything at the protocol level. This means that we need to keep all of the structural fields that allow us to navigate the chunks of history in the DAG. We also only want to auth events against fields that wouldn't be removed during redaction. In practice, this means: + + - When redacting `m.room.insertion` events, keep the `m.next_chunk_id` content field around + - When redacting `m.room.chunk` events, keep the `m.chunk_id` content field around + - When redacting `m.room.marker` events, keep the `m.marker.insertion` content field around + - When redacting `m.room.power_levels` events, keep the `historical` content field around + + +#### Backwards compatibility + +However, this MSC is mostly backwards compatible and can be used with the current room version with the fact that redactions aren't supported for `m.room.insertion`, `m.room.chunk`, `m.room.marker` events. We can protect people from this limitation by throwing an error when they try to use `PUT /_matrix/client/r0/rooms/{roomId}/redact/{eventId}/{txnId}` to redact one of those events. We would have to accept the redaction if it came over federation to avoid split-brained rooms. + +Because we also can't use the `historical` power level for controlling who can send these events in the existing room version, we instead only allow the room `creator` to send `m.room.insertion`, `m.room.chunk`, and `m.room.chunk` events. + + + +### New historical batch send endpoint + +Add a new endpoint, `POST /_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send?prev_event=&chunk_id=`, which can insert a chunk of events historically back in time next to the given `prev_event`. This endpoint can only be used by application services. + +This endpoint will handle the complexity of creating "insertion" and "chunk" events. All the application service has to do is use `?chunk_id` which comes from `next_chunk_id` in the response of the batch send endpoint. `next_chunk_id` is derived from the insertion events added to each chunk and is not required for the first batch send. + +Request body: +```json { - "events": [ ... ], - "state_events_at_start": [ ... ] + "state_events_at_start": [{ + "type": "m.room.member", + "sender": "@someone:matrix.org", + "origin_server_ts": 1628277690333, + "content": { + "membership": "join" + }, + "state_key": "@someone:matrix.org" + }], + "events": [ + { + "type": "m.room.message", + "sender": "@someone:matrix.org", + "origin_server_ts": 1628277690333, + "content": { + "msgtype": "m.text", + "body": "Historical message1" + }, + }, + { + "type": "m.room.message", + "sender": "@someone:matrix.org", + "origin_server_ts": 1628277690334, + "content": { + "msgtype": "m.text", + "body": "Historical message2" + }, + } + ], } +``` -# Response +Request response: +```json { - "state_events": [...list of state event ID's we inserted], - "events": [...list of historical event ID's we inserted], + "state_events": [list of state event ID's we inserted...], + # List of historical event ID's we inserted which includes the + # auto-generated insertion and chunk events... + "events": [ + { "insertion event for chunk" }, + { "historical message 1" }, + { "historical message 2" }, + { "chunk event" }, + { "base insertion event" } + ], "next_chunk_id": "random-unique-string", } ``` -`state_events_at_start` is used to define the historical state events needed to auth the `events` like join events. These events can float outside of the normal DAG. In Synapse, these are called floating `outlier`'s and won't be visible in the chat history which also allows us to insert multiple chunks without having a bunch of `@mxid joined the room` noise between each chunk. The state will not be resolved into the current state of the room. -`events` is chronological chunk/list of events you want to insert. For Synapse, there is a reverse-chronological constraint on chunks so once you insert one chunk of messages, you can only insert older an older chunk after that. tldr; Insert from your most recent chunk of history -> oldest history. +`state_events_at_start` is used to define the historical state events needed to auth the `events` like invite and join events. These events can float outside of the normal DAG. In Synapse, these are called `outlier`'s and won't be visible in the chat history which also allows us to insert multiple chunks without having a bunch of `@mxid joined the room` noise between each chunk. **The state will not be resolved into the current state of the room.** -The `state_events`/`events` payload is in **chronological** order (`[0, 1, 2]`) and is processed it in that order so the `prev_events` point to it's older-in-time previous message which is more sane in the DAG. **Depth discussion:** For Synapse, when persisting, we **reverse the list (to make it reverse-chronological)** so we can still get the correct `(topological_ordering, stream_ordering)` so it sorts between A and B as we expect. Why? `depth` is not re-calculated when historical messages are inserted into the DAG. This means we have to take care to insert in the right order. Events are sorted by `(topological_ordering, stream_ordering)` where `topological_ordering` is just `depth`. Normally, `stream_ordering` is an auto incrementing integer but for `backfilled=true` events, it decrements. Historical messages are inserted all at the same `depth`, and marked as backfilled so the `stream_ordering` decrements and each event is sorted behind the next. (from https://github.com/matrix-org/synapse/pull/9247#discussion_r588479201) +`events` is chronological chunk/list of events you want to insert. For Synapse, there is a reverse-chronological constraint on chunks so once you insert one chunk of messages, you can only insert older an older chunk after that. **tldr; Insert from your most recent chunk of history -> oldest history.** -All of the events in the chunk get a content field, `"m.historical": true`, to indicate that they are historical at the point of being added to a room. -With the new process, the DAG will look like: +#### What does the batch send endpoint do behind the scenes? -![](https://user-images.githubusercontent.com/558581/126577416-68f1a5b0-2818-48c1-b046-21e504a0fe83.png) +This section explains the homeserver magic that happens when someone uses the `batch_send` endpoint. If you're just trying to understand how the "insertion", "chunk", "marker" events work, you might want to just skip down to the room DAG breakdown which incrementally explains how everything fits together. + 1. An "insertion" event for the "chunk" is added to the start of the chunk. This is the starting point of the next chunk and holds the `next_chunk_id` that we return in the batch send response. The application service passes this as `?chunk_id` + 1. A "chunk" event is added to the end of the chunk. This is the event that connects to an insertion event by `?chunk_id`. + 1. If `?chunk_id` is not specified (usually for the first chunk), create base "insertion" event as a jumping off point from `?prev_event`. + 1. All of the events in the historical chunk get a content field, `"m.historical": true`, to indicate that they are historical at the point of being added to a room. + 1. The `state_events_at_start`/`events` payload is in **chronological** order (`[0, 1, 2]`) and is processed in that order so the `prev_events` point to it's older-in-time previous message which gives us a nice straight line in the DAG. + - **Depth discussion:** For Synapse, when persisting, we **reverse the list (to make it reverse-chronological)** so we can still get the correct `(topological_ordering, stream_ordering)` so it sorts between A and B as we expect. Why? `depth` is not re-calculated when historical messages are inserted into the DAG. This means we have to take care to insert in the right order. Events are sorted by `(topological_ordering, stream_ordering)` where `topological_ordering` is just `depth`. Normally, `stream_ordering` is an auto incrementing integer but for `backfilled=true` events, it decrements. Historical messages are inserted all at the same `depth`, and marked as backfilled so the `stream_ordering` decrements and each event is sorted behind the next. (from https://github.com/matrix-org/synapse/pull/9247#discussion_r588479201) -[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBCIC0tLS0tLS0tLS0tLS0-IEFcbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBjaHVuazBcbiAgICAgICAgY2h1bmswLTIoKFwiMlwiKSkgLS0-IGNodW5rMC0xKCgxKSkgLS0-IGNodW5rMC0wKCgwKSlcbiAgICBlbmRcblxuICAgIHN1YmdyYXBoIGNodW5rMVxuICAgICAgICBjaHVuazEtMigoXCIyXCIpKSAtLT4gY2h1bmsxLTEoKDEpKSAtLT4gY2h1bmsxLTAoKDApKVxuICAgIGVuZFxuICAgIFxuICAgIHN1YmdyYXBoIGNodW5rMlxuICAgICAgICBjaHVuazItMigoXCIyXCIpKSAtLT4gY2h1bmsyLTEoKDEpKSAtLT4gY2h1bmsyLTAoKDApKVxuICAgIGVuZFxuXG4gICAgXG4gICAgY2h1bmswLTAgLS0tLS0tLT4gQVxuICAgIGNodW5rMS0wIC0tPiBBXG4gICAgY2h1bmsyLTAgLS0-IEFcbiAgICBcbiAgICAlJSBhbGlnbm1lbnQgbGlua3MgXG4gICAgY2h1bmswLTAgLS0tIGNodW5rMS0yXG4gICAgY2h1bmsxLTAgLS0tIGNodW5rMi0yXG4gICAgJSUgbWFrZSB0aGUgbGlua3MgaW52aXNpYmxlIFxuICAgIGxpbmtTdHlsZSAxMCBzdHJva2Utd2lkdGg6MnB4LGZpbGw6bm9uZSxzdHJva2U6bm9uZTtcbiAgICBsaW5rU3R5bGUgMTEgc3Ryb2tlLXdpZHRoOjJweCxmaWxsOm5vbmUsc3Ryb2tlOm5vbmU7IiwibWVybWFpZCI6Int9IiwidXBkYXRlRWRpdG9yIjpmYWxzZSwiYXV0b1N5bmMiOnRydWUsInVwZGF0ZURpYWdyYW0iOmZhbHNlfQ) +### Room DAG breakdown + +#### Basic chunk structure + +Here is the starting point how the historical chunk concept look like in the DAG. +We're going to build from this in the next sections. + + - `A` is the oldest-in-time message + - `B` is the newest-in-time message + - `chunk0` is the first chunk we try to import + - Each chunk of messages is older-in-time than the last (`chunk1` is older than `chunk0`, etc) + + +![](https://user-images.githubusercontent.com/558581/126577416-68f1a5b0-2818-48c1-b046-21e504a0fe83.png) + +[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBCIC0tLS0tLS0tLS0tLS0-IEFcbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBjaHVuazBcbiAgICAgICAgY2h1bmswLTIoKFwiMlwiKSkgLS0-IGNodW5rMC0xKCgxKSkgLS0-IGNodW5rMC0wKCgwKSlcbiAgICBlbmRcblxuICAgIHN1YmdyYXBoIGNodW5rMVxuICAgICAgICBjaHVuazEtMigoXCIyXCIpKSAtLT4gY2h1bmsxLTEoKDEpKSAtLT4gY2h1bmsxLTAoKDApKVxuICAgIGVuZFxuICAgIFxuICAgIHN1YmdyYXBoIGNodW5rMlxuICAgICAgICBjaHVuazItMigoXCIyXCIpKSAtLT4gY2h1bmsyLTEoKDEpKSAtLT4gY2h1bmsyLTAoKDApKVxuICAgIGVuZFxuXG4gICAgXG4gICAgY2h1bmswLTAgLS0tLS0tLT4gQVxuICAgIGNodW5rMS0wIC0tPiBBXG4gICAgY2h1bmsyLTAgLS0-IEFcbiAgICBcbiAgICAlJSBhbGlnbm1lbnQgbGlua3MgXG4gICAgY2h1bmswLTAgLS0tIGNodW5rMS0yXG4gICAgY2h1bmsxLTAgLS0tIGNodW5rMi0yXG4gICAgJSUgbWFrZSB0aGUgbGlua3MgaW52aXNpYmxlIFxuICAgIGxpbmtTdHlsZSAxMCBzdHJva2Utd2lkdGg6MnB4LGZpbGw6bm9uZSxzdHJva2U6bm9uZTtcbiAgICBsaW5rU3R5bGUgMTEgc3Ryb2tlLXdpZHRoOjJweCxmaWxsOm5vbmUsc3Ryb2tlOm5vbmU7IiwibWVybWFpZCI6Int9IiwidXBkYXRlRWRpdG9yIjpmYWxzZSwiYXV0b1N5bmMiOnRydWUsInVwZGF0ZURpYWdyYW0iOmZhbHNlfQ) +
mermaid graph syntax @@ -87,7 +215,7 @@ flowchart BT -### New approach with "insertion" events +#### Adding "insertion" and "chunk" events Next we add "insertion" and "chunk" events so it's more presriptive on how each historical chunk should connect to each other and how the homeserver can navigate the DAG. @@ -96,7 +224,6 @@ Next we add "insertion" and "chunk" events so it's more presriptive on how each - We use "chunk" events to point to the "insertion" event by referencing the "next_chunk_id" from the "insertion" event. - Consideration: the "insertion"/"chunk" events add a new way for an application service to tie the chunk reconciliation in knots(similar to the DAG knots that can happen). - ![](https://user-images.githubusercontent.com/558581/127040602-e95ac36a-5e64-4176-904d-6abae2c95ae9.png) [Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBCIC0tLS0tLS0tLS0tLS0tLS0tPiBBXG4gICAgZW5kXG4gICAgXG4gICAgc3ViZ3JhcGggY2h1bmswXG4gICAgICAgIGNodW5rMC1jaHVua1tbXCJjaHVua1wiXV0gLS0-IGNodW5rMC0yKChcIjJcIikpIC0tPiBjaHVuazAtMSgoMSkpIC0tPiBjaHVuazAtMCgoMCkpIC0tPiBjaHVuazAtaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcblxuICAgIHN1YmdyYXBoIGNodW5rMVxuICAgICAgICBjaHVuazEtY2h1bmtbW1wiY2h1bmtcIl1dIC0tPiBjaHVuazEtMigoXCIyXCIpKSAtLT4gY2h1bmsxLTEoKDEpKSAtLT4gY2h1bmsxLTAoKDApKSAtLT4gY2h1bmsxLWluc2VydGlvblsvaW5zZXJ0aW9uXFxdXG4gICAgZW5kXG4gICAgXG4gICAgc3ViZ3JhcGggY2h1bmsyXG4gICAgICAgIGNodW5rMi1jaHVua1tbXCJjaHVua1wiXV0gLS0-IGNodW5rMi0yKChcIjJcIikpIC0tPiBjaHVuazItMSgoMSkpIC0tPiBjaHVuazItMCgoMCkpIC0tPiBjaHVuazItaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcblxuICAgIFxuICAgIGNodW5rMC1pbnNlcnRpb25CYXNlWy9pbnNlcnRpb25cXF0gLS0tLS0tLS0tLS0tLT4gQVxuICAgIGNodW5rMC1jaHVuayAtLi0-IGNodW5rMC1pbnNlcnRpb25CYXNlWy9pbnNlcnRpb25cXF1cbiAgICBjaHVuazEtY2h1bmsgLS4tPiBjaHVuazAtaW5zZXJ0aW9uXG4gICAgY2h1bmsyLWNodW5rIC0uLT4gY2h1bmsxLWluc2VydGlvblxuIiwibWVybWFpZCI6Int9IiwidXBkYXRlRWRpdG9yIjpmYWxzZSwiYXV0b1N5bmMiOnRydWUsInVwZGF0ZURpYWdyYW0iOmZhbHNlfQ) @@ -132,9 +259,6 @@ flowchart BT
- - - The structure of the insertion event would look like: ```js { @@ -168,7 +292,7 @@ The structure of the chunk event would look like: -### "Marker" events +#### Adding marker events Finally, we add "marker" events into the mix so that federated remote servers can also navigate and to know where/how to fetch historical messages correctly. @@ -190,7 +314,7 @@ To lay out the different types of servers consuming these historical messages (m - A "marker" event simply points back to an "insertion" event. - The "marker" event solves the problem of, how does a federated homeserver know about the historical events which won't come down incremental sync? And the scenario where the federated HS already has all the history in the room, so it won't do a full sync of the room again. - - Unlike the historical events, the "marker" event is sent as a normal event on the "live" timeline so that comes down incremental sync and is available to all homeservers regardless of how much scrollback history they already have. + - Unlike the historical events sent via `/batch_send`, **the "marker" event is sent separately as a normal event on the "live" timeline** so that comes down incremental sync and is available to all homeservers regardless of how much scrollback history they already have. - Note: If a server joins after a "marker" event is sent, it could be lost in the middle of the timeline and they could jump back in time past the "marker" and never pick it up. But `backfill/` response should have historical messages included. It gets a bit hairy if the server has the room backfilled, the user leaves, a "marker" event is sent, more messages put it back in the timeline, the user joins back, jumps back in the timeline and misses the "marker" and expects to see the historical messages. They will be missing the historical messages until they can backfill the gap where they left. - A "marker" event is not needed for every chunk/batch of historical messages. Multiple chunks can be inserted then once we're done importing everything, we can add one "marker" event pointing at the root "insertion" event - If more history is decided to be added later, another "marker" can be sent to let the homeservers know again. @@ -250,14 +374,33 @@ flowchart BT -### Limit who can send historical messages -Since events being silently sent in the past is hard to moderate, it will probably be good to limit who can add historical messages to the timeline. The batch send endpoint is already limited to application services but we also need to limit who can send "insertion", "chunk", and "marker" events since someone can attempt to send them via the normal `/send` API (we don't want any nasty weird knots to reconcile either) -We can limit and protect from this by introducing a new `historical` power level which controls who can send "insertion", "chunk", and "marker" events. Since we're changing the power levels and `event_auth.py` stuff in Synapse, this probably requires a new room version. For experimenting, we can use an experimental room version, `org.matrix.msc2716`. +## Unstable prefix + +Event types, event content fields, and the API endpoint are all using the unstable prefix `org.matrix.msc2716`: + +**Endpoints:** + + - `POST /_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send` + +**Event types:** + + - `org.matrix.msc2716.insertion` + - `org.matrix.msc2716.chunk` + - `org.matrix.msc2716.marker` + +**Content fields:** -Alternatively, we can use the existing `events` power level. For the default and existing rooms, if the "insertion", "chunk", and "marker" event PL levels are unset, we can completely disallow sending of those events in the room. This lets people opt-in and set a power level when they want an application service to start inserting history. + - `org.matrix.msc2716.historical` + - `org.matrix.msc2716.next_chunk_id` + - `org.matrix.msc2716.chunk_id` + - `org.matrix.msc2716.marker.insertion` +**Room version:** + - `org.matrix.msc2716` and `org.matrix.msc2716v2`, etc as we develop and iterate along the way +**Power level:** + - `historical` (does not need prefixing because it's already under an experimental room version) From 8a4d136d0c9f48d4ae869a5d907fa5f11983922c Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Fri, 6 Aug 2021 21:50:10 -0500 Subject: [PATCH 10/68] Wrap lines --- .../2716-batch-send-historical-messages.md | 288 ++++++++++++++---- 1 file changed, 229 insertions(+), 59 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index ef933d64ce..31f0ad0936 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -33,7 +33,8 @@ This is currently not supported because: ### Expectation -Historical messages that we insert should appear in the timeline just like they would if they were sent back at that time. +Historical messages that we insert should appear in the timeline just like they +would if they were sent back at that time. Here is what scrollback is expected to look like in Element: @@ -48,26 +49,48 @@ Here is what scrollback is expected to look like in Element: **Event types:** - - `m.room.insertion`: Events that mark points in time where you can insert historical messages - - `m.room.chunk`: This is what connects one historical chunk to the other. In the DAG, we navigate from an insertion event to the chunk event that points at it, up the historical messages to the insertion event, then repeat the process - - `m.room.marker`: Used to hint to homeservers (and potentially to cache bust on clients) that there is new history back time that you should go fetch next time someone scrolls back around the specified insertion event. + - `m.room.insertion`: Events that mark points in time where you can insert + historical messages + - `m.room.chunk`: This is what connects one historical chunk to the other. In + the DAG, we navigate from an insertion event to the chunk event that points + at it, up the historical messages to the insertion event, then repeat the + process + - `m.room.marker`: Used to hint to homeservers (and potentially to cache bust + on clients) that there is new history back time that you should go fetch next + time someone scrolls back around the specified insertion event. **Content fields:** - - `m.historical` (`[true|false]`): Used on any event to indicate they were historically imported after the fact - - `m.next_chunk_id` (`string`): This is a random unique string for a `m.room.insertion` event to indicate what ID the next "chunk" event should specify in order to connect to it - - `m.chunk_id` (`string`): Used on `m.room.chunk` events to indicate which `m.room.insertion` event it connects to by its `m.next_chunk_id` field - - `m.marker.insertion` (another `event_id` string): For `m.room.marker` events to point at an `m.room.insertion` event by `event_id` + - `m.historical` (`[true|false]`): Used on any event to indicate they were + historically imported after the fact + - `m.next_chunk_id` (`string`): This is a random unique string for a + `m.room.insertion` event to indicate what ID the next "chunk" event should + specify in order to connect to it + - `m.chunk_id` (`string`): Used on `m.room.chunk` events to indicate which + `m.room.insertion` event it connects to by its `m.next_chunk_id` field + - `m.marker.insertion` (another `event_id` string): For `m.room.marker` events + to point at an `m.room.insertion` event by `event_id` **Power level:** -Since events being silently sent in the past is hard to moderate, it will probably be good to limit who can add historical messages to the timeline. The batch send endpoint is already limited to application services but we also need to limit who can send "insertion", "chunk", and "marker" events since someone can attempt to send them via the normal `/send` API (we don't want any nasty weird knots to reconcile either). +Since events being silently sent in the past is hard to moderate, it will +probably be good to limit who can add historical messages to the timeline. The +batch send endpoint is already limited to application services but we also need +to limit who can send "insertion", "chunk", and "marker" events since someone +can attempt to send them via the normal `/send` API (we don't want any nasty +weird knots to reconcile either). - - `historical`: This controls who can send `m.room.insertion`, `m.room.chunk`, and `m.room.marker` in the room. + - `historical`: This controls who can send `m.room.insertion`, `m.room.chunk`, + and `m.room.marker` in the room. **Room version:** -The redaction algorithm changes are the only hard requirement for a new room version because we need to make sure when redacting, we only strip out fields without affecting anything at the protocol level. This means that we need to keep all of the structural fields that allow us to navigate the chunks of history in the DAG. We also only want to auth events against fields that wouldn't be removed during redaction. In practice, this means: +The redaction algorithm changes are the only hard requirement for a new room +version because we need to make sure when redacting, we only strip out fields +without affecting anything at the protocol level. This means that we need to +keep all of the structural fields that allow us to navigate the chunks of +history in the DAG. We also only want to auth events against fields that +wouldn't be removed during redaction. In practice, this means: - When redacting `m.room.insertion` events, keep the `m.next_chunk_id` content field around - When redacting `m.room.chunk` events, keep the `m.chunk_id` content field around @@ -77,17 +100,32 @@ The redaction algorithm changes are the only hard requirement for a new room ver #### Backwards compatibility -However, this MSC is mostly backwards compatible and can be used with the current room version with the fact that redactions aren't supported for `m.room.insertion`, `m.room.chunk`, `m.room.marker` events. We can protect people from this limitation by throwing an error when they try to use `PUT /_matrix/client/r0/rooms/{roomId}/redact/{eventId}/{txnId}` to redact one of those events. We would have to accept the redaction if it came over federation to avoid split-brained rooms. +However, this MSC is mostly backwards compatible and can be used with the +current room version with the fact that redactions aren't supported for +`m.room.insertion`, `m.room.chunk`, `m.room.marker` events. We can protect +people from this limitation by throwing an error when they try to use `PUT +/_matrix/client/r0/rooms/{roomId}/redact/{eventId}/{txnId}` to redact one of +those events. We would have to accept the redaction if it came over federation +to avoid split-brained rooms. -Because we also can't use the `historical` power level for controlling who can send these events in the existing room version, we instead only allow the room `creator` to send `m.room.insertion`, `m.room.chunk`, and `m.room.chunk` events. +Because we also can't use the `historical` power level for controlling who can +send these events in the existing room version, we instead only allow the room +`creator` to send `m.room.insertion`, `m.room.chunk`, and `m.room.chunk` events. ### New historical batch send endpoint -Add a new endpoint, `POST /_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send?prev_event=&chunk_id=`, which can insert a chunk of events historically back in time next to the given `prev_event`. This endpoint can only be used by application services. +Add a new endpoint, `POST +/_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send?prev_event=&chunk_id=`, +which can insert a chunk of events historically back in time next to the given +`prev_event`. This endpoint can only be used by application services. -This endpoint will handle the complexity of creating "insertion" and "chunk" events. All the application service has to do is use `?chunk_id` which comes from `next_chunk_id` in the response of the batch send endpoint. `next_chunk_id` is derived from the insertion events added to each chunk and is not required for the first batch send. +This endpoint will handle the complexity of creating "insertion" and "chunk" +events. All the application service has to do is use `?chunk_id` which comes +from `next_chunk_id` in the response of the batch send endpoint. `next_chunk_id` +is derived from the insertion events added to each chunk and is not required for +the first batch send. Request body: ```json @@ -125,38 +163,72 @@ Request body: ``` Request response: -```json +```jsonc { - "state_events": [list of state event ID's we inserted...], - # List of historical event ID's we inserted which includes the - # auto-generated insertion and chunk events... + "state_events": [ + // list of state event ID's we inserted... + ], + // List of historical event ID's we inserted which includes the + // auto-generated insertion and chunk events... "events": [ - { "insertion event for chunk" }, - { "historical message 1" }, - { "historical message 2" }, - { "chunk event" }, - { "base insertion event" } + // insertion event ID for chunk + // historical message1 event ID + // historical message2 event ID + // chunk event ID + // base insertion event ID ], "next_chunk_id": "random-unique-string", } ``` -`state_events_at_start` is used to define the historical state events needed to auth the `events` like invite and join events. These events can float outside of the normal DAG. In Synapse, these are called `outlier`'s and won't be visible in the chat history which also allows us to insert multiple chunks without having a bunch of `@mxid joined the room` noise between each chunk. **The state will not be resolved into the current state of the room.** +`state_events_at_start` is used to define the historical state events needed to +auth the `events` like invite and join events. These events can float outside of +the normal DAG. In Synapse, these are called `outlier`'s and won't be visible in +the chat history which also allows us to insert multiple chunks without having a +bunch of `@mxid joined the room` noise between each chunk. **The state will not +be resolved into the current state of the room.** -`events` is chronological chunk/list of events you want to insert. For Synapse, there is a reverse-chronological constraint on chunks so once you insert one chunk of messages, you can only insert older an older chunk after that. **tldr; Insert from your most recent chunk of history -> oldest history.** +`events` is chronological chunk/list of events you want to insert. For Synapse, +there is a reverse-chronological constraint on chunks so once you insert one +chunk of messages, you can only insert older an older chunk after that. **tldr; +Insert from your most recent chunk of history -> oldest history.** #### What does the batch send endpoint do behind the scenes? -This section explains the homeserver magic that happens when someone uses the `batch_send` endpoint. If you're just trying to understand how the "insertion", "chunk", "marker" events work, you might want to just skip down to the room DAG breakdown which incrementally explains how everything fits together. - - 1. An "insertion" event for the "chunk" is added to the start of the chunk. This is the starting point of the next chunk and holds the `next_chunk_id` that we return in the batch send response. The application service passes this as `?chunk_id` - 1. A "chunk" event is added to the end of the chunk. This is the event that connects to an insertion event by `?chunk_id`. - 1. If `?chunk_id` is not specified (usually for the first chunk), create base "insertion" event as a jumping off point from `?prev_event`. - 1. All of the events in the historical chunk get a content field, `"m.historical": true`, to indicate that they are historical at the point of being added to a room. - 1. The `state_events_at_start`/`events` payload is in **chronological** order (`[0, 1, 2]`) and is processed in that order so the `prev_events` point to it's older-in-time previous message which gives us a nice straight line in the DAG. - - **Depth discussion:** For Synapse, when persisting, we **reverse the list (to make it reverse-chronological)** so we can still get the correct `(topological_ordering, stream_ordering)` so it sorts between A and B as we expect. Why? `depth` is not re-calculated when historical messages are inserted into the DAG. This means we have to take care to insert in the right order. Events are sorted by `(topological_ordering, stream_ordering)` where `topological_ordering` is just `depth`. Normally, `stream_ordering` is an auto incrementing integer but for `backfilled=true` events, it decrements. Historical messages are inserted all at the same `depth`, and marked as backfilled so the `stream_ordering` decrements and each event is sorted behind the next. (from https://github.com/matrix-org/synapse/pull/9247#discussion_r588479201) +This section explains the homeserver magic that happens when someone uses the +`batch_send` endpoint. If you're just trying to understand how the "insertion", +"chunk", "marker" events work, you might want to just skip down to the room DAG +breakdown which incrementally explains how everything fits together. + + 1. An "insertion" event for the "chunk" is added to the start of the chunk. + This is the starting point of the next chunk and holds the `next_chunk_id` + that we return in the batch send response. The application service passes + this as `?chunk_id` + 1. A "chunk" event is added to the end of the chunk. This is the event that + connects to an insertion event by `?chunk_id`. + 1. If `?chunk_id` is not specified (usually for the first chunk), create base + "insertion" event as a jumping off point from `?prev_event`. + 1. All of the events in the historical chunk get a content field, + `"m.historical": true`, to indicate that they are historical at the point of + being added to a room. + 1. The `state_events_at_start`/`events` payload is in **chronological** order + (`[0, 1, 2]`) and is processed in that order so the `prev_events` point to + it's older-in-time previous message which gives us a nice straight line in + the DAG. + - **Depth discussion:** For Synapse, when persisting, we **reverse the list + (to make it reverse-chronological)** so we can still get the correct + `(topological_ordering, stream_ordering)` so it sorts between A and B as + we expect. Why? `depth` is not re-calculated when historical messages are + inserted into the DAG. This means we have to take care to insert in the + right order. Events are sorted by `(topological_ordering, + stream_ordering)` where `topological_ordering` is just `depth`. Normally, + `stream_ordering` is an auto incrementing integer but for + `backfilled=true` events, it decrements. Historical messages are inserted + all at the same `depth`, and marked as backfilled so the `stream_ordering` + decrements and each event is sorted behind the next. (from + https://github.com/matrix-org/synapse/pull/9247#discussion_r588479201) @@ -164,13 +236,14 @@ This section explains the homeserver magic that happens when someone uses the `b #### Basic chunk structure -Here is the starting point how the historical chunk concept look like in the DAG. -We're going to build from this in the next sections. +Here is the starting point how the historical chunk concept look like in the +DAG. We're going to build from this in the next sections. - `A` is the oldest-in-time message - `B` is the newest-in-time message - `chunk0` is the first chunk we try to import - - Each chunk of messages is older-in-time than the last (`chunk1` is older than `chunk0`, etc) + - Each chunk of messages is older-in-time than the last (`chunk1` is older than + `chunk0`, etc) ![](https://user-images.githubusercontent.com/558581/126577416-68f1a5b0-2818-48c1-b046-21e504a0fe83.png) @@ -217,12 +290,18 @@ flowchart BT #### Adding "insertion" and "chunk" events -Next we add "insertion" and "chunk" events so it's more presriptive on how each historical chunk should connect to each other and how the homeserver can navigate the DAG. +Next we add "insertion" and "chunk" events so it's more presriptive on how each +historical chunk should connect to each other and how the homeserver can +navigate the DAG. - - With "insertion" events, we just add them to the start of each chronological chunk (where the oldest message in the chunk is). The next older-in-time chunk can connect to that "insertion" point from the previous chunk. - - The initial "insertion" event could be from the main DAG or we can create it ad-hoc in the first chunk so the homeserver can start traversing up the chunk from there after a "marker" event points to it. - - We use "chunk" events to point to the "insertion" event by referencing the "next_chunk_id" from the "insertion" event. - - Consideration: the "insertion"/"chunk" events add a new way for an application service to tie the chunk reconciliation in knots(similar to the DAG knots that can happen). + - With "insertion" events, we just add them to the start of each chronological + chunk (where the oldest message in the chunk is). The next older-in-time + chunk can connect to that "insertion" point from the previous chunk. + - The initial "insertion" event could be from the main DAG or we can create it + ad-hoc in the first chunk so the homeserver can start traversing up the chunk + from there after a "marker" event points to it. + - We use `m.room.chunk` events to indicate which `m.room.insertion` event it + connects to by its `m.next_chunk_id` field ![](https://user-images.githubusercontent.com/558581/127040602-e95ac36a-5e64-4176-904d-6abae2c95ae9.png) @@ -294,32 +373,72 @@ The structure of the chunk event would look like: #### Adding marker events -Finally, we add "marker" events into the mix so that federated remote servers can also navigate and to know where/how to fetch historical messages correctly. +Finally, we add "marker" events into the mix so that federated remote servers +can also navigate and to know where/how to fetch historical messages correctly. -To lay out the different types of servers consuming these historical messages (more context on why we need "marker" events): +To lay out the different types of servers consuming these historical messages +(more context on why we need "marker" events): 1. Local server - - This can pretty much work out of the box. Just add the events to the database and they're available. The new endpoint is just a mechanism to insert the events. - 1. Federated remote server that already has all scrollback history and then new history is inserted - - The big problem is how does a HS know it needs to go fetch more history if they already fetched all of the history in the room? We're solving this with "marker" events which are sent on the "live" timeline and point back to the "insertion" event where we inserted history next to. The HS can then go and backfill the "insertion" event and continue navigating the chunks from there. + - This can pretty much work out of the box. Just add the events to the + database and they're available. The new endpoint is just a mechanism to + insert the events. + 1. Federated remote server that already has all scrollback history and then new + history is inserted + - The big problem is how does a HS know it needs to go fetch more history if + they already fetched all of the history in the room? We're solving this + with "marker" events which are sent on the "live" timeline and point back + to the "insertion" event where we inserted history next to. The HS can + then go and backfill the "insertion" event and continue navigating the + chunks from there. 1. Federated remote server that joins a new room with historical messages - - We need to update the `/backfill` response to include historical messages from the chunks + - We need to update the `/backfill` response to include historical messages + from the chunks 1. Federated remote server already in the room when history is inserted - - Depends on whether the HS has the scrollback history. If the HS already has all history, see scenario 2, if doesn't, see scenario 3. + - Depends on whether the HS has the scrollback history. If the HS already + has all history, see scenario 2, if doesn't, see scenario 3. 1. For federated servers already in the room that haven't implemented MSC2716 - - Those homeservers won't have historical messages available because they're unable to navigate the "marker"/"insertion" events. But the historical messages would be available once the HS implements MSC2716 and processes the "marker" events that point to the history. + - Those homeservers won't have historical messages available because they're + unable to navigate the "marker"/"insertion" events. But the historical + messages would be available once the HS implements MSC2716 and processes + the "marker" events that point to the history. --- - A "marker" event simply points back to an "insertion" event. - - The "marker" event solves the problem of, how does a federated homeserver know about the historical events which won't come down incremental sync? And the scenario where the federated HS already has all the history in the room, so it won't do a full sync of the room again. - - Unlike the historical events sent via `/batch_send`, **the "marker" event is sent separately as a normal event on the "live" timeline** so that comes down incremental sync and is available to all homeservers regardless of how much scrollback history they already have. - - Note: If a server joins after a "marker" event is sent, it could be lost in the middle of the timeline and they could jump back in time past the "marker" and never pick it up. But `backfill/` response should have historical messages included. It gets a bit hairy if the server has the room backfilled, the user leaves, a "marker" event is sent, more messages put it back in the timeline, the user joins back, jumps back in the timeline and misses the "marker" and expects to see the historical messages. They will be missing the historical messages until they can backfill the gap where they left. - - A "marker" event is not needed for every chunk/batch of historical messages. Multiple chunks can be inserted then once we're done importing everything, we can add one "marker" event pointing at the root "insertion" event + - The "marker" event solves the problem of, how does a federated homeserver + know about the historical events which won't come down incremental sync? And + the scenario where the federated HS already has all the history in the room, + so it won't do a full sync of the room again. + - Unlike the historical events sent via `/batch_send`, **the "marker" event is + sent separately as a normal event on the "live" timeline** so that comes down + incremental sync and is available to all homeservers regardless of how much + scrollback history they already have. + - Note: If a server joins after a "marker" event is sent, it could be lost + in the middle of the timeline and they could jump back in time past the + "marker" and never pick it up. But `backfill/` response should have + historical messages included. It gets a bit hairy if the server has the + room backfilled, the user leaves, a "marker" event is sent, more messages + put it back in the timeline, the user joins back, jumps back in the + timeline and misses the "marker" and expects to see the historical + messages. They will be missing the historical messages until they can + backfill the gap where they left. + - A "marker" event is not needed for every chunk/batch of historical messages. + Multiple chunks can be inserted then once we're done importing everything, we + can add one "marker" event pointing at the root "insertion" event - If more history is decided to be added later, another "marker" can be sent to let the homeservers know again. - - When a remote federated homeserver, receives a "marker" event, it can mark the "insertion" prev events as needing to backfill from that point again and can fetch the historical messages when the user scrolls back to that area in the future. - - We could remove the need for "marker" events if we decided to only allow sending "insertion" events on the "live" timeline at any point where you would later want to add history. But this isn't compatible with our dynamic insertion use cases like Gitter where the rooms already exist with no "insertion" events at the start of the room, and the examples from this MSC like NNTP (newsgroup) and email which can potentially want to branch off of everything. + - When a remote federated homeserver, receives a "marker" event, it can mark + the "insertion" prev events as needing to backfill from that point again and + can fetch the historical messages when the user scrolls back to that area in + the future. + - We could remove the need for "marker" events if we decided to only allow + sending "insertion" events on the "live" timeline at any point where you + would later want to add history. But this isn't compatible with our dynamic + insertion use cases like Gitter where the rooms already exist with no + "insertion" events at the start of the room, and the examples from this MSC + like NNTP (newsgroup) and email which can potentially want to branch off of + everything. The structure of the "marker" event would look like: ```js @@ -371,14 +490,63 @@ flowchart BT
+## Potential issues + +Also see the security considerations section below. + +This doesn't provide a way for a HS to tell an AS that a client has tried to +call `/messages` beyond the beginning of a room, and that the AS should try to +lazy-insert some more messages (as per +https://github.com/matrix-org/matrix-doc/issues/698). For this MSC to be +properly useful, we might want to flesh that out. Another related problem with +the existing appservice query APIs is that they don't include who is querying, +so they're hard to use in bridges that require logging in. If a similar query +API is added here, it should include the ID of the user who's asking for +history. + + +## Alternatives + +We could insist that we use the SS API to import history history in this manner +rather than extending the AS API. However, it seems unnecessarily burdensome to +make bridge authors understand the SS API, especially when we already have so +many AS API bridges. Hence these minor extensions to the existing AS API. + +Another way of doing this might be to store the different eras of the room as +different versions of the room, using `m.room.tombstone` events to form a linked +list of the eras. This has the advantage of isolating room state between +different eras of the room, simplifying state resolution calculations and +avoiding risk of any cross-talk. It's also easier to reason about, and avoids +exposing the DAG to bridge developers. However, it would require better +presentation of room versions in clients, and it would require support for +retrospectively specifying the `predecessor` of the current room when you +retrospectively import history. Currently `predecessor` is in the immutable +`m.room.create` event of a room, so cannot be changed retrospectively - and +doing so in a safe and race-free manner sounds Hard. A big problem with this +approach is if you just want to inject a few old lost messages - eg if you're +importing a mail or newsgroup archive and you stumble across a lost mbox with a +few msgs in retrospect, you wouldn't want or be able to splice a whole new room +in with tombstones. + + + + +## Security considerations +The "insertion" and "chunk" events add a new way for an application service to +tie the chunk reconciliation in knots(similar to the DAG knots that can happen) +which can potentially DoS message and backfill navigation on the server. +This also makes it much easier for an AS to maliciously spoof history. This is +a bit unavoidable given the nature of the feature, and is also possible today +via SS API. ## Unstable prefix -Event types, event content fields, and the API endpoint are all using the unstable prefix `org.matrix.msc2716`: +Event types, event content fields, and the API endpoint are all using the +unstable prefix `org.matrix.msc2716`: **Endpoints:** @@ -399,8 +567,10 @@ Event types, event content fields, and the API endpoint are all using the unstab **Room version:** - - `org.matrix.msc2716` and `org.matrix.msc2716v2`, etc as we develop and iterate along the way + - `org.matrix.msc2716` and `org.matrix.msc2716v2`, etc as we develop and + iterate along the way **Power level:** - - `historical` (does not need prefixing because it's already under an experimental room version) + - `historical` (does not need prefixing because it's already under an + experimental room version) From 92f87edbbba6c8c83055f2253c5d5106858d8725 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Fri, 6 Aug 2021 21:58:39 -0500 Subject: [PATCH 11/68] Add remaining alternatives --- .../2716-batch-send-historical-messages.md | 20 +++ ...6-importing-history-into-existing-rooms.md | 135 ------------------ 2 files changed, 20 insertions(+), 135 deletions(-) delete mode 100644 proposals/2716-importing-history-into-existing-rooms.md diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 31f0ad0936..21b8987a4c 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -512,6 +512,13 @@ rather than extending the AS API. However, it seems unnecessarily burdensome to make bridge authors understand the SS API, especially when we already have so many AS API bridges. Hence these minor extensions to the existing AS API. +Another way of doing this is using the existing single send state and event API +endpoints. We could use `PUT /_matrix/client/r0/rooms/{roomId}/state/{eventType}/{stateKey}` +with `?historical=true` which would create the floating outlier state events. +Then we could use `PUT /_matrix/client/r0/rooms/{roomId}/send/{eventType}/{txnId}`, +with `?prev_event` pointing at that floating state to auth the event and where we +want to insert the event. + Another way of doing this might be to store the different eras of the room as different versions of the room, using `m.room.tombstone` events to form a linked list of the eras. This has the advantage of isolating room state between @@ -528,7 +535,20 @@ importing a mail or newsgroup archive and you stumble across a lost mbox with a few msgs in retrospect, you wouldn't want or be able to splice a whole new room in with tombstones. +Another way could be to let the server who issued the `m.room.create` also go +and retrospectively insert events into the room outside the context of the DAG +(i.e. without parent prev_events or signatures). To quote the original +[bug](https://github.com/matrix-org/matrix-doc/issues/698#issuecomment-259478116): + +> You could just create synthetic events which look like normal DAG events but + exist before the m.room.create event. Their signatures and prev-events would + all be missing, but they would be blindly trusted based on the HS who is + allowed to serve them (based on metadata in the m.room.create event). Thus + you'd have a perimeter in the DAG beyond which events are no longer + decentralised or signed, but are blindly trusted to let HSes insert ancient + history provided by ASes. +However, this feels needlessly complicated if the DAG approach is sufficient. ## Security considerations diff --git a/proposals/2716-importing-history-into-existing-rooms.md b/proposals/2716-importing-history-into-existing-rooms.md deleted file mode 100644 index fdd3b85bdf..0000000000 --- a/proposals/2716-importing-history-into-existing-rooms.md +++ /dev/null @@ -1,135 +0,0 @@ -# MSC2716: Incrementally importing history into existing rooms - -## Problem - -Matrix has historically been unable to easily import existing history into a -room that already exists. This is a major problem when bridging existing -conversations into Matrix, particularly if the scrollback is being -incrementally or lazily imported. - -For instance, an NNTP bridge might work by letting a user join a room that -maps to a given newsgroup, first showing an empty room, and then importing the -most recent 1000 newsgroup posts for that room to flesh out some history. The -bridge might then choose to slowly import additional posts for that newsgroup -in the background, until however many decades of backfill were complete. -Finally, as more archives surface, they might also need to be manually -gradually added into the history of the room - slowly building up the complete -history of the conversations over time. - -This is currently not supported because: - * There is no way to set historical room state in a room via the CS or AS API - - you can only edit current room state. - * There is no way to create messages in the context of historical room state in - a room via CS or AS API - you can only create events relative to current room - state. - * There is currently no way to override the timestamp on an event via the AS API. - (We used to have the concept of [timestamp - massaging](https://matrix.org/docs/spec/application_service/r0.1.2#timestamp-massaging), - but it never got properly specified) - -## Proposal - - 1. We let the AS API override the prev_event(s) of an event when injecting it into - the room, thus letting bridges consciously specify the topological ordering of - the room DAG. We do this by adding a `prev_event` querystring parameter on the - `PUT /_matrix/client/r0/rooms/{roomId}/send/{eventType}/{txnId}` and - `PUT /_matrix/client/r0/rooms/{roomId}/state/{eventType}/{stateKey}` endpoints. - The `prev_event` parameter can be repeated multiple times to specify multiple parent - event IDs of the event being submitted. An event must not have more than 20 prev_events. - If a `prev_event` parameter is not presented, the server assumes the event is being - appended to the current timeline and calculates the prev_events as normal. If an - unrecognised event ID is specified as a `prev_event`, the request fails with a 404. - - 2. We also let the AS API override ('massage') the `origin_server_ts` timestamp applied - to sent events. We do this by adding a `ts` querystring parameter on the - `PUT /_matrix/client/r0/rooms/{roomId}/send/{eventType}/{txnId}` and - `PUT /_matrix/client/r0/rooms/{roomId}/state/{eventType}/{stateKey}`endpoints, specifying - the value to apply to `origin_server_ts` on the event (UNIX epoch milliseconds). - - 3. Finally, we can add a optional `"m.historical": true` field to events to - indicate that they are historical at the point of being added to a room, and - as such servers should not serve them to clients via the CS `/sync` API - - instead preferring clients to discover them by paginating scrollback via - `/messages`. - -This lets history be injected at the right place topologically in the room. For instance, different eras of the room could -end up as branches off the original `m.room.create` event, each first setting up the contextual room state for that era before -the block of imported history. So, you could end up with something like this: - -``` -m.room.create - |\ - | \___________________________________ - | \ \ - | \ \ -live timeline previous 1000 messages another block of ancient history -``` - -We consciously don't support the new `parent` and `ts` parameters on the -various helper syntactic-sugar APIs like `/kick` and `/ban`. If a bridge/bot is -smart enough to be faking history, it is already in the business of dealing -with raw events, and should not be using the syntactic sugar APIs. - -## Potential issues - -There are a bunch of security considerations here - see below. - -This doesn't provide a way for a HS to tell an AS that a client has tried to call -/messages beyond the beginning of a room, and that the AS should try to -lazy-insert some more messages (as per https://github.com/matrix-org/matrix-doc/issues/698). -For this MSC to be properly useful, we might want to flesh that out. - -## Alternatives - -We could insist that we use the SS API to import history history in this manner rather than -extending the AS API. However, it seems unnecessarily burdensome to make bridge authors -understand the SS API, especially when we already have so many AS API bridges. Hence these -minor extensions to the existing AS API. - -Another way of doing this might be to store the different eras of the room as -different versions of the room, using `m.room.tombstone` events to form a -linked list of the eras. This has the advantage of isolating room state -between different eras of the room, simplifying state resolution calculations -and avoiding risk of any cross-talk. It's also easier to reason about, and -avoids exposing the DAG to bridge developers. However, it would require -better presentation of room versions in clients, and it would require support -for retrospectively specifying the `predecessor` of the current room when you -retrospectively import history. Currently `predecessor` is in the immutable -`m.room.create` event of a room, so cannot be changed retrospectively - and -doing so in a safe and race-free manner sounds Hard. - -Another way could be to let the server who issued the m.room.create also go -and retrospectively insert events into the room outside the context of the DAG -(i.e. without parent prev_events or signatures). To quote the original -[bug](https://github.com/matrix-org/matrix-doc/issues/698#issuecomment-259478116): - -> You could just create synthetic events which look like normal DAG events but - exist before the m.room.create event. Their signatures and prev-events would - all be missing, but they would be blindly trusted based on the HS who is - allowed to serve them (based on metadata in the m.room.create event). Thus - you'd have a perimeter in the DAG beyond which events are no longer - decentralised or signed, but are blindly trusted to let HSes insert ancient - history provided by ASes. - -However, this feels needlessly complicated if the DAG approach is sufficient. - -## Security considerations - -This allows an AS to tie the room DAG in knots by specifying inappropriate -event IDs as parents, potentially DoSing the state resolution algorithm, or -triggering undesired state resolution results. This is already possible by the -SS API today however, and given AS API requires the homeserver admin to -explicitly authorise the AS in question, this doesn't feel too bad. - -This also makes it much easier for an AS to maliciously spoof history. This -is a bit unavoidable given the nature of the feature, and is also possible -today via SS API. - -If the state changes from under us due to importing history, we have no way to -tell the client about it. This is an [existing -bug](https://github.com/matrix-org/synapse/issues/4508) that can be triggered -today by SS API traffic, so is orthogonal to this proposal. - -## Unstable prefix - -Feels unnecessary. From bb44a638fa2c17d9afc674f84323a30971f3b765 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Fri, 6 Aug 2021 22:04:16 -0500 Subject: [PATCH 12/68] Correct stable endpoint location --- proposals/2716-batch-send-historical-messages.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 21b8987a4c..a94627162a 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -45,7 +45,7 @@ Here is what scrollback is expected to look like in Element: **Endpoint:** - - `POST /_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send?prev_event=&chunk_id=` + - `POST /_matrix/client/r0/rooms//batch_send?prev_event=&chunk_id=` **Event types:** From 3367f56fabaecf992038f72910e1b742b7aa2207 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Fri, 6 Aug 2021 22:52:09 -0500 Subject: [PATCH 13/68] Reading pass --- .../2716-batch-send-historical-messages.md | 74 +++++++++---------- 1 file changed, 33 insertions(+), 41 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index a94627162a..a39c02e22d 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -61,7 +61,7 @@ Here is what scrollback is expected to look like in Element: **Content fields:** - - `m.historical` (`[true|false]`): Used on any event to indicate they were + - `m.historical` (`[true|false]`): Used on any event to indicate that it was historically imported after the fact - `m.next_chunk_id` (`string`): This is a random unique string for a `m.room.insertion` event to indicate what ID the next "chunk" event should @@ -208,8 +208,9 @@ breakdown which incrementally explains how everything fits together. this as `?chunk_id` 1. A "chunk" event is added to the end of the chunk. This is the event that connects to an insertion event by `?chunk_id`. - 1. If `?chunk_id` is not specified (usually for the first chunk), create base - "insertion" event as a jumping off point from `?prev_event`. + 1. If `?chunk_id` is not specified (usually only for the first chunk), create a + base "insertion" event as a jumping off point from `?prev_event` which can + be added to the end of the `events` list in the response. 1. All of the events in the historical chunk get a content field, `"m.historical": true`, to indicate that they are historical at the point of being added to a room. @@ -236,14 +237,14 @@ breakdown which incrementally explains how everything fits together. #### Basic chunk structure -Here is the starting point how the historical chunk concept look like in the -DAG. We're going to build from this in the next sections. +Here is the starting point for how the historical chunk concept looks like in +the DAG. We're going to build from this in the next sections. - `A` is the oldest-in-time message - `B` is the newest-in-time message - `chunk0` is the first chunk we try to import - - Each chunk of messages is older-in-time than the last (`chunk1` is older than - `chunk0`, etc) + - Each chunk of messages is older-in-time than the last (`chunk1` is + older-in-time than `chunk0`, etc) ![](https://user-images.githubusercontent.com/558581/126577416-68f1a5b0-2818-48c1-b046-21e504a0fe83.png) @@ -290,18 +291,18 @@ flowchart BT #### Adding "insertion" and "chunk" events -Next we add "insertion" and "chunk" events so it's more presriptive on how each +Next we add "insertion" and "chunk" events so it's more prescriptive on how each historical chunk should connect to each other and how the homeserver can navigate the DAG. - With "insertion" events, we just add them to the start of each chronological chunk (where the oldest message in the chunk is). The next older-in-time chunk can connect to that "insertion" point from the previous chunk. - - The initial "insertion" event could be from the main DAG or we can create it - ad-hoc in the first chunk so the homeserver can start traversing up the chunk - from there after a "marker" event points to it. + - The initial base "insertion" event could be from the main DAG or we can + create it ad-hoc in the first chunk so the homeserver can start traversing up + the chunk from there after a "marker" event points to it. - We use `m.room.chunk` events to indicate which `m.room.insertion` event it - connects to by its `m.next_chunk_id` field + connects to by its `m.next_chunk_id` field. ![](https://user-images.githubusercontent.com/558581/127040602-e95ac36a-5e64-4176-904d-6abae2c95ae9.png) @@ -338,7 +339,7 @@ flowchart BT
-The structure of the insertion event would look like: +The structure of the insertion event looks like: ```js { "type": "m.room.insertion", @@ -354,7 +355,7 @@ The structure of the insertion event would look like: ``` -The structure of the chunk event would look like: +The structure of the chunk event looks like: ```js { "type": "m.room.chunk", @@ -374,15 +375,15 @@ The structure of the chunk event would look like: #### Adding marker events Finally, we add "marker" events into the mix so that federated remote servers -can also navigate and to know where/how to fetch historical messages correctly. +also know where in the DAG they should look for historical messages. To lay out the different types of servers consuming these historical messages (more context on why we need "marker" events): 1. Local server - - This can pretty much work out of the box. Just add the events to the - database and they're available. The new endpoint is just a mechanism to - insert the events. + - This pretty much works out of the box. It's possible to just add the + historical events to the database and they're available. The new endpoint + is just a mechanism to insert the events. 1. Federated remote server that already has all scrollback history and then new history is inserted - The big problem is how does a HS know it needs to go fetch more history if @@ -390,18 +391,18 @@ To lay out the different types of servers consuming these historical messages with "marker" events which are sent on the "live" timeline and point back to the "insertion" event where we inserted history next to. The HS can then go and backfill the "insertion" event and continue navigating the - chunks from there. + historical chunks from there. 1. Federated remote server that joins a new room with historical messages - - We need to update the `/backfill` response to include historical messages - from the chunks + - The originating homeserver just needs to update the `/backfill` response + to include historical messages from the chunks. 1. Federated remote server already in the room when history is inserted - Depends on whether the HS has the scrollback history. If the HS already has all history, see scenario 2, if doesn't, see scenario 3. 1. For federated servers already in the room that haven't implemented MSC2716 - Those homeservers won't have historical messages available because they're - unable to navigate the "marker"/"insertion" events. But the historical - messages would be available once the HS implements MSC2716 and processes - the "marker" events that point to the history. + unable to navigate the "marker"/"insertion"/"chunk" events. But the + historical messages would be available once the HS implements MSC2716 and + processes the "marker" events that point to the history. --- @@ -424,23 +425,17 @@ To lay out the different types of servers consuming these historical messages timeline and misses the "marker" and expects to see the historical messages. They will be missing the historical messages until they can backfill the gap where they left. - - A "marker" event is not needed for every chunk/batch of historical messages. - Multiple chunks can be inserted then once we're done importing everything, we - can add one "marker" event pointing at the root "insertion" event + - A "marker" event is not needed for every chunk of historical messages added + via `/batch_send`. Multiple chunks can be inserted then once we're done + importing everything, we can add one "marker" event pointing at the root + "insertion" event - If more history is decided to be added later, another "marker" can be sent to let the homeservers know again. - When a remote federated homeserver, receives a "marker" event, it can mark the "insertion" prev events as needing to backfill from that point again and can fetch the historical messages when the user scrolls back to that area in the future. - - We could remove the need for "marker" events if we decided to only allow - sending "insertion" events on the "live" timeline at any point where you - would later want to add history. But this isn't compatible with our dynamic - insertion use cases like Gitter where the rooms already exist with no - "insertion" events at the start of the room, and the examples from this MSC - like NNTP (newsgroup) and email which can potentially want to branch off of - everything. - -The structure of the "marker" event would look like: + +The structure of the "marker" event looks like: ```js { "type": "m.room.marker", @@ -498,8 +493,8 @@ This doesn't provide a way for a HS to tell an AS that a client has tried to call `/messages` beyond the beginning of a room, and that the AS should try to lazy-insert some more messages (as per https://github.com/matrix-org/matrix-doc/issues/698). For this MSC to be -properly useful, we might want to flesh that out. Another related problem with -the existing appservice query APIs is that they don't include who is querying, +extra useful, we might want to flesh that out. Another related problem with +the existing AS query APIs is that they don't include who is querying, so they're hard to use in bridges that require logging in. If a similar query API is added here, it should include the ID of the user who's asking for history. @@ -565,9 +560,6 @@ via SS API. ## Unstable prefix -Event types, event content fields, and the API endpoint are all using the -unstable prefix `org.matrix.msc2716`: - **Endpoints:** - `POST /_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send` From a4c474eb221d40b2d50d0df65b593d4abd3bbb58 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 7 Sep 2021 09:18:18 -0500 Subject: [PATCH 14/68] Fix casing typo Co-authored-by: witchent --- proposals/2716-batch-send-historical-messages.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index a39c02e22d..c94bf7966c 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -524,7 +524,7 @@ presentation of room versions in clients, and it would require support for retrospectively specifying the `predecessor` of the current room when you retrospectively import history. Currently `predecessor` is in the immutable `m.room.create` event of a room, so cannot be changed retrospectively - and -doing so in a safe and race-free manner sounds Hard. A big problem with this +doing so in a safe and race-free manner sounds hard. A big problem with this approach is if you just want to inject a few old lost messages - eg if you're importing a mail or newsgroup archive and you stumble across a lost mbox with a few msgs in retrospect, you wouldn't want or be able to splice a whole new room From 9df8b6e4b0634c765b7c30d760de881059307a6b Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Thu, 16 Sep 2021 19:37:11 -0500 Subject: [PATCH 15/68] Split out meta MSC2716 events into their own fields Incorporate feedback from: - https://github.com/matrix-org/matrix-doc/pull/2716#discussion_r684575957 - https://github.com/matrix-org/matrix-doc/pull/2716#discussion_r685508994 --- .../2716-batch-send-historical-messages.md | 22 ++++++++++++------- 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index c94bf7966c..6eed78dd2f 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -165,19 +165,25 @@ Request body: Request response: ```jsonc { - "state_events": [ - // list of state event ID's we inserted... + // List of state event ID's we inserted + "state_event_ids": [ + // member state event ID ], - // List of historical event ID's we inserted which includes the - // auto-generated insertion and chunk events... - "events": [ - // insertion event ID for chunk + // List of historical event ID's we inserted + "event_ids": [ // historical message1 event ID // historical message2 event ID - // chunk event ID - // base insertion event ID ], "next_chunk_id": "random-unique-string", + "insertion_event_id": "$X9RSsCPKu5gTVIJCoDe6HeCmsrp6kD31zXjMRfBCADE", + "chunk_event_id": "$kHspK8a5kQN2xkTJMDWL-BbmeYVYAloQAA9QSLOsOZ4", + // When `?chunk_id` isn't provided, the homeserver automatically creates an + // insertion event as a starting place to hang the history off of. This automatic + // insertion event ID is returned in this field. + // + // When `?chunk_id` is provided, this field is not present because we can hang + // the history off the insertione event specified associated by the chunk ID. + "base_insertion_event_id": "$pmmaTamxhcyLrrOKSrJf3c1zNmfvsE5SGpFpgE_UvN0" } ``` From b3b7903db168cbb26b989c05e583e123430903dc Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Thu, 16 Sep 2021 19:41:32 -0500 Subject: [PATCH 16/68] Remove outdated comment on current iteration of spec See https://github.com/matrix-org/matrix-doc/pull/2716#discussion_r684576542 --- proposals/2716-batch-send-historical-messages.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 6eed78dd2f..1d53bbf63a 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -17,8 +17,6 @@ gradually added into the history of the room - slowly building up the complete history of the conversations over time. This is currently not supported because: - * There is no way to set historical room state in a room via the CS or AS API - - you can only edit current room state. * There is no way to create messages in the context of historical room state in a room via CS or AS API - you can only create events relative to current room state. From 38bebb16174be7ed7084ac97fdf09c449d61f056 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Thu, 16 Sep 2021 19:43:06 -0500 Subject: [PATCH 17/68] Use more obvious query param name See https://github.com/matrix-org/matrix-doc/pull/2716#discussion_r705872887 --- proposals/2716-batch-send-historical-messages.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 1d53bbf63a..f77d7bc534 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -43,7 +43,7 @@ Here is what scrollback is expected to look like in Element: **Endpoint:** - - `POST /_matrix/client/r0/rooms//batch_send?prev_event=&chunk_id=` + - `POST /_matrix/client/r0/rooms//batch_send?prev_event_id=&chunk_id=` **Event types:** @@ -115,9 +115,9 @@ send these events in the existing room version, we instead only allow the room ### New historical batch send endpoint Add a new endpoint, `POST -/_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send?prev_event=&chunk_id=`, +/_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send?prev_event_id=&chunk_id=`, which can insert a chunk of events historically back in time next to the given -`prev_event`. This endpoint can only be used by application services. +`prev_event_id`. This endpoint can only be used by application services. This endpoint will handle the complexity of creating "insertion" and "chunk" events. All the application service has to do is use `?chunk_id` which comes @@ -213,7 +213,7 @@ breakdown which incrementally explains how everything fits together. 1. A "chunk" event is added to the end of the chunk. This is the event that connects to an insertion event by `?chunk_id`. 1. If `?chunk_id` is not specified (usually only for the first chunk), create a - base "insertion" event as a jumping off point from `?prev_event` which can + base "insertion" event as a jumping off point from `?prev_event_id` which can be added to the end of the `events` list in the response. 1. All of the events in the historical chunk get a content field, `"m.historical": true`, to indicate that they are historical at the point of @@ -515,7 +515,7 @@ Another way of doing this is using the existing single send state and event API endpoints. We could use `PUT /_matrix/client/r0/rooms/{roomId}/state/{eventType}/{stateKey}` with `?historical=true` which would create the floating outlier state events. Then we could use `PUT /_matrix/client/r0/rooms/{roomId}/send/{eventType}/{txnId}`, -with `?prev_event` pointing at that floating state to auth the event and where we +with `?prev_event_id` pointing at that floating state to auth the event and where we want to insert the event. Another way of doing this might be to store the different eras of the room as From 7df80fe59c7d5b79079b186f98ff397de2f665fa Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 13 Oct 2021 14:52:33 -0500 Subject: [PATCH 18/68] Rename from chunks to batches See https://github.com/matrix-org/matrix-doc/pull/2716#discussion_r684574497 --- .../2716-batch-send-historical-messages.md | 222 +++++++++--------- 1 file changed, 116 insertions(+), 106 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index f77d7bc534..040318fe04 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -43,14 +43,14 @@ Here is what scrollback is expected to look like in Element: **Endpoint:** - - `POST /_matrix/client/r0/rooms//batch_send?prev_event_id=&chunk_id=` + - `POST /_matrix/client/r0/rooms//batch_send?prev_event_id=&batch_id=` **Event types:** - `m.room.insertion`: Events that mark points in time where you can insert historical messages - - `m.room.chunk`: This is what connects one historical chunk to the other. In - the DAG, we navigate from an insertion event to the chunk event that points + - `m.room.batch`: This is what connects one historical batch to the other. In + the DAG, we navigate from an insertion event to the batch event that points at it, up the historical messages to the insertion event, then repeat the process - `m.room.marker`: Used to hint to homeservers (and potentially to cache bust @@ -61,11 +61,11 @@ Here is what scrollback is expected to look like in Element: - `m.historical` (`[true|false]`): Used on any event to indicate that it was historically imported after the fact - - `m.next_chunk_id` (`string`): This is a random unique string for a - `m.room.insertion` event to indicate what ID the next "chunk" event should + - `m.next_batch_id` (`string`): This is a random unique string for a + `m.room.insertion` event to indicate what ID the next "batch" event should specify in order to connect to it - - `m.chunk_id` (`string`): Used on `m.room.chunk` events to indicate which - `m.room.insertion` event it connects to by its `m.next_chunk_id` field + - `m.batch_id` (`string`): Used on `m.room.batch` events to indicate which + `m.room.insertion` event it connects to by its `m.next_batch_id` field - `m.marker.insertion` (another `event_id` string): For `m.room.marker` events to point at an `m.room.insertion` event by `event_id` @@ -74,11 +74,11 @@ Here is what scrollback is expected to look like in Element: Since events being silently sent in the past is hard to moderate, it will probably be good to limit who can add historical messages to the timeline. The batch send endpoint is already limited to application services but we also need -to limit who can send "insertion", "chunk", and "marker" events since someone +to limit who can send "insertion", "batch", and "marker" events since someone can attempt to send them via the normal `/send` API (we don't want any nasty weird knots to reconcile either). - - `historical`: This controls who can send `m.room.insertion`, `m.room.chunk`, + - `historical`: This controls who can send `m.room.insertion`, `m.room.batch`, and `m.room.marker` in the room. **Room version:** @@ -86,12 +86,12 @@ weird knots to reconcile either). The redaction algorithm changes are the only hard requirement for a new room version because we need to make sure when redacting, we only strip out fields without affecting anything at the protocol level. This means that we need to -keep all of the structural fields that allow us to navigate the chunks of +keep all of the structural fields that allow us to navigate the batches of history in the DAG. We also only want to auth events against fields that wouldn't be removed during redaction. In practice, this means: - - When redacting `m.room.insertion` events, keep the `m.next_chunk_id` content field around - - When redacting `m.room.chunk` events, keep the `m.chunk_id` content field around + - When redacting `m.room.insertion` events, keep the `m.next_batch_id` content field around + - When redacting `m.room.batch` events, keep the `m.batch_id` content field around - When redacting `m.room.marker` events, keep the `m.marker.insertion` content field around - When redacting `m.room.power_levels` events, keep the `historical` content field around @@ -100,7 +100,7 @@ wouldn't be removed during redaction. In practice, this means: However, this MSC is mostly backwards compatible and can be used with the current room version with the fact that redactions aren't supported for -`m.room.insertion`, `m.room.chunk`, `m.room.marker` events. We can protect +`m.room.insertion`, `m.room.batch`, `m.room.marker` events. We can protect people from this limitation by throwing an error when they try to use `PUT /_matrix/client/r0/rooms/{roomId}/redact/{eventId}/{txnId}` to redact one of those events. We would have to accept the redaction if it came over federation @@ -108,21 +108,21 @@ to avoid split-brained rooms. Because we also can't use the `historical` power level for controlling who can send these events in the existing room version, we instead only allow the room -`creator` to send `m.room.insertion`, `m.room.chunk`, and `m.room.chunk` events. +`creator` to send `m.room.insertion`, `m.room.batch`, and `m.room.marker` events. ### New historical batch send endpoint Add a new endpoint, `POST -/_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send?prev_event_id=&chunk_id=`, -which can insert a chunk of events historically back in time next to the given +/_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send?prev_event_id=&batch_id=`, +which can insert a batch of events historically back in time next to the given `prev_event_id`. This endpoint can only be used by application services. -This endpoint will handle the complexity of creating "insertion" and "chunk" -events. All the application service has to do is use `?chunk_id` which comes -from `next_chunk_id` in the response of the batch send endpoint. `next_chunk_id` -is derived from the insertion events added to each chunk and is not required for +This endpoint will handle the complexity of creating "insertion" and "batch" +events. All the application service has to do is use `?batch_id` which comes +from `next_batch_id` in the response of the batch send endpoint. `next_batch_id` +is derived from the insertion events added to each batch and is not required for the first batch send. Request body: @@ -172,15 +172,15 @@ Request response: // historical message1 event ID // historical message2 event ID ], - "next_chunk_id": "random-unique-string", + "next_batch_id": "random-unique-string", "insertion_event_id": "$X9RSsCPKu5gTVIJCoDe6HeCmsrp6kD31zXjMRfBCADE", - "chunk_event_id": "$kHspK8a5kQN2xkTJMDWL-BbmeYVYAloQAA9QSLOsOZ4", - // When `?chunk_id` isn't provided, the homeserver automatically creates an + "batch_event_id": "$kHspK8a5kQN2xkTJMDWL-BbmeYVYAloQAA9QSLOsOZ4", + // When `?batch_id` isn't provided, the homeserver automatically creates an // insertion event as a starting place to hang the history off of. This automatic // insertion event ID is returned in this field. // - // When `?chunk_id` is provided, this field is not present because we can hang - // the history off the insertione event specified associated by the chunk ID. + // When `?batch_id` is provided, this field is not present because we can hang + // the history off the insertione event specified associated by the batch ID. "base_insertion_event_id": "$pmmaTamxhcyLrrOKSrJf3c1zNmfvsE5SGpFpgE_UvN0" } ``` @@ -189,33 +189,33 @@ Request response: `state_events_at_start` is used to define the historical state events needed to auth the `events` like invite and join events. These events can float outside of the normal DAG. In Synapse, these are called `outlier`'s and won't be visible in -the chat history which also allows us to insert multiple chunks without having a -bunch of `@mxid joined the room` noise between each chunk. **The state will not +the chat history which also allows us to insert multiple batches without having a +bunch of `@mxid joined the room` noise between each batch. **The state will not be resolved into the current state of the room.** -`events` is chronological chunk/list of events you want to insert. For Synapse, -there is a reverse-chronological constraint on chunks so once you insert one -chunk of messages, you can only insert older an older chunk after that. **tldr; -Insert from your most recent chunk of history -> oldest history.** +`events` is chronological list of events you want to insert. For Synapse, +there is a reverse-chronological constraint on batches so once you insert one +batch of messages, you can only insert older an older batch after that. **tldr; +Insert from your most recent batch of history -> oldest history.** #### What does the batch send endpoint do behind the scenes? This section explains the homeserver magic that happens when someone uses the `batch_send` endpoint. If you're just trying to understand how the "insertion", -"chunk", "marker" events work, you might want to just skip down to the room DAG +"batch", "marker" events work, you might want to just skip down to the room DAG breakdown which incrementally explains how everything fits together. - 1. An "insertion" event for the "chunk" is added to the start of the chunk. - This is the starting point of the next chunk and holds the `next_chunk_id` + 1. An "insertion" event for the batch is added to the start of the batch. + This is the starting point of the next batch and holds the `next_batch_id` that we return in the batch send response. The application service passes - this as `?chunk_id` - 1. A "chunk" event is added to the end of the chunk. This is the event that - connects to an insertion event by `?chunk_id`. - 1. If `?chunk_id` is not specified (usually only for the first chunk), create a + this as `?batch_id` + 1. A "batch" event is added to the end of the batch. This is the event that + connects to an insertion event by `?batch_id`. + 1. If `?batch_id` is not specified (usually only for the first batch), create a base "insertion" event as a jumping off point from `?prev_event_id` which can be added to the end of the `events` list in the response. - 1. All of the events in the historical chunk get a content field, + 1. All of the events in the historical batch get a content field, `"m.historical": true`, to indicate that they are historical at the point of being added to a room. 1. The `state_events_at_start`/`events` payload is in **chronological** order @@ -239,21 +239,21 @@ breakdown which incrementally explains how everything fits together. ### Room DAG breakdown -#### Basic chunk structure +#### Basic batch structure -Here is the starting point for how the historical chunk concept looks like in +Here is the starting point for how the historical batch concept looks like in the DAG. We're going to build from this in the next sections. - `A` is the oldest-in-time message - `B` is the newest-in-time message - - `chunk0` is the first chunk we try to import - - Each chunk of messages is older-in-time than the last (`chunk1` is - older-in-time than `chunk0`, etc) + - `batch0` is the first batch we try to import + - Each batch of messages is older-in-time than the last (`batch1` is + older-in-time than `batch0`, etc) -![](https://user-images.githubusercontent.com/558581/126577416-68f1a5b0-2818-48c1-b046-21e504a0fe83.png) +![](https://user-images.githubusercontent.com/558581/137199056-f7e17437-0c98-4a06-9af1-eec8f026229c.png) -[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBCIC0tLS0tLS0tLS0tLS0-IEFcbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBjaHVuazBcbiAgICAgICAgY2h1bmswLTIoKFwiMlwiKSkgLS0-IGNodW5rMC0xKCgxKSkgLS0-IGNodW5rMC0wKCgwKSlcbiAgICBlbmRcblxuICAgIHN1YmdyYXBoIGNodW5rMVxuICAgICAgICBjaHVuazEtMigoXCIyXCIpKSAtLT4gY2h1bmsxLTEoKDEpKSAtLT4gY2h1bmsxLTAoKDApKVxuICAgIGVuZFxuICAgIFxuICAgIHN1YmdyYXBoIGNodW5rMlxuICAgICAgICBjaHVuazItMigoXCIyXCIpKSAtLT4gY2h1bmsyLTEoKDEpKSAtLT4gY2h1bmsyLTAoKDApKVxuICAgIGVuZFxuXG4gICAgXG4gICAgY2h1bmswLTAgLS0tLS0tLT4gQVxuICAgIGNodW5rMS0wIC0tPiBBXG4gICAgY2h1bmsyLTAgLS0-IEFcbiAgICBcbiAgICAlJSBhbGlnbm1lbnQgbGlua3MgXG4gICAgY2h1bmswLTAgLS0tIGNodW5rMS0yXG4gICAgY2h1bmsxLTAgLS0tIGNodW5rMi0yXG4gICAgJSUgbWFrZSB0aGUgbGlua3MgaW52aXNpYmxlIFxuICAgIGxpbmtTdHlsZSAxMCBzdHJva2Utd2lkdGg6MnB4LGZpbGw6bm9uZSxzdHJva2U6bm9uZTtcbiAgICBsaW5rU3R5bGUgMTEgc3Ryb2tlLXdpZHRoOjJweCxmaWxsOm5vbmUsc3Ryb2tlOm5vbmU7IiwibWVybWFpZCI6Int9IiwidXBkYXRlRWRpdG9yIjpmYWxzZSwiYXV0b1N5bmMiOnRydWUsInVwZGF0ZURpYWdyYW0iOmZhbHNlfQ) +[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBCIC0tLS0tLS0tLS0tLS0-IEFcbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBiYXRjaDBcbiAgICAgICAgYmF0Y2gwLTIoKFwiMlwiKSkgLS0-IGJhdGNoMC0xKCgxKSkgLS0-IGJhdGNoMC0wKCgwKSlcbiAgICBlbmRcblxuICAgIHN1YmdyYXBoIGJhdGNoMVxuICAgICAgICBiYXRjaDEtMigoXCIyXCIpKSAtLT4gYmF0Y2gxLTEoKDEpKSAtLT4gYmF0Y2gxLTAoKDApKVxuICAgIGVuZFxuICAgIFxuICAgIHN1YmdyYXBoIGJhdGNoMlxuICAgICAgICBiYXRjaDItMigoXCIyXCIpKSAtLT4gYmF0Y2gyLTEoKDEpKSAtLT4gYmF0Y2gyLTAoKDApKVxuICAgIGVuZFxuXG4gICAgXG4gICAgYmF0Y2gwLTAgLS0tLS0tLT4gQVxuICAgIGJhdGNoMS0wIC0tPiBBXG4gICAgYmF0Y2gyLTAgLS0-IEFcbiAgICBcbiAgICAlJSBhbGlnbm1lbnQgbGlua3MgXG4gICAgYmF0Y2gwLTAgLS0tIGJhdGNoMS0yXG4gICAgYmF0Y2gxLTAgLS0tIGJhdGNoMi0yXG4gICAgJSUgbWFrZSB0aGUgbGlua3MgaW52aXNpYmxlIFxuICAgIGxpbmtTdHlsZSAxMCBzdHJva2Utd2lkdGg6MnB4LGZpbGw6bm9uZSxzdHJva2U6bm9uZTtcbiAgICBsaW5rU3R5bGUgMTEgc3Ryb2tlLXdpZHRoOjJweCxmaWxsOm5vbmUsc3Ryb2tlOm5vbmU7IiwibWVybWFpZCI6IntcbiAgXCJ0aGVtZVwiOiBcImRlZmF1bHRcIlxufSIsInVwZGF0ZUVkaXRvciI6ZmFsc2UsImF1dG9TeW5jIjp0cnVlLCJ1cGRhdGVEaWFncmFtIjpmYWxzZX0)
mermaid graph syntax @@ -264,26 +264,26 @@ flowchart BT B -------------> A end - subgraph chunk0 - chunk0-2(("2")) --> chunk0-1((1)) --> chunk0-0((0)) + subgraph batch0 + batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) end - subgraph chunk1 - chunk1-2(("2")) --> chunk1-1((1)) --> chunk1-0((0)) + subgraph batch1 + batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) end - subgraph chunk2 - chunk2-2(("2")) --> chunk2-1((1)) --> chunk2-0((0)) + subgraph batch2 + batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) end - chunk0-0 -------> A - chunk1-0 --> A - chunk2-0 --> A + batch0-0 -------> A + batch1-0 --> A + batch2-0 --> A %% alignment links - chunk0-0 --- chunk1-2 - chunk1-0 --- chunk2-2 + batch0-0 --- batch1-2 + batch1-0 --- batch2-2 %% make the links invisible linkStyle 10 stroke-width:2px,fill:none,stroke:none; linkStyle 11 stroke-width:2px,fill:none,stroke:none; @@ -293,24 +293,24 @@ flowchart BT -#### Adding "insertion" and "chunk" events +#### Adding "insertion" and "batch" events -Next we add "insertion" and "chunk" events so it's more prescriptive on how each -historical chunk should connect to each other and how the homeserver can +Next we add "insertion" and "batch" events so it's more prescriptive on how each +historical batch should connect to each other and how the homeserver can navigate the DAG. - With "insertion" events, we just add them to the start of each chronological - chunk (where the oldest message in the chunk is). The next older-in-time - chunk can connect to that "insertion" point from the previous chunk. + batch (where the oldest message in the batch is). The next older-in-time + batch can connect to that "insertion" point from the previous batch. - The initial base "insertion" event could be from the main DAG or we can - create it ad-hoc in the first chunk so the homeserver can start traversing up - the chunk from there after a "marker" event points to it. - - We use `m.room.chunk` events to indicate which `m.room.insertion` event it - connects to by its `m.next_chunk_id` field. + create it ad-hoc in the first batch so the homeserver can start traversing up + the batch from there after a "marker" event points to it. + - We use `m.room.batch` events to indicate which `m.room.insertion` event it + connects to by its `m.next_batch_id` field. -![](https://user-images.githubusercontent.com/558581/127040602-e95ac36a-5e64-4176-904d-6abae2c95ae9.png) +![](https://user-images.githubusercontent.com/558581/137203204-fc630b1e-9ceb-41bb-b074-52a60514cd44.png) -[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBCIC0tLS0tLS0tLS0tLS0tLS0tPiBBXG4gICAgZW5kXG4gICAgXG4gICAgc3ViZ3JhcGggY2h1bmswXG4gICAgICAgIGNodW5rMC1jaHVua1tbXCJjaHVua1wiXV0gLS0-IGNodW5rMC0yKChcIjJcIikpIC0tPiBjaHVuazAtMSgoMSkpIC0tPiBjaHVuazAtMCgoMCkpIC0tPiBjaHVuazAtaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcblxuICAgIHN1YmdyYXBoIGNodW5rMVxuICAgICAgICBjaHVuazEtY2h1bmtbW1wiY2h1bmtcIl1dIC0tPiBjaHVuazEtMigoXCIyXCIpKSAtLT4gY2h1bmsxLTEoKDEpKSAtLT4gY2h1bmsxLTAoKDApKSAtLT4gY2h1bmsxLWluc2VydGlvblsvaW5zZXJ0aW9uXFxdXG4gICAgZW5kXG4gICAgXG4gICAgc3ViZ3JhcGggY2h1bmsyXG4gICAgICAgIGNodW5rMi1jaHVua1tbXCJjaHVua1wiXV0gLS0-IGNodW5rMi0yKChcIjJcIikpIC0tPiBjaHVuazItMSgoMSkpIC0tPiBjaHVuazItMCgoMCkpIC0tPiBjaHVuazItaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcblxuICAgIFxuICAgIGNodW5rMC1pbnNlcnRpb25CYXNlWy9pbnNlcnRpb25cXF0gLS0tLS0tLS0tLS0tLT4gQVxuICAgIGNodW5rMC1jaHVuayAtLi0-IGNodW5rMC1pbnNlcnRpb25CYXNlWy9pbnNlcnRpb25cXF1cbiAgICBjaHVuazEtY2h1bmsgLS4tPiBjaHVuazAtaW5zZXJ0aW9uXG4gICAgY2h1bmsyLWNodW5rIC0uLT4gY2h1bmsxLWluc2VydGlvblxuIiwibWVybWFpZCI6Int9IiwidXBkYXRlRWRpdG9yIjpmYWxzZSwiYXV0b1N5bmMiOnRydWUsInVwZGF0ZURpYWdyYW0iOmZhbHNlfQ) +[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBCIC0tLS0tLS0tLS0tLS0tLS0tLS0tPiBBXG4gICAgZW5kXG4gICAgXG4gICAgc3ViZ3JhcGggYmF0Y2gwXG4gICAgICAgIGJhdGNoMC1iYXRjaFtbXCJiYXRjaFwiXV0gLS0-IGJhdGNoMC0yKChcIjJcIikpIC0tPiBiYXRjaDAtMSgoMSkpIC0tPiBiYXRjaDAtMCgoMCkpIC0tPiBiYXRjaDAtaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcblxuICAgIHN1YmdyYXBoIGJhdGNoMVxuICAgICAgICBiYXRjaDEtYmF0Y2hbW1wiYmF0Y2hcIl1dIC0tPiBiYXRjaDEtMigoXCIyXCIpKSAtLT4gYmF0Y2gxLTEoKDEpKSAtLT4gYmF0Y2gxLTAoKDApKSAtLT4gYmF0Y2gxLWluc2VydGlvblsvaW5zZXJ0aW9uXFxdXG4gICAgZW5kXG4gICAgXG4gICAgc3ViZ3JhcGggYmF0Y2gyXG4gICAgICAgIGJhdGNoMi1iYXRjaFtbXCJiYXRjaFwiXV0gLS0-IGJhdGNoMi0yKChcIjJcIikpIC0tPiBiYXRjaDItMSgoMSkpIC0tPiBiYXRjaDItMCgoMCkpIC0tPiBiYXRjaDItaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcblxuXG4gICAgYmF0Y2gwLWluc2VydGlvbiAtLT4gZmFrZVByZXZFdmVudDB7e2Zha2VfcHJldl9ldmVudH19XG4gICAgYmF0Y2gxLWluc2VydGlvbiAtLT4gZmFrZVByZXZFdmVudDF7e2Zha2VfcHJldl9ldmVudH19XG4gICAgYmF0Y2gyLWluc2VydGlvbiAtLT4gZmFrZVByZXZFdmVudDJ7e2Zha2VfcHJldl9ldmVudH19XG5cbiAgICBcbiAgICBiYXRjaDAtaW5zZXJ0aW9uQmFzZVsvaW5zZXJ0aW9uXFxdIC0tLS0tLS0tLS0tLS0tLT4gQVxuICAgIGJhdGNoMC1iYXRjaCAtLi0-IGJhdGNoMC1pbnNlcnRpb25CYXNlWy9pbnNlcnRpb25cXF1cbiAgICBiYXRjaDEtYmF0Y2ggLS4tPiBiYXRjaDAtaW5zZXJ0aW9uXG4gICAgYmF0Y2gyLWJhdGNoIC0uLT4gYmF0Y2gxLWluc2VydGlvblxuIiwibWVybWFpZCI6IntcbiAgXCJ0aGVtZVwiOiBcImRlZmF1bHRcIlxufSIsInVwZGF0ZUVkaXRvciI6ZmFsc2UsImF1dG9TeW5jIjp0cnVlLCJ1cGRhdGVEaWFncmFtIjpmYWxzZX0)
mermaid graph syntax @@ -318,26 +318,31 @@ navigate the DAG. ```mermaid flowchart BT subgraph live - B -----------------> A + B --------------------> A end - subgraph chunk0 - chunk0-chunk[["chunk"]] --> chunk0-2(("2")) --> chunk0-1((1)) --> chunk0-0((0)) --> chunk0-insertion[/insertion\] + subgraph batch0 + batch0-batch[["batch"]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/insertion\] end - subgraph chunk1 - chunk1-chunk[["chunk"]] --> chunk1-2(("2")) --> chunk1-1((1)) --> chunk1-0((0)) --> chunk1-insertion[/insertion\] + subgraph batch1 + batch1-batch[["batch"]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/insertion\] end - subgraph chunk2 - chunk2-chunk[["chunk"]] --> chunk2-2(("2")) --> chunk2-1((1)) --> chunk2-0((0)) --> chunk2-insertion[/insertion\] + subgraph batch2 + batch2-batch[["batch"]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/insertion\] end + + batch0-insertion --> fakePrevEvent0{{fake_prev_event}} + batch1-insertion --> fakePrevEvent1{{fake_prev_event}} + batch2-insertion --> fakePrevEvent2{{fake_prev_event}} + - chunk0-insertionBase[/insertion\] -------------> A - chunk0-chunk -.-> chunk0-insertionBase[/insertion\] - chunk1-chunk -.-> chunk0-insertion - chunk2-chunk -.-> chunk1-insertion + batch0-insertionBase[/insertion\] ---------------> A + batch0-batch -.-> batch0-insertionBase[/insertion\] + batch1-batch -.-> batch0-insertion + batch2-batch -.-> batch1-insertion ```
@@ -349,7 +354,7 @@ The structure of the insertion event looks like: "type": "m.room.insertion", "sender": "@appservice:example.org", "content": { - "m.next_chunk_id": next_chunk_id, + "m.next_batch_id": next_batch_id, "m.historical": true }, "room_id": "!jEsUZKDJdhlrceRyVU:example.org", @@ -359,13 +364,13 @@ The structure of the insertion event looks like: ``` -The structure of the chunk event looks like: +The structure of the batch event looks like: ```js { - "type": "m.room.chunk", + "type": "m.room.batch", "sender": "@appservice:example.org", "content": { - "m.chunk_id": chunk_id, + "m.batch_id": batch_id, "m.historical": true }, "room_id": "!jEsUZKDJdhlrceRyVU:example.org", @@ -395,16 +400,16 @@ To lay out the different types of servers consuming these historical messages with "marker" events which are sent on the "live" timeline and point back to the "insertion" event where we inserted history next to. The HS can then go and backfill the "insertion" event and continue navigating the - historical chunks from there. + historical batches from there. 1. Federated remote server that joins a new room with historical messages - The originating homeserver just needs to update the `/backfill` response - to include historical messages from the chunks. + to include historical messages from the batches. 1. Federated remote server already in the room when history is inserted - Depends on whether the HS has the scrollback history. If the HS already has all history, see scenario 2, if doesn't, see scenario 3. 1. For federated servers already in the room that haven't implemented MSC2716 - Those homeservers won't have historical messages available because they're - unable to navigate the "marker"/"insertion"/"chunk" events. But the + unable to navigate the "marker"/"insertion"/"batch" events. But the historical messages would be available once the HS implements MSC2716 and processes the "marker" events that point to the history. @@ -429,8 +434,8 @@ To lay out the different types of servers consuming these historical messages timeline and misses the "marker" and expects to see the historical messages. They will be missing the historical messages until they can backfill the gap where they left. - - A "marker" event is not needed for every chunk of historical messages added - via `/batch_send`. Multiple chunks can be inserted then once we're done + - A "marker" event is not needed for every batch of historical messages added + via `/batch_send`. Multiple batches can be inserted then once we're done importing everything, we can add one "marker" event pointing at the root "insertion" event - If more history is decided to be added later, another "marker" can be sent to let the homeservers know again. @@ -453,9 +458,9 @@ The structure of the "marker" event looks like: } ``` -![](https://user-images.githubusercontent.com/558581/127429607-d67b6785-050f-4944-bd11-f31870ed43a0.png) +![](https://user-images.githubusercontent.com/558581/137203021-d5f5dcfe-3e47-4ee2-9041-232c13090218.png) -[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBtYXJrZXIxPlwibWFya2VyXCJdIC0tLS0-IEIgLS0tLS0tLS0tLS0tLS0tLS0-IEFcbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBjaHVuazBcbiAgICAgICAgY2h1bmswLWNodW5rW1tcImNodW5rXCJdXSAtLT4gY2h1bmswLTIoKFwiMlwiKSkgLS0-IGNodW5rMC0xKCgxKSkgLS0-IGNodW5rMC0wKCgwKSkgLS0-IGNodW5rMC1pbnNlcnRpb25bL2luc2VydGlvblxcXVxuICAgIGVuZFxuXG4gICAgc3ViZ3JhcGggY2h1bmsxXG4gICAgICAgIGNodW5rMS1jaHVua1tbXCJjaHVua1wiXV0gLS0-IGNodW5rMS0yKChcIjJcIikpIC0tPiBjaHVuazEtMSgoMSkpIC0tPiBjaHVuazEtMCgoMCkpIC0tPiBjaHVuazEtaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBjaHVuazJcbiAgICAgICAgY2h1bmsyLWNodW5rW1tcImNodW5rXCJdXSAtLT4gY2h1bmsyLTIoKFwiMlwiKSkgLS0-IGNodW5rMi0xKCgxKSkgLS0-IGNodW5rMi0wKCgwKSkgLS0-IGNodW5rMi1pbnNlcnRpb25bL2luc2VydGlvblxcXVxuICAgIGVuZFxuXG4gICAgXG4gICAgbWFya2VyMSAtLi0-IGNodW5rMC1pbnNlcnRpb25CYXNlXG4gICAgY2h1bmswLWluc2VydGlvbkJhc2VbL2luc2VydGlvblxcXSAtLS0tLS0tLS0tLS0tPiBBXG4gICAgY2h1bmswLWNodW5rIC0uLT4gY2h1bmswLWluc2VydGlvbkJhc2VbL2luc2VydGlvblxcXVxuICAgIGNodW5rMS1jaHVuayAtLi0-IGNodW5rMC1pbnNlcnRpb25cbiAgICBjaHVuazItY2h1bmsgLS4tPiBjaHVuazEtaW5zZXJ0aW9uXG4iLCJtZXJtYWlkIjoie30iLCJ1cGRhdGVFZGl0b3IiOmZhbHNlLCJhdXRvU3luYyI6dHJ1ZSwidXBkYXRlRGlhZ3JhbSI6ZmFsc2V9) +[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBtYXJrZXIxPlwibWFya2VyXCJdIC0tLS0-IEIgLS0tLS0tLS0tLS0tLS0tLS0-IEFcbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBiYXRjaDBcbiAgICAgICAgYmF0Y2gwLWJhdGNoW1tcImJhdGNoXCJdXSAtLT4gYmF0Y2gwLTIoKFwiMlwiKSkgLS0-IGJhdGNoMC0xKCgxKSkgLS0-IGJhdGNoMC0wKCgwKSkgLS0-IGJhdGNoMC1pbnNlcnRpb25bL2luc2VydGlvblxcXVxuICAgIGVuZFxuXG4gICAgc3ViZ3JhcGggYmF0Y2gxXG4gICAgICAgIGJhdGNoMS1iYXRjaFtbXCJiYXRjaFwiXV0gLS0-IGJhdGNoMS0yKChcIjJcIikpIC0tPiBiYXRjaDEtMSgoMSkpIC0tPiBiYXRjaDEtMCgoMCkpIC0tPiBiYXRjaDEtaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBiYXRjaDJcbiAgICAgICAgYmF0Y2gyLWJhdGNoW1tcImJhdGNoXCJdXSAtLT4gYmF0Y2gyLTIoKFwiMlwiKSkgLS0-IGJhdGNoMi0xKCgxKSkgLS0-IGJhdGNoMi0wKCgwKSkgLS0-IGJhdGNoMi1pbnNlcnRpb25bL2luc2VydGlvblxcXVxuICAgIGVuZFxuXG5cbiAgICBiYXRjaDAtaW5zZXJ0aW9uIC0tPiBmYWtlUHJldkV2ZW50MHt7ZmFrZV9wcmV2X2V2ZW50fX1cbiAgICBiYXRjaDEtaW5zZXJ0aW9uIC0tPiBmYWtlUHJldkV2ZW50MXt7ZmFrZV9wcmV2X2V2ZW50fX1cbiAgICBiYXRjaDItaW5zZXJ0aW9uIC0tPiBmYWtlUHJldkV2ZW50Mnt7ZmFrZV9wcmV2X2V2ZW50fX1cblxuICAgIFxuICAgIG1hcmtlcjEgLS4tPiBiYXRjaDAtaW5zZXJ0aW9uQmFzZVxuICAgIGJhdGNoMC1pbnNlcnRpb25CYXNlWy9pbnNlcnRpb25cXF0gLS0tLS0tLS0tLS0tLS0tPiBBXG4gICAgYmF0Y2gwLWJhdGNoIC0uLT4gYmF0Y2gwLWluc2VydGlvbkJhc2VbL2luc2VydGlvblxcXVxuICAgIGJhdGNoMS1iYXRjaCAtLi0-IGJhdGNoMC1pbnNlcnRpb25cbiAgICBiYXRjaDItYmF0Y2ggLS4tPiBiYXRjaDEtaW5zZXJ0aW9uXG4iLCJtZXJtYWlkIjoie1xuICBcInRoZW1lXCI6IFwiZGVmYXVsdFwiXG59IiwidXBkYXRlRWRpdG9yIjpmYWxzZSwiYXV0b1N5bmMiOnRydWUsInVwZGF0ZURpYWdyYW0iOmZhbHNlfQ)
mermaid graph syntax @@ -466,24 +471,29 @@ flowchart BT marker1>"marker"] ----> B -----------------> A end - subgraph chunk0 - chunk0-chunk[["chunk"]] --> chunk0-2(("2")) --> chunk0-1((1)) --> chunk0-0((0)) --> chunk0-insertion[/insertion\] + subgraph batch0 + batch0-batch[["batch"]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/insertion\] end - subgraph chunk1 - chunk1-chunk[["chunk"]] --> chunk1-2(("2")) --> chunk1-1((1)) --> chunk1-0((0)) --> chunk1-insertion[/insertion\] + subgraph batch1 + batch1-batch[["batch"]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/insertion\] end - subgraph chunk2 - chunk2-chunk[["chunk"]] --> chunk2-2(("2")) --> chunk2-1((1)) --> chunk2-0((0)) --> chunk2-insertion[/insertion\] + subgraph batch2 + batch2-batch[["batch"]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/insertion\] end + + batch0-insertion --> fakePrevEvent0{{fake_prev_event}} + batch1-insertion --> fakePrevEvent1{{fake_prev_event}} + batch2-insertion --> fakePrevEvent2{{fake_prev_event}} + - marker1 -.-> chunk0-insertionBase - chunk0-insertionBase[/insertion\] -------------> A - chunk0-chunk -.-> chunk0-insertionBase[/insertion\] - chunk1-chunk -.-> chunk0-insertion - chunk2-chunk -.-> chunk1-insertion + marker1 -.-> batch0-insertionBase + batch0-insertionBase[/insertion\] ---------------> A + batch0-batch -.-> batch0-insertionBase[/insertion\] + batch1-batch -.-> batch0-insertion + batch2-batch -.-> batch1-insertion ```
@@ -552,8 +562,8 @@ However, this feels needlessly complicated if the DAG approach is sufficient. ## Security considerations -The "insertion" and "chunk" events add a new way for an application service to -tie the chunk reconciliation in knots(similar to the DAG knots that can happen) +The "insertion" and "batch" events add a new way for an application service to +tie the batch reconciliation in knots(similar to the DAG knots that can happen) which can potentially DoS message and backfill navigation on the server. This also makes it much easier for an AS to maliciously spoof history. This is @@ -571,14 +581,14 @@ via SS API. **Event types:** - `org.matrix.msc2716.insertion` - - `org.matrix.msc2716.chunk` + - `org.matrix.msc2716.batch` - `org.matrix.msc2716.marker` **Content fields:** - `org.matrix.msc2716.historical` - - `org.matrix.msc2716.next_chunk_id` - - `org.matrix.msc2716.chunk_id` + - `org.matrix.msc2716.next_batch_id` + - `org.matrix.msc2716.batch_id` - `org.matrix.msc2716.marker.insertion` **Room version:** From 3f28588dec6736d35ca615f6b87484d7c4a8aa4f Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 13 Oct 2021 22:50:22 -0500 Subject: [PATCH 19/68] Add graph to show how historical state plays into the DAG --- .../2716-batch-send-historical-messages.md | 60 +++++++++++++++++++ 1 file changed, 60 insertions(+) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 040318fe04..73dbc3148c 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -499,6 +499,66 @@ flowchart BT
+ +#### Add in the historical state + +In order to show the display name and avatar for the historical messages, +the state provided by `state_events_at_start` needs to resolve when one of +the historical messages is fetched. + +It's probably most semantic to have the state float outside of the normal DAG +in a chain by referencing a fake `prev_event`. Then the insertion event +reference the last piece in the floating state chain. + +In Synapse, the fake `prev_event` causes it to look like an "outlier" because +the homeserver can't fully fetch and resolve the state at the point. As a +result, the state will not be resolved into the current state of the room, +and it won't be visible in the chat history. This allows us to insert multiple +batches without having a bunch of `@mxid joined the room` noise between each +batch. + +![](https://user-images.githubusercontent.com/558581/137247868-b7d5b996-02ac-49f8-bbfa-1417ddc60bae.png) + +[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBtYXJrZXIxPlwibWFya2VyXCJdIC0tLS0-IEIgLS0tLS0tLS0tLS0tLS0tLS0-IEFcbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBiYXRjaDBcbiAgICAgICAgYmF0Y2gwLWJhdGNoW1tcImJhdGNoXCJdXSAtLT4gYmF0Y2gwLTIoKFwiMlwiKSkgLS0-IGJhdGNoMC0xKCgxKSkgLS0-IGJhdGNoMC0wKCgwKSkgLS0-IGJhdGNoMC1pbnNlcnRpb25bL2luc2VydGlvblxcXVxuICAgIGVuZFxuXG4gICAgc3ViZ3JhcGggYmF0Y2gxXG4gICAgICAgIGJhdGNoMS1iYXRjaFtbXCJiYXRjaFwiXV0gLS0-IGJhdGNoMS0yKChcIjJcIikpIC0tPiBiYXRjaDEtMSgoMSkpIC0tPiBiYXRjaDEtMCgoMCkpIC0tPiBiYXRjaDEtaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBiYXRjaDJcbiAgICAgICAgYmF0Y2gyLWJhdGNoW1tcImJhdGNoXCJdXSAtLT4gYmF0Y2gyLTIoKFwiMlwiKSkgLS0-IGJhdGNoMi0xKCgxKSkgLS0-IGJhdGNoMi0wKCgwKSkgLS0-IGJhdGNoMi1pbnNlcnRpb25bL2luc2VydGlvblxcXVxuICAgIGVuZFxuXG5cbiAgICBiYXRjaDAtaW5zZXJ0aW9uIC0tPiBtZW1iZXJCb2IwKFtcIm0ucm9vbS5tZW1iZXIgKGJvYilcIl0pIC0tPiBtZW1iZXJBbGljZTAoW1wibS5yb29tLm1lbWJlciAoYWxpY2UpXCJdKSAtLT4gZmFrZVByZXZFdmVudDB7e2Zha2VfcHJldl9ldmVudH19XG4gICAgYmF0Y2gxLWluc2VydGlvbiAtLT4gbWVtYmVyQm9iMShbXCJtLnJvb20ubWVtYmVyIChib2IpXCJdKSAtLT4gbWVtYmVyQWxpY2UxKFtcIm0ucm9vbS5tZW1iZXIgKGFsaWNlKVwiXSkgLS0-IGZha2VQcmV2RXZlbnQxe3tmYWtlX3ByZXZfZXZlbnR9fVxuICAgIGJhdGNoMi1pbnNlcnRpb24gLS0-IG1lbWJlckJvYjIoW1wibS5yb29tLm1lbWJlciAoYm9iKVwiXSkgLS0-IG1lbWJlckFsaWNlMihbXCJtLnJvb20ubWVtYmVyIChhbGljZSlcIl0pIC0tPiBmYWtlUHJldkV2ZW50Mnt7ZmFrZV9wcmV2X2V2ZW50fX1cblxuICAgIG1hcmtlcjEgLS4tPiBiYXRjaDAtaW5zZXJ0aW9uQmFzZVxuICAgIGJhdGNoMC1pbnNlcnRpb25CYXNlWy9pbnNlcnRpb25cXF0gLS0tLS0tLS0tLS0tLS0tPiBBXG4gICAgYmF0Y2gwLWJhdGNoIC0uLT4gYmF0Y2gwLWluc2VydGlvbkJhc2VbL2luc2VydGlvblxcXVxuICAgIGJhdGNoMS1iYXRjaCAtLi0-IGJhdGNoMC1pbnNlcnRpb25cbiAgICBiYXRjaDItYmF0Y2ggLS4tPiBiYXRjaDEtaW5zZXJ0aW9uXG4iLCJtZXJtYWlkIjoie1xuICBcInRoZW1lXCI6IFwiZGVmYXVsdFwiXG59IiwidXBkYXRlRWRpdG9yIjpmYWxzZSwiYXV0b1N5bmMiOnRydWUsInVwZGF0ZURpYWdyYW0iOmZhbHNlfQ) + +
+mermaid graph syntax + +```mermaid +flowchart BT + subgraph live + marker1>"marker"] ----> B -----------------> A + end + + subgraph batch0 + batch0-batch[["batch"]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/insertion\] + end + + subgraph batch1 + batch1-batch[["batch"]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/insertion\] + end + + subgraph batch2 + batch2-batch[["batch"]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/insertion\] + end + + + batch0-insertion --> memberBob0(["m.room.member (bob)"]) --> memberAlice0(["m.room.member (alice)"]) --> fakePrevEvent0{{fake_prev_event}} + batch1-insertion --> memberBob1(["m.room.member (bob)"]) --> memberAlice1(["m.room.member (alice)"]) --> fakePrevEvent1{{fake_prev_event}} + batch2-insertion --> memberBob2(["m.room.member (bob)"]) --> memberAlice2(["m.room.member (alice)"]) --> fakePrevEvent2{{fake_prev_event}} + + marker1 -.-> batch0-insertionBase + batch0-insertionBase[/insertion\] ---------------> A + batch0-batch -.-> batch0-insertionBase[/insertion\] + batch1-batch -.-> batch0-insertion + batch2-batch -.-> batch1-insertion +``` + +
+ + + + ## Potential issues Also see the security considerations section below. From 80e68bc41f2afd204cc01a6db2fd14d6d1f02684 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 15 Dec 2021 05:23:57 -0600 Subject: [PATCH 20/68] Add server detection support --- proposals/2716-batch-send-historical-messages.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 73dbc3148c..5322e65ea0 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -634,6 +634,10 @@ via SS API. ## Unstable prefix +Servers will indicate support for the new endpoint via a non-empty value for feature flag +`org.matrix.msc2716` in `unstable_features` in the response to `GET +/_matrix/client/versions`. + **Endpoints:** - `POST /_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send` From 16782826fca00eecf52f153642cb1864249dd4a6 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Mon, 7 Feb 2022 16:48:25 -0600 Subject: [PATCH 21/68] Prefer empty prev_events=[] over fake prev_events See https://github.com/matrix-org/synapse/pull/11114 --- .../2716-batch-send-historical-messages.md | 62 ++++++++++--------- 1 file changed, 33 insertions(+), 29 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 5322e65ea0..ce5cc6da69 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -202,7 +202,7 @@ Insert from your most recent batch of history -> oldest history.** #### What does the batch send endpoint do behind the scenes? This section explains the homeserver magic that happens when someone uses the -`batch_send` endpoint. If you're just trying to understand how the "insertion", +`/batch_send` endpoint. If you're just trying to understand how the "insertion", "batch", "marker" events work, you might want to just skip down to the room DAG breakdown which incrementally explains how everything fits together. @@ -308,9 +308,9 @@ navigate the DAG. - We use `m.room.batch` events to indicate which `m.room.insertion` event it connects to by its `m.next_batch_id` field. -![](https://user-images.githubusercontent.com/558581/137203204-fc630b1e-9ceb-41bb-b074-52a60514cd44.png) +![](https://user-images.githubusercontent.com/558581/152883498-7acf0750-5742-47b3-8644-008f24f9396f.png) -[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBCIC0tLS0tLS0tLS0tLS0tLS0tLS0tPiBBXG4gICAgZW5kXG4gICAgXG4gICAgc3ViZ3JhcGggYmF0Y2gwXG4gICAgICAgIGJhdGNoMC1iYXRjaFtbXCJiYXRjaFwiXV0gLS0-IGJhdGNoMC0yKChcIjJcIikpIC0tPiBiYXRjaDAtMSgoMSkpIC0tPiBiYXRjaDAtMCgoMCkpIC0tPiBiYXRjaDAtaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcblxuICAgIHN1YmdyYXBoIGJhdGNoMVxuICAgICAgICBiYXRjaDEtYmF0Y2hbW1wiYmF0Y2hcIl1dIC0tPiBiYXRjaDEtMigoXCIyXCIpKSAtLT4gYmF0Y2gxLTEoKDEpKSAtLT4gYmF0Y2gxLTAoKDApKSAtLT4gYmF0Y2gxLWluc2VydGlvblsvaW5zZXJ0aW9uXFxdXG4gICAgZW5kXG4gICAgXG4gICAgc3ViZ3JhcGggYmF0Y2gyXG4gICAgICAgIGJhdGNoMi1iYXRjaFtbXCJiYXRjaFwiXV0gLS0-IGJhdGNoMi0yKChcIjJcIikpIC0tPiBiYXRjaDItMSgoMSkpIC0tPiBiYXRjaDItMCgoMCkpIC0tPiBiYXRjaDItaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcblxuXG4gICAgYmF0Y2gwLWluc2VydGlvbiAtLT4gZmFrZVByZXZFdmVudDB7e2Zha2VfcHJldl9ldmVudH19XG4gICAgYmF0Y2gxLWluc2VydGlvbiAtLT4gZmFrZVByZXZFdmVudDF7e2Zha2VfcHJldl9ldmVudH19XG4gICAgYmF0Y2gyLWluc2VydGlvbiAtLT4gZmFrZVByZXZFdmVudDJ7e2Zha2VfcHJldl9ldmVudH19XG5cbiAgICBcbiAgICBiYXRjaDAtaW5zZXJ0aW9uQmFzZVsvaW5zZXJ0aW9uXFxdIC0tLS0tLS0tLS0tLS0tLT4gQVxuICAgIGJhdGNoMC1iYXRjaCAtLi0-IGJhdGNoMC1pbnNlcnRpb25CYXNlWy9pbnNlcnRpb25cXF1cbiAgICBiYXRjaDEtYmF0Y2ggLS4tPiBiYXRjaDAtaW5zZXJ0aW9uXG4gICAgYmF0Y2gyLWJhdGNoIC0uLT4gYmF0Y2gxLWluc2VydGlvblxuIiwibWVybWFpZCI6IntcbiAgXCJ0aGVtZVwiOiBcImRlZmF1bHRcIlxufSIsInVwZGF0ZUVkaXRvciI6ZmFsc2UsImF1dG9TeW5jIjp0cnVlLCJ1cGRhdGVEaWFncmFtIjpmYWxzZX0) +[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#pako:eNqNk92KwjAQhV8lBIQKdTW5WejCwsruE7h3xovYjDaYJpKkuiK--6atP7WtYi7a6eTM4Us6c8SpEYATvFJmn2bcejT9ZRqF5Yrl2vJthpTcQZ0q1xSNetYn-qoloEUdtEyW3KfZ5GZTf4-q13zOcBUwvFig0uy8S6OIYcrwcNjMkigi95lJFE3uM1I7sF4aPR9fQ8YWDcY-PtLiI0_5SC8f6fCRDh95ge_RHdIWI33KSHsZaYeRdhjpy3dYP9sXP-UO7gvRo55p9gIavfX8xa5Zo5I8q2zoaEdHmrrLcQYDxJVc6xy0D72vNw7dmVxLyvPcpORanfMNIJ9B22asw6g5JPVOOrlU55ly_qAaUoKqXamkPyQok0KAjlOjjE285dptuQ2qj7q2tJ1V9eQ9GFmzgdFeCp8ldPsXr6RSiTYa4nqrij9wjHOwOZciTP2x9GE4wObAcBJCASteKM8w06cgLbaCe_gR0huLkxVXDmLMC29mB53igFTARfQteWjT_Kw6_QOAPVu0)
mermaid graph syntax @@ -333,16 +333,18 @@ flowchart BT batch2-batch[["batch"]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/insertion\] end - - batch0-insertion --> fakePrevEvent0{{fake_prev_event}} - batch1-insertion --> fakePrevEvent1{{fake_prev_event}} - batch2-insertion --> fakePrevEvent2{{fake_prev_event}} - batch0-insertionBase[/insertion\] ---------------> A batch0-batch -.-> batch0-insertionBase[/insertion\] batch1-batch -.-> batch0-insertion batch2-batch -.-> batch1-insertion + + + %% alignment links + batch2-insertion --- alignment1 + %% make the alignment links/nodes invisible + style alignment1 visibility: hidden,color:transparent; + linkStyle 17 stroke-width:2px,fill:none,stroke:none; ```
@@ -458,9 +460,10 @@ The structure of the "marker" event looks like: } ``` -![](https://user-images.githubusercontent.com/558581/137203021-d5f5dcfe-3e47-4ee2-9041-232c13090218.png) +![](https://user-images.githubusercontent.com/558581/152883769-cfd1bd30-3c18-47a3-8631-3d51af540d1d.png) + -[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBtYXJrZXIxPlwibWFya2VyXCJdIC0tLS0-IEIgLS0tLS0tLS0tLS0tLS0tLS0-IEFcbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBiYXRjaDBcbiAgICAgICAgYmF0Y2gwLWJhdGNoW1tcImJhdGNoXCJdXSAtLT4gYmF0Y2gwLTIoKFwiMlwiKSkgLS0-IGJhdGNoMC0xKCgxKSkgLS0-IGJhdGNoMC0wKCgwKSkgLS0-IGJhdGNoMC1pbnNlcnRpb25bL2luc2VydGlvblxcXVxuICAgIGVuZFxuXG4gICAgc3ViZ3JhcGggYmF0Y2gxXG4gICAgICAgIGJhdGNoMS1iYXRjaFtbXCJiYXRjaFwiXV0gLS0-IGJhdGNoMS0yKChcIjJcIikpIC0tPiBiYXRjaDEtMSgoMSkpIC0tPiBiYXRjaDEtMCgoMCkpIC0tPiBiYXRjaDEtaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBiYXRjaDJcbiAgICAgICAgYmF0Y2gyLWJhdGNoW1tcImJhdGNoXCJdXSAtLT4gYmF0Y2gyLTIoKFwiMlwiKSkgLS0-IGJhdGNoMi0xKCgxKSkgLS0-IGJhdGNoMi0wKCgwKSkgLS0-IGJhdGNoMi1pbnNlcnRpb25bL2luc2VydGlvblxcXVxuICAgIGVuZFxuXG5cbiAgICBiYXRjaDAtaW5zZXJ0aW9uIC0tPiBmYWtlUHJldkV2ZW50MHt7ZmFrZV9wcmV2X2V2ZW50fX1cbiAgICBiYXRjaDEtaW5zZXJ0aW9uIC0tPiBmYWtlUHJldkV2ZW50MXt7ZmFrZV9wcmV2X2V2ZW50fX1cbiAgICBiYXRjaDItaW5zZXJ0aW9uIC0tPiBmYWtlUHJldkV2ZW50Mnt7ZmFrZV9wcmV2X2V2ZW50fX1cblxuICAgIFxuICAgIG1hcmtlcjEgLS4tPiBiYXRjaDAtaW5zZXJ0aW9uQmFzZVxuICAgIGJhdGNoMC1pbnNlcnRpb25CYXNlWy9pbnNlcnRpb25cXF0gLS0tLS0tLS0tLS0tLS0tPiBBXG4gICAgYmF0Y2gwLWJhdGNoIC0uLT4gYmF0Y2gwLWluc2VydGlvbkJhc2VbL2luc2VydGlvblxcXVxuICAgIGJhdGNoMS1iYXRjaCAtLi0-IGJhdGNoMC1pbnNlcnRpb25cbiAgICBiYXRjaDItYmF0Y2ggLS4tPiBiYXRjaDEtaW5zZXJ0aW9uXG4iLCJtZXJtYWlkIjoie1xuICBcInRoZW1lXCI6IFwiZGVmYXVsdFwiXG59IiwidXBkYXRlRWRpdG9yIjpmYWxzZSwiYXV0b1N5bmMiOnRydWUsInVwZGF0ZURpYWdyYW0iOmZhbHNlfQ) +[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#pako:eNqNlN9vwiAQx_8VQmJSk7oJb6uJycz2F2xv4gOW05JSMEDnjPF_H2390dpqdg_t9fje8QGOHnFqBOAEb5TZpxm3Hi2-mUbBXLneWr7LkJI_0IQqK7jNwZI5w43H8ApNgs3Ron53bI7em0zQonHuaq-5T7PprXrzPalfyyXDtROmqOaYX0ZpFDFMGR6P21ESRaQbmUbRtBuR2oH10ujl69VlbNViHOIjd3zkKR8Z5CM9PtLjI__ge7SH9I6RPmWkg4y0x0h7jPTfe9g8z-2CJi8Dp7Dg7txZQyPd-uhRa7Vb5vE0Q7DtwxzObOloT0fausuqRyPEldzqArQPN0fnDnWKXFOq9dyk5Jpd8ByQz-C-zKsOF9UhqX-kk2t13jfnD6olJagelUr6Q4IyKQToODXK2MRbrt2O26CaNblV2a86n7yFQtbkMNlL4bOE7n7jjVQq0UZD3AzV_gzHuABbcCnCP-NY1WE4wBbAcBJcARteKs8w06cgLXeCe_gU0huLkw1XDmLMS2--DjrFAamEi-hD8tDNxVl1-gNT2W_l)
mermaid graph syntax @@ -483,17 +486,19 @@ flowchart BT batch2-batch[["batch"]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/insertion\] end - - batch0-insertion --> fakePrevEvent0{{fake_prev_event}} - batch1-insertion --> fakePrevEvent1{{fake_prev_event}} - batch2-insertion --> fakePrevEvent2{{fake_prev_event}} - marker1 -.-> batch0-insertionBase batch0-insertionBase[/insertion\] ---------------> A batch0-batch -.-> batch0-insertionBase[/insertion\] batch1-batch -.-> batch0-insertion batch2-batch -.-> batch1-insertion + + + %% alignment links + batch2-insertion --- alignment1 + %% make the alignment links/nodes invisible + style alignment1 visibility: hidden,color:transparent; + linkStyle 19 stroke-width:2px,fill:none,stroke:none; ```
@@ -506,20 +511,19 @@ In order to show the display name and avatar for the historical messages, the state provided by `state_events_at_start` needs to resolve when one of the historical messages is fetched. -It's probably most semantic to have the state float outside of the normal DAG -in a chain by referencing a fake `prev_event`. Then the insertion event -reference the last piece in the floating state chain. +It's probably most semantic to have the historical state float outside of the +normal DAG in a chain by specifying no `prev_events` (empty `prev_events=[]`) +for the first one. Then the insertion event can reference the last piece in the +floating state chain. -In Synapse, the fake `prev_event` causes it to look like an "outlier" because -the homeserver can't fully fetch and resolve the state at the point. As a -result, the state will not be resolved into the current state of the room, -and it won't be visible in the chat history. This allows us to insert multiple -batches without having a bunch of `@mxid joined the room` noise between each -batch. +In Synapse, historical state is marked as an `outlier`. As a result, the state +will not be resolved into the current state of the room, and it won't be visible +in the chat history. This allows us to insert multiple batches without having a +bunch of `@mxid joined the room` noise between each batch. -![](https://user-images.githubusercontent.com/558581/137247868-b7d5b996-02ac-49f8-bbfa-1417ddc60bae.png) +![](https://user-images.githubusercontent.com/558581/152884091-b4fe23e2-e019-4d05-af24-bbfb4f656b05.png) -[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBtYXJrZXIxPlwibWFya2VyXCJdIC0tLS0-IEIgLS0tLS0tLS0tLS0tLS0tLS0-IEFcbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBiYXRjaDBcbiAgICAgICAgYmF0Y2gwLWJhdGNoW1tcImJhdGNoXCJdXSAtLT4gYmF0Y2gwLTIoKFwiMlwiKSkgLS0-IGJhdGNoMC0xKCgxKSkgLS0-IGJhdGNoMC0wKCgwKSkgLS0-IGJhdGNoMC1pbnNlcnRpb25bL2luc2VydGlvblxcXVxuICAgIGVuZFxuXG4gICAgc3ViZ3JhcGggYmF0Y2gxXG4gICAgICAgIGJhdGNoMS1iYXRjaFtbXCJiYXRjaFwiXV0gLS0-IGJhdGNoMS0yKChcIjJcIikpIC0tPiBiYXRjaDEtMSgoMSkpIC0tPiBiYXRjaDEtMCgoMCkpIC0tPiBiYXRjaDEtaW5zZXJ0aW9uWy9pbnNlcnRpb25cXF1cbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBiYXRjaDJcbiAgICAgICAgYmF0Y2gyLWJhdGNoW1tcImJhdGNoXCJdXSAtLT4gYmF0Y2gyLTIoKFwiMlwiKSkgLS0-IGJhdGNoMi0xKCgxKSkgLS0-IGJhdGNoMi0wKCgwKSkgLS0-IGJhdGNoMi1pbnNlcnRpb25bL2luc2VydGlvblxcXVxuICAgIGVuZFxuXG5cbiAgICBiYXRjaDAtaW5zZXJ0aW9uIC0tPiBtZW1iZXJCb2IwKFtcIm0ucm9vbS5tZW1iZXIgKGJvYilcIl0pIC0tPiBtZW1iZXJBbGljZTAoW1wibS5yb29tLm1lbWJlciAoYWxpY2UpXCJdKSAtLT4gZmFrZVByZXZFdmVudDB7e2Zha2VfcHJldl9ldmVudH19XG4gICAgYmF0Y2gxLWluc2VydGlvbiAtLT4gbWVtYmVyQm9iMShbXCJtLnJvb20ubWVtYmVyIChib2IpXCJdKSAtLT4gbWVtYmVyQWxpY2UxKFtcIm0ucm9vbS5tZW1iZXIgKGFsaWNlKVwiXSkgLS0-IGZha2VQcmV2RXZlbnQxe3tmYWtlX3ByZXZfZXZlbnR9fVxuICAgIGJhdGNoMi1pbnNlcnRpb24gLS0-IG1lbWJlckJvYjIoW1wibS5yb29tLm1lbWJlciAoYm9iKVwiXSkgLS0-IG1lbWJlckFsaWNlMihbXCJtLnJvb20ubWVtYmVyIChhbGljZSlcIl0pIC0tPiBmYWtlUHJldkV2ZW50Mnt7ZmFrZV9wcmV2X2V2ZW50fX1cblxuICAgIG1hcmtlcjEgLS4tPiBiYXRjaDAtaW5zZXJ0aW9uQmFzZVxuICAgIGJhdGNoMC1pbnNlcnRpb25CYXNlWy9pbnNlcnRpb25cXF0gLS0tLS0tLS0tLS0tLS0tPiBBXG4gICAgYmF0Y2gwLWJhdGNoIC0uLT4gYmF0Y2gwLWluc2VydGlvbkJhc2VbL2luc2VydGlvblxcXVxuICAgIGJhdGNoMS1iYXRjaCAtLi0-IGJhdGNoMC1pbnNlcnRpb25cbiAgICBiYXRjaDItYmF0Y2ggLS4tPiBiYXRjaDEtaW5zZXJ0aW9uXG4iLCJtZXJtYWlkIjoie1xuICBcInRoZW1lXCI6IFwiZGVmYXVsdFwiXG59IiwidXBkYXRlRWRpdG9yIjpmYWxzZSwiYXV0b1N5bmMiOnRydWUsInVwZGF0ZURpYWdyYW0iOmZhbHNlfQ) +[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#pako:eNqVlE1ugzAQha9izcqRkhR7yQIpUXuCdhdnYWBSUDGOjGlVRbl7DU5aftNkNh6Pn_0-DRpOkOgUIYRDob-STBpLtm-iJC6qOn438piRIv9EX2pCSfOBhkUCfCZgT1YuIrJt115EZONvYpn6ZPB2LG2SBX-v-_2qXXY7AW3iLBqP6HrKKRXABSwW3SqjlPUrAaVBv5KXFRqb63L39JsKse8wTvGxAR-7yccm-diIj4342B18cz3kA0Z-k5FPMvIRIx8x8rt66DfDppPV2j2jUMVotjoOqINTa6O1WvsiobGOFw7VG_ripsgTnNLK5qBVd-zYrB17wI7da8dn7fgDdvwfO294mTzvM-ztVlY43fXmpP-pyNyUdqdv3mbqu3fnYvpmr2lDHevqYAkKjZJ56v5Lp-aeAJuhQgGhS1M8yLqwAkR5dtL6mEqLL2lutYHwIIsKlyBrq1-_ywRCa2q8ip5z6SZGXVTnH4twitQ)
mermaid graph syntax @@ -543,9 +547,9 @@ flowchart BT end - batch0-insertion --> memberBob0(["m.room.member (bob)"]) --> memberAlice0(["m.room.member (alice)"]) --> fakePrevEvent0{{fake_prev_event}} - batch1-insertion --> memberBob1(["m.room.member (bob)"]) --> memberAlice1(["m.room.member (alice)"]) --> fakePrevEvent1{{fake_prev_event}} - batch2-insertion --> memberBob2(["m.room.member (bob)"]) --> memberAlice2(["m.room.member (alice)"]) --> fakePrevEvent2{{fake_prev_event}} + batch0-insertion -.-> memberBob0(["m.room.member (bob)"]) --> memberAlice0(["m.room.member (alice)"]) + batch1-insertion -.-> memberBob1(["m.room.member (bob)"]) --> memberAlice1(["m.room.member (alice)"]) + batch2-insertion -.-> memberBob2(["m.room.member (bob)"]) --> memberAlice2(["m.room.member (alice)"]) marker1 -.-> batch0-insertionBase batch0-insertionBase[/insertion\] ---------------> A From c6a60b1a8df522ed24152b1ba8c02fbd5a69a023 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 15 Feb 2022 21:39:26 -0600 Subject: [PATCH 22/68] GitHub now supports mermaid natively See https://github.blog/2022-02-14-include-diagrams-markdown-files-mermaid/ --- .../2716-batch-send-historical-messages.md | 38 ------------------- 1 file changed, 38 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index ce5cc6da69..c223bb106f 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -250,14 +250,6 @@ the DAG. We're going to build from this in the next sections. - Each batch of messages is older-in-time than the last (`batch1` is older-in-time than `batch0`, etc) - -![](https://user-images.githubusercontent.com/558581/137199056-f7e17437-0c98-4a06-9af1-eec8f026229c.png) - -[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#eyJjb2RlIjoiZmxvd2NoYXJ0IEJUXG4gICAgc3ViZ3JhcGggbGl2ZVxuICAgICAgICBCIC0tLS0tLS0tLS0tLS0-IEFcbiAgICBlbmRcbiAgICBcbiAgICBzdWJncmFwaCBiYXRjaDBcbiAgICAgICAgYmF0Y2gwLTIoKFwiMlwiKSkgLS0-IGJhdGNoMC0xKCgxKSkgLS0-IGJhdGNoMC0wKCgwKSlcbiAgICBlbmRcblxuICAgIHN1YmdyYXBoIGJhdGNoMVxuICAgICAgICBiYXRjaDEtMigoXCIyXCIpKSAtLT4gYmF0Y2gxLTEoKDEpKSAtLT4gYmF0Y2gxLTAoKDApKVxuICAgIGVuZFxuICAgIFxuICAgIHN1YmdyYXBoIGJhdGNoMlxuICAgICAgICBiYXRjaDItMigoXCIyXCIpKSAtLT4gYmF0Y2gyLTEoKDEpKSAtLT4gYmF0Y2gyLTAoKDApKVxuICAgIGVuZFxuXG4gICAgXG4gICAgYmF0Y2gwLTAgLS0tLS0tLT4gQVxuICAgIGJhdGNoMS0wIC0tPiBBXG4gICAgYmF0Y2gyLTAgLS0-IEFcbiAgICBcbiAgICAlJSBhbGlnbm1lbnQgbGlua3MgXG4gICAgYmF0Y2gwLTAgLS0tIGJhdGNoMS0yXG4gICAgYmF0Y2gxLTAgLS0tIGJhdGNoMi0yXG4gICAgJSUgbWFrZSB0aGUgbGlua3MgaW52aXNpYmxlIFxuICAgIGxpbmtTdHlsZSAxMCBzdHJva2Utd2lkdGg6MnB4LGZpbGw6bm9uZSxzdHJva2U6bm9uZTtcbiAgICBsaW5rU3R5bGUgMTEgc3Ryb2tlLXdpZHRoOjJweCxmaWxsOm5vbmUsc3Ryb2tlOm5vbmU7IiwibWVybWFpZCI6IntcbiAgXCJ0aGVtZVwiOiBcImRlZmF1bHRcIlxufSIsInVwZGF0ZUVkaXRvciI6ZmFsc2UsImF1dG9TeW5jIjp0cnVlLCJ1cGRhdGVEaWFncmFtIjpmYWxzZX0) - -
-mermaid graph syntax - ```mermaid flowchart BT subgraph live @@ -289,8 +281,6 @@ flowchart BT linkStyle 11 stroke-width:2px,fill:none,stroke:none; ``` -
- #### Adding "insertion" and "batch" events @@ -308,13 +298,6 @@ navigate the DAG. - We use `m.room.batch` events to indicate which `m.room.insertion` event it connects to by its `m.next_batch_id` field. -![](https://user-images.githubusercontent.com/558581/152883498-7acf0750-5742-47b3-8644-008f24f9396f.png) - -[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#pako:eNqNk92KwjAQhV8lBIQKdTW5WejCwsruE7h3xovYjDaYJpKkuiK--6atP7WtYi7a6eTM4Us6c8SpEYATvFJmn2bcejT9ZRqF5Yrl2vJthpTcQZ0q1xSNetYn-qoloEUdtEyW3KfZ5GZTf4-q13zOcBUwvFig0uy8S6OIYcrwcNjMkigi95lJFE3uM1I7sF4aPR9fQ8YWDcY-PtLiI0_5SC8f6fCRDh95ge_RHdIWI33KSHsZaYeRdhjpy3dYP9sXP-UO7gvRo55p9gIavfX8xa5Zo5I8q2zoaEdHmrrLcQYDxJVc6xy0D72vNw7dmVxLyvPcpORanfMNIJ9B22asw6g5JPVOOrlU55ly_qAaUoKqXamkPyQok0KAjlOjjE285dptuQ2qj7q2tJ1V9eQ9GFmzgdFeCp8ldPsXr6RSiTYa4nqrij9wjHOwOZciTP2x9GE4wObAcBJCASteKM8w06cgLbaCe_gR0huLkxVXDmLMC29mB53igFTARfQteWjT_Kw6_QOAPVu0) - -
-mermaid graph syntax - ```mermaid flowchart BT subgraph live @@ -347,8 +330,6 @@ flowchart BT linkStyle 17 stroke-width:2px,fill:none,stroke:none; ``` -
- The structure of the insertion event looks like: ```js @@ -460,14 +441,6 @@ The structure of the "marker" event looks like: } ``` -![](https://user-images.githubusercontent.com/558581/152883769-cfd1bd30-3c18-47a3-8631-3d51af540d1d.png) - - -[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#pako:eNqNlN9vwiAQx_8VQmJSk7oJb6uJycz2F2xv4gOW05JSMEDnjPF_H2390dpqdg_t9fje8QGOHnFqBOAEb5TZpxm3Hi2-mUbBXLneWr7LkJI_0IQqK7jNwZI5w43H8ApNgs3Ron53bI7em0zQonHuaq-5T7PprXrzPalfyyXDtROmqOaYX0ZpFDFMGR6P21ESRaQbmUbRtBuR2oH10ujl69VlbNViHOIjd3zkKR8Z5CM9PtLjI__ge7SH9I6RPmWkg4y0x0h7jPTfe9g8z-2CJi8Dp7Dg7txZQyPd-uhRa7Vb5vE0Q7DtwxzObOloT0fausuqRyPEldzqArQPN0fnDnWKXFOq9dyk5Jpd8ByQz-C-zKsOF9UhqX-kk2t13jfnD6olJagelUr6Q4IyKQToODXK2MRbrt2O26CaNblV2a86n7yFQtbkMNlL4bOE7n7jjVQq0UZD3AzV_gzHuABbcCnCP-NY1WE4wBbAcBJcARteKs8w06cgLXeCe_gU0huLkw1XDmLMS2--DjrFAamEi-hD8tDNxVl1-gNT2W_l) - -
-mermaid graph syntax - ```mermaid flowchart BT subgraph live @@ -501,8 +474,6 @@ flowchart BT linkStyle 19 stroke-width:2px,fill:none,stroke:none; ``` -
- #### Add in the historical state @@ -521,13 +492,6 @@ will not be resolved into the current state of the room, and it won't be visible in the chat history. This allows us to insert multiple batches without having a bunch of `@mxid joined the room` noise between each batch. -![](https://user-images.githubusercontent.com/558581/152884091-b4fe23e2-e019-4d05-af24-bbfb4f656b05.png) - -[Mermaid live editor playground link](https://mermaid-js.github.io/mermaid-live-editor/edit/#pako:eNqVlE1ugzAQha9izcqRkhR7yQIpUXuCdhdnYWBSUDGOjGlVRbl7DU5aftNkNh6Pn_0-DRpOkOgUIYRDob-STBpLtm-iJC6qOn438piRIv9EX2pCSfOBhkUCfCZgT1YuIrJt115EZONvYpn6ZPB2LG2SBX-v-_2qXXY7AW3iLBqP6HrKKRXABSwW3SqjlPUrAaVBv5KXFRqb63L39JsKse8wTvGxAR-7yccm-diIj4342B18cz3kA0Z-k5FPMvIRIx8x8rt66DfDppPV2j2jUMVotjoOqINTa6O1WvsiobGOFw7VG_ripsgTnNLK5qBVd-zYrB17wI7da8dn7fgDdvwfO294mTzvM-ztVlY43fXmpP-pyNyUdqdv3mbqu3fnYvpmr2lDHevqYAkKjZJ56v5Lp-aeAJuhQgGhS1M8yLqwAkR5dtL6mEqLL2lutYHwIIsKlyBrq1-_ywRCa2q8ip5z6SZGXVTnH4twitQ) - -
-mermaid graph syntax - ```mermaid flowchart BT subgraph live @@ -558,8 +522,6 @@ flowchart BT batch2-batch -.-> batch1-insertion ``` -
- From 65e5f7bcba29a563df18cb43ec88f529f50672f4 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Thu, 14 Apr 2022 02:03:17 -0500 Subject: [PATCH 23/68] Incorporate in feedback --- proposals/2716-batch-send-historical-messages.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index c223bb106f..c5ce221b32 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -107,8 +107,9 @@ those events. We would have to accept the redaction if it came over federation to avoid split-brained rooms. Because we also can't use the `historical` power level for controlling who can -send these events in the existing room version, we instead only allow the room -`creator` to send `m.room.insertion`, `m.room.batch`, and `m.room.marker` events. +send these events in the existing room version, we always persist but instead +only process and give meaning to the `m.room.insertion`, `m.room.batch`, and +`m.room.marker` events when the room `creator` sends them. @@ -180,7 +181,7 @@ Request response: // insertion event ID is returned in this field. // // When `?batch_id` is provided, this field is not present because we can hang - // the history off the insertione event specified associated by the batch ID. + // the history off the insertion event specified associated by the batch ID. "base_insertion_event_id": "$pmmaTamxhcyLrrOKSrJf3c1zNmfvsE5SGpFpgE_UvN0" } ``` @@ -433,8 +434,7 @@ The structure of the "marker" event looks like: "type": "m.room.marker", "sender": "@appservice:example.org", "content": { - "m.insertion_id": insertion_event.event_id, - "m.historical": true + "m.insertion_id": insertion_event.event_id }, "room_id": "!jEsUZKDJdhlrceRyVU:example.org", "origin_server_ts": 1626914158639, From d7cf78984f26cfdac9599d01c249c8abff3fa4fd Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 11 May 2022 20:19:42 -0500 Subject: [PATCH 24/68] Fix little mistakes --- proposals/2716-batch-send-historical-messages.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index c5ce221b32..7eb75ca7af 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -181,7 +181,7 @@ Request response: // insertion event ID is returned in this field. // // When `?batch_id` is provided, this field is not present because we can hang - // the history off the insertion event specified associated by the batch ID. + // the history off the insertion event specified and associated by the batch ID. "base_insertion_event_id": "$pmmaTamxhcyLrrOKSrJf3c1zNmfvsE5SGpFpgE_UvN0" } ``` @@ -194,7 +194,7 @@ the chat history which also allows us to insert multiple batches without having bunch of `@mxid joined the room` noise between each batch. **The state will not be resolved into the current state of the room.** -`events` is chronological list of events you want to insert. For Synapse, +`events` is a chronological list of events you want to insert. For Synapse, there is a reverse-chronological constraint on batches so once you insert one batch of messages, you can only insert older an older batch after that. **tldr; Insert from your most recent batch of history -> oldest history.** @@ -434,7 +434,7 @@ The structure of the "marker" event looks like: "type": "m.room.marker", "sender": "@appservice:example.org", "content": { - "m.insertion_id": insertion_event.event_id + "m.marker.insertion": insertion_event.event_id }, "room_id": "!jEsUZKDJdhlrceRyVU:example.org", "origin_server_ts": 1626914158639, From d016b7d997eb35170a540e145272c9cba7499370 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Fri, 3 Jun 2022 01:51:11 -0500 Subject: [PATCH 25/68] Emphasize has *all* history as it's the key differentiator for that statement compared to the rest in the list --- proposals/2716-batch-send-historical-messages.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 7eb75ca7af..21649b043b 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -377,8 +377,8 @@ To lay out the different types of servers consuming these historical messages - This pretty much works out of the box. It's possible to just add the historical events to the database and they're available. The new endpoint is just a mechanism to insert the events. - 1. Federated remote server that already has all scrollback history and then new - history is inserted + 1. Federated remote server that already has *all* scrollback history and then + new history is inserted - The big problem is how does a HS know it needs to go fetch more history if they already fetched all of the history in the room? We're solving this with "marker" events which are sent on the "live" timeline and point back From 2544a3f45d152fffc305ce218290af8550a20719 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Fri, 3 Jun 2022 01:59:24 -0500 Subject: [PATCH 26/68] Address markers being lost in timeline gaps (marker events as state) See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r782499674 --- .../2716-batch-send-historical-messages.md | 30 +++++++++---------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 21649b043b..e7add2c1ff 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -53,9 +53,9 @@ Here is what scrollback is expected to look like in Element: the DAG, we navigate from an insertion event to the batch event that points at it, up the historical messages to the insertion event, then repeat the process - - `m.room.marker`: Used to hint to homeservers (and potentially to cache bust - on clients) that there is new history back time that you should go fetch next - time someone scrolls back around the specified insertion event. + - `m.room.marker`: State event used to hint to homeservers (and potentially to + cache bust on clients) that there is new history back time that you should go + fetch next time someone scrolls back around the specified insertion event. **Content fields:** @@ -406,18 +406,17 @@ To lay out the different types of servers consuming these historical messages the scenario where the federated HS already has all the history in the room, so it won't do a full sync of the room again. - Unlike the historical events sent via `/batch_send`, **the "marker" event is - sent separately as a normal event on the "live" timeline** so that comes down - incremental sync and is available to all homeservers regardless of how much - scrollback history they already have. - - Note: If a server joins after a "marker" event is sent, it could be lost - in the middle of the timeline and they could jump back in time past the - "marker" and never pick it up. But `backfill/` response should have - historical messages included. It gets a bit hairy if the server has the - room backfilled, the user leaves, a "marker" event is sent, more messages - put it back in the timeline, the user joins back, jumps back in the - timeline and misses the "marker" and expects to see the historical - messages. They will be missing the historical messages until they can - backfill the gap where they left. + sent separately as a normal state event on the "live" timeline** so that + comes down incremental sync and is available to all homeservers regardless of + how much scrollback history they already have. And since it's state it never + gets lost in a timeline gap and is immediately aparent to all servers that + join. + - Also instead of overwriting the same generic `state_key: ""` over and over, + the expected behavior is send each "marker" event with a unique `state_key`. + This way all of the "markers" are discoverable in the current state without + us having to go through the chain of previous state to figure it all out. + This also avoids potential state resolution conflicts where only one of the + "marker" events win and we would lose the other chain history. - A "marker" event is not needed for every batch of historical messages added via `/batch_send`. Multiple batches can be inserted then once we're done importing everything, we can add one "marker" event pointing at the root @@ -432,6 +431,7 @@ The structure of the "marker" event looks like: ```js { "type": "m.room.marker", + "state_key": "", "sender": "@appservice:example.org", "content": { "m.marker.insertion": insertion_event.event_id From 7258f64b6056ffe59fd3133795d82db1f09e6b00 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Fri, 3 Jun 2022 15:32:31 -0500 Subject: [PATCH 27/68] Formatting --- .../2716-batch-send-historical-messages.md | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index e7add2c1ff..b540d7768f 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -53,9 +53,10 @@ Here is what scrollback is expected to look like in Element: the DAG, we navigate from an insertion event to the batch event that points at it, up the historical messages to the insertion event, then repeat the process - - `m.room.marker`: State event used to hint to homeservers (and potentially to - cache bust on clients) that there is new history back time that you should go - fetch next time someone scrolls back around the specified insertion event. + - `m.room.marker`: State event used to hint to homeservers that there is new + history back time that you should go fetch next time someone scrolls back + around the specified insertion event. Also used on clients to cache bust the + timeline. **Content fields:** @@ -367,8 +368,8 @@ The structure of the batch event looks like: #### Adding marker events -Finally, we add "marker" events into the mix so that federated remote servers -also know where in the DAG they should look for historical messages. +Finally, we add "marker" state events into the mix so that federated remote +servers also know where in the DAG they should look for historical messages. To lay out the different types of servers consuming these historical messages (more context on why we need "marker" events): @@ -381,9 +382,9 @@ To lay out the different types of servers consuming these historical messages new history is inserted - The big problem is how does a HS know it needs to go fetch more history if they already fetched all of the history in the room? We're solving this - with "marker" events which are sent on the "live" timeline and point back - to the "insertion" event where we inserted history next to. The HS can - then go and backfill the "insertion" event and continue navigating the + with "marker" state events which are sent on the "live" timeline and point + back to the "insertion" event where we inserted history next to. The HS + can then go and backfill the "insertion" event and continue navigating the historical batches from there. 1. Federated remote server that joins a new room with historical messages - The originating homeserver just needs to update the `/backfill` response From a828de3087dcc5522a21a65c6747cbe1b26971c8 Mon Sep 17 00:00:00 2001 From: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com> Date: Tue, 9 Aug 2022 17:07:01 +0100 Subject: [PATCH 28/68] Small typos and other fixes --- .../2716-batch-send-historical-messages.md | 20 +++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index b540d7768f..3b4fa0cce6 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -51,10 +51,10 @@ Here is what scrollback is expected to look like in Element: historical messages - `m.room.batch`: This is what connects one historical batch to the other. In the DAG, we navigate from an insertion event to the batch event that points - at it, up the historical messages to the insertion event, then repeat the + at it, up the historical messages to the next insertion event, then repeat the process - `m.room.marker`: State event used to hint to homeservers that there is new - history back time that you should go fetch next time someone scrolls back + history that you should go fetch next time someone scrolls back around the specified insertion event. Also used on clients to cache bust the timeline. @@ -102,10 +102,10 @@ wouldn't be removed during redaction. In practice, this means: However, this MSC is mostly backwards compatible and can be used with the current room version with the fact that redactions aren't supported for `m.room.insertion`, `m.room.batch`, `m.room.marker` events. We can protect -people from this limitation by throwing an error when they try to use `PUT -/_matrix/client/r0/rooms/{roomId}/redact/{eventId}/{txnId}` to redact one of -those events. We would have to accept the redaction if it came over federation -to avoid split-brained rooms. +people from this limitation by throwing an error when they try to use [`PUT +/_matrix/client/v3/rooms/{roomId}/redact/{eventId}/{txnId}`](https://spec.matrix.org/v1.3/client-server-api/#put_matrixclientv3roomsroomidredacteventidtxnid) +to redact one of those events. We would have to accept the redaction if +it came over federation to avoid split-brained rooms. Because we also can't use the `historical` power level for controlling who can send these events in the existing room version, we always persist but instead @@ -197,7 +197,7 @@ be resolved into the current state of the room.** `events` is a chronological list of events you want to insert. For Synapse, there is a reverse-chronological constraint on batches so once you insert one -batch of messages, you can only insert older an older batch after that. **tldr; +batch of messages, you can only insert an older batch after that. **tldr; Insert from your most recent batch of history -> oldest history.** @@ -410,7 +410,7 @@ To lay out the different types of servers consuming these historical messages sent separately as a normal state event on the "live" timeline** so that comes down incremental sync and is available to all homeservers regardless of how much scrollback history they already have. And since it's state it never - gets lost in a timeline gap and is immediately aparent to all servers that + gets lost in a timeline gap and is immediately apparent to all servers that join. - Also instead of overwriting the same generic `state_key: ""` over and over, the expected behavior is send each "marker" event with a unique `state_key`. @@ -419,11 +419,11 @@ To lay out the different types of servers consuming these historical messages This also avoids potential state resolution conflicts where only one of the "marker" events win and we would lose the other chain history. - A "marker" event is not needed for every batch of historical messages added - via `/batch_send`. Multiple batches can be inserted then once we're done + via `/batch_send`. Multiple batches can be inserted. Then once we're done importing everything, we can add one "marker" event pointing at the root "insertion" event - If more history is decided to be added later, another "marker" can be sent to let the homeservers know again. - - When a remote federated homeserver, receives a "marker" event, it can mark + - When a remote federated homeserver receives a "marker" event, it can mark the "insertion" prev events as needing to backfill from that point again and can fetch the historical messages when the user scrolls back to that area in the future. From b2b5b54b41ad3b1781729d88f1ca348751ebaaac Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 9 Aug 2022 18:32:26 -0500 Subject: [PATCH 29/68] Use json5 See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r940825322 --- proposals/2716-batch-send-historical-messages.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index b540d7768f..a36e0e3f7f 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -163,7 +163,7 @@ Request body: ``` Request response: -```jsonc +```json5 { // List of state event ID's we inserted "state_event_ids": [ From a7920fb84cb355e131c71900fda213609a2ccbdd Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 9 Aug 2022 20:54:45 -0500 Subject: [PATCH 30/68] Remove base graph See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941463379 --- .../2716-batch-send-historical-messages.md | 59 ++++--------------- 1 file changed, 11 insertions(+), 48 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index a36e0e3f7f..59744d5971 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -241,55 +241,10 @@ breakdown which incrementally explains how everything fits together. ### Room DAG breakdown -#### Basic batch structure +#### "insertion" and "batch" events -Here is the starting point for how the historical batch concept looks like in -the DAG. We're going to build from this in the next sections. - - - `A` is the oldest-in-time message - - `B` is the newest-in-time message - - `batch0` is the first batch we try to import - - Each batch of messages is older-in-time than the last (`batch1` is - older-in-time than `batch0`, etc) - -```mermaid -flowchart BT - subgraph live - B -------------> A - end - - subgraph batch0 - batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) - end - - subgraph batch1 - batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) - end - - subgraph batch2 - batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) - end - - - batch0-0 -------> A - batch1-0 --> A - batch2-0 --> A - - %% alignment links - batch0-0 --- batch1-2 - batch1-0 --- batch2-2 - %% make the links invisible - linkStyle 10 stroke-width:2px,fill:none,stroke:none; - linkStyle 11 stroke-width:2px,fill:none,stroke:none; -``` - - - -#### Adding "insertion" and "batch" events - -Next we add "insertion" and "batch" events so it's more prescriptive on how each -historical batch should connect to each other and how the homeserver can -navigate the DAG. +We use "insertion" and "batch" events to describe how each historical batch +should connect to each other and how the homeserver can navigate the DAG. - With "insertion" events, we just add them to the start of each chronological batch (where the oldest message in the batch is). The next older-in-time @@ -300,6 +255,14 @@ navigate the DAG. - We use `m.room.batch` events to indicate which `m.room.insertion` event it connects to by its `m.next_batch_id` field. +Here is how the historical batch concept looks like in the DAG: + + - `A` is the oldest-in-time message + - `B` is the newest-in-time message + - `batch0` is the first batch we try to import + - Each batch of messages is older-in-time than the last (`batch1` is + older-in-time than `batch0`, etc) + ```mermaid flowchart BT subgraph live From 73f4143e4725450e423ea0fb51440e7c2adffbd3 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 9 Aug 2022 22:42:51 -0500 Subject: [PATCH 31/68] ?ts is now specced See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941282190 --- .../2716-batch-send-historical-messages.md | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 59744d5971..b8d8372c61 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -20,10 +20,11 @@ This is currently not supported because: * There is no way to create messages in the context of historical room state in a room via CS or AS API - you can only create events relative to current room state. - * There is currently no way to override the timestamp on an event via the AS API. - (We used to have the concept of [timestamp - massaging](https://matrix.org/docs/spec/application_service/r0.1.2#timestamp-massaging), - but it never got properly specified) + * It is possible to override the timestamp with the `?ts` query parameter + ([timestamp + massaging](](https://spec.matrix.org/v1.3/application-service-api/#timestamp-massaging))) + using the AS API but the event will still be appended to the tip of the DAG. + It's not possible to change the DAG ordering with this. @@ -54,7 +55,7 @@ Here is what scrollback is expected to look like in Element: at it, up the historical messages to the insertion event, then repeat the process - `m.room.marker`: State event used to hint to homeservers that there is new - history back time that you should go fetch next time someone scrolls back + history back in time that you should go fetch next time someone scrolls back around the specified insertion event. Also used on clients to cache bust the timeline. @@ -79,8 +80,9 @@ to limit who can send "insertion", "batch", and "marker" events since someone can attempt to send them via the normal `/send` API (we don't want any nasty weird knots to reconcile either). - - `historical`: This controls who can send `m.room.insertion`, `m.room.batch`, - and `m.room.marker` in the room. + - `historical`: A new top-level field in the `content` dictionary of the room's + power levels, controlling who can send `m.room.insertion`, `m.room.batch`, + and `m.room.marker` events in the room. **Room version:** @@ -190,7 +192,7 @@ Request response: `state_events_at_start` is used to define the historical state events needed to auth the `events` like invite and join events. These events can float outside of -the normal DAG. In Synapse, these are called `outlier`'s and won't be visible in +the normal DAG. In Synapse, these are called `outlier`s and won't be visible in the chat history which also allows us to insert multiple batches without having a bunch of `@mxid joined the room` noise between each batch. **The state will not be resolved into the current state of the room.** From 92a765839d536e7e31d829b8a2e98c025a13a9dc Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 9 Aug 2022 22:45:03 -0500 Subject: [PATCH 32/68] Say explicit for current room version See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941378307 --- proposals/2716-batch-send-historical-messages.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index b8d8372c61..14a6f0ce40 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -112,7 +112,8 @@ to avoid split-brained rooms. Because we also can't use the `historical` power level for controlling who can send these events in the existing room version, we always persist but instead only process and give meaning to the `m.room.insertion`, `m.room.batch`, and -`m.room.marker` events when the room `creator` sends them. +`m.room.marker` events when the room `creator` sends them. This caveat/rule only +applies to existing room versions. From 1cf73951c1b0355004fcfa8bd6148d997e2970e9 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 9 Aug 2022 23:15:07 -0500 Subject: [PATCH 33/68] Say which query parameters are optional vs required See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941413667 --- proposals/2716-batch-send-historical-messages.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 14a6f0ce40..2865221fac 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -122,13 +122,15 @@ applies to existing room versions. Add a new endpoint, `POST /_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send?prev_event_id=&batch_id=`, which can insert a batch of events historically back in time next to the given -`prev_event_id`. This endpoint can only be used by application services. - -This endpoint will handle the complexity of creating "insertion" and "batch" -events. All the application service has to do is use `?batch_id` which comes -from `next_batch_id` in the response of the batch send endpoint. `next_batch_id` -is derived from the insertion events added to each batch and is not required for -the first batch send. +`?prev_event_id` (required). This endpoint can only be used by application +services. `?batch_id` is optional and only necessary to connect the current +batch to the previous. + +This endpoint handles the complexity of creating "insertion" and "batch" events. +All the application service has to do is use `?batch_id` which comes from +`next_batch_id` in the response of the batch send endpoint to connect batches +together. `next_batch_id` is derived from the insertion events added to each +batch. Request body: ```json From 2433dfac660843e1f31525420568b3b70acd8685 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 9 Aug 2022 23:29:23 -0500 Subject: [PATCH 34/68] "Live timeline" so it's obvious See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941428544 --- proposals/2716-batch-send-historical-messages.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 2865221fac..a46fe005dd 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -123,8 +123,8 @@ Add a new endpoint, `POST /_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send?prev_event_id=&batch_id=`, which can insert a batch of events historically back in time next to the given `?prev_event_id` (required). This endpoint can only be used by application -services. `?batch_id` is optional and only necessary to connect the current -batch to the previous. +services. `?batch_id` is not required for the first batch send and is only +necessary to connect the current batch to the previous. This endpoint handles the complexity of creating "insertion" and "batch" events. All the application service has to do is use `?batch_id` which comes from @@ -202,7 +202,7 @@ be resolved into the current state of the room.** `events` is a chronological list of events you want to insert. For Synapse, there is a reverse-chronological constraint on batches so once you insert one -batch of messages, you can only insert older an older batch after that. **tldr; +batch of messages, you can only insert an older batch after that. **tldr; Insert from your most recent batch of history -> oldest history.** @@ -214,7 +214,7 @@ This section explains the homeserver magic that happens when someone uses the breakdown which incrementally explains how everything fits together. 1. An "insertion" event for the batch is added to the start of the batch. - This is the starting point of the next batch and holds the `next_batch_id` + This will be the starting point of the next batch and holds the `next_batch_id` that we return in the batch send response. The application service passes this as `?batch_id` 1. A "batch" event is added to the end of the batch. This is the event that @@ -270,7 +270,7 @@ Here is how the historical batch concept looks like in the DAG: ```mermaid flowchart BT - subgraph live + subgraph live timeline B --------------------> A end @@ -412,7 +412,7 @@ The structure of the "marker" event looks like: ```mermaid flowchart BT - subgraph live + subgraph live timeline marker1>"marker"] ----> B -----------------> A end @@ -463,7 +463,7 @@ bunch of `@mxid joined the room` noise between each batch. ```mermaid flowchart BT - subgraph live + subgraph live timeline marker1>"marker"] ----> B -----------------> A end From efbee43bf08f792a548a931a8b047342a7857275 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 9 Aug 2022 23:34:09 -0500 Subject: [PATCH 35/68] Fix stable versions See: - https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r940824921 - https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941521259 --- proposals/2716-batch-send-historical-messages.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index a46fe005dd..7a223f90d5 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -44,7 +44,7 @@ Here is what scrollback is expected to look like in Element: **Endpoint:** - - `POST /_matrix/client/r0/rooms//batch_send?prev_event_id=&batch_id=` + - `POST /_matrix/client/v1/rooms//batch_send?prev_event_id=&batch_id=` **Event types:** @@ -120,7 +120,7 @@ applies to existing room versions. ### New historical batch send endpoint Add a new endpoint, `POST -/_matrix/client/unstable/org.matrix.msc2716/rooms//batch_send?prev_event_id=&batch_id=`, +/_matrix/client/v1/org.matrix.msc2716/rooms//batch_send?prev_event_id=&batch_id=`, which can insert a batch of events historically back in time next to the given `?prev_event_id` (required). This endpoint can only be used by application services. `?batch_id` is not required for the first batch send and is only From b4ba8c41f895daafac52eef048c50449372c0472 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 9 Aug 2022 23:35:12 -0500 Subject: [PATCH 36/68] Feature needs to be true See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941530850 --- proposals/2716-batch-send-historical-messages.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 7a223f90d5..4d09ccabc3 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -569,7 +569,7 @@ via SS API. ## Unstable prefix -Servers will indicate support for the new endpoint via a non-empty value for feature flag +Servers will indicate support for the new endpoint via a `true` value for feature flag `org.matrix.msc2716` in `unstable_features` in the response to `GET /_matrix/client/versions`. From 7cde5cd5a71ad005d3e817b675bc845efa9c09ca Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 9 Aug 2022 23:41:41 -0500 Subject: [PATCH 37/68] More clear phrasing See: - https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941451745 - https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941454016 --- proposals/2716-batch-send-historical-messages.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 4d09ccabc3..fafcfc80d0 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -253,12 +253,13 @@ should connect to each other and how the homeserver can navigate the DAG. - With "insertion" events, we just add them to the start of each chronological batch (where the oldest message in the batch is). The next older-in-time - batch can connect to that "insertion" point from the previous batch. + batch can connect to that "insertion" event from the previous batch. - The initial base "insertion" event could be from the main DAG or we can - create it ad-hoc in the first batch so the homeserver can start traversing up - the batch from there after a "marker" event points to it. - - We use `m.room.batch` events to indicate which `m.room.insertion` event it - connects to by its `m.next_batch_id` field. + create it ad-hoc in the first batch. In the latter case, a "marker" event + (detailed below) inserted into the main DAG can be used to point to the new + "insertion" event. + - `m.room.batch` events have a `m.next_batch_id` field which is used to indicate the + `m.room.insertion` event that the batch connects to. Here is how the historical batch concept looks like in the DAG: From 8a50cbf1f188208facc813d12430c00fcda5c1b8 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 10 Aug 2022 00:01:05 -0500 Subject: [PATCH 38/68] Add annotation that older events are at the top of the graphs See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941502629 --- proposals/2716-batch-send-historical-messages.md | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index fafcfc80d0..3efcb9c602 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -271,7 +271,8 @@ Here is how the historical batch concept looks like in the DAG: ```mermaid flowchart BT - subgraph live timeline + A --- annotation1>"Note: older events are at the top"] + subgraph live [live timeline] B --------------------> A end @@ -298,7 +299,9 @@ flowchart BT batch2-insertion --- alignment1 %% make the alignment links/nodes invisible style alignment1 visibility: hidden,color:transparent; - linkStyle 17 stroke-width:2px,fill:none,stroke:none; + linkStyle 18 stroke-width:2px,fill:none,stroke:none; + %% make the annotation links invisible + linkStyle 0 stroke-width:2px,fill:none,stroke:none; ``` @@ -413,6 +416,7 @@ The structure of the "marker" event looks like: ```mermaid flowchart BT + A --- annotation1>"Note: older events are at the top"] subgraph live timeline marker1>"marker"] ----> B -----------------> A end @@ -441,7 +445,9 @@ flowchart BT batch2-insertion --- alignment1 %% make the alignment links/nodes invisible style alignment1 visibility: hidden,color:transparent; - linkStyle 19 stroke-width:2px,fill:none,stroke:none; + linkStyle 20 stroke-width:2px,fill:none,stroke:none; + %% make the annotation links invisible + linkStyle 0 stroke-width:2px,fill:none,stroke:none; ``` @@ -464,6 +470,7 @@ bunch of `@mxid joined the room` noise between each batch. ```mermaid flowchart BT + A --- annotation1>"Note: older events are at the top"] subgraph live timeline marker1>"marker"] ----> B -----------------> A end @@ -490,6 +497,9 @@ flowchart BT batch0-batch -.-> batch0-insertionBase[/insertion\] batch1-batch -.-> batch0-insertion batch2-batch -.-> batch1-insertion + + %% make the annotation links invisible + linkStyle 0 stroke-width:2px,fill:none,stroke:none; ``` From 8286ca4dcd5a7891d20edb791679f4153571f9a4 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 10 Aug 2022 00:07:40 -0500 Subject: [PATCH 39/68] A --> B is just where you want to import between See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r942020934 --- proposals/2716-batch-send-historical-messages.md | 1 + 1 file changed, 1 insertion(+) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 3efcb9c602..5ec738a6d7 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -263,6 +263,7 @@ should connect to each other and how the homeserver can navigate the DAG. Here is how the historical batch concept looks like in the DAG: + - `A --> B` is any point in the DAG that we want to import between. - `A` is the oldest-in-time message - `B` is the newest-in-time message - `batch0` is the first batch we try to import From 850e2f19ef2ed9989553daad885316fea174cba8 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 10 Aug 2022 00:10:34 -0500 Subject: [PATCH 40/68] More than one reason for new room version See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941446114 --- proposals/2716-batch-send-historical-messages.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 5ec738a6d7..a5ae407678 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -86,7 +86,9 @@ weird knots to reconcile either). **Room version:** -The redaction algorithm changes are the only hard requirement for a new room +The new `historical` power level necessitates a new room version (changes the structure of `m.room.power_levels`). + +The redaction algorithm changes is also hard requirement for a new room version because we need to make sure when redacting, we only strip out fields without affecting anything at the protocol level. This means that we need to keep all of the structural fields that allow us to navigate the batches of From 3332ca86532f6839ed0b6d6e649e7a8de2cc35d8 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 10 Aug 2022 00:19:24 -0500 Subject: [PATCH 41/68] unioned with the state at the prev_event_id See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941499364 --- proposals/2716-batch-send-historical-messages.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index a5ae407678..bc58a6a1ad 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -194,13 +194,13 @@ Request response: } ``` - -`state_events_at_start` is used to define the historical state events needed to -auth the `events` like invite and join events. These events can float outside of -the normal DAG. In Synapse, these are called `outlier`s and won't be visible in -the chat history which also allows us to insert multiple batches without having a -bunch of `@mxid joined the room` noise between each batch. **The state will not -be resolved into the current state of the room.** +`state_events_at_start` is unioned with the state at the `prev_event_id` and is +used to define the historical state events needed to auth the `events` like +invite and join events. These events can float outside of the normal DAG. In +Synapse, these are called `outlier`s and won't be visible in the chat history +which also allows us to insert multiple batches without having a bunch of `@mxid +joined the room` noise between each batch. **The state will not be resolved into +the current state of the room.** `events` is a chronological list of events you want to insert. For Synapse, there is a reverse-chronological constraint on batches so once you insert one From 36681939dc077e8bac6dd0c487658922fd3f3680 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 10 Aug 2022 00:44:59 -0500 Subject: [PATCH 42/68] State events are allowed See: - https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941498769 - https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941499364 --- proposals/2716-batch-send-historical-messages.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index bc58a6a1ad..a12aad5877 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -202,10 +202,12 @@ which also allows us to insert multiple batches without having a bunch of `@mxid joined the room` noise between each batch. **The state will not be resolved into the current state of the room.** -`events` is a chronological list of events you want to insert. For Synapse, -there is a reverse-chronological constraint on batches so once you insert one -batch of messages, you can only insert an older batch after that. **tldr; -Insert from your most recent batch of history -> oldest history.** +`events` is a chronological list of events you want to insert. It's possible to +also include `state_events` here which will be used to auth further events in +the batch. For Synapse, there is a reverse-chronological constraint on batches +so once you insert one batch of messages, you can only insert an older batch +after that. **tldr; Insert from your most recent batch of history -> oldest +history.** #### What does the batch send endpoint do behind the scenes? From 1d3f562002c7f784ac98c409e8d5130789689910 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 10 Aug 2022 01:04:15 -0500 Subject: [PATCH 43/68] Unsaved merge conflict --- proposals/2716-batch-send-historical-messages.md | 14 -------------- 1 file changed, 14 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index fb596628c4..261ebfb604 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -202,26 +202,12 @@ which also allows us to insert multiple batches without having a bunch of `@mxid joined the room` noise between each batch. **The state will not be resolved into the current state of the room.** -<<<<<<< HEAD `events` is a chronological list of events you want to insert. It's possible to also include `state_events` here which will be used to auth further events in the batch. For Synapse, there is a reverse-chronological constraint on batches so once you insert one batch of messages, you can only insert an older batch after that. **tldr; Insert from your most recent batch of history -> oldest history.** -======= -`state_events_at_start` is used to define the historical state events needed to -auth the `events` like invite and join events. These events can float outside of -the normal DAG. In Synapse, these are called `outlier`'s and won't be visible in -the chat history which also allows us to insert multiple batches without having a -bunch of `@mxid joined the room` noise between each batch. **The state will not -be resolved into the current state of the room.** - -`events` is a chronological list of events you want to insert. For Synapse, -there is a reverse-chronological constraint on batches so once you insert one -batch of messages, you can only insert an older batch after that. **tldr; -Insert from your most recent batch of history -> oldest history.** ->>>>>>> a828de3087dcc5522a21a65c6747cbe1b26971c8 #### What does the batch send endpoint do behind the scenes? From e593c209d248a866b89997364ad66010c0d3a916 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 10 Aug 2022 01:05:55 -0500 Subject: [PATCH 44/68] More clear image See: - https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r940824756 - https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941460116 --- .../2716-batch-send-historical-messages.md | 2 +- .../images/2716-message-scrollback-example.png | Bin 0 -> 29696 bytes 2 files changed, 1 insertion(+), 1 deletion(-) create mode 100644 proposals/images/2716-message-scrollback-example.png diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 261ebfb604..f36f8fc344 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -37,7 +37,7 @@ would if they were sent back at that time. Here is what scrollback is expected to look like in Element: -![](https://user-images.githubusercontent.com/558581/119064795-cae7e380-b9a1-11eb-9366-5e1f5e6370a8.png) +![Two historical batches in between some existing messages](./images/2716-message-scrollback-example.png) ### Overview diff --git a/proposals/images/2716-message-scrollback-example.png b/proposals/images/2716-message-scrollback-example.png new file mode 100644 index 0000000000000000000000000000000000000000..e59a8edf094d99ea1bd88dc3414455f95eec7f5f GIT binary patch literal 29696 zcmcG$byQpLx9(fq-Q9{iw76?24lNqISh3&^#T|-!X({ebaMw^M?iwgo0)*mr)9*fK zpWiuqkG;C(wv|5&3`*NrbG5nC7yk%>m= zpw48#Wyta!_CXXBLk;VjxWGoWlK51g;H2`UuSLRwiAytuTeU%A^g`k-?Sgm@cQtt7 zye+KgJLhd((5AQ4x__i(e&e%kbyw}x-R2-4Hq*DPUW&+M1GO#efj3+%z(|Xiq(mYE ze1xhSD9+$Dbsg9rfoyk(TDIGm0!8qXNb7laAT@+~fG>U_g)BrosLWKatG-b7=aapo z2GzbB&LU-`MN_yC8R)1y{Rd&o*>H1_1$pE&VCCHwQ24t-B{P*hMWn^6C?O&cb1;CM z{ay9)n_2j2=&vYj?KG|R=Nh_2TI|04jX2_Cn8r8Mww#q`LEeU6*~<^1c>RsHWMDNf*JeD?diBHRU01_hetIblD_*Y2YiuT}dQ58RWAr$J)itSt zv1A(MHBtXXXtD}fLXw1LD@@yAclSIM`f$WPP0HZQoUB`_pcGiI8mV5JE*X_141JCL zoq3Q0co>AcgSB?fG_qI8W#!SK_uFG%PAr6}_NCj|lFIGC{rj222scALcaUqX1vxi^ zPza<4uP{9N2bsuHySeJFucK-tS4umQXM$}BnMg~~)5yTnch(R0A%3<$C?cobEQvsB zf~6e5sE>GK_3O&(dJ^W&yKPjq3!~+!rMo|Wh4HMbA!bfd%&&)Zh!g?iI=g_S6NdgW zLBfSPsA@}W^Bu=)LUuDOt*4L_NjFBeo`#C6=rw8IMi2k*a!|@1ZPItNY8Z=Y>Gx~( zx5s$xP0m#M5H656>4}GLGV5>A06Ehy9VRtCtuuz+-!q^o1D-fMofhPWG3r}4(lcxX zmbxLUW?h?xF6ah(gO{cGBfI_E)lqpUt=1bhna(UBOj7~%_AdDkcj=xmaOT2zt*I$z zz8P1h`xhgo%&u+?9$(LV0P%0}(`*6J>%-Rqe#_3@>eD%a&UA}wV$Nar;Om%T|MhcQ z8<*5zzZNwf?5pR8Temg~kLJdWy8r+L#yGVScnRZm(E}Yj-_#lTbvAG8xn2*d>v;F8 zNeqlCI3i{`KP14Hdw7#PN?glbrsR^t&h3i^&u>8Y00z|uLQp{BKzXlcV3(W_v^=#P zQo1`lO|?shv`sO~W+w=~7Vne$YVJIQ;2WS+?TF4LVPN2?#e*JR6>MIOfEe`#x1g;# zW_}+Fp#FKW(Ya> zR(T6wB7ACuOhg0_tYd?+6Z3Ojz$+~88DN+SiL_8i^$vmDxAX70aR6%{#v<#iy-TpD zA&$HOEk7_Q@5#lj@_w|Q;fp-`4UvHq`}jSbI?!r3YS-V-NkC3$N9%)#RyGsSq3vZq2tfBNE?$Mv_gF`TIlUB0UNEV#ulE76afF=! zXeB>JSL^@U3Y<&p^VSBu&6bj$nvj{O>^) zVB3S*AeqP}b9D~jaB$Mw(47Dj@T6AMhKTln9ZvU~@nd|hBmgw@qgM$2)tfiFjSpvf z*Y`?aE+$k6!9RjJT79c!_e|c*Y|BGu`@^G3xwJv>O0dW-BXDiP+C*_TL z-V@IxU;G;UaQbx-Kw5A8?Xx@O6Wu)g_Lw>s9^ji)vm^mIK0DHpiac8IGDa=kjx=BZ zbe@+t_~LJ(-*3%uXA)m4(Y63mEXbGP&wEKbUtxd$j^EKMTS76v;C!1W`WhQDQKF8F z%XJWML4Km1i$OhTKbE~`L$2g*uN#6i3^o!x?`;G1 z?oiv&7h^z7mq^Q$AAHMMZqU>1s@bsA;a!Uz=#dq0?JxcLygbx%2R^{OBdz-GmmneQJ5F$847{4wa~Eg27h4V=N9YS?HqwmlHF zUZ0agnJpFAo|ci({^+n!<$51jQBm<>rO95v8wzOUHtSwH*=(}c{YbZ(#;%*E^^1rE z6!eHTj)aFRrEX5p1(o!BoAgEh6@kc+wAXKCw|knv-E7+e@15E$CDP$5nSi5cKY#eC&nd1-;Gy_MvS*xhF`U$@RWyBx4X|Deib zI;yD>78y^9?DhJokW?ct=FvNC?c|SWL?RG1dcBadKEN!c}z8_;N@LD>H@XaI>FYM(hnQ6e7 zq3K+RJBYQm>#1bT>pmZLyXGt4bcfy6e%-9WMlj_87kyj8dd#))bk4mv-fp`GmJswb zaR*#|BJKX>a3yGFXt=KoXaV4-{(d>u0By#S!fcl&D_)H8Xui5~ ze0r~FQ$s^vq7Zpdq@+}5Oyy+vwx-k`GHP-!;;cs*Z{g^*r2o}ug7D^16-%+#@j}{V zgleZXnDm;W#d)1&anr}$97VsQX$%skO;<8gf}N&_Qd45{os-j^<9Em2!m9OZ%SnR2 z#t8DHNGg11;1zK&$} zv@Ph?_*2*ZZmWz8VgyP6{D*XE8G>B7;@`)T&5t4tH! zX?JeUnB?<~W3*iJI@yU9%nZs%xsCD&K`J4Y31BK7Dr4(2%ePS&d9Ex3mAw{$K2#H4d(@_8$>j8Og+MdS^2Oi zklP(1TD(7l>7h(nKqYn@ZXAv(smQN58q3J$(GQ3V`YL1kP;&Ea6Ys|a9+x$PVJNhP_`xy+u1ygrEcp!KQ4+LV`Dk?F=#KfS4bK&O6x{WMF>x?w3m6*V#5&p?+ zG0%_V5zHoowOv)h=vdTxgjS}~Ph*<8(o`5&)TDDR{w9aRzg{HE{Q(-=ld;g%W!&`BBd6zqoInP z_IhO97Lrdm-z8&AwhR(aOoOrJ`wfGT((*m3w69KY!>SoEz`q8qqk*1u6s%9I(NiIr`kg*9#)Y$Tx6# z`SErLzpT96Ro3>cs_J@>9A+d#A_szP^ze|9<2!ot1gKT6fORaxbm2ysaCzNp&Kf8LZSov9yT7|m z9rClmEd_jI65xLRG)Y-y2+8{R)iyB<6*yeT6#5uTf`brCAc@0}m3W2uRE2nbSelw& ztVI0|rC$C|wD*y>57+x{BFVU$lBzn@BF+T;w^_a5b}#SkM|?WDNBcDHnU`mR_aX&f zCxX(d3+-o3zUGXjmV#bPq0{plyr&mU>Ksk&jQOS5Yk$-$x<=HGz5y2o9z7)O)*5JF zHPPGVpAE80a*bQ@iE34d%J2LUB{`8phjSn&No?Cjp{PMD(XLKT)mRp2|4RuLO#2Gn zZ$(!`{1pPbBL9A*JH#j;j(>m;k$!&QKtaGb26l3JUGFQJqc}Dg&GD$>*vEKIEaVuy zW30@qG6PIkZ44`e_wP?;Sd#FR0<+%c3TTm%fO2vYYO0FL>+t$!q93!jHneTS7x=_D z;Q1V4#3(x`50r*Zg;;9!@#FL@lyWPUP$FkC+F#6NL&Cde#`JF_D-A7?li{;LPVy<_ z0H#q@#*&vAXtgK!xD+)O)tT0BMQ5M=5u6g`RIBFI!b+6+iTqi%w;#&1UyD8X1+qKe zZ0wyWY%gf4lp|&Ekpmc@Gy$w`b~K0?)WA_5#FM{=%v3bnlhdUCpJ6|N?UGWpw~$O{Q*Z`Z-^EPQy3PR>zv#; zDK(PNRz9yNg^srqAwDi@ph3MaF_>(34N6BR85zqJuy~A!_&8lS#2%7M3|awoZa37w zN98T?&|+CzFZ~kK5<^d=r2JAZWZ+H*3v0UHPnWC`1QKD^h#z7Aa#)H%mHylv9euo} zxW>{hlJC1ZSzZ0%hh(<(LbmU1zn`Do8NUAH*;jG7Pd>#+-(b}OTw+{833pTcX20$=I2bnDJ)I-Zm=g`K&qywU6A#0;`IfDfHW7Q-uG5-2-jG_Z9skT3ohJYAoT zdHeQl(buoNPVqhEo{^WsTpW!>)Uz7D2c!~RC+Es_b$PfB8H3v1~prw#*>xesd*XLAhk`*As2M_r%5}Wrq#{lExNa^xV_?L9RR;t$TT-Vs`y!a2I zP_dek*p7o8p7Kt1)>Q0=v@kdRm{0fRh$A4l#y;*s&I^jd)74|;JPS|j+GZXJY>R?e!+XNJBT0h^IyXLpQuJaoTra$>7?6O@%mcx zL!JUR>m=ksBcIn}vFGwi94D2~bR$pR7vC_E=_?2DQ+hm!_tn&7y8EQ$)T58^I-a*m z=H?B7)$7h%Crfe~3?h)kr^_((in{Q=_ai3?Y5nH;u1<43iuUc=BX8m$QfgdVrUFt2 zsqDc6BB}f#s%!?qNT;;BHhu&(KXQfPF0FEwubVxqmzj9@K5MHta42E-SxGsGA~&ar zKi&oLUy3;YcftMNYd-(k;b4LcYRez55{mIrjvMNwa%(K}6b!5q_ogxIthNyo5)&~1 zi3hRWPng{n1`gci$tSY<^fs3k8Pc}wEQR6BhCdSobC*yOg(d2VIXF2r{m%@+1<`-n zKX{W*vc{55w=B&WS>gz;ml}63zSa%7hh7JLj_unDJ75S>ArVPvuHb+9N%2)}+m#=g z_5`08BRmM;J^lX}_*Yf6C@i__R-hnPy4Qvac@7wOqx^DcHHNwiK1JIr?<@5?dZ&vL zF3G(p#^NMIO$}cC*n4fUQ#41o-x*KJ1K7>rw}rU-WV=T=MDOgO6-^7(XJIQTGwX#g zQPU&jzcy$b=AW0DqDI5nidR%dz9-?NL?mYNqm<*LIU#v46u&4f(D2I0acPG8YT9#+ z$FG706#=Y%r042}7MXAKDQnaCLNYy1p#Ap@QsH;HSTV0K9wwwFS=k!ttmC3#px9Ja zO;?D;|6R-<2UrPyM*D_2fY6BpPj=zTH2+0r3Nz$u{jXO~>^01GTehL(iwg_w_+Guu zzj?X1Jo?F)E#c?6+^UCq;GyXwW447~W08dCvQyEBr;iuHfJ{%CRU#iqpx90H z4g7ybe5B`a{;G-!jkZ*}_&dIIA?vSP9KhGFsZ0!2bS>!4^P4$A(jJuzeHmfMaFqa%7=VtKhDS-N<9baZBE@6_S}+o- zQeeUBCpaH4Z*zT$O^6;4+-KsqWiT_*1tOty2pT{_FX}>n^yd z760?_@2Xw?-}!yyfARZon*7(^hQ~s7)wAoK`sJGKrgzeM)iD81gX{pI763||8~yJo zscX54iOA_>cy?6RJ=ZVSRJSoV#KFU(5kq9b}rYN(`n8Qr~2|@(yy+M2EVhTTENj+s_3j-daXwS#S$vf4Us{+7lWv z;zRBW85Z!0dcnz|!~sP1QOa^p&`G*fGl3@&)1Xp>HAk4v;xgP0!nbLDSy7P!CMM?E z@J+6ft%1u#p@8Z*|H+`#ZwUFmTE5upz5W!Qokh4{!=o@~3_ma9L)p2sxG|S3Us~a?kkU6W>%JW`5{LM2n7n9@wnwEw=-#?dga< zhiYhGqhWWlUW&^+ZgbVJ@UZP$7$PiM(wP1%Zv?p63wv`*vue9;OH7{5-O71C8oTt& z^09C4{SZwz9 z4bERhhzS1|MWCkzQ0*ap{3{4x@8FOalwlMKx<6xS!Nj}D-P?cQyVU|QQ>*R?bwhIW z#~(((!y>{0wO*-3fG!_gFXFHH**{Umh%fKW)Cl-ZEX@@qSOgHkvz>p6O8+0yfd5h{ z{)9kl=QEqb~3P%5&TKE1DX+h94VVTyx?9 zj%ehi*r!MBW%N72j+0v6?#qmdb(wMJE4>4qkTETERB!vA35#5Tr<>izlvj(U{+wtW z9EHBZ9^lJS&Ba{XzDpJ8`IQAQt5X$ASYzw);l*#frd0S5V>oz)#p>60Wm^1TIj7Y2 z5r&)n-d7=!MmSjz`@3&bcJKSs>P(frs1J`{QeiipGFyy;T(gnrre{#Uwo1F)str?K zGiOm2V?0_HwRL!_L)?iGxQa`Y>A(Lwg*Eni>NcFeG#O^-(IqoQX^hXkN85v)H^XMgI-Jztd0}3($yUjG+7wmk6~@f? zij3=w-{~~sN#F&VOM&hWJr6d2P+0-mp}$-VT*#8DQZn~XGje38y-%cg3ess(OzB)Q zHsd!rE`({FCw}lLc%6zdGlMlo6Hl&o1ak;0duc?xsJ`WpuNx)Wev?cAcxCL`gHWcV5r zEw`Sg%`FB$KK4H8y6q^K)6j`B>$1#)QxsX(PBC)B3N(}%0h_=2J$a>r)o7_b=H`>U zM1fmrs)C)H~mcjz^JpDqwEjLVKwlil_3mCL~QO-7Xz zSF1Yqpx~SPl=|idJ6(Zl&|$>LUIiy^N0dU(rr#q$`7%&P$VGy^b<;=T3q_FHQD_uq zMWx1gMLKH^`ryKfuFl;&hG@XSG3$h8DHB1Bi@RahQ8?iKen2DdQ=*sV=69YSXID^T zPiP)n;Yy3>XRmi((jo+!Nuj*g+4t+yXQvE>Le3stYo4nFiWn7sYz7IJD6#N5iv%w<hqB{6Z9$K`2FuxemdaKcv@fi5jY{{MRfM>9T9)F;F?#s%Dz#nG z^+ru9l^ToE)Qm7HRdD5zT&g>KIAva{|!CRaEK60^W z$lt{p0N}mYr_GFQX0^WgKJsBVT5P#-KW$YhSnKpEJz#lmXu0Jw&Hp=2_i0%?x)@Xb zgmkz22s_1PP(C(Iz`G1?Nd`rfLlg~go%xgNqP-08+<9*}iMViaBOn&KyBad#rTx9WCE+^fQDc zR6D%#-S&2++TnFziX|ZpOdoZb!eJ$tgxGG)Mjj035|i4{XdNF-|t4& z=V}rOtp{E7p;CxiwW>V^)c$#XS%<3pZv3-~v9;3Y1*_Cna74%puYPqLzw^{v?e)h1T;@R?up~kI97?W zdv63N_&r1GI%vsanr1Nao=Ho;PxHI9|2jV25A<4-+-gNR)WU^9;8$b`AD!EGlkm6y zUW64CFvDRy3(UKye}nqRzKnNI=~V%x*F?JRv8Jt}&ms9NQr~<49i88ViTphu#*4N& zvdXwuP%59fn6(pO8kJ+{ZyaQcEfeLOXhR;vQqN=W5rRd%xnSj=_JLR!49{C#sI95( zI;`WDzf-3VZHPpMcThKj=bACBB9kOM=8B_alF15XbAi%6dLo&SwOQ_5&2jl_TF;YV z$vZ1xe4vCfG;fMZ!LTP4_b{19u6VNzZw+r|yW7=8wBUaDG47bQf7JVU?V@l>*PkG_ zH^H<`|7p8A!EIypI4{o6g3Kan>g=cW|w9r@MTZFF)xx-O?r&1(G<#kJMzM&bdNE^^Km|oK*y1m71Sxo zi`rSJz}KhlGbpBW$lU2xWzn@MmG7t6<*eT#frb2~%JmEP3*Rih?Z;z+SXLy2F~u!n zsm3I$w+0ivnf%7p}&wnlYNNes#K{sL1uM5e2Z z?oxt+m^t$+y-cDaU<6V`zo*dlPD_03uxRG znbx$y>Q?c^VG03%z9c^KwLcPYAq&>{qp!dV94qj61g-ZK-_?w&&6EyyOn*2@b=Vzl z6&e$Vc#Y?O7XnXhR;LBiYBT5ugH$%YG^tAEbsGC6Z4iUL;}Btz2|2Gzm1XF&LQ|&u z&buL$EX4SY8%)CER1mMxfm4_CQ)y$KXXmB2x07d>zO5fL;VY~;heQgEU)g?e_wmWh z%mdb-;l0ZL1o3zy#*(jvg8!$_PBmoK7YcRkFBv(VboYNUjI;lMjDYN6Fi)xU=nsJ%UYy5c+DJg{23DB4QHZw1g9 zLAmsueMmEu^>>hjv`3x2-&%>9{fGz07e@A-GK23w6hEwfrKBY3=4eor$BaJQe^mJ|B@z9I|doW%tdUnX& zLz}8e@1gE~`&274@iYQ)DBhgS-K{ad*G$S7E0Ahn9IcewFz)W)ziweYLbwnYxnb-z z&YX;8Pof&k!ouq`@;RghoelJR|6nqqD4aP1qi7$}+XdNvu%eD_brZqqamFmT%qyGx zX7#}_o_9~jY$d^!7r=)6$tpVCj7;&kFAgbgbW_=s!_T5{MY(qNE&{%8J=}%n(w8#5 zKz>(s*MY~YB_sbC9PzUIsm1}pw8>DU@iD;&PSraD=V+?S5&tc6xlE|9{F|}39@lmG z-Xq-7d0KcU{FHkWqOiXzuRm%g*>2<~-69M1sZt91b6|8U+}G`bQurrW3c#^R)vo7E z+b(;DyUrF&7x1)6XUY>y`1|=nS#NCKIXK|~@)Pq`n)X+Y4*b8a3XrrruBJb>9oHwo zHtD!CrD=>ff`vK(u^hOser;SSNSbd)<@9hMVo$(q0zaHh&(T-6mr*aa@*<63dm-To z=tVu(oqe*SJsiis?2Qo5?a>7PihJSPXx0-odv;uMW;mtofIt@6^r985GTe_=D$h8O zK!3cCUuamcJ|1{HU{`v{jp@YH;|yOc*0q+EbLoOdPc-YZ{LD{Ap;AG{3LytSr9gLI zc`7!xC2&>-o7npn@kV?VyIEM8uOy9@Y#uz%{-d z`+_(gRp;o>=>y8tKYEuhNxeaH8sunMt<~pWJHzae|*e9(+8M0_iyWnuk3cY&XNI zd*42QS@l$gzRc3ywJ;tbU`fw)qu;npRE$TS-h;z5 zO2XzmL`L<@U_(O?O0E~EEffPgAl_F}4^Ihq7ycqYg<{>b1%I$bt&;Zbjtt}}8&6B2 zqguv_lPjIdFDYJfTP6zDD6LF*=xD-Tx{NEW^3}_A*>l2zo(?1LzwOBnis%PJ%6K_f z(9gF4G`UuEPG2pD0_inZ3JU7%C_HvvCi9PVj^OhbT2limAA;)`oNE~}Txa?R8F6j?8TE&3%9-XJf zEl*iWoc)GCPLbzlT)@WU_50D+kCnT>qZY;sa5*yzI$QJ-f4O25d+h9Zr`4gKxg*;k zwB3S?C9pm4B`1y3aX)ePi%O`0c%BA!yobvK>I#Ny^p`TQ&B^UfIQb>_l3SU*VH?pD2{eeGYE8|;)pmR0O=ZbNjt?hj@KxARlj{>aE^hzk5?_%&}W=TJ!n&d`s;1q?%-=*ev4V{Tuuq z636)Cd_lYFhpQG$(bElhkSwj(OYG;i9mES{XOt$``lgCDGRN(J0f+(xFc&%~vW8@k|Su^X4j$y0r>CYm0YSw9n;be|6PMm)yioFL=s7QDLZff{j= znZ4)fxsz}6Xt%F^h>EJe%J~A0XD_^7_XqFlcY*izy%*lp^5^bq$^AuiCO?R+fLZiH z!c;2pMis+^#&~OuSU2gkepS`7gwBH=Rp^a^?|)L>D$+amJUC@P0A<<&j_Xln9o(GH z^4*^=9vzBiUz5yO^|EBQLf=?9!>bI3@Q!5SLO}}a<|aY3cJagGSDCFsg-UHu_v6xA z(c>$(+x^PLaWj$Q>T=FXs{U&_G_-})YO394SMg$=(E)fGPgRda?YJ2TfMbX6pJqQ} zvTEQ+6Nm;52;5gj>4AeW)}}l%zI0sGzlZ53Rj2wZJ7un;HN=Zi!Rf~t7DTjp@#Ec! zbK~8qs9GGZ=6R^F(}NP!gTR=A{nIAy;HkLdz{04RlrDp?`Z5RA^{~i1oD15q7Yqqo z2^+EAsycz0nbu^uNE4u&(A#WcC_q>30opAtXd?szF4FC9yDtXCZiCI%>E(Yx(@vsR zg2$Kn;sxV`NJPwkw6GuJjx)o}d$3Op{J?j`0i1Q?ls0-)PXn^g;E-pnn}xcDtvrs~ zVAHacV49v~#er6O!GFU#MfS|sRoIP!wVc`4IHZHR{Ca_bv4l}J&ua$U7;(xT>W5(>t z?l!;&)7(IrgfFbwzZ0q7Y1nzqICZfw_Y_**qr{WKYhFyH8u=-Y_jse)CSnVan{!Bm z_omb}8OK6x=6D6iwWyindb_$_72t4u*{F=kQ$2HGES1%IyI&DD1FjtnMZu!FZow%o z=KQenaI5ixZSkA@t;q+502gCblfW+3B%$zuC_Hf2we(i_WHcQSi29rRdf-%-z*8zl zf`2!v?7dSh=eil~mU5)jFCi>83_bgWG(CUQgequXp&G$3?Z$ejY zza|W4j>jvDzSfngM^H16^PSO_moFc5>(h=GzbzN829H2>(s zuf{EouJ@VT_d3vyXW2t|(Ou$x4U63&TKE=DXqoIoXIl%751K;cR^mu`llu$pC;OM{ zz6}@u?XslJXK~{|!Js55uMv@bFt4f(sREqFb%KH5y?eL4_xj>w@4;$*RwF^XiQ3Ah zN>al; zp*>)sxkK&Fqe0bbYdvgttn`ale!jpuaF(0LMC>u1LU2#`*{|(5;IqaSC<)}f8&@ar zq?BDh>H_EF^p&sF;fNgAcs9(@6H}hT1$$-e^iLYrVitep)c&Nq{rO`m22uVS26}ng zAO+{PORsChuUz_%+m!mWqjMXFE3y7w@O~)ZaWpAFCHN*)bMN=BI|Yn9DdQl2igS1~ zsT$BW6O*K4Y6`950jMM6v=ZTSa0!U{IypIgT#I&CHZw3d2-mEZmJAL&t8-d!e=l|a zyV0w0t4^y9@Q`pWBqj^zIFk}9oJI3PF%X(Le98j(znON`wrRY5+wHa)a@E)H^#Z{@ zwX_7Go2|U;&i>~&DQd~#E&wRemD>ye%s2HpXTBEA?`5Z6yhGTmFAIpnMmU6{Tpdwu zbL`kBk*ZN_?7Oqy_epf9B#eNH@uM_Zg$^O;A$pD_qgWcyPFVnc!cwho-*(_PA% z!Xe*3vyx`U67Ss|_uO{O3iTuq6M-J!1y2U8=)6#HOl3?R15m7_gw$jyVcSBf7oNA3 zULanFy5Dx&O*8KAOy-Hs#^lGEb^naQKSlxfZu;~dB5Na-E)co|ib>+Nw1?tDfgZHT z$;Sbv!4UP1rsAze*jR;BED4a$qACCzg2ELV&?JGZ&yCitg=m1V#NZzN$LL434p9E}U{81lK) zKt$mv==?To*s4377r{JuQh!F$WTM2Wd6;-o+98+_>l_=Az%xa56H3es!NR!vD;Q_( z_;1ky(QnhQa+!c@(ns(>UHuaccx{% z{Tg^E0FLuNT4Y|l((At_y`e7cWvE%h8@K$s2cs?{*v`K)7#X-k6VYqMk0AV8vn!TF z&&boW`98ifgl*@FZzHnOzXUT+S}%+wY^cu%H(Wf=<(AEYlRZ!Brf{&?Lx$@WN?bzX zU?9V3ooY;6g*}`^rXMG~sBZatVhs?~yMDQD?={~{D))CHNlld%75OAl9f!uTpt2y7 z!WD5{BK7S#_uPG^n>}+g6s=}fLiUlej7C8J6*Pi_^tuNHZ z#@~<|_!iZH_}+%%%8qPV;1WTRyI=4xsVfhC4CQWac6N+Gqq&LOvM(*F8K9+u4iDhd zucU2iCU>_)6?E^!>Aa(yVEMSus=3w>cz@mDVDHVs{YqWBnoR_{pOmHD-O%86O@lz` zaT*BwW$e0Yi&-aHWi{u~hekXkT40+DuQR4^(U8$>Ly4-U74!;{c}F9a;|{fL>> zU5_o8o5ox2d8E{g)IrXI9b_OOjO>}n0Usy$5;iv5FWRO;CtZ=uHr^m#%BwY7> zHBnp zq;e}A!#O%4?074JvbsIQd_sK;1bg=EnP{d!x+;#N15LzmdyOyX7^%n~oo+uTKFb5? zoE*kysdt={00ynj`S)#V)XyQ7d5IW#(-wX0+at}5wo_x32jpXY{|yWG|F#J>omlsR z1Iwr3T`GYz9FyyJldK{ky`*m#&h1&CI?igg9=v8VaLXOlbmihqvp0DaEP5IsGof$E zIk&jDFX=LBozU?MKS*KhRhhns&X^}Zw0JG%^?29@BAAsLBRoSSTtWsv2*@(K~9Fi$-6txBT+NXkWv*|VUwyc>q7WxBOeroQUJyodC zyt)6@C+|206QcJDfh3HG-KVFgmv-6vmZ8%_l;6VsQhN2_uX1}VD@#=IbFcXJ;d2AO z`Umpueb27Gen+R4{a}y+7-CE8%*|B-APPILDf58-pnaE01J@qRzkDuxA+99xPYb`# z0$oUcjajh6yvXW-3H_hGp_OzIIDa(Hb41*i97D6!v#yeBE)=0S3(z82 z6Pd~3bJZFDIFW&R9Z?6LTgv0@;ZkWUYD1@9K-a#zjlU!hvOgDKe917f>2p!dqFu>$ z{D(piKr?D)l@p1fd_2zVQZ2lVf-$4+_-@T;sXH*L^JWIjV~RxD5}foVtf}9=qL&~# zQlc%YN?-yDB)S#KIJa$!@@JXxCZ6DLaNT}^aZm+a*cpm5DU<1LEP%)i0{c(wJ)sJJ6f;$aT>VAVj}r+T&$XH{-A(zpu^30_8g1tUl3 zO@EQ_js44jYor~JBa17czX!g_Wi#;DiNGRj`^ONN^W$eEy^4MuGe}BAAh3({AjpLi zh~q9#m$0<-6!on-A?J=q4YHpRz0-3E2QW%eY%+Y|ct|t>V+3EB-DfET3)-L$Wyf>c zyqpQGN8Wb;7|;KJ&1;^}S$C?Pl*Q{GL`jLQ>1h%1-9Fzj1r^;cte%3dl+)lOV7OBJ z`z|{4#_rBs`Ltey>sLq10!GUJ;Dm$94EV3~GKZz-TWRGY6o=&vjg50Dg?Egu$Vf@W zwrA1aOqtP15D%JOhP~OxA`z)VWYoH)T6jUwScse$81%ww@Jr%%sQ;2;@m~wT^BRiW zgh%rhtxzRb;*sHnS=zgIkFO5O3sSv{|3~#!`<%7! z?7Agl-xeeYl$d|(z{q5;tqL1lX%_E+H5h=W{{^Ja|4$&Dhnk2e4iy3NZAdbo41|&g z_70VU{G0ZhsBV0zmPNs>jA`xcIXCw@off?OdH2@<)Qi;5_-r*Z!C|axdbuW3#f3~yhNX>xMX;MA zd}t{ua36~@g8!+6NQ3O$I*&I$cj6X3UHTuz-=>87mA_8*?C=beWdFehbjh+ifjB_`gaE7kR1f_O8Re|xPj<`i1|F@J@BzO`1R*ih?$IoZcPrw`iB@vg77GIxCU;KD3v>y)h>gl+n@C4v|0O#IKw{$zu zda(}d6z$=Zjy0;>x)F#xwG1oHyOlOFCS(x7jh+lMGqaAC)|QZhO0{5>FhI>lMf8`q zdBoq?0w2{W0T|4fdu0St7w}stHQLr2Vd43!A}40ciDlu2UVuHUr6DQSC{x`c~nLIq7FtQ5b_JVCmoG+>I2(S?k6Kpz8ejwW?|jXGYLQ(R2cx zNi3RUlcI+avui<=ECeDJiBtzjcl1r&fK%}9x2}>WpGZck$~UNHwt|4&{XY?iNOD-N zqfgO`!Ho&rB9X-8t2lV7T>}R^6`j*NzhatBY!G7(zS3cSP^FGl)fyc7Y=fncaS0Yw2$@XPwqVFKF~Uul#ys>dK)fry#&RXEjB>kA zuOb=L6RnBr)%YO(`kPPA!@_DMouW#{m3+T2?O$KhHj@VBMlDrd+6}_N#_4Z;D>~kb zd^!Ia&3o3LmQg-F6!s`V*2iOGlOPX)`_y)C^x3H%{ zOVJ6Koy{@*#fPLMB(INuP@%0bz_*$laszvG|0teP8(J=!5fTxR^e15Acx6jzWN*|~C9kWe=upB{ZjADL)BbyoXyQomo7e?F$tEkLpBUmIljz}yEm0wZlV>}|J z`TXx$Ki2Cvl!GeFH(0z3;|gCXW{Mjpv-oEfOLQDqa(fBVV+{%osE`v&oE5&!j9MsE zF~-x55Xi0ZBfnMr>~|m1SP;Ce=p85q{vtq#q9)$8fffE~)S^fhJ~n1P`?)+3U9$e2 zz4wY)6#7y$P;*Xj<%>`KT>2+IbixxiI0&u5Wv`o|+jCZE)-lD+@OH~b$%p;tq|!0G z_#6E(>4ptHIB>L+5cAc(Od97wEvl@#T0McvmHQqG3o9(jOLqKj_F@rDxL~O=2>&qP zTRx@(gJU4yhSFdaTZk2%c7D^vOiZ24Twv+It)*O-pP9Q!sw?>lnuDx3Aw1oeM&61` ziLEF=rHM=&2S*&E@9tocBZe5r2!$|>&%u4Cnjue76u3C~8_2~V-H+Nr?kj!{pMg^} zom@Wd_;-RjnNt+sk5r$@tSO&%b!!qD=e1|gOx{f@|9PMnj^^iOK2i|H_hO+Wp82B~ zJGFgN<_Mq~4%UF@`abaQ_dn*({+HbU);Z&2Fqg>tXZlQ4vI;q3cqm`gy@W`dq$l7T z4Pxb3tJIQ@PsjBE1v~Jg3jRGYX9AkOxKubwB6&7C$Ajl%X-QKrL&jNWEW+wx`>V zcXT#NV0zN%RLyhl)q_C90C8l@$4uaI4$@*o?dy=ZzvaAtol(P@XlK zJ%0J9%>Kr@KmYmygJ_=TH`ZnxAf`Pj`W4jg2F1219)7;ZSlWdbuoqTc_R2C}$&--& zZGer0v#ROuAVapRutIWF!MPA%T@Vhw%8}3ls<4}--~cJfFs@|r~5yZRugT9TItGkk9KAB_?iR``YD-zG=x}dty5>jFzT6>`O0AiGCg;nx0 zJ#UZ1KdXA-6LU5*p{J}qz)Q3<)4KU-&VcTxX%p2+W?b)i+iRYWON2syJ#N&C^t7}$ zva%to_{@JN-GKbr%zIgzAtYC>@$MxW`8jSDDq+ipSFQoW@f79fo$z~u@ksW^OW;HU zzg9Jkuc}kE`+1YqIa%MVu#l&<<5-O1Pf`2X+lx4pyA@h(nt{^*ZTktFwLs*EfKehc zug#}NsL@Kk3YNNH1rw*nR!!%4OdTpG4=1NL(v2F=XEMS5XpF+%YdG-)vO71j1?aAQU)n!UDdkC6k1tb;3S)r;%rIB>EDZCZm6sDvwjjz%kGR%Duw%#jF)*?nBAD} zzP*t;ikk=)C9h`sYZz7-ayWs}ID(%4Y8fietWI_sU>W+AxaYKY2YlW(33%hT$f1wZ$nDJ~iClqa*oD!qx1XgaS8DaO88SFNVIp zrhoYm&IHi=!J4nSI3p$yP?&0Y`w3a&)76nBndq14F6*V zwl)dOa9afBRfot7bE(uYkQ>XFkOUed_RD?xh%+)W-c_yM({@5?>e*IIKhDKI`LNm{ak-%sDcSs+2KOK zNE5zm&hY=;Ep#t`{P+>@lY-H*zTk7HMfCT#y%dG6^#oKTZ^C@aid35dq}KzcsY94P zW8}(Fhb7a+yE!~JqDm7Cd^!uN(piPiZIP8Jt{aJs!y5_tZ1~_8A!pSHGb~x^7Spe6 zDF{AfH?gSYr8Q7C7R%lf#Qs0+o%cT#e*gaq$IOl+dxXfy-ei-Nj7~NgA)F(U%wuO9 zn~;%R4pGR;4w;da$nF@CJ(Bvoj(T73alNn4?R$N1*Y|e&{&4*V$IUsf=kxJ+KA!hy z+|5VZi>aJX$kH`ux_ib7aoBKk#q7Ebq1$5J>AA8h4~Fhh=gg`2P)e{p_4T!9nQKvd z%jj28l|Qw^>{sFOq~YrX!@9-Q*iVssEgqsZl}dpw3d|+)7iYU}s(lBQf{;Mtvcs#x z2+GGez$ioE4vr8(BTjZcVp^_JyeftV!u&$c8m8crKM7;)h4xWODm)}yKUc=l5-X)! zGz9m?6JfhbeEI<;VGM)TrR0mx>Lxs|cDPdzN>|=h8{$$rD7?9qKl?+u&)j&DOhI;`&)Z-A@G@k5Pv&b4 zYRN77<)AlK*2{idNqLtBw;rY^syFc%{WxZZLF1^lydHjak0o?weO%Zt0px`yfqoxtSX{8)|+_tZjZ6U zPxqcOFHWKCJm&N+Dsr!2KcS(^z!HymED>c1#(XG6@J(D|bte|!F5_wqYhHesZsI|Y zF?dW(A)8JHmJ0>pT9T={mWGB4XqjqY2vlY?TZ>uiA25&$XL4Ve88&G(xlFipW`$U( zCT~cSLZxh*jaP;Nn4pap;_-zP{g8yg5v-L-*-P6IQH5zbGZ%hjxV>6KH-@vEn`o~( zqnu$S_9Wtz=brMe;&?JAdD*O)%1T+|aYho)(<$MRH209sncVIaB7j4=fV@Nd2)dSC zEsAsw08oqag3z7^6xJ1#1s3K+RWlUwAG;bkb5H8IRbRY(;+dKvOTXRS3~837Hm2RT zVKREKzzZ_-_Au7o#={q`v#7|P2xnD>!%G&0j$!WzER%)`9$QIq)YbS2Hf=ra&Du_* z*3%IIs;{N^0YkT2qi+kDa>fBI-&G7S>RqqN56vtpeyk*QQJFOAR_QgGG-%SUt&Z*|##SpcLJuC5@>!w7fdVV~8IGz~Ep#exph-T1*sIa@5 z*YRCM_zFJ7#omt>clA8*cbx`+A4FYtAKf9?UcYsRPI=+P!*{$RBzA`7gkkm9a{D`2 z<=@?bQHw4g=W?D`mTOZg(UCW$a;sr_ zG2^A}#>LN`xjj$1*T|~3PU+nH`t)+Sp$tht*lG*ar;v*~sZS*)2(S@@cQ4|st7-ha zH;9%yQ?wjxc=nTHDsGy#$$UY-J({=`@Q$vJz$~Ug0Sqv&|_Q-~c071goa{cQbLmY?;;FZ-n zwsXSUVVo9to6O~KZ`vE{<~I!k@rfeu8%JX&pom}$!6&H}`-mhl=TW|wcYoRY1qB7J zzx(@rSbsmRqc=^&yiTQO|6L!p=HVoi^LUhDYdKSJ4FNr6bk*?#*fc2R-g~vsYp(oE z4#3?nOT4w?#YY=jNn~#THZEU_Hy?A95>ay(pgV3mzV#1bU%<-##WAF>FcBldP_xI& z5)rFFw`;wSurSl<0iogMyUu>ReU4-}VAJuCh*RSf)P`|p6sISs>8GEc@_c$UU(sY% z6bL-7tp$tn?n{D0pDud{PV@$RFCQk9osX>_467`ueX<57kLr?_aimvuRHF%Ly1e%8 zFy14f5{Q$KKGFL4iaVkqIpoGH*0{+9TB(If6rZ9#5k6u$8>Sok+qhq#aenA68zD^T ztzD3k^W^xO#G*i}AA)Aagx7kB@CG&M`xut&ql7QKt>bp&tut2+3gz&6!yg;iY zw!U}hm1bz;Kxu}^H_rhz8=sk0^RhiymF{eNn8wYQQ~pIHT*0@71weWMBA&JJf1dEu#`V0qFO5a8j*>k{SbWToRDg95;juMlC0V0 zrLT#p8k_dWDAY97OA{W68F9=a%u`DD?=}8@_xs|rYnxX+=cUUCCbIT62OZj=h<{c4 z@9VbYvfUFUms#hjxGMF}rJo6ueqR8P;DG3sa@yRpDHzoE-4~dojCk^Z^(!safom3a zLQCbys={xBi0xhwv|QaHPQ4DC`*K0P`+=_`jS5`F z55a;ZaPY$~K3fiXr+SZa0-jEc;V|Uwr7)_#{Z?%mPk`}A%z%OZkj)rLVro+Jh>((& zL4RcgVn9MR+ZMX{Yi@r^3%QeK+bnmyQu3@c(Km-2aWIQ7tsxDi0mWaX#;N@@XI7C( z(JU&kCqhQ^pUZzcyB|P0+5N-TN8M*{1(C_WMUo33XEbNjE&e^dPN zp?Ldv`nFb&{22479qSs863Bt)_=YP7M82lkCP3FRlZ0)G5`sa*1q446 zltgDUz(?&PxvXT(oBZHu1kjaZK)^)2KvP`f>+pDc_@!pM>|F??V$)puszjRmkXmCD z6V~hx842Ho>Lxlw-v?B&Z+)L~-R1T>U&1PEe9XO#5`8|IPX2G@pFFtrcc6jeOqo;N zLVUl?t&5iU7oyv@{pG*eieFs-V(x<)YufN2E$Tw-tIK$G!bgiA|Z z&Vn%>?S1hY8(P;k0cLujhA>exUztgsew>7>9BuJjI~N6Bh!XDk)a2xPBAkw%uKg^< z{n}08vLbmHqBq@%ArXJA`;b`ql(ylm^h&X5%5F0}0jP&;($?E*!h;;2FNxPJy!m(8 z4ZPk%Z{_)c!Bv8`76jG*Mtd2{8scL22bqAnFP{n8Pp8soDV9o`?f+$@Z%Zx7`lo?D zT`qF9(-zDu$)boABb{O|!Hm9-rxTOVvMQvkfrVHMYRlZxti}iGRV(TleNG=UDqbyw zneFfgmH$eCP6mo$KmO72i#v-f14j<~sb4@Z?is{e1V0)l=1t;cjngbe6H${(4m zb|tE5ntdLR*%oYALTlz713{k-VS30iA#9Pd{?bj~$^}81Ye#q33i=1yXtzz%%k4S= zW4Gv-X#QjELo}Urd>?iJ8Nn~d5rE~iITu#A-WU#Mun35ToE7VEY_vMvO>^;jwbPm; zC(3iFm7+cw!oRw@Iz{h&5w?z*Q*}l_tO;(eY*o(NQsGGuf^vO-S0RGps)CL2Rku992U9`wE^41oVf%#?w2>Z# zgF;X(Ifhy_%Clo!w3ihgEt)sfM^H;_cIQFfi%y+M9Q6wdHfP~K*ih0E5-~M+y6+l z_{QyaHOSCI;Gt@z-a7QJJIs~BQIEJR9$NK7-TzWIQWM;%+99mjcFw^9}wKDs41cTOc)uSc($o_73v7l|u><)6iVG_y$cdv=DZ1fdk3 zo|n#tJ2!~b4}`aXkMXl5=u`i#C4l8+!;uYxCv$!};@l!OJiqdAv>Ly-nE{c}f(|+@3TbhC$}n48WYgjxRii*W=b&biU46zQ1V- zT91Upu}#6rv_A?=`|C~#6O!8k;G{{zLX_cK=bh4645 z8?H9on()AU*C*$@tfKXS)u6thM8rk!ElW`Efw#EBdru4I%Cr;F|1D6mS?;%9k8s^N0k8WwGaa?%(nkfJrvtQG=h>Zj-|n?<>Jz~8|#)DpB6%5&y0EN|QZ{Fs-9 z&u8&!J?4O^;u1PW-a-Y&yEMR5b@#s6?bO1ZedOFf@+qv z5Qc8~U{?saN(Z7S5o9Mc38&v^5=97knvcDBv4t;YvZzRs?=0`As;c(aVX0A9>WLVQ z6HNr`Hd)^R&jM6?Gz3j59gGd@6?;%&!+O3F16T?7Ks3VVser-R=wx>Kq=5E_{mhzM z=^$$2c>wTL=-JI;H%U14??yoslM*qU?JKD4+Z%%m5WKc~$I@K|KcBche;Nf@y)*rEM6i}mP z;lF5&PulzzzKKO&TJEWa!EfkTY8@b?v=R3_OL;=D`P-qHn$X;40+L7Zo%`S z8d0`Z81jX3qgD#1vxoLEsRFL`s&M8-=Y{M))pslC-->%|eUEzxX67wp9)A0rER||A z--MX?M9@-IhOJCA7gPFMEmdVw?mh(D7F8ZV)(5=~^He^H-v=Z}SK;E=sB|F5<7$E< z%*5$l)X(?Vr^DoY?Ry6VSuoZ}Hp8Su@vSMH{#^&Iygw-WX_4CE%O65M=)S&Qs?Oi^ zC52&4`H!wWU4X&rnbnvZhJP99qj86t%||@m1UGDHf`a*8SDGvL)VHBbA)d93Zv`U7 zT35`&<>V)0!UrFmlyx0+0k*1lOy(R@aDr)myX|jIeUupHCju7XP`9^$RRYD)&v~w_Z8C^=s*IL)B+}wR%k%YWYDj_3rj6&--i|LcJH#w<4@PgIp4^l;w~xi}lnG#1sIKX4v;9%?&^BqsBwkeMDf)-LYw-=5tKxpd$jR z$KyN5=ypH1>#+sO$NgYqcDMrAgWdU>i3?a-@KGxWM0f2hzbgP*$h_XZkN%~CLEWJe zq|EXMa;rHwxS*_T*{SZ2(9%W5U(izBGZc=HmtIqKo?Go3+>4(~T@w-W^6rNh%w!Nl zMeE}?+p-8J^3a6QIx8nvBewgV#lALW6v}k}CmzpffyM3tE>F((% zK%@BpwV@C~LfaKdb8%A4OpCt!4n{Xk&Zl~HA#JD+@b9zzrQhFhT3mEyh>f6@QTg&} zUtL=08nTUe@dzZIDTePnZcy05J?v|;5Gc7WLAbxQSm1tRY+V@0)CU7I3i5n{uf9c6 zdkisdZE(1SPE&Eq0vKWWBF!D8Xj&HX4suw297R;-TIN`y_&m z3uplVKCN7Ie%8c;HAa2qy|T13rt7glQe@(tL*<1;cE5pdaR9vZ6T!L`#zmVmv%?knlUM_?0cZ^o|j6++M80?Syy z-l5t46!^SLSXo11lW%Bb^k%Y_G*gj?q!Tk{Q;L{5^D`LFEneTba^s;jJe9@Y_Exu?1X4s=8koj&Rkm*u zI^5+Ve@XzFWc+(jp4mtE`*Gf2s0N*dD0Z=q zT4&`=SZIWihX4DH^)q>LzzD0x&ot`nqu@vLO~1ymoLRy3irP+-44j*r8y=^hcO_)Z zluh;cK>sIJs;=cs;0Z2|Fj?>n%ma2iLtN~epg^Z@g>r8>ZJKuTS6-gOZYEs5hSjpB z!NnMco^d7|bR3y+A@SsMJ4byV90I5z|nhWlIiiC*L(3+_29-}LF?pgfRXlKv8 zVfA-0(rPA+MFa`XznkNOT*k6$jkw##&ma%3+jdZB1vq(h zUv38B?k7J86iB_BJb1IC?)U_s{r{4X-Lvo1b{O=vqPVC@e|n?tPS~J6SXZJn9~`r6 zkvQXA$|?0+j>O(X7Jpuw9h4duUA#i@U}$KC-MW#6X0;&)cNoO0&@U7Cv9HQ{{q&I{ zFMcpNs4tv1W^O-lpVtOaEaiKj2r{#CE~#{D#I`s z3A$Pa?C|KMh+#}xaowZeBArK*#F!mx?qc_B;G)?32}(zQ;$A=D?DxjjjxPS2{T&$X zbuLFX_b@~u*cqc-iM8s(lM4e|vPG?@Tt8IzsSX1UtQHYtrejMG58e?`KwxY_UIXaXgHk>Y0xxw*`*iVo#Uhq^m_a~aQD1>(y@EqJVP1hI*iePv(5 z-6H*Jjl|}G&-1=P!bkemk%c;5NGC46tT~%x_P};4BjkeY7Q5aSS32}`k=W817frgB z53Wt`YO_?P>Q2rfqO*BB#f=9R<_dx_R|~O+3q;O zVj+;iW!6s-?9=DQ-1VO~96b9w{53g_i1P$MhZQ%E4Bz&!lX#}*WrVkF)(<$_PyUZsGY3xr)??9%d6TCFXC#Y{DXf@jfxZXD<3wTIf#BZ zYvTOAoa#X)Jt`ubY`yQxkGCJH!mT}2z@A3C^P2>4_@F)J@nH%SWLhci!cM_%e;&(e z!=g}Y^FNFN9=8C`C|%0g3f1KWm^m+eI10g!W=1mxv?Uv{bq2LAwh2qGe!esLFyo(I z^#^xj|JQ2=IBy^>)RU%dOA(;_nFjYiowtvmHBR|r0}ep!BI{otYBOF&AP`QsV-ON} zDBUcrI+qwhcwmkdr?nh*I&`r}5Z+t^ol7PQ>3-Q=zEnOgnyc`y9`Vw7yWgPKSoCSd zp8M%R6^)lC=u_(s2Ipb&Fc&%|a18>bYk2J%I|Kr8z4?ewRz9&`6@r$q*Dun^_*6EA z!hQ%H*v!&Kv^OQfB0!vqR+8^RZnuayLB!eI@DEoG(|nt4=0rTtZ~0>H9IRc&-Z`)t z0(TBvCsEHQMV1;b?_g?HYnV4{|3ST087(q{-Kfdtg-84BEq~CHXa1piMMDbOqgQ&1 zrSu1O03x5rO-)v*c)%?O-lCK74{YX?{${`d1Fa+tDgOLGQ%)jv+iBb{nnXAO6*ven zgrXFA4{;6c>PC=m=3;_YeXaAM6p=9>oEhV_;`_M%d+Ez*$A?K9Tb~%agS*@rPa7d| zxp>O|a`HeOxvS>AtKK*)cin6B(gg1Bs8efMvX|8Q#K8saRnk5ydCO`{9Q@PA5H4Ce zsIBi?ZiDaoSIL=_MWa#w2Nw`7uG{`Pe=y~L^6393{UH$dc*8cT-BU?Mk40cJIM;B} zOdvz$i0)FO+MG`Ct&D>-e#bPvZRMjT2^=5Z;R7abkZI<_#eo};gG?{=^CrJA*haQQ z+}Cg1ICnTh&3`eGM&&cI%6TlSx<8;AG#G!rePD~^V8*+Mv-*75)XCXA^LR$$C*qeP z>rfHHec?>}bNjyZL_r00Lr+guyJSQa8yOp?P)SO^127KT z*O4?&E)V3NmhoEG4khkA`NT&%@80w~Z3Z)O`~!jg_zwtdUaAnH2DGym^dtFW*5Pcm z_v|`23NK=}iMl_lzvoRCpMpFE8X%HnBl;9oun{C-yDLieO~x&Ykb`$QB9)xhcy8DQ z$P#3+>Gl1_Q6kQ{b72vOc*Sg2NxiS>e*bi&dXugX%+@)+Hn7BAy}N2(VzAeHd$`uLJ9}>YJ^(CaUu$rGjQQ?P_cS zylmen*#G|SLF~N}`if}+O3roOAdU0>c=31qQb#UH(}Qxv9zw@sWMm|u@j|bzrWkt$ zVg8QO;H>>(f~)B~$s^));?pe;DP`v#JZ;wZNAud`*eEWG&%KK8PXMoM5QjZZ5HhfM z7vj`HNth0%M6hwc1z&p+DP0TmdQUI!MDf{UjbC?e2+y9}xru)hT+To|oUy6Srw*~^ zIFqYdEpc*+;V&J%I-v}jgin0W`K|R|8+zg~(!uTd+-rhphDo(aso!4xn1+u zebd#+GoHLP_}(naos5>hQ;HmMenQTbQ|UEgt56#jVEx#n@)qLC3>TzKT)?g7_HF$N zT}>zn+2dWd1Ce>(pC$4Crit%|&p33vqIl-PYS?HLrW*dbkn&jS_cJ%C3O)}uST8U9 zv3X}aokeMneq}cEI+)~h6qn=Hi+F8M(da#JcwrtX$iovK6om6qNcU5$f+EOTEUEP7 z>Sl=dnLlHa1Eo2=yxE#_aQ1=Zvy%fiHkW?~UGpd;lEbup3sEXrU%+6jj7g&{+wgM4 zSMr(~uY8VuId^c1XjRzy4|#@qPUXKt1P^kQv z2nl6|X%Y##^*@%4;OnG34~RkuI2YGHXJ@YZSAHnZ+6}Sq1$MU59n+ literal 0 HcmV?d00001 From 2c465471750ccdcfbaefe215106559ed7a92c0ab Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 10 Aug 2022 01:08:45 -0500 Subject: [PATCH 45/68] Just use event types when referring to them See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941444583 --- .../2716-batch-send-historical-messages.md | 92 +++++++++---------- 1 file changed, 46 insertions(+), 46 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index f36f8fc344..1379ffd060 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -64,7 +64,7 @@ Here is what scrollback is expected to look like in Element: - `m.historical` (`[true|false]`): Used on any event to indicate that it was historically imported after the fact - `m.next_batch_id` (`string`): This is a random unique string for a - `m.room.insertion` event to indicate what ID the next "batch" event should + `m.room.insertion` event to indicate what ID the next `m.room.batch` event should specify in order to connect to it - `m.batch_id` (`string`): Used on `m.room.batch` events to indicate which `m.room.insertion` event it connects to by its `m.next_batch_id` field @@ -76,7 +76,7 @@ Here is what scrollback is expected to look like in Element: Since events being silently sent in the past is hard to moderate, it will probably be good to limit who can add historical messages to the timeline. The batch send endpoint is already limited to application services but we also need -to limit who can send "insertion", "batch", and "marker" events since someone +to limit who can send `m.room.insertion`, `m.room.batch`, and `m.room.marker` events since someone can attempt to send them via the normal `/send` API (we don't want any nasty weird knots to reconcile either). @@ -128,7 +128,7 @@ which can insert a batch of events historically back in time next to the given services. `?batch_id` is not required for the first batch send and is only necessary to connect the current batch to the previous. -This endpoint handles the complexity of creating "insertion" and "batch" events. +This endpoint handles the complexity of creating `m.room.insertion` and `m.room.batch` events. All the application service has to do is use `?batch_id` which comes from `next_batch_id` in the response of the batch send endpoint to connect batches together. `next_batch_id` is derived from the insertion events added to each @@ -213,18 +213,18 @@ history.** #### What does the batch send endpoint do behind the scenes? This section explains the homeserver magic that happens when someone uses the -`/batch_send` endpoint. If you're just trying to understand how the "insertion", -"batch", "marker" events work, you might want to just skip down to the room DAG +`/batch_send` endpoint. If you're just trying to understand how the `m.room.insertion`, +`m.room.batch`, `m.room.marker` events work, you might want to just skip down to the room DAG breakdown which incrementally explains how everything fits together. - 1. An "insertion" event for the batch is added to the start of the batch. + 1. An `m.room.insertion` event for the batch is added to the start of the batch. This will be the starting point of the next batch and holds the `next_batch_id` that we return in the batch send response. The application service passes this as `?batch_id` - 1. A "batch" event is added to the end of the batch. This is the event that + 1. A `m.room.batch` event is added to the end of the batch. This is the event that connects to an insertion event by `?batch_id`. 1. If `?batch_id` is not specified (usually only for the first batch), create a - base "insertion" event as a jumping off point from `?prev_event_id` which can + base `m.room.insertion` event as a jumping off point from `?prev_event_id` which can be added to the end of the `events` list in the response. 1. All of the events in the historical batch get a content field, `"m.historical": true`, to indicate that they are historical at the point of @@ -250,18 +250,18 @@ breakdown which incrementally explains how everything fits together. ### Room DAG breakdown -#### "insertion" and "batch" events +#### `m.room.insertion` and `m.room.batch` events -We use "insertion" and "batch" events to describe how each historical batch +We use `m.room.insertion` and `m.room.batch` events to describe how each historical batch should connect to each other and how the homeserver can navigate the DAG. - - With "insertion" events, we just add them to the start of each chronological + - With `m.room.insertion` events, we just add them to the start of each chronological batch (where the oldest message in the batch is). The next older-in-time - batch can connect to that "insertion" event from the previous batch. - - The initial base "insertion" event could be from the main DAG or we can - create it ad-hoc in the first batch. In the latter case, a "marker" event + batch can connect to that `m.room.insertion` event from the previous batch. + - The initial base `m.room.insertion` event could be from the main DAG or we can + create it ad-hoc in the first batch. In the latter case, a `m.room.marker` event (detailed below) inserted into the main DAG can be used to point to the new - "insertion" event. + `m.room.insertion` event. - `m.room.batch` events have a `m.next_batch_id` field which is used to indicate the `m.room.insertion` event that the batch connects to. @@ -282,15 +282,15 @@ flowchart BT end subgraph batch0 - batch0-batch[["batch"]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/insertion\] + batch0-batch[[`m.room.batch`]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/insertion\] end subgraph batch1 - batch1-batch[["batch"]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/insertion\] + batch1-batch[[`m.room.batch`]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/insertion\] end subgraph batch2 - batch2-batch[["batch"]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/insertion\] + batch2-batch[[`m.room.batch`]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/insertion\] end @@ -345,11 +345,11 @@ The structure of the batch event looks like: #### Adding marker events -Finally, we add "marker" state events into the mix so that federated remote +Finally, we add `m.room.marker` state events into the mix so that federated remote servers also know where in the DAG they should look for historical messages. To lay out the different types of servers consuming these historical messages -(more context on why we need "marker" events): +(more context on why we need `m.room.marker` events): 1. Local server - This pretty much works out of the box. It's possible to just add the @@ -359,9 +359,9 @@ To lay out the different types of servers consuming these historical messages new history is inserted - The big problem is how does a HS know it needs to go fetch more history if they already fetched all of the history in the room? We're solving this - with "marker" state events which are sent on the "live" timeline and point - back to the "insertion" event where we inserted history next to. The HS - can then go and backfill the "insertion" event and continue navigating the + with `m.room.marker` state events which are sent on the "live" timeline and point + back to the `m.room.insertion` event where we inserted history next to. The HS + can then go and backfill the `m.room.insertion` event and continue navigating the historical batches from there. 1. Federated remote server that joins a new room with historical messages - The originating homeserver just needs to update the `/backfill` response @@ -371,41 +371,41 @@ To lay out the different types of servers consuming these historical messages has all history, see scenario 2, if doesn't, see scenario 3. 1. For federated servers already in the room that haven't implemented MSC2716 - Those homeservers won't have historical messages available because they're - unable to navigate the "marker"/"insertion"/"batch" events. But the + unable to navigate the `m.room.marker`/`m.room.insertion`/`m.room.batch` events. But the historical messages would be available once the HS implements MSC2716 and - processes the "marker" events that point to the history. + processes the `m.room.marker` events that point to the history. --- - - A "marker" event simply points back to an "insertion" event. - - The "marker" event solves the problem of, how does a federated homeserver + - A `m.room.marker` event simply points back to an `m.room.insertion` event. + - The `m.room.marker` event solves the problem of, how does a federated homeserver know about the historical events which won't come down incremental sync? And the scenario where the federated HS already has all the history in the room, so it won't do a full sync of the room again. - - Unlike the historical events sent via `/batch_send`, **the "marker" event is + - Unlike the historical events sent via `/batch_send`, **the `m.room.marker` event is sent separately as a normal state event on the "live" timeline** so that comes down incremental sync and is available to all homeservers regardless of how much scrollback history they already have. And since it's state it never gets lost in a timeline gap and is immediately apparent to all servers that join. - Also instead of overwriting the same generic `state_key: ""` over and over, - the expected behavior is send each "marker" event with a unique `state_key`. + the expected behavior is send each `m.room.marker` event with a unique `state_key`. This way all of the "markers" are discoverable in the current state without us having to go through the chain of previous state to figure it all out. This also avoids potential state resolution conflicts where only one of the - "marker" events win and we would lose the other chain history. - - A "marker" event is not needed for every batch of historical messages added + `m.room.marker` events win and we would lose the other chain history. + - A `m.room.marker` event is not needed for every batch of historical messages added via `/batch_send`. Multiple batches can be inserted. Then once we're done - importing everything, we can add one "marker" event pointing at the root - "insertion" event - - If more history is decided to be added later, another "marker" can be sent to let the homeservers know again. - - When a remote federated homeserver receives a "marker" event, it can mark - the "insertion" prev events as needing to backfill from that point again and + importing everything, we can add one `m.room.marker` event pointing at the root + `m.room.insertion` event + - If more history is decided to be added later, another `m.room.marker` can be sent to let the homeservers know again. + - When a remote federated homeserver receives a `m.room.marker` event, it can mark + the `m.room.insertion` prev events as needing to backfill from that point again and can fetch the historical messages when the user scrolls back to that area in the future. -The structure of the "marker" event looks like: +The structure of the `m.room.marker` event looks like: ```js { "type": "m.room.marker", @@ -423,19 +423,19 @@ The structure of the "marker" event looks like: flowchart BT A --- annotation1>"Note: older events are at the top"] subgraph live timeline - marker1>"marker"] ----> B -----------------> A + marker1>`m.room.marker`] ----> B -----------------> A end subgraph batch0 - batch0-batch[["batch"]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/insertion\] + batch0-batch[[`m.room.batch`]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/insertion\] end subgraph batch1 - batch1-batch[["batch"]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/insertion\] + batch1-batch[[`m.room.batch`]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/insertion\] end subgraph batch2 - batch2-batch[["batch"]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/insertion\] + batch2-batch[[`m.room.batch`]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/insertion\] end @@ -477,19 +477,19 @@ bunch of `@mxid joined the room` noise between each batch. flowchart BT A --- annotation1>"Note: older events are at the top"] subgraph live timeline - marker1>"marker"] ----> B -----------------> A + marker1>`m.room.marker`] ----> B -----------------> A end subgraph batch0 - batch0-batch[["batch"]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/insertion\] + batch0-batch[[`m.room.batch`]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/insertion\] end subgraph batch1 - batch1-batch[["batch"]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/insertion\] + batch1-batch[[`m.room.batch`]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/insertion\] end subgraph batch2 - batch2-batch[["batch"]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/insertion\] + batch2-batch[[`m.room.batch`]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/insertion\] end @@ -573,7 +573,7 @@ However, this feels needlessly complicated if the DAG approach is sufficient. ## Security considerations -The "insertion" and "batch" events add a new way for an application service to +The `m.room.insertion` and `m.room.batch` events add a new way for an application service to tie the batch reconciliation in knots(similar to the DAG knots that can happen) which can potentially DoS message and backfill navigation on the server. From f60c2338a5f7b44cedba43ee4fa7a3e9ef6e3cbd Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 10 Aug 2022 01:12:42 -0500 Subject: [PATCH 46/68] Self-referential batches descoped to another MSC See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941485335 --- proposals/2716-batch-send-historical-messages.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 1379ffd060..c3e9e35c00 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -209,6 +209,11 @@ so once you insert one batch of messages, you can only insert an older batch after that. **tldr; Insert from your most recent batch of history -> oldest history.** +One aspect that isn't solved yet is how to handle relations/annotations (such as +reactions, replies, and threaded conversations) that reference each other within +the same `events` batch because the events don't have `event_ids` to reference +before being persisted. A solution for this can be proposed in another MSC. + #### What does the batch send endpoint do behind the scenes? From b081ec73b56c583328eb88ae9cda411c2d06aa5a Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Thu, 11 Aug 2022 20:54:05 -0500 Subject: [PATCH 47/68] Re-arrange to explain events and fields in table Based on: - https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941411338 - https://github.com/matrix-org/matrix-spec-proposals/blob/43d224accc81a8335305d22790f7eb1d61f282f8/proposals/3672-ephemeral-location-streaming.md#proposal --- .../2716-batch-send-historical-messages.md | 225 +++++++++--------- 1 file changed, 118 insertions(+), 107 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index c3e9e35c00..7b2ac202e8 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -28,11 +28,12 @@ This is currently not supported because: + ## Proposal -### Expectation +## Expectation -Historical messages that we insert should appear in the timeline just like they +Historical messages that we import should appear in the timeline just like they would if they were sent back at that time. Here is what scrollback is expected to look like in Element: @@ -40,83 +41,92 @@ Here is what scrollback is expected to look like in Element: ![Two historical batches in between some existing messages](./images/2716-message-scrollback-example.png) -### Overview +### Any event -**Endpoint:** +key | type | description | Required +--- | --- | --- | --- | --- +`m.historical` | bool | `true` | Used on any event to hint that it was historically imported after the fact. This field should just be omitted if `false`. | no - - `POST /_matrix/client/v1/rooms//batch_send?prev_event_id=&batch_id=` -**Event types:** +### `m.room.insertion` - - `m.room.insertion`: Events that mark points in time where you can insert - historical messages - - `m.room.batch`: This is what connects one historical batch to the other. In - the DAG, we navigate from an insertion event to the batch event that points - at it, up the historical messages to the next insertion event, then repeat the - process - - `m.room.marker`: State event used to hint to homeservers that there is new - history back in time that you should go fetch next time someone scrolls back - around the specified insertion event. Also used on clients to cache bust the - timeline. +Events that mark points in time where you can insert historical messages. -**Content fields:** +**`m.room.insertion` event `content` field definitions:** - - `m.historical` (`[true|false]`): Used on any event to indicate that it was - historically imported after the fact - - `m.next_batch_id` (`string`): This is a random unique string for a - `m.room.insertion` event to indicate what ID the next `m.room.batch` event should - specify in order to connect to it - - `m.batch_id` (`string`): Used on `m.room.batch` events to indicate which - `m.room.insertion` event it connects to by its `m.next_batch_id` field - - `m.marker.insertion` (another `event_id` string): For `m.room.marker` events - to point at an `m.room.insertion` event by `event_id` +key | type | value | description | required +--- | --- | --- | --- | --- +`m.next_batch_id` | string | randomly generated string | This is a random unique string that the next `m.room.batch` event should specify in order to connect to it. | yes -**Power level:** +A full example of the `m.room.insertion` event: +```json5 +{ + "type": "m.room.insertion", + "sender": "@appservice:example.org", + "content": { + "m.next_batch_id": next_batch_id, + "m.historical": true + }, + "room_id": "!jEsUZKDJdhlrceRyVU:example.org", + // Doesn't affect much but good to use the same time as the closest event + "origin_server_ts": 1626914158639 +} +``` -Since events being silently sent in the past is hard to moderate, it will -probably be good to limit who can add historical messages to the timeline. The -batch send endpoint is already limited to application services but we also need -to limit who can send `m.room.insertion`, `m.room.batch`, and `m.room.marker` events since someone -can attempt to send them via the normal `/send` API (we don't want any nasty -weird knots to reconcile either). +### `m.room.batch` - - `historical`: A new top-level field in the `content` dictionary of the room's - power levels, controlling who can send `m.room.insertion`, `m.room.batch`, - and `m.room.marker` events in the room. +This is what connects one historical batch to the other. In the DAG, we navigate +from an insertion event to the batch event that points at it, up the historical +messages to the next insertion event, then repeat the process. -**Room version:** +**`m.room.batch` event `content` field definitions:** -The new `historical` power level necessitates a new room version (changes the structure of `m.room.power_levels`). +key | type | value | description | required +--- | --- | --- | --- | --- +`m.batch_id` | string | A batch ID from an insertion event | Used to indicate which `m.room.insertion` event it connects to by its `m.next_batch_id` field. | yes -The redaction algorithm changes is also hard requirement for a new room -version because we need to make sure when redacting, we only strip out fields -without affecting anything at the protocol level. This means that we need to -keep all of the structural fields that allow us to navigate the batches of -history in the DAG. We also only want to auth events against fields that -wouldn't be removed during redaction. In practice, this means: +A full example of the `m.room.batch` event: +```json5 +{ + "type": "m.room.batch", + "sender": "@appservice:example.org", + "content": { + "m.batch_id": batch_id, + "m.historical": true + }, + "room_id": "!jEsUZKDJdhlrceRyVU:example.org", + // Doesn't affect much but good to use the same time as the closest event + "origin_server_ts": 1626914158639 +} +``` - - When redacting `m.room.insertion` events, keep the `m.next_batch_id` content field around - - When redacting `m.room.batch` events, keep the `m.batch_id` content field around - - When redacting `m.room.marker` events, keep the `m.marker.insertion` content field around - - When redacting `m.room.power_levels` events, keep the `historical` content field around +### `m.room.marker` -#### Backwards compatibility +State event used to hint to homeservers that there is new +history back in time that you should go fetch next time someone scrolls back +around the specified insertion event. Also used on clients to cache bust the +timeline. -However, this MSC is mostly backwards compatible and can be used with the -current room version with the fact that redactions aren't supported for -`m.room.insertion`, `m.room.batch`, `m.room.marker` events. We can protect -people from this limitation by throwing an error when they try to use [`PUT -/_matrix/client/v3/rooms/{roomId}/redact/{eventId}/{txnId}`](https://spec.matrix.org/v1.3/client-server-api/#put_matrixclientv3roomsroomidredacteventidtxnid) -to redact one of those events. We would have to accept the redaction if -it came over federation to avoid split-brained rooms. +**`m.room.marker` event `content` field definitions:** -Because we also can't use the `historical` power level for controlling who can -send these events in the existing room version, we always persist but instead -only process and give meaning to the `m.room.insertion`, `m.room.batch`, and -`m.room.marker` events when the room `creator` sends them. This caveat/rule only -applies to existing room versions. +key | type | value | description | required +--- | --- | --- | --- | --- +`m.marker.insertion` | string | Another `event_id` | Used to point at an `m.room.insertion` event by its `event_id`. | yes +A full example of the `m.room.marker` event: +```json5 +{ + "type": "m.room.marker", + "state_key": "", + "sender": "@appservice:example.org", + "content": { + "m.marker.insertion": insertion_event.event_id + }, + "room_id": "!jEsUZKDJdhlrceRyVU:example.org", + "origin_server_ts": 1626914158639, +} +``` ### New historical batch send endpoint @@ -222,7 +232,7 @@ This section explains the homeserver magic that happens when someone uses the `m.room.batch`, `m.room.marker` events work, you might want to just skip down to the room DAG breakdown which incrementally explains how everything fits together. - 1. An `m.room.insertion` event for the batch is added to the start of the batch. + 1. A `m.room.insertion` event for the batch is added to the start of the batch. This will be the starting point of the next batch and holds the `next_batch_id` that we return in the batch send response. The application service passes this as `?batch_id` @@ -252,6 +262,52 @@ breakdown which incrementally explains how everything fits together. https://github.com/matrix-org/synapse/pull/9247#discussion_r588479201) +### Power levels + +Since events being silently sent in the past is hard to moderate, it will +probably be good to limit who can add historical messages to the timeline. The +batch send endpoint is already limited to application services but we also need +to limit who can send `m.room.insertion`, `m.room.batch`, and `m.room.marker` events since someone +can attempt to send them via the normal `/send` API (we don't want any nasty +weird knots to reconcile either). + + - `historical`: A new top-level field in the `content` dictionary of the room's + power levels, controlling who can send `m.room.insertion`, `m.room.batch`, + and `m.room.marker` events in the room. + +### Room version + +The new `historical` power level necessitates a new room version (changes the structure of `m.room.power_levels`). + +The redaction algorithm changes is also hard requirement for a new room +version because we need to make sure when redacting, we only strip out fields +without affecting anything at the protocol level. This means that we need to +keep all of the structural fields that allow us to navigate the batches of +history in the DAG. We also only want to auth events against fields that +wouldn't be removed during redaction. In practice, this means: + + - When redacting `m.room.insertion` events, keep the `m.next_batch_id` content field around + - When redacting `m.room.batch` events, keep the `m.batch_id` content field around + - When redacting `m.room.marker` events, keep the `m.marker.insertion` content field around + - When redacting `m.room.power_levels` events, keep the `historical` content field around + + +#### Backwards compatibility with existing room versions + +However, this MSC is mostly backwards compatible and can be used with the +current room version with the fact that redactions aren't supported for +`m.room.insertion`, `m.room.batch`, `m.room.marker` events. We can protect +people from this limitation by throwing an error when they try to use [`PUT +/_matrix/client/v3/rooms/{roomId}/redact/{eventId}/{txnId}`](https://spec.matrix.org/v1.3/client-server-api/#put_matrixclientv3roomsroomidredacteventidtxnid) +to redact one of those events. We would have to accept the redaction if +it came over federation to avoid split-brained rooms. + +Because we also can't use the `historical` power level for controlling who can +send these events in the existing room version, we always persist but instead +only process and give meaning to the `m.room.insertion`, `m.room.batch`, and +`m.room.marker` events when the room `creator` sends them. This caveat/rule only +applies to existing room versions. + ### Room DAG breakdown @@ -315,37 +371,6 @@ flowchart BT ``` -The structure of the insertion event looks like: -```js -{ - "type": "m.room.insertion", - "sender": "@appservice:example.org", - "content": { - "m.next_batch_id": next_batch_id, - "m.historical": true - }, - "room_id": "!jEsUZKDJdhlrceRyVU:example.org", - // Doesn't affect much but good to use the same time as the closest event - "origin_server_ts": 1626914158639 -} -``` - - -The structure of the batch event looks like: -```js -{ - "type": "m.room.batch", - "sender": "@appservice:example.org", - "content": { - "m.batch_id": batch_id, - "m.historical": true - }, - "room_id": "!jEsUZKDJdhlrceRyVU:example.org", - // Doesn't affect much but good to use the same time as the closest event - "origin_server_ts": 1626914158639 -} -``` - #### Adding marker events @@ -410,20 +435,6 @@ To lay out the different types of servers consuming these historical messages can fetch the historical messages when the user scrolls back to that area in the future. -The structure of the `m.room.marker` event looks like: -```js -{ - "type": "m.room.marker", - "state_key": "", - "sender": "@appservice:example.org", - "content": { - "m.marker.insertion": insertion_event.event_id - }, - "room_id": "!jEsUZKDJdhlrceRyVU:example.org", - "origin_server_ts": 1626914158639, -} -``` - ```mermaid flowchart BT A --- annotation1>"Note: older events are at the top"] From d20455fd78e91bb37ca828e5d92982598c2b05c8 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Thu, 11 Aug 2022 21:00:35 -0500 Subject: [PATCH 48/68] Fix some table formatting and better full examples --- proposals/2716-batch-send-historical-messages.md | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 7b2ac202e8..9516cceed2 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -43,7 +43,7 @@ Here is what scrollback is expected to look like in Element: ### Any event -key | type | description | Required +key | type | value | description | Required --- | --- | --- | --- | --- `m.historical` | bool | `true` | Used on any event to hint that it was historically imported after the fact. This field should just be omitted if `false`. | no @@ -64,9 +64,10 @@ A full example of the `m.room.insertion` event: "type": "m.room.insertion", "sender": "@appservice:example.org", "content": { - "m.next_batch_id": next_batch_id, + "m.next_batch_id": "w25ljc1kb4", "m.historical": true }, + "event_id": "$insertionabcd:example.org", "room_id": "!jEsUZKDJdhlrceRyVU:example.org", // Doesn't affect much but good to use the same time as the closest event "origin_server_ts": 1626914158639 @@ -91,9 +92,10 @@ A full example of the `m.room.batch` event: "type": "m.room.batch", "sender": "@appservice:example.org", "content": { - "m.batch_id": batch_id, + "m.batch_id": "w25ljc1kb4", "m.historical": true }, + "event_id": "$batchabcd:example.org", "room_id": "!jEsUZKDJdhlrceRyVU:example.org", // Doesn't affect much but good to use the same time as the closest event "origin_server_ts": 1626914158639 @@ -121,8 +123,9 @@ A full example of the `m.room.marker` event: "state_key": "", "sender": "@appservice:example.org", "content": { - "m.marker.insertion": insertion_event.event_id + "m.marker.insertion": "$insertionabcd:example.org" }, + "event_id": "$markerabcd:example.org", "room_id": "!jEsUZKDJdhlrceRyVU:example.org", "origin_server_ts": 1626914158639, } From a8313bd583797ae68bd1e8f3d472c3ef889bd6b4 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Thu, 11 Aug 2022 21:07:53 -0500 Subject: [PATCH 49/68] Link to depth discussion See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r943481538 --- .../2716-batch-send-historical-messages.md | 22 +++++++++++-------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 9516cceed2..59d0f51c69 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -219,8 +219,9 @@ the current state of the room.** also include `state_events` here which will be used to auth further events in the batch. For Synapse, there is a reverse-chronological constraint on batches so once you insert one batch of messages, you can only insert an older batch -after that. **tldr; Insert from your most recent batch of history -> oldest -history.** +after that. For more information on this Synapse constraint, see the ["Depth +discussion"](#depth-discussion) below. **tldr; Insert from your most recent batch of history -> +oldest history.** One aspect that isn't solved yet is how to handle relations/annotations (such as reactions, replies, and threaded conversations) that reference each other within @@ -251,17 +252,20 @@ breakdown which incrementally explains how everything fits together. (`[0, 1, 2]`) and is processed in that order so the `prev_events` point to it's older-in-time previous message which gives us a nice straight line in the DAG. - - **Depth discussion:** For Synapse, when persisting, we **reverse the list - (to make it reverse-chronological)** so we can still get the correct - `(topological_ordering, stream_ordering)` so it sorts between A and B as - we expect. Why? `depth` is not re-calculated when historical messages are + - **Depth discussion:** For Synapse, when + persisting, we **reverse the list (to make it reverse-chronological)** so + we can still get the correct `(topological_ordering, stream_ordering)` so + it sorts between A and B as we expect. Why? `depth` (or the + `topological_ordering`) is not re-calculated when historical messages are inserted into the DAG. This means we have to take care to insert in the right order. Events are sorted by `(topological_ordering, stream_ordering)` where `topological_ordering` is just `depth`. Normally, `stream_ordering` is an auto incrementing integer but for - `backfilled=true` events, it decrements. Historical messages are inserted - all at the same `depth`, and marked as backfilled so the `stream_ordering` - decrements and each event is sorted behind the next. (from + `backfilled=true` events, it decrements. Since historical messages are + inserted all at the same `depth`, the only way we can control the ordering + in between is the `stream_ordering`. Historical messages are marked as + backfilled so the `stream_ordering` decrements and each event is sorted + behind the next. (from https://github.com/matrix-org/synapse/pull/9247#discussion_r588479201) From 9d96c5cc98a8132a14445cf3c6c5249e7e5ae52a Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Thu, 11 Aug 2022 22:53:12 -0500 Subject: [PATCH 50/68] Wrapping --- proposals/2716-batch-send-historical-messages.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 59d0f51c69..eab13ddf88 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -220,8 +220,8 @@ also include `state_events` here which will be used to auth further events in the batch. For Synapse, there is a reverse-chronological constraint on batches so once you insert one batch of messages, you can only insert an older batch after that. For more information on this Synapse constraint, see the ["Depth -discussion"](#depth-discussion) below. **tldr; Insert from your most recent batch of history -> -oldest history.** +discussion"](#depth-discussion) below. **tldr; Insert from your most recent +batch of history -> oldest history.** One aspect that isn't solved yet is how to handle relations/annotations (such as reactions, replies, and threaded conversations) that reference each other within From 991bd84741a36afb0d36bd6a4ed9ffec10255108 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 17 Aug 2022 17:37:45 -0500 Subject: [PATCH 51/68] Fix direction --- proposals/2716-batch-send-historical-messages.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index eab13ddf88..74b36f1d0e 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -335,7 +335,7 @@ should connect to each other and how the homeserver can navigate the DAG. Here is how the historical batch concept looks like in the DAG: - - `A --> B` is any point in the DAG that we want to import between. + - `A <--- B` is any point in the DAG that we want to import between. - `A` is the oldest-in-time message - `B` is the newest-in-time message - `batch0` is the first batch we try to import From 55551fce9874cf8e10dcda225346883175258d17 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Wed, 17 Aug 2022 17:47:03 -0500 Subject: [PATCH 52/68] Remove namespace beacuse the event type is already the namespace See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r941444525 --- .../2716-batch-send-historical-messages.md | 31 +++++++++---------- 1 file changed, 14 insertions(+), 17 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 74b36f1d0e..bf6d31f62f 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -45,7 +45,7 @@ Here is what scrollback is expected to look like in Element: key | type | value | description | Required --- | --- | --- | --- | --- -`m.historical` | bool | `true` | Used on any event to hint that it was historically imported after the fact. This field should just be omitted if `false`. | no +`historical` | bool | `true` | Used on any event to hint that it was historically imported after the fact. This field should just be omitted if `false`. | no ### `m.room.insertion` @@ -56,7 +56,7 @@ Events that mark points in time where you can insert historical messages. key | type | value | description | required --- | --- | --- | --- | --- -`m.next_batch_id` | string | randomly generated string | This is a random unique string that the next `m.room.batch` event should specify in order to connect to it. | yes +`next_batch_id` | string | randomly generated string | This is a random unique string that the next `m.room.batch` event should specify in order to connect to it. | yes A full example of the `m.room.insertion` event: ```json5 @@ -64,8 +64,8 @@ A full example of the `m.room.insertion` event: "type": "m.room.insertion", "sender": "@appservice:example.org", "content": { - "m.next_batch_id": "w25ljc1kb4", - "m.historical": true + "next_batch_id": "w25ljc1kb4", + "historical": true }, "event_id": "$insertionabcd:example.org", "room_id": "!jEsUZKDJdhlrceRyVU:example.org", @@ -84,7 +84,7 @@ messages to the next insertion event, then repeat the process. key | type | value | description | required --- | --- | --- | --- | --- -`m.batch_id` | string | A batch ID from an insertion event | Used to indicate which `m.room.insertion` event it connects to by its `m.next_batch_id` field. | yes +`batch_id` | string | A batch ID from an insertion event | Used to indicate which `m.room.insertion` event it connects to by its `next_batch_id` field. | yes A full example of the `m.room.batch` event: ```json5 @@ -92,8 +92,8 @@ A full example of the `m.room.batch` event: "type": "m.room.batch", "sender": "@appservice:example.org", "content": { - "m.batch_id": "w25ljc1kb4", - "m.historical": true + "batch_id": "w25ljc1kb4", + "historical": true }, "event_id": "$batchabcd:example.org", "room_id": "!jEsUZKDJdhlrceRyVU:example.org", @@ -114,7 +114,7 @@ timeline. key | type | value | description | required --- | --- | --- | --- | --- -`m.marker.insertion` | string | Another `event_id` | Used to point at an `m.room.insertion` event by its `event_id`. | yes +`insertion_event_reference` | string | Another `event_id` | Used to point at an `m.room.insertion` event by its `event_id`. | yes A full example of the `m.room.marker` event: ```json5 @@ -123,7 +123,7 @@ A full example of the `m.room.marker` event: "state_key": "", "sender": "@appservice:example.org", "content": { - "m.marker.insertion": "$insertionabcd:example.org" + "insertion_event_reference": "$insertionabcd:example.org" }, "event_id": "$markerabcd:example.org", "room_id": "!jEsUZKDJdhlrceRyVU:example.org", @@ -246,7 +246,7 @@ breakdown which incrementally explains how everything fits together. base `m.room.insertion` event as a jumping off point from `?prev_event_id` which can be added to the end of the `events` list in the response. 1. All of the events in the historical batch get a content field, - `"m.historical": true`, to indicate that they are historical at the point of + `"historical": true`, to indicate that they are historical at the point of being added to a room. 1. The `state_events_at_start`/`events` payload is in **chronological** order (`[0, 1, 2]`) and is processed in that order so the `prev_events` point to @@ -293,9 +293,9 @@ keep all of the structural fields that allow us to navigate the batches of history in the DAG. We also only want to auth events against fields that wouldn't be removed during redaction. In practice, this means: - - When redacting `m.room.insertion` events, keep the `m.next_batch_id` content field around - - When redacting `m.room.batch` events, keep the `m.batch_id` content field around - - When redacting `m.room.marker` events, keep the `m.marker.insertion` content field around + - When redacting `m.room.insertion` events, keep the `next_batch_id` content field around + - When redacting `m.room.batch` events, keep the `batch_id` content field around + - When redacting `m.room.marker` events, keep the `insertion_event_reference` content field around - When redacting `m.room.power_levels` events, keep the `historical` content field around @@ -330,7 +330,7 @@ should connect to each other and how the homeserver can navigate the DAG. create it ad-hoc in the first batch. In the latter case, a `m.room.marker` event (detailed below) inserted into the main DAG can be used to point to the new `m.room.insertion` event. - - `m.room.batch` events have a `m.next_batch_id` field which is used to indicate the + - `m.room.batch` events have a `next_batch_id` field which is used to indicate the `m.room.insertion` event that the batch connects to. Here is how the historical batch concept looks like in the DAG: @@ -625,9 +625,6 @@ Servers will indicate support for the new endpoint via a `true` value for featur **Content fields:** - `org.matrix.msc2716.historical` - - `org.matrix.msc2716.next_batch_id` - - `org.matrix.msc2716.batch_id` - - `org.matrix.msc2716.marker.insertion` **Room version:** From 1ee23d435274eba11dcc0378133274976eb94f82 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 23 Aug 2022 20:44:10 -0500 Subject: [PATCH 53/68] Fix heading structure and more words to describe the historical property See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r952813841 --- proposals/2716-batch-send-historical-messages.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index bf6d31f62f..090e8ccece 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -27,10 +27,6 @@ This is currently not supported because: It's not possible to change the DAG ordering with this. - - -## Proposal - ## Expectation Historical messages that we import should appear in the timeline just like they @@ -41,7 +37,12 @@ Here is what scrollback is expected to look like in Element: ![Two historical batches in between some existing messages](./images/2716-message-scrollback-example.png) -### Any event +## Proposal + +### `historical` property on any event + +A new `historical` property is defined which can be included in the content of any +event to indicate it was retrospectively imported. key | type | value | description | Required --- | --- | --- | --- | --- From e7e435dc0f2f490777ecc33a132253f96fa9d2ce Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 23 Aug 2022 20:49:24 -0500 Subject: [PATCH 54/68] Explain when the messages in the example were sent See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r952815869 --- proposals/2716-batch-send-historical-messages.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 090e8ccece..e10074edf9 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -30,7 +30,9 @@ This is currently not supported because: ## Expectation Historical messages that we import should appear in the timeline just like they -would if they were sent back at that time. +would if they were sent back at that time. In the example below, Maria's +messages 1-6 were sent originally in the room and the "Historical" messages in +the middle were imported after the fact. Here is what scrollback is expected to look like in Element: From 775d3d396904434e065ee90fec51670c88b3d852 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 23 Aug 2022 21:33:46 -0500 Subject: [PATCH 55/68] Clarify that you provide it next time See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r952840357 --- proposals/2716-batch-send-historical-messages.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index e10074edf9..656178fa37 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -41,7 +41,7 @@ Here is what scrollback is expected to look like in Element: ## Proposal -### `historical` property on any event +### `historical` `content` property on any event A new `historical` property is defined which can be included in the content of any event to indicate it was retrospectively imported. @@ -242,7 +242,7 @@ breakdown which incrementally explains how everything fits together. 1. A `m.room.insertion` event for the batch is added to the start of the batch. This will be the starting point of the next batch and holds the `next_batch_id` that we return in the batch send response. The application service passes - this as `?batch_id` + this as `?batch_id` next time to continue the chain of historical messages. 1. A `m.room.batch` event is added to the end of the batch. This is the event that connects to an insertion event by `?batch_id`. 1. If `?batch_id` is not specified (usually only for the first batch), create a From 6ccecd7fd4f8364da346a533534063f26a650880 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 23 Aug 2022 21:40:40 -0500 Subject: [PATCH 56/68] Clarify how it connects See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r952840631 --- proposals/2716-batch-send-historical-messages.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 656178fa37..45a7dff40b 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -243,8 +243,9 @@ breakdown which incrementally explains how everything fits together. This will be the starting point of the next batch and holds the `next_batch_id` that we return in the batch send response. The application service passes this as `?batch_id` next time to continue the chain of historical messages. - 1. A `m.room.batch` event is added to the end of the batch. This is the event that - connects to an insertion event by `?batch_id`. + 1. A `m.room.batch` event is added to the end of the batch. This is the event + that connects to an `m.room.insertion` event by specifying a `batch_id` that + matches the `next_batch_id` on the `m.room.insertion` event. 1. If `?batch_id` is not specified (usually only for the first batch), create a base `m.room.insertion` event as a jumping off point from `?prev_event_id` which can be added to the end of the `events` list in the response. From af24a5f5d89529e6327c61adc3b01a093e525dba Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 23 Aug 2022 21:41:34 -0500 Subject: [PATCH 57/68] Fix endpoint path (no unstable) See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r952839140 --- proposals/2716-batch-send-historical-messages.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 45a7dff40b..09f85d7986 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -138,7 +138,7 @@ A full example of the `m.room.marker` event: ### New historical batch send endpoint Add a new endpoint, `POST -/_matrix/client/v1/org.matrix.msc2716/rooms//batch_send?prev_event_id=&batch_id=`, +/_matrix/client/v1/rooms//batch_send?prev_event_id=&batch_id=`, which can insert a batch of events historically back in time next to the given `?prev_event_id` (required). This endpoint can only be used by application services. `?batch_id` is not required for the first batch send and is only From 10599cbc244dd56ed951bbec0e04e5b95fa4fb85 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 23 Aug 2022 21:43:00 -0500 Subject: [PATCH 58/68] Add heading for new event types See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r952824158 --- proposals/2716-batch-send-historical-messages.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 09f85d7986..6e21700c45 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -51,7 +51,9 @@ key | type | value | description | Required `historical` | bool | `true` | Used on any event to hint that it was historically imported after the fact. This field should just be omitted if `false`. | no -### `m.room.insertion` +### New event types + +#### `m.room.insertion` Events that mark points in time where you can insert historical messages. @@ -77,7 +79,7 @@ A full example of the `m.room.insertion` event: } ``` -### `m.room.batch` +#### `m.room.batch` This is what connects one historical batch to the other. In the DAG, we navigate from an insertion event to the batch event that points at it, up the historical @@ -106,7 +108,7 @@ A full example of the `m.room.batch` event: ``` -### `m.room.marker` +#### `m.room.marker` State event used to hint to homeservers that there is new history back in time that you should go fetch next time someone scrolls back From a6b5d8fa0744373f698756e0969f6bd7b46abf02 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 23 Aug 2022 21:48:43 -0500 Subject: [PATCH 59/68] Use m.room.insertion type name in mermaid graphs --- .../2716-batch-send-historical-messages.md | 30 +++++++++---------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 6e21700c45..99e5c83ea6 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -356,20 +356,20 @@ flowchart BT end subgraph batch0 - batch0-batch[[`m.room.batch`]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/insertion\] + batch0-batch[[`m.room.batch`]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/`m.room.insertion`\] end subgraph batch1 - batch1-batch[[`m.room.batch`]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/insertion\] + batch1-batch[[`m.room.batch`]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/`m.room.insertion`\] end subgraph batch2 - batch2-batch[[`m.room.batch`]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/insertion\] + batch2-batch[[`m.room.batch`]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/`m.room.insertion`\] end - batch0-insertionBase[/insertion\] ---------------> A - batch0-batch -.-> batch0-insertionBase[/insertion\] + batch0-insertionBase[/`m.room.insertion`\] ---------------> A + batch0-batch -.-> batch0-insertionBase batch1-batch -.-> batch0-insertion batch2-batch -.-> batch1-insertion @@ -456,21 +456,21 @@ flowchart BT end subgraph batch0 - batch0-batch[[`m.room.batch`]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/insertion\] + batch0-batch[[`m.room.batch`]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/`m.room.insertion`\] end subgraph batch1 - batch1-batch[[`m.room.batch`]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/insertion\] + batch1-batch[[`m.room.batch`]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/`m.room.insertion`\] end subgraph batch2 - batch2-batch[[`m.room.batch`]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/insertion\] + batch2-batch[[`m.room.batch`]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/`m.room.insertion`\] end marker1 -.-> batch0-insertionBase - batch0-insertionBase[/insertion\] ---------------> A - batch0-batch -.-> batch0-insertionBase[/insertion\] + batch0-insertionBase[/`m.room.insertion`\] ---------------> A + batch0-batch -.-> batch0-insertionBase batch1-batch -.-> batch0-insertion batch2-batch -.-> batch1-insertion @@ -510,15 +510,15 @@ flowchart BT end subgraph batch0 - batch0-batch[[`m.room.batch`]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/insertion\] + batch0-batch[[`m.room.batch`]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/`m.room.insertion`\] end subgraph batch1 - batch1-batch[[`m.room.batch`]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/insertion\] + batch1-batch[[`m.room.batch`]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/`m.room.insertion`\] end subgraph batch2 - batch2-batch[[`m.room.batch`]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/insertion\] + batch2-batch[[`m.room.batch`]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/`m.room.insertion`\] end @@ -527,8 +527,8 @@ flowchart BT batch2-insertion -.-> memberBob2(["m.room.member (bob)"]) --> memberAlice2(["m.room.member (alice)"]) marker1 -.-> batch0-insertionBase - batch0-insertionBase[/insertion\] ---------------> A - batch0-batch -.-> batch0-insertionBase[/insertion\] + batch0-insertionBase[/`m.room.insertion`\] ---------------> A + batch0-batch -.-> batch0-insertionBase batch1-batch -.-> batch0-insertion batch2-batch -.-> batch1-insertion From 0642d888163853ce1f63c1ec8d71f25e15ed4d9b Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 23 Aug 2022 21:50:51 -0500 Subject: [PATCH 60/68] Remove backticks from mermaid graphs --- .../2716-batch-send-historical-messages.md | 28 +++++++++---------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 99e5c83ea6..e940ed3f27 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -356,19 +356,19 @@ flowchart BT end subgraph batch0 - batch0-batch[[`m.room.batch`]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/`m.room.insertion`\] + batch0-batch[[m.room.batch]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/m.room.insertion\] end subgraph batch1 - batch1-batch[[`m.room.batch`]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/`m.room.insertion`\] + batch1-batch[[m.room.batch]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/m.room.insertion\] end subgraph batch2 - batch2-batch[[`m.room.batch`]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/`m.room.insertion`\] + batch2-batch[[m.room.batch]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/m.room.insertion\] end - batch0-insertionBase[/`m.room.insertion`\] ---------------> A + batch0-insertionBase[/m.room.insertion\] ---------------> A batch0-batch -.-> batch0-insertionBase batch1-batch -.-> batch0-insertion batch2-batch -.-> batch1-insertion @@ -452,24 +452,24 @@ To lay out the different types of servers consuming these historical messages flowchart BT A --- annotation1>"Note: older events are at the top"] subgraph live timeline - marker1>`m.room.marker`] ----> B -----------------> A + marker1>m.room.marker] ----> B -----------------> A end subgraph batch0 - batch0-batch[[`m.room.batch`]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/`m.room.insertion`\] + batch0-batch[[m.room.batch]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/m.room.insertion\] end subgraph batch1 - batch1-batch[[`m.room.batch`]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/`m.room.insertion`\] + batch1-batch[[m.room.batch]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/m.room.insertion\] end subgraph batch2 - batch2-batch[[`m.room.batch`]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/`m.room.insertion`\] + batch2-batch[[m.room.batch]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/m.room.insertion\] end marker1 -.-> batch0-insertionBase - batch0-insertionBase[/`m.room.insertion`\] ---------------> A + batch0-insertionBase[/m.room.insertion\] ---------------> A batch0-batch -.-> batch0-insertionBase batch1-batch -.-> batch0-insertion batch2-batch -.-> batch1-insertion @@ -506,19 +506,19 @@ bunch of `@mxid joined the room` noise between each batch. flowchart BT A --- annotation1>"Note: older events are at the top"] subgraph live timeline - marker1>`m.room.marker`] ----> B -----------------> A + marker1>m.room.marker] ----> B -----------------> A end subgraph batch0 - batch0-batch[[`m.room.batch`]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/`m.room.insertion`\] + batch0-batch[[m.room.batch]] --> batch0-2(("2")) --> batch0-1((1)) --> batch0-0((0)) --> batch0-insertion[/m.room.insertion\] end subgraph batch1 - batch1-batch[[`m.room.batch`]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/`m.room.insertion`\] + batch1-batch[[m.room.batch]] --> batch1-2(("2")) --> batch1-1((1)) --> batch1-0((0)) --> batch1-insertion[/m.room.insertion\] end subgraph batch2 - batch2-batch[[`m.room.batch`]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/`m.room.insertion`\] + batch2-batch[[m.room.batch]] --> batch2-2(("2")) --> batch2-1((1)) --> batch2-0((0)) --> batch2-insertion[/m.room.insertion\] end @@ -527,7 +527,7 @@ flowchart BT batch2-insertion -.-> memberBob2(["m.room.member (bob)"]) --> memberAlice2(["m.room.member (alice)"]) marker1 -.-> batch0-insertionBase - batch0-insertionBase[/`m.room.insertion`\] ---------------> A + batch0-insertionBase[/m.room.insertion\] ---------------> A batch0-batch -.-> batch0-insertionBase batch1-batch -.-> batch0-insertion batch2-batch -.-> batch1-insertion From b18c214a0cf43ff87242793bb9e30e8695911cfe Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 23 Aug 2022 22:25:53 -0500 Subject: [PATCH 61/68] Add more initial explanation See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r953293502 --- .../2716-batch-send-historical-messages.md | 39 +++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index e940ed3f27..f09f4dff78 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -38,6 +38,45 @@ Here is what scrollback is expected to look like in Element: ![Two historical batches in between some existing messages](./images/2716-message-scrollback-example.png) +To accomplish what's shown in the image, this is the basic flow: + + 1. `maria` sends messages 1-6. These represent messages in the normal "live" timeline before any history is imported. + 1. Create hitsorical batch 0 via `POST /_matrix/client/v1/rooms//batch_send?prev_event_id=` with the "Historical [xyz]" message `events` from Eric and the necessary `state_events_at_start` to auth them. + - This will return a response that contains the `next_batch_id` that we will use for the next batch. + - This also returns `base_insertion_event_id` which we will use the for the `m.room.marker` even later. + 1. Create hitsorical batch 1 via `POST /_matrix/client/v1/rooms//batch_send?prev_event_id=&batch_id=` with the "Historical [foo|bar|baz]" message `events` from Eric and the necessary `state_events_at_start` to auth them. + 1. Send a `m.room.marker` event so the history is discoverable across all federated homeservers: `PUT /_matrix/client/v3/rooms/{roomId}/send/m.room.marker/{txnId}` with `insertion_event_reference` set as the `base_insertion_event_id` from before. + +The DAG for these messages ends up looking like: + +```mermaid +flowchart BT + A --- annotation1>"Note: older events are at the top"] + subgraph live timeline + marker1>m.room.marker] ----> B -----------------> A + end + + subgraph batch0 + batch0-batch[[m.room.batch]] --> batch0-2((z)) --> batch0-1((y)) --> batch0-0((x)) --> batch0-insertion[/m.room.insertion\] + end + + subgraph batch1 + batch1-batch[[m.room.batch]] --> batch1-2((baz)) --> batch1-1((bar)) --> batch1-0((foo)) --> batch1-insertion[/m.room.insertion\] + end + + + batch0-insertion -.-> memberBob0(["m.room.member (Eric)"]) + batch1-insertion -.-> memberBob1(["m.room.member (Eric)"]) + + marker1 -.-> batch0-insertionBase + batch0-insertionBase[/m.room.insertion\] ---------------> A + batch0-batch -.-> batch0-insertionBase + batch1-batch -.-> batch0-insertion + + %% make the annotation links invisible + linkStyle 0 stroke-width:2px,fill:none,stroke:none; +``` + ## Proposal From 02b5f4bd6096ce425fcc374caa6455b4a198750e Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Tue, 23 Aug 2022 22:32:10 -0500 Subject: [PATCH 62/68] Better DAG to match expectation image --- proposals/2716-batch-send-historical-messages.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index f09f4dff78..50a2970567 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -51,9 +51,9 @@ The DAG for these messages ends up looking like: ```mermaid flowchart BT - A --- annotation1>"Note: older events are at the top"] + 1 --- annotation1>"Note: older events are at the top"] subgraph live timeline - marker1>m.room.marker] ----> B -----------------> A + marker1>m.room.marker] ----> 6[Message 6] --> 5[Message 5] --> 4[Message 4] -----------------> 3[Message 3] --> 2[Message 2] --> 1[Message 1] end subgraph batch0 @@ -69,7 +69,7 @@ flowchart BT batch1-insertion -.-> memberBob1(["m.room.member (Eric)"]) marker1 -.-> batch0-insertionBase - batch0-insertionBase[/m.room.insertion\] ---------------> A + batch0-insertionBase[/m.room.insertion\] ---------------> 1 batch0-batch -.-> batch0-insertionBase batch1-batch -.-> batch0-insertion From 69bd287a736797563ff0bf4a0923c2791ac989bc Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Thu, 25 Aug 2022 12:02:50 -0500 Subject: [PATCH 63/68] Add example why you would use the historical content property See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r952823790 --- proposals/2716-batch-send-historical-messages.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 50a2970567..b2311d64fc 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -82,8 +82,12 @@ flowchart BT ### `historical` `content` property on any event -A new `historical` property is defined which can be included in the content of any -event to indicate it was retrospectively imported. +A new `historical` property is defined which can be included in the content of +any event to indicate it was retrospectively imported. Used as a hint/indication +to clients that history didn't originally happen in the room and to add the +right semantics to the historical messages. Perhaps a little "Historical" flag +in the corner of these messages to show that they are maybe a little less +trusted in terms of attribution. key | type | value | description | Required --- | --- | --- | --- | --- From 5412e80423ca2f2844ec08ad722f22733cc33a2a Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Thu, 25 Aug 2022 12:08:42 -0500 Subject: [PATCH 64/68] Remove "full" See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r953397374 --- proposals/2716-batch-send-historical-messages.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index b2311d64fc..48499cd37d 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -106,7 +106,7 @@ key | type | value | description | required --- | --- | --- | --- | --- `next_batch_id` | string | randomly generated string | This is a random unique string that the next `m.room.batch` event should specify in order to connect to it. | yes -A full example of the `m.room.insertion` event: +An example of the `m.room.insertion` event: ```json5 { "type": "m.room.insertion", @@ -134,7 +134,7 @@ key | type | value | description | required --- | --- | --- | --- | --- `batch_id` | string | A batch ID from an insertion event | Used to indicate which `m.room.insertion` event it connects to by its `next_batch_id` field. | yes -A full example of the `m.room.batch` event: +An example of the `m.room.batch` event: ```json5 { "type": "m.room.batch", @@ -164,7 +164,7 @@ key | type | value | description | required --- | --- | --- | --- | --- `insertion_event_reference` | string | Another `event_id` | Used to point at an `m.room.insertion` event by its `event_id`. | yes -A full example of the `m.room.marker` event: +An example of the `m.room.marker` event: ```json5 { "type": "m.room.marker", From 16a6a408de1b483a923901b589cb73bda410ee87 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Thu, 25 Aug 2022 12:09:47 -0500 Subject: [PATCH 65/68] Fix historical typo See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r953392846 --- proposals/2716-batch-send-historical-messages.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 48499cd37d..e2bfdc4d87 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -41,10 +41,10 @@ Here is what scrollback is expected to look like in Element: To accomplish what's shown in the image, this is the basic flow: 1. `maria` sends messages 1-6. These represent messages in the normal "live" timeline before any history is imported. - 1. Create hitsorical batch 0 via `POST /_matrix/client/v1/rooms//batch_send?prev_event_id=` with the "Historical [xyz]" message `events` from Eric and the necessary `state_events_at_start` to auth them. + 1. Create historical batch 0 via `POST /_matrix/client/v1/rooms//batch_send?prev_event_id=` with the "Historical [xyz]" message `events` from Eric and the necessary `state_events_at_start` to auth them. - This will return a response that contains the `next_batch_id` that we will use for the next batch. - This also returns `base_insertion_event_id` which we will use the for the `m.room.marker` even later. - 1. Create hitsorical batch 1 via `POST /_matrix/client/v1/rooms//batch_send?prev_event_id=&batch_id=` with the "Historical [foo|bar|baz]" message `events` from Eric and the necessary `state_events_at_start` to auth them. + 1. Create historical batch 1 via `POST /_matrix/client/v1/rooms//batch_send?prev_event_id=&batch_id=` with the "Historical [foo|bar|baz]" message `events` from Eric and the necessary `state_events_at_start` to auth them. 1. Send a `m.room.marker` event so the history is discoverable across all federated homeservers: `PUT /_matrix/client/v3/rooms/{roomId}/send/m.room.marker/{txnId}` with `insertion_event_reference` set as the `base_insertion_event_id` from before. The DAG for these messages ends up looking like: From 4a8f834c8a3be9e466dfdbd2096c1146a538b41a Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Thu, 25 Aug 2022 12:12:33 -0500 Subject: [PATCH 66/68] Explain that /batch_send does the insertion/batch dance for you See https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r953294367 --- proposals/2716-batch-send-historical-messages.md | 1 + 1 file changed, 1 insertion(+) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index e2bfdc4d87..01bec87d75 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -44,6 +44,7 @@ To accomplish what's shown in the image, this is the basic flow: 1. Create historical batch 0 via `POST /_matrix/client/v1/rooms//batch_send?prev_event_id=` with the "Historical [xyz]" message `events` from Eric and the necessary `state_events_at_start` to auth them. - This will return a response that contains the `next_batch_id` that we will use for the next batch. - This also returns `base_insertion_event_id` which we will use the for the `m.room.marker` even later. + - `/batch_send` inserts `m.room.insertion` and `m.room.batch` events as necessary to connect the batches into a historical chain of history. 1. Create historical batch 1 via `POST /_matrix/client/v1/rooms//batch_send?prev_event_id=&batch_id=` with the "Historical [foo|bar|baz]" message `events` from Eric and the necessary `state_events_at_start` to auth them. 1. Send a `m.room.marker` event so the history is discoverable across all federated homeservers: `PUT /_matrix/client/v3/rooms/{roomId}/send/m.room.marker/{txnId}` with `insertion_event_reference` set as the `base_insertion_event_id` from before. From e4193ffbc409f6b8a736ba389cf611d76ede7b99 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Thu, 13 Apr 2023 16:31:33 -0500 Subject: [PATCH 67/68] Make it more clear what the drawbacks are Pulling from my summary comments at: - https://github.com/matrix-org/matrix-spec-proposals/pull/2716#issuecomment-1487441010 - https://github.com/matrix-org/matrix-spec-proposals/pull/2716#issuecomment-1504262734 --- .../2716-batch-send-historical-messages.md | 112 +++++++++++++++--- 1 file changed, 96 insertions(+), 16 deletions(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 01bec87d75..119c2bfc7b 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -302,21 +302,23 @@ breakdown which incrementally explains how everything fits together. (`[0, 1, 2]`) and is processed in that order so the `prev_events` point to it's older-in-time previous message which gives us a nice straight line in the DAG. - - **Depth discussion:** For Synapse, when - persisting, we **reverse the list (to make it reverse-chronological)** so - we can still get the correct `(topological_ordering, stream_ordering)` so - it sorts between A and B as we expect. Why? `depth` (or the - `topological_ordering`) is not re-calculated when historical messages are - inserted into the DAG. This means we have to take care to insert in the - right order. Events are sorted by `(topological_ordering, + - **Depth discussion:** For Synapse, when persisting, + we **reverse the list (to make it reverse-chronological)** so we can still get the + correct `(topological_ordering, stream_ordering)` so it sorts between A and B as + we expect. Why? `depth` (or the `topological_ordering`) is not re-calculated when + historical messages are inserted into the DAG. This means we have to take care to + insert in the right order. Events are sorted by `(topological_ordering, stream_ordering)` where `topological_ordering` is just `depth`. Normally, - `stream_ordering` is an auto incrementing integer but for - `backfilled=true` events, it decrements. Since historical messages are - inserted all at the same `depth`, the only way we can control the ordering - in between is the `stream_ordering`. Historical messages are marked as - backfilled so the `stream_ordering` decrements and each event is sorted - behind the next. (from - https://github.com/matrix-org/synapse/pull/9247#discussion_r588479201) + `stream_ordering` is an auto incrementing integer but for `backfilled=true` + events, it decrements. Since historical messages are inserted all at the same + `depth`, the only way we can control the ordering in between is the + `stream_ordering`. Historical messages are marked as backfilled so the + `stream_ordering` decrements and each event is sorted behind the next. (from + https://github.com/matrix-org/synapse/pull/9247#discussion_r588479201). Because + ordering between events is mostly controlled by `stream_ordering`, we will run + into ordering issues over federation if it backfills in the wrong order (see the + ["Message ordering issues over + federation"](#message-ordering-issues-over-federation) section below) ### Power levels @@ -581,12 +583,68 @@ flowchart BT ``` - - ## Potential issues Also see the security considerations section below. +### Message ordering issues over federation + +See the ["Depth discussion"](#depth-discussion) for the appropriate context for how +ordering currently works. This works fine for the local server that imported the history +in any scenario but since current homeserver implementations rely on `stream_ordering` +(which is just when the server received the event) to tie break the +`topological_ordering`/`depth`, this will cause message out of order problems for +federating servers consuming the events. It only works if the federating server scrolls +back sequentially without jumping around in the history at all which isn't realistic +with API's like jump to date (`/timestamp_to_event`) around nowadays. + +To totally fix this problem, it would require a different [graph +linearization](https://github.com/matrix-org/gomatrixserverlib/issues/187) strategy. +Perhaps we would do some online topological ordering (Katriel–Bodlaender algorithm) +where `depth`/`topological_ordering` is dynamically updated whenever new events are +inserted into the DAG. This is something extremely sci-fi and a big task though. + + - https://github.com/matrix-org/gomatrixserverlib/issues/187 is the best reference I + know of for graph linearization (how to go from a DAG to a list of events in order) + in general though + - Related event ordering issue: https://github.com/matrix-org/matrix-spec/issues/852 + - Synapse docs on depth and stream ordering: + https://github.com/matrix-org/synapse/blob/66ad1b8984eb536608e0915722c6a0b4493bb9df/docs/development/room-dag-concepts.md#depth-and-stream-ordering + +--- + +When factoring in how to use MSC2716 with the Gitter import and the static archives, we +were hand waving over this part and planned to have a script manually scrollback across +all of the rooms on the archive server before anyone else or Google spider crawls in +some weird way. This way it will lock the sort in place for all of the historical +messages. Or have the static archives fetch directly from the `gitter.im` homeserver +which would be correct since it was the server that imported everything. + +Then later, online topological ordering can happen in the future and by its nature will +apply retroactively to fix any inconsistencies introduced by jumping and people permalinking. + +But we were able to accomplish the Gitter to Matrix migration message import without +MSC2716 and if your use case is just one big import blast at the beginning of the room, +the way Gitter accomplished this works now and is a lot simpler (do that instead), see +[*"Alternative for one big import blast at the start of a room (Gitter case study)"* +section below](#one-big-import-blast-gitter-case-study). + + + + +### Self-referential batches + +We probably want to come up with a solution for how to reference another event in the +same batch. Imagine wanting to reply to an earlier event in the batch. Or any other +relation like reactions and threads. + +See this [discussion +thread](https://github.com/matrix-org/matrix-spec-proposals/pull/2716#discussion_r870884150) +for ideas. + + +### Application service signals to lazy load more history + This doesn't provide a way for a HS to tell an AS that a client has tried to call `/messages` beyond the beginning of a room, and that the AS should try to lazy-insert some more messages (as per @@ -598,6 +656,9 @@ API is added here, it should include the ID of the user who's asking for history. + + + ## Alternatives We could insist that we use the SS API to import history history in this manner @@ -644,6 +705,25 @@ and retrospectively insert events into the room outside the context of the DAG However, this feels needlessly complicated if the DAG approach is sufficient. +### Alternative for one big import blast at the start of a room (Gitter case study) + + + +As an update, [Gitter has fully migrated to +Matrix](https://blog.gitter.im/2023/02/13/gitter-has-fully-migrated-to-matrix/) and was +able to accomplish the 141M message import with MSC2716. If your use case is just one +big import blast at the beginning of the room, the way Gitter accomplished this works +now and is a lot simpler (do this instead). + +In the Gitter case, we started with a fresh room for the historical messages and +imported one by one so the `topological_ordering` was correct. We also used +`/send?ts=xxx` to make the timestamps correct. Then connected the historical and "live" +room together with a `m.room.tombstone` and MSC3946 `predecessor` event. This +functionality is completely separate from MSC2716 and works fine today. + + + + ## Security considerations The `m.room.insertion` and `m.room.batch` events add a new way for an application service to From 1fc8b6ba7825714944ebdbc4afa178c28b16f39b Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Thu, 13 Apr 2023 16:33:21 -0500 Subject: [PATCH 68/68] with should be without --- proposals/2716-batch-send-historical-messages.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/2716-batch-send-historical-messages.md b/proposals/2716-batch-send-historical-messages.md index 119c2bfc7b..846af57896 100644 --- a/proposals/2716-batch-send-historical-messages.md +++ b/proposals/2716-batch-send-historical-messages.md @@ -711,7 +711,7 @@ However, this feels needlessly complicated if the DAG approach is sufficient. As an update, [Gitter has fully migrated to Matrix](https://blog.gitter.im/2023/02/13/gitter-has-fully-migrated-to-matrix/) and was -able to accomplish the 141M message import with MSC2716. If your use case is just one +able to accomplish the 141M message import without MSC2716. If your use case is just one big import blast at the beginning of the room, the way Gitter accomplished this works now and is a lot simpler (do this instead).