Skip to content

Commit

Permalink
Remove the need to call /initialSync in getMessagesByUserIn.
Browse files Browse the repository at this point in the history
At the moment we call `/initialSync` to give a `from` token to `/messages`.
In this PR we instead do not provide a `from` token when calling `/messages`,
which has recently been permitted in the spec
Technically this is still unstable in the spec
https://spec.matrix.org/unstable/client-server-api/#get_matrixclientv3roomsroomidmessages
matrix-org/matrix-spec#1002

Synapse has supported this for over 2 years and Element web depends on it for threads.
matrix-org/matrix-js-sdk#2065

Given that redactions are super heavy in Mjolnir already and have been reported
as barely functional on matrix.org I believe we should also adopt this approach as
if for some reason the spec did change before the next release (1.3) (extremely unlikely) we can revert this commit.
  • Loading branch information
Gnuxie committed May 17, 2022
1 parent bcc3405 commit 048e679
Showing 1 changed file with 23 additions and 44 deletions.
67 changes: 23 additions & 44 deletions src/utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -121,29 +121,16 @@ export async function getMessagesByUserIn(client: MatrixClient, sender: string,
}
}

/**
* Note: `rooms/initialSync` is deprecated. However, there is no replacement for this API for the time being.
* While previous versions of this function used `/sync`, experience shows that it can grow extremely
* slow (4-5 minutes long) when we need to sync many large rooms, which leads to timeouts and
* breakage in Mjolnir, see https://github.com/matrix-org/synapse/issues/10842.
*/
function roomInitialSync() {
return client.doRequest("GET", `/_matrix/client/r0/rooms/${encodeURIComponent(roomId)}/initialSync`);
}

function backfill(from: string) {
function backfill(from: string|null) {
const qs = {
filter: JSON.stringify(roomEventFilter),
from: from,
dir: "b",
... from ? { from } : {}
};
LogService.info("utils", "Backfilling with token: " + from);
return client.doRequest("GET", `/_matrix/client/r0/rooms/${encodeURIComponent(roomId)}/messages`, qs);
return client.doRequest("GET", `/_matrix/client/v3/rooms/${encodeURIComponent(roomId)}/messages`, qs);
}

// Do an initial sync first to get the batch token
const response = await roomInitialSync();

let processed = 0;
/**
* Filter events from the timeline to events that are from a matching sender and under the limit that can be processed by the callback.
Expand All @@ -160,35 +147,27 @@ export async function getMessagesByUserIn(client: MatrixClient, sender: string,
}
return messages;
}

// The recommended APIs for fetching events from a room is to use both rooms/initialSync then /messages.
// Unfortunately, this results in code that is rather hard to read, as these two APIs employ very different data structures.
// We prefer discarding the results from rooms/initialSync and reading only from /messages,
// even if it's a little slower, for the sake of code maintenance.
const timeline = response['messages']
if (timeline) {
// The end of the PaginationChunk has the most recent events from rooms/initialSync.
// This token is required be present in the PagintionChunk from rooms/initialSync.
let token = timeline['end']!;
// We check that we have the token because rooms/messages is not required to provide one
// and will not provide one when there is no more history to paginate.
while (token && processed < limit) {
const bfMessages = await backfill(token);
let lastToken = token;
token = bfMessages['end'];
if (lastToken === token) {
LogService.debug("utils", "Backfill returned same end token - returning early.");
return;
}
const events = filterEvents(bfMessages['chunk'] || []);
// If we are using a glob, there may be no relevant events in this chunk.
if (events.length > 0) {
await cb(events);
}
// We check that we have the token because rooms/messages is not required to provide one
// and will not provide one when there is no more history to paginate.
let token: string|null = null;
do {
const bfMessages: { chunk: any[], end?: string } = await backfill(token);
const lastToken: string|null = token;
token = bfMessages['end'] ?? null;
const events = filterEvents(bfMessages['chunk'] || []);
// If we are using a glob, there may be no relevant events in this chunk.
if (events.length > 0) {
await cb(events);
}
} else {
throw new Error(`Internal Error: rooms/initialSync did not return a pagination chunk for ${roomId}, this is not normal and if it is we need to stop using it. See roomInitialSync() for why we are using it.`);
}
// This check exists only becuase of a Synapse compliance bug https://github.com/matrix-org/synapse/issues/12102.
// We also check after processing events as the `lastToken` can be 'null' if we are at the start of the steam
// and `token` can also be 'null' as we have paginated the entire timeline, but there would be unprocessed events in the
// chunk that was returned in this request.
if (lastToken === token) {
LogService.debug("utils", "Backfill returned same end token - returning early.");
return;
}
} while (token && processed < limit)
}

/*
Expand Down

0 comments on commit 048e679

Please sign in to comment.