Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the need to call /initialSync in getMessagesByUserIn. #297

Merged
merged 3 commits into from
May 24, 2022

Conversation

Gnuxie
Copy link
Contributor

@Gnuxie Gnuxie commented May 17, 2022

At the moment we call /initialSync to give a from token to /messages.
In this PR we instead do not provide a from token when calling /messages,
which has recently been permitted in the spec
Technically this is still unstable in the spec
https://spec.matrix.org/unstable/client-server-api/#get_matrixclientv3roomsroomidmessages
matrix-org/matrix-spec#1002

Synapse has supported this for over 2 years and Element web depends on it for threads.
matrix-org/matrix-js-sdk#2065

Given that redactions are super heavy in Mjolnir already and have been reported
as barely functional on matrix.org I believe we should also adopt this approach as
if for some reason the spec did change before the next release (1.3) (extremely unlikely) we can revert this commit.

At the moment we call `/initialSync` to give a `from` token to `/messages`.
In this PR we instead do not provide a `from` token when calling `/messages`,
which has recently been permitted in the spec
Technically this is still unstable in the spec
https://spec.matrix.org/unstable/client-server-api/#get_matrixclientv3roomsroomidmessages
matrix-org/matrix-spec#1002

Synapse has supported this for over 2 years and Element web depends on it for threads.
matrix-org/matrix-js-sdk#2065

Given that redactions are super heavy in Mjolnir already and have been reported
as barely functional on matrix.org I believe we should also adopt this approach as
if for some reason the spec did change before the next release (1.3) (extremely unlikely) we can revert this commit.
@Gnuxie Gnuxie changed the title Remove need to call /initialSync in getMessagesByUserIn. Remove the need to call /initialSync in getMessagesByUserIn. May 17, 2022
@Gnuxie Gnuxie requested a review from Yoric May 17, 2022 13:46
@Yoric
Copy link
Contributor

Yoric commented May 17, 2022

That looks pretty good!
Out of curiosity, is there a way to compare performance between two runs?

Do you need to update the README to specify Synapse version requirement?

@Gnuxie
Copy link
Contributor Author

Gnuxie commented May 17, 2022

@Yoric Technically supported in Synapse since v1.48.0 (when v3 endpoints were added) so no change to the readme is required (it asks for 1.53).

As for performance, there's not an easy way to compare other than just comparing differences in response to the redact command. We also don't really know how well it is performing in the wild other than our own and the matrix.org deployment. I don't think a comparison is warranted from a technical perspective given that we used to discard the entire response from /initialSync to obtain a token to call /messages from what would almost always be the same position in the stream as now. It should be quicker and if it isn't then something terrible and absurd is going on. Quicker by how much is another question though as there are other factors to consider.

@Yoric
Copy link
Contributor

Yoric commented May 18, 2022

Quicker by how much is another question though as there are other factors to consider.

We should probably consider instrumenting performance, eventually. But that would be a different PR!

Copy link
Contributor

@Yoric Yoric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, forgot to click "submit review".

src/utils.ts Outdated
let token: string|null = null;
do {
const bfMessages: { chunk: any[], end?: string } = await backfill(token);
const lastToken: string|null = token;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that the previous token, rather than the last token?

}
// We check that we have the token because rooms/messages is not required to provide one
// and will not provide one when there is no more history to paginate.
let token: string|null = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we store the token for next time? Or don't we ever need to do this more than once per room?

Copy link
Contributor Author

@Gnuxie Gnuxie May 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We paginate backwards, so we never want to use this token again because it is far in the past, next time we will want to be in the future. It might be an optimization to say "We should store the token to see if we get to this point next time and can finish early", but that is problematic to implement and may not work at all since there's no guarantee for you to end up in the same spot (align with really, you might be one event ahead or behind, but the next chunk is going to be 10 messages) in the stream next time.

Actually I'm just wrong

We can provide a to token i think

It still seems difficult to optimize for though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's difficult to optimize for, don't worry about it. At best, if you think that a comment would be useful for the next time someone ventures in this code, leave one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem right putting the comment here because you can't make this optimization to getMessagesByUserIn alone. This optimization would be specific to redactions because it is valid to want to paginate into history you have "seen before" in other situations.

src/utils.ts Outdated
}

function backfill(from: string) {
function backfill(from: string|null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Could you take the opportunity to document?

@Gnuxie Gnuxie requested a review from Yoric May 20, 2022 16:04
Copy link
Contributor

@Yoric Yoric left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!
Let's keep on making Mjölnir faster :)

}
// We check that we have the token because rooms/messages is not required to provide one
// and will not provide one when there is no more history to paginate.
let token: string|null = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's difficult to optimize for, don't worry about it. At best, if you think that a comment would be useful for the next time someone ventures in this code, leave one.

src/utils.ts Outdated
} else {
throw new Error(`Internal Error: rooms/initialSync did not return a pagination chunk for ${roomId}, this is not normal and if it is we need to stop using it. See roomInitialSync() for why we are using it.`);
}
// This check exists only becuase of a Synapse compliance bug https://github.com/matrix-org/synapse/issues/12102.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "because".

@Gnuxie Gnuxie merged commit 558cbb3 into main May 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants