Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use SledStore to store full timeline #288

Closed
wants to merge 7 commits into from

Conversation

jsparber
Copy link
Contributor

@jsparber jsparber commented Jun 28, 2021

This implements storing and retrieving the full timeline via the SledStore.

When a sync or message response is received the events are written to the store in the same direction (Backward).
The implementation merges consecutive batches as well as overlapping batches into a single batch. For each batch that isn't consecutive the store keeps track of the prev_batch and next_batch of the batch. If a disconnected batch (a response that isn't possible to be merge to an already known batch) is received a new batch is added to the store and the prev_batch and next_batch is tracked separately. It may be possible to merge the batches at a later point, but it would require moving the entire batch from one location to an other one inside the store which may be expensive, therefore I would keep them as distinct batches that can grow independently.

When reading events, the store automatically continues reading events from an other known batch whenever possible. Once the end of the stored timeline is reached, the found events are returned and if not all requested events are found a token is added to the response that can be used to retrieve more events from the homeserver.

Please note that this doesn't implement storing the timeline to the MemoryStore

Fixes: #138

@jsparber
Copy link
Contributor Author

cc: @DevinR528

@ShadowJonathan
Copy link
Contributor

While i like the idea, i'd love to know what exactly a good reason would be for the SDK to store the timeline locally.

Technically, the timeline on the server and the timeline "locally" can become out of sync once more DAG events converge and "slot" into historical timelines. When an author expects this to happen, they might want to back-paginate to collect all of these events (somehow) and display them appropriately. Or the author might expect what a server ought to do; display the correct topological ordering of the events on a request.

This is good for caching, but I don't think its good as a Source of Truth, as the historical timeline is subject to change once federation converges, and it's relatively easy for this to happen more than 50 events ago.

Even then, duplicate events are easy to have here as a server might "append" a converged event to a sync, which then gets persisted "late" into the timeline, and when the client might back-paginate to get more events, it might encounter both the persisted "late" variant of the event, and the "correct" variant the server might supply in that moment.

@jsparber
Copy link
Contributor Author

jsparber commented Jun 29, 2021

While i like the idea, i'd love to know what exactly a good reason would be for the SDK to store the timeline locally.

Requesting events from the server can be slow. Especially for encrypted rooms we need a local stored timeline so that we don't need to decrypt events every time the user views older messages and there is a hard requirement for a local stored timeline for the search in encrypted rooms.

Honestly i forgot about about de-duplicating events. I will add it.

@ShadowJonathan
Copy link
Contributor

Is there a way to only set the SDK to store some rooms locally? i.e only DMs, like you said.

@jsparber
Copy link
Contributor Author

Is there a way to only set the SDK to store some rooms locally? i.e only DMs, like you said.

Currently no, it could be added at a later point, but i don't really see a good reason to not store the timeline.

matrix_sdk_base/src/store/sled_store/mod.rs Outdated Show resolved Hide resolved
matrix_sdk_base/src/store/sled_store/mod.rs Outdated Show resolved Hide resolved
matrix_sdk_base/src/client.rs Outdated Show resolved Hide resolved
@DevinR528
Copy link
Contributor

Ugh sorry to clutter everything up with review comments instead of "request changes" reviews that you could actually close. if you what to leave a not on the comment I can delete it when you ready or whatever.

@jsparber
Copy link
Contributor Author

jsparber commented Jul 1, 2021

Ugh sorry to clutter everything up with review comments instead of "request changes" reviews that you could actually close. if you what to leave a not on the comment I can delete it when you ready or whatever.

don't really know the difference :)

@jsparber jsparber force-pushed the messages-api branch 10 times, most recently from 9102f39 to e217d37 Compare July 8, 2021 09:34
@jsparber jsparber force-pushed the messages-api branch 2 times, most recently from 36dee64 to de04124 Compare July 16, 2021 09:52
@jsparber
Copy link
Contributor Author

i added now deduplication of events. And the removal of redacted events from the store. I think this is good now for a final review.

Copy link
Contributor

@poljar poljar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems mainly sensible, though a bit of a cleanup of the concepts and splitting up some methods seems like it could make this nicer.

matrix_sdk/src/room/common.rs Outdated Show resolved Hide resolved
matrix_sdk/src/room/common.rs Outdated Show resolved Hide resolved
matrix_sdk/src/room/common.rs Outdated Show resolved Hide resolved
matrix_sdk/src/room/common.rs Outdated Show resolved Hide resolved
matrix_sdk/src/room/common.rs Outdated Show resolved Hide resolved
})
.and_then(|e| e.event.deserialize().ok())
{
if let Ok(Some(room_version)) = self.get_room_version(room)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care about this? Can't we just use the latest room version?

Copy link
Contributor Author

@jsparber jsparber Aug 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure but redaction changes based on the room version and different values are stripped. From reading the specs it looks like only m.room.aliases has changed in version 6. Before version 6 it allowed to keep aliases after redaction. I think the room version doesn't have any implication for the store, but we may not strip the correct data.

matrix_sdk_base/src/store/sled_store/mod.rs Outdated Show resolved Hide resolved
matrix_sdk_base/src/store/sled_store/mod.rs Outdated Show resolved Hide resolved
matrix_sdk_base/src/store/sled_store/mod.rs Outdated Show resolved Hide resolved
matrix_sdk_base/src/store/sled_store/mod.rs Outdated Show resolved Hide resolved
@jsparber jsparber force-pushed the messages-api branch 4 times, most recently from 185b688 to 6ea5d94 Compare August 4, 2021 14:05
@ftilde
Copy link
Contributor

ftilde commented Nov 10, 2021

After the previous discussion, I fear that with the current matrix spec and implementations, implementing a local message store is (at least almost) useless. I've tried to summarize my thoughts below. Of course, feel free to disagree! This is based on my current (possibly flawed) understanding of the ordering of messages returned by syncs and messages: As far as I understand there are no guarantees whatsoever about the ordering of one vs. the other.

Let the following be the "true" timeline, i.e. a sequence of events with the ordering decided by the server (after reordering, what would returned by messages at a time infinitely far in the future.)

most recent <-> older
A001..A500 B001..B500 C D001..D500

Then, let this be the ordering in which events arrive at (your) homeserver. Notice that event C is delayed significantly for some reason (temporary disconnection of the homeserver of the user who sent C or something like that).

most recent <-> older
A001..A500 C B001..B500 D001..D500

Now assume that your client is running when D500 arrives, but closed after receiving B001 via sync. B001-D500 are now in the local store. In particular, it did not receive C and is thus not in the store.

After A001..A500 C have arrived at the (your) homeserver, the client is restarted. Some events A001..A??? are recevied at the first sync. Events A???(+1)..A500 can be received via the sync token and calls to messages. After A500, B001..BXYX are returned via messages, but (due to the large number of messages between B001 and C) C is very likely not in the batch. What should the client do now? Since B001 is already in the store it probably does not make sense to query further messages. But then it never displays C! The only way to make sure to receive all messages is to re-fetch all messages until the beginning of the timeline. This behavior, however, would make storing the messages useless. (Or at least almost useless: The single benefit I can think of would be to have an offline copy of a once accurate timeline.)

@ShadowJonathan does this describe your concerns expressed above as well?

@ShadowJonathan
Copy link
Contributor

Yes, as currently most of this behaviour is undefined in the spec, all homeservers take different approaches with the sync, and you cannot have a guaranteed ordering story like that. I saw that coming when I first saw this PR, so I still really recommend against merging it.

At most, an event-id -> content cache would work, not a timeline store like this.

@agraven
Copy link
Contributor

agraven commented Nov 11, 2021

While do agree that the raised issue is a concern to be considered, my main concern is that not merging this is going to lead to most client developers consuming this library implementing their own implementation of this feature, which in most cases won't be put through as much scrutiny as an implementation in the matrix-sdk itself, and additionally won't be able to draw on any of the benefits of access to internals that come from being part of the matrix-sdk crate itself.

@ftilde
Copy link
Contributor

ftilde commented Nov 11, 2021 via email

@ShadowJonathan
Copy link
Contributor

Keep in mind that messages can also be missed if the "limited"/"partial" flag was set (can't remember the exact name), which highlights a partial sync (for a particular room's timeline)

@poljar
Copy link
Contributor

poljar commented Nov 12, 2021

Such a gap would be detected by comparing the start/end tokens of two successive syncs, which should be the same if no messages were missed.

Keep in mind that messages can also be missed if the "limited"/"partial" flag was set (can't remember the exact name), which highlights a partial sync (for a particular room's timeline)

Those two are the same case, no? Either there is a gap and the limited flag is set and the tokens differ, or there isn't a gap in which case the limited flag is not set and the tokens match.

@jsparber jsparber force-pushed the messages-api branch 2 times, most recently from d72ddf3 to 116eddf Compare January 17, 2022 10:45
@ShadowJonathan
Copy link
Contributor

I believe https://github.com/matrix-org/matrix-doc/issues/3263 hasn't been mentioned yet, but it is relevant to this PR

@jsparber
Copy link
Contributor Author

jsparber commented Feb 3, 2022

With the current limits of the sync api, I think we can only implement a cache for a continues stream of events of the timeline. This will make the cache a little less efficient, but unfortunately I don't see any way around this. On duplicated events the local cache is dropped, because it's the indication that the order of the timeline on the server has changed. Till the developer requests a new stream we will keep the local order of events, and thus, we can just drop the second occurrence of the event.

For now I'm not considering implementing any cache for requests directly to get_message_events, get_context and get_room_event. Even though i think it could be possible to serve some requests locally quite easily.

In specific this will solve the following issues we faced so far in this MR:

  • No gaps in the returned and stored timeline
  • No issue with duplicated events (since the second occurrence of the events can be dropped from the stream)
  • No overlapping batches, because we always use the proper token from the previous request

The local cache is invalidated on:

  • A sync response contains a limited timeline [1]
  • We found duplicated events in the previously started stream. This ensures that we get the new and proper order of events next time the stream is started (the previously started stream won't be affected)

This will still give us a local cache of the timeline for:

  • Timeline before receiving a sync response (after the initial sync)
  • Offline use
  • Quicker loading time on cache hit.
  • Developer doesn't need to mess around with batch token

For API, i think we could use DoubleEndedStream. It's still an unstable feature, therefore, I would implement it by using two Stream, one forward stream and one backward stream starting from the current point in the timeline.

[1] Considerations about filling the gap:
In future we could try to request events to fill the gap but we still need to invalidate the cache on duplicated events. At least, we wouldn't have any issue with overlapping batches since get_message_events allows us to set the end token.

@ShadowJonathan
Copy link
Contributor

This looks promising, I want to point out that I'm very pleased with this conclusion, as initially I thought messing with the limitations was a non-starter, but delivering such a cache with the above promises seem to work, while also balancing it against developer and application ease, thanks a lot!

It looks like the force_auth is working correctly when restoring the
session, but it's not working with the call to client.login, so it
cannot be used.

The access_token is not available in the login query, so if the option
force_auth is set, this call will fail always. This patch just ignores
the force_auth if there's no session, so any query will work when
force_auth is true.

Fix matrix-org#488
@jsparber
Copy link
Contributor Author

jsparber commented Feb 14, 2022

closing in favor of #486

@jsparber jsparber closed this Feb 14, 2022
@dkasak
Copy link
Member

dkasak commented Aug 9, 2022

@jsparber, you seem to have linked to the wrong thing here. Could you fix the link? Nevermind, noticed the backref above and fixed it myself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Storing timelines in the state store