Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Deduplicate presence updates before sending them to application services #11140

Closed
wants to merge 4 commits into from

Conversation

anoadragon453
Copy link
Member

@anoadragon453 anoadragon453 commented Oct 20, 2021

Mostly addresses problem 2 of #10836 (comment).

We calculate presence updates for application services based on the
users that application service is interested in. For each of these
users, we determine which presence updates they are set to receive,
compile that into a list, and then send every update from the list
to the application service.

However, because a single presence update can cause a notification
for many different users, we're likely to end up with lots of
duplicated presence updates being collected here. Currently, all of
these are sent to the application service.

By using a Set, this deduplication happens automatically.

We calculate presence updates for application services based on the
users that application service is interested in. For each of these
users, we determine which presence updates they are set to receive,
compile that into a list, and then send every update from the list
to the application service.

However, because a single presence update can cause a notification
for many different users, we're likely to end up with lots of
duplicated presence updates being collected here. Currently, all of
these are sent to the application service.

By using a Set, this deduplication happens automatically.
This wasn't necessary, but mypy complained that 'events' could be both
a List and a Set of JsonDict. So rather than define it with a Union type
as such, I figured it'd be best to convert the other methods to use Set
as well. We may even eliminate other duplicates in the process.
@anoadragon453 anoadragon453 marked this pull request as ready for review January 14, 2022 17:45
@anoadragon453 anoadragon453 requested a review from a team as a code owner January 14, 2022 17:45
Copy link
Member

@clokep clokep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks plausible, but I have some questions since I'm not super familiar with this code.

@@ -427,7 +427,7 @@ async def _handle_presence(
from_key=from_key,
)
time_now = self.clock.time_msec()
events.extend(
events.update(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this really be a set or do we want to de-duplicate based on user ID?

@@ -491,7 +491,7 @@ async def get_new_events_as(
):
continue

events.append(self._make_event_for(room_id))
events.add(self._make_event_for(room_id))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it being a set might not de-duplicate a lot since the object embeds a list of user IDs?

@DMRobertson
Copy link
Contributor

Looks like we forgot about this one. Is it worth resurrecting @anoadragon453?

@anoadragon453
Copy link
Member Author

Let's close it for now since it's ancient. One day I may return to finish it up.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants