Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Reduce federation replication traffic #2115

Merged
merged 9 commits into from Apr 12, 2017

Conversation

Projects
None yet
2 participants
Owner

erikjohnston commented Apr 10, 2017

This is mainly done by moving the calculation of where to send presence updates from the presence handler to the transaction queue, so we only need to send the presence event (and not the destinations) across the replication connection. Before we were duplicating by sending the full state across once per destination.

Reduce federation presence replication traffic
This is mainly done by moving the calculation of where to send presence
updates from the presence handler to the transaction queue, so we only
need to send the presence event (and not the destinations) across the
replication connection. Before we were duplicating by sending the full
state across once per destination.
synapse/federation/transaction_queue.py
+ continue
+
+ host = get_domain_from_id(user_id)
+ hosts_and_states.append(([host], local_states))
@erikjohnston

erikjohnston Apr 10, 2017

Owner

The above is copied verbatim from _get_interested_parties

+ state.user_id: state for state in states
+ })
+
+ preserve_fn(self._attempt_new_transaction)(destination)
@erikjohnston

erikjohnston Apr 10, 2017

Owner

The above is copied verbatim from the old send_presence function

synapse/federation/transaction_queue.py
@@ -224,17 +227,95 @@ def _send_pdu(self, pdu, destinations):
self._attempt_new_transaction, destination
)
- def send_presence(self, destination, states):
- if not self.can_send_to(destination):
+ @preserve_fn
@erikjohnston

erikjohnston Apr 10, 2017

Owner

Is this an acceptable thing to do style wise?

@richvdh

richvdh Apr 11, 2017

Member

I think so. I think a comment saying # the caller should not yield on this on this line would help.

@erikjohnston erikjohnston requested a review from richvdh Apr 10, 2017

looks broadly plausible, but I had a bit of a nightmare trying to understand what was going on. A lot of that is because of poor comments in the existing code, so I had to go digging to understand things. Accordingly, I've made a load of requests that you comment things better; I realise that some of them aren't changing, but in the interests of trying to make work in this area less awful in the first place I'd appreciate it if we could take the opportunity to improve things.

I think moving get_interested_parties out would really help.

synapse/app/federation_sender.py
+ # way that can be replicated. This means that we don't have a way to
+ # invalidate the cache correctly.
+ # This is fine since in practice nobody uses the presence list stuff...
+ get_presence_list_accepted = PresenceStore.__dict__[
@richvdh

richvdh Apr 11, 2017

Member

Shouldn't we be doing this in a SlavedPresenceStore?

synapse/app/synchrotron.py
- states, calculate_remote_hosts=False
- )
- room_ids_to_states, users_to_states, _ = parties
+ parties = yield self._get_interested_parties(states)
@richvdh

richvdh Apr 11, 2017

Member

since we're changing the signature of _get_interested_parties anyway, it would be a good time to rename it to reflect the fact it is being used outside PresenceHandler.

see below: I think _get_interested_parties needs to move anyway.

synapse/federation/send_queue.py
@@ -187,18 +190,14 @@ def send_edu(self, destination, edu_type, content, key=None):
self.notifier.on_new_replication_data()
- def send_presence(self, destination, states):
+ def send_presence(self, states):
"""As per TransactionQueue"""
@richvdh

richvdh Apr 11, 2017

Member

can you doc the type of states anyway, please

synapse/federation/send_queue.py
- state.user_id: state
- for state in states
- })
+ local_states = filter(lambda s: self.is_mine_id(s.user_id), states)
@richvdh

richvdh Apr 11, 2017

Member

can you comment to explain why we are filtering here. I guess it's so that we only send out presence updates for our own users, but also: why would we get as far as here with an update for someone else?

- self.presence_changed[pos] = [
- (destination, state.user_id) for state in states
- ]
+ self.presence_map.update({state.user_id: state for state in local_states})
@richvdh

richvdh Apr 11, 2017

Member

can you document what presence_map and presence_changed mean in the constructor so I can understand if this is a sane thing?

synapse/federation/transaction_queue.py
@@ -78,6 +78,7 @@ def __init__(self, hs):
self.pending_edus_by_dest = edus = {}
# Presence needs to be separate as we send single aggragate EDUs
+ self.pending_presence = {}
@richvdh

richvdh Apr 11, 2017

Member

could you document what all three of these fields mean and their types?

synapse/federation/transaction_queue.py
@@ -224,17 +227,95 @@ def _send_pdu(self, pdu, destinations):
self._attempt_new_transaction, destination
)
- def send_presence(self, destination, states):
- if not self.can_send_to(destination):
+ @preserve_fn
@richvdh

richvdh Apr 11, 2017

Member

I think so. I think a comment saying # the caller should not yield on this on this line would help.

synapse/federation/transaction_queue.py
+ @preserve_fn
+ @defer.inlineCallbacks
+ def send_presence(self, states):
+ """Send the new presence states to the appropriate destinations.
@richvdh

richvdh Apr 11, 2017

Member

"Start sending" ... "if we are not already sending updates" or something

@richvdh

richvdh Apr 12, 2017

Member

you didn't like my suggestion?

synapse/federation/transaction_queue.py
+ # hosts in those rooms.
+ room_ids_to_states = {}
+ users_to_states = {}
+ for state in states.itervalues():
@richvdh

richvdh Apr 11, 2017

Member

please let's not c&p get_interested_parties here. How about moving this lump of code to the store or something?

(For super-mega-bonus points, make it a separate PR which precedes this one...)

synapse/federation/transaction_queue.py
+ for u in plist:
+ users_to_states.setdefault(u, []).append(state)
+
+ hosts_and_states = []
@richvdh

richvdh Apr 11, 2017

Member

comment on what this is going to be, please.

synapse/federation/transaction_queue.py
+
+ hosts_and_states = []
+ for room_id, states in room_ids_to_states.items():
+ local_states = filter(lambda s: self.is_mine_id(s.user_id), states)
@richvdh

richvdh Apr 11, 2017

Member

again, why are we filtering, and why would updates for remote users be getting in here? (and could we filter before deriving room_ids_to_states and users_to_states to avoid some work? happy if you want to leave this for now to avoid changing too much at once)

synapse/handlers/presence.py
@@ -615,12 +611,12 @@ def current_state_for_users(self, user_ids):
defer.returnValue(states)
@defer.inlineCallbacks
- def _get_interested_parties(self, states, calculate_remote_hosts=True):
+ def _get_interested_parties(self, states):
@richvdh

richvdh Apr 11, 2017

Member

what are we expecting states to be here? A list(UserPresenceState) representing presence updates, presumably?

synapse/handlers/presence.py
@@ -615,12 +611,12 @@ def current_state_for_users(self, user_ids):
defer.returnValue(states)
@defer.inlineCallbacks
- def _get_interested_parties(self, states, calculate_remote_hosts=True):
+ def _get_interested_parties(self, states):
"""Given a list of states return which entities (rooms, users, servers)
@richvdh

richvdh Apr 11, 2017

Member

no more servers.

synapse/handlers/presence.py
-
- host = get_domain_from_id(user_id)
- hosts_to_states.setdefault(host, []).extend(local_states)
-
# TODO: de-dup hosts_to_states, as a single host might have multiple
@richvdh

richvdh Apr 11, 2017

Member

think this comment is dead

synapse/handlers/presence.py
"""Sends state updates to remote servers.
Args:
- hosts_to_states (dict): Mapping `server_name` -> `[UserPresenceState]`
+ hosts_to_states (list): list(state)
@richvdh

richvdh Apr 11, 2017

Member

list(UserPresenceState), no?

@richvdh richvdh assigned erikjohnston and unassigned richvdh Apr 11, 2017

erikjohnston added some commits Apr 11, 2017

Owner

erikjohnston commented Apr 11, 2017

I've moved the get_interested_* functions out into standalone functions in the handlers/presence.py file, as it doesn't feel like the sort of thing that belongs in the storage layer

@erikjohnston erikjohnston assigned richvdh and unassigned erikjohnston Apr 11, 2017

synapse/federation/transaction_queue.py
@@ -78,7 +78,18 @@ def __init__(self, hs):
self.pending_edus_by_dest = edus = {}
# Presence needs to be separate as we send single aggragate EDUs
@richvdh

richvdh Apr 12, 2017

Member

not quite sure what this comment means, any more

synapse/federation/transaction_queue.py
+ # to be sent out by user_id. Entries here get processed and put in
+ # pending_presence_by_dest
+ self.pending_presence = {}
+ # Map of destination -> user_id -> UserPresenceState of pending presence
@richvdh

richvdh Apr 12, 2017

Member

could do with a blank line here

synapse/federation/transaction_queue.py
+ @preserve_fn
+ @defer.inlineCallbacks
+ def send_presence(self, states):
+ """Send the new presence states to the appropriate destinations.
@richvdh

richvdh Apr 12, 2017

Member

you didn't like my suggestion?

synapse/handlers/presence.py
@@ -669,7 +641,7 @@ def _push_to_remotes(self, states):
"""Sends state updates to remote servers.
Args:
- hosts_to_states (list): list(state)
+ hosts_to_states (list(UserPresenceState))
@richvdh

richvdh Apr 12, 2017

Member

s/hosts_to_states/states/. sorry for not spotting that one before

synapse/handlers/presence.py
+ each row the list of UserPresenceState should be sent to each
+ destination
+ """
+ hosts_and_states = [] # Final result to return
@richvdh

richvdh Apr 12, 2017

Member

now that you have an (excellent) description of the return value of the function, this probably doesn't really need a comment, but it's harmless enough

synapse/handlers/presence.py
+ # hosts in those rooms.
+ room_ids_to_states = {}
+ users_to_states = {}
+ for state in states.itervalues():
@richvdh

richvdh Apr 12, 2017

Member

can we not use get_interested_parties here?

@erikjohnston

erikjohnston Apr 12, 2017

Owner

Heh, for some reason I got it stuck in my head that they were different

Member

richvdh commented Apr 12, 2017

I've moved the get_interested_* functions out into standalone functions in the handlers/presence.py file, as it doesn't feel like the sort of thing that belongs in the storage layer

Fine by me. Looks much better now.

@richvdh richvdh assigned erikjohnston and unassigned richvdh Apr 12, 2017

erikjohnston added some commits Apr 12, 2017

@erikjohnston erikjohnston assigned richvdh and unassigned erikjohnston Apr 12, 2017

lgtm

@richvdh richvdh assigned erikjohnston and unassigned richvdh Apr 12, 2017

@erikjohnston erikjohnston merged commit 247c736 into develop Apr 12, 2017

8 checks passed

Sytest Dendron (Commit) Build #1963 origin/erikj/dedupe_federation_repl succeeded in 8 min 55 sec
Details
Sytest Dendron (Merged PR) Build finished.
Details
Sytest Postgres (Commit) Build #2793 origin/erikj/dedupe_federation_repl succeeded in 7 min 21 sec
Details
Sytest Postgres (Merged PR) Build finished.
Details
Sytest SQLite (Commit) Build #2862 origin/erikj/dedupe_federation_repl succeeded in 5 min 49 sec
Details
Sytest SQLite (Merged PR) Build finished.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

psaavedra added a commit to psaavedra/synapse that referenced this pull request May 19, 2017

Merge tag 'v0.21.0' into v0.21.0_no_federate_by_default
Changes in synapse v0.21.0 (2017-05-18)
=======================================

No changes since v0.21.0-rc3

Changes in synapse v0.21.0-rc3 (2017-05-17)
===========================================

Features:

* Add per user rate-limiting overrides (PR #2208)
* Add config option to limit maximum number of events requested by ``/sync``
  and ``/messages`` (PR #2221) Thanks to @psaavedra!

Changes:

* Various small performance fixes (PR #2201, #2202, #2224, #2226, #2227, #2228,
  #2229)
* Update username availability checker API (PR #2209, #2213)
* When purging, don't de-delta state groups we're about to delete (PR #2214)
* Documentation to check synapse version (PR #2215) Thanks to @hamber-dick!
* Add an index to event_search to speed up purge history API (PR #2218)

Bug fixes:

* Fix API to allow clients to upload one-time-keys with new sigs (PR #2206)

Changes in synapse v0.21.0-rc2 (2017-05-08)
===========================================

Changes:

* Always mark remotes as up if we receive a signed request from them (PR #2190)

Bug fixes:

* Fix bug where users got pushed for rooms they had muted (PR #2200)

Changes in synapse v0.21.0-rc1 (2017-05-08)
===========================================

Features:

* Add username availability checker API (PR #2183)
* Add read marker API (PR #2120)

Changes:

* Enable guest access for the 3pl/3pid APIs (PR #1986)
* Add setting to support TURN for guests (PR #2011)
* Various performance improvements (PR #2075, #2076, #2080, #2083, #2108,
  #2158, #2176, #2185)
* Make synctl a bit more user friendly (PR #2078, #2127) Thanks @APwhitehat!
* Replace HTTP replication with TCP replication (PR #2082, #2097, #2098,
  #2099, #2103, #2014, #2016, #2115, #2116, #2117)
* Support authenticated SMTP (PR #2102) Thanks @DanielDent!
* Add a counter metric for successfully-sent transactions (PR #2121)
* Propagate errors sensibly from proxied IS requests (PR #2147)
* Add more granular event send metrics (PR #2178)

Bug fixes:

* Fix nuke-room script to work with current schema (PR #1927) Thanks
  @zuckschwerdt!
* Fix db port script to not assume postgres tables are in the public schema
  (PR #2024) Thanks @jerrykan!
* Fix getting latest device IP for user with no devices (PR #2118)
* Fix rejection of invites to unreachable servers (PR #2145)
* Fix code for reporting old verify keys in synapse (PR #2156)
* Fix invite state to always include all events (PR #2163)
* Fix bug where synapse would always fetch state for any missing event (PR #2170)
* Fix a leak with timed out HTTP connections (PR #2180)
* Fix bug where we didn't time out HTTP requests to ASes  (PR #2192)

Docs:

* Clarify doc for SQLite to PostgreSQL port (PR #1961) Thanks @benhylau!
* Fix typo in synctl help (PR #2107) Thanks @HarHarLinks!
* ``web_client_location`` documentation fix (PR #2131) Thanks @matthewjwolff!
* Update README.rst with FreeBSD changes (PR #2132) Thanks @feld!
* Clarify setting up metrics (PR #2149) Thanks @encks!

@erikjohnston erikjohnston deleted the erikj/dedupe_federation_repl branch Oct 26, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment