Implement device lists updates over federation #1857

Merged
merged 19 commits into from Jan 30, 2017

Projects

None yet

2 participants

@erikjohnston
Member

This implements both the server and client side portions of device list update notifications. See the google doc for more information.

Should help fix vector-im/riot-web#2305

erikjohnston added some commits Jan 25, 2017
@erikjohnston erikjohnston Add basic implementation of local device list changes
2367c55
@erikjohnston erikjohnston Fix up sending of m.device_list_update edus
51e9fe3
@erikjohnston erikjohnston Implement device key caching over federation
c974116
@erikjohnston erikjohnston Add comments
fbfad76
@erikjohnston erikjohnston Handle users leaving rooms
76d40f4
@erikjohnston erikjohnston requested a review from NegativeMjark Jan 26, 2017
erikjohnston added some commits Jan 26, 2017
@erikjohnston erikjohnston Fix on sqlite: use left rather than outer join
31aca55
@erikjohnston erikjohnston Fix unit tests
b3e1f2a
@@ -27,6 +29,21 @@ class DeviceHandler(BaseHandler):
def __init__(self, hs):
super(DeviceHandler, self).__init__(hs)
+ self.hs = hs
@NegativeMjark
NegativeMjark Jan 26, 2017 Contributor

I've been doing self.is_mine_id = hs.is_mine_id in new code in a mostly futile attempt to avoid having self.hs scattered throughout the handlers.

@erikjohnston
erikjohnston Jan 27, 2017 Member

I'm not sure I necessarily like that, as its not clear that is_mine_id isn't a local function.

@NegativeMjark
NegativeMjark Jan 27, 2017 Contributor

It's kind of nice to kill of the ```self.hs`` as it means that all the accesses to the homeserver object happen during start up rather than during the execution.

Eh, the usage count is now about 50/50 so I guess it doesn't matter that much which style you pick:

$ grep -r self.hs.is_mine synapse | wc -l
34
$ grep -r self.is_mine synapse | wc -l
29
@@ -33,26 +34,23 @@ def store_device(self, user_id, device_id,
user_id (str): id of user associated with the device
device_id (str): id of device
initial_device_display_name (str): initial displayname of the
- device
- ignore_if_known (bool): ignore integrity errors which mean the
@NegativeMjark
NegativeMjark Jan 26, 2017 edited Contributor

You haven't actually removed the ignore_if_known argument.

synapse/storage/devices.py
@@ -139,3 +137,374 @@ def get_devices_by_user(self, user_id):
)
defer.returnValue({d["device_id"]: d for d in devices})
+
+ def get_device_list_remote_extremity(self, user_id):
@NegativeMjark
NegativeMjark Jan 26, 2017 Contributor

Would "last_stream_id" be a better name than "extremity"?

synapse/storage/devices.py
+ )
+
+ def mark_remote_user_device_list_as_unsubscribed(self, user_id):
+ return self._simple_delete(
@NegativeMjark
NegativeMjark Jan 26, 2017 Contributor

Could we have some docstring for this.

synapse/storage/devices.py
+ txn, [(user_id, None)], include_all_devices=True
+ )
+
+ for user_id, user_devices in devices.iteritems():
@NegativeMjark
NegativeMjark Jan 26, 2017 Contributor

This might be clearer as an if statement.

synapse/storage/devices.py
+ now_stream_id):
+ sql = """
+ SELECT user_id, device_id, max(stream_id) FROM device_lists_outbound_pokes
+ WHERE destination = ? AND stream_id > ? AND stream_id <= ? AND sent = ?
@NegativeMjark
NegativeMjark Jan 26, 2017 Contributor

Maybe write this as ? < stream_id AND stream_id <= ?

@@ -33,7 +31,7 @@ def set_e2e_device_keys(self, user_id, device_id, time_now, json_bytes):
}
)
- def get_e2e_device_keys(self, query_list):
+ def get_e2e_device_keys(self, query_list, include_all_devices=False):
@NegativeMjark
NegativeMjark Jan 26, 2017 Contributor

Whats the include_all_devices parameter for? and could it have some docstring?

synapse/storage/devices.py
+
+ results = []
+ for user_id, user_devices in devices.iteritems():
+ txn.execute(prev_sent_id_sql, (destination, user_id, True))
@NegativeMjark
NegativeMjark Jan 26, 2017 Contributor

I assume you are binding the constant True here rather than putting a literal in the SQL query as some sort of postgresql vs sqlite compatibility hack. Maybe throw in a comment to that effect?

synapse/storage/devices.py
+ SELECT coalesce(max(stream_id), 0) as stream_id
+ FROM device_lists_outbound_pokes
+ WHERE destination = ? AND user_id = ? AND sent = ?
+ """
@NegativeMjark
NegativeMjark Jan 26, 2017 edited Contributor

Is it reasonable to assume that the number of entries in the device_lists_outbound_pokes per user_id is small? Do we think that the (destination, user_id) index is good enough here?

@erikjohnston
erikjohnston Jan 27, 2017 Member

This is only going to be a problem for remote servers that have been down for ages, and so this table gets filled up.

I think that should probably be handled in a different way. Maybe a clean up job that deletes everything but the most recent poke if the entries are older than a day?

erikjohnston added some commits Jan 27, 2017
@erikjohnston erikjohnston Remove unused param
f25a4a4
@erikjohnston erikjohnston Better name 888c59c
@erikjohnston erikjohnston User if rather than for 755adff
@erikjohnston erikjohnston SQL param ordering 738a286
@erikjohnston erikjohnston Comment c517a19
@erikjohnston erikjohnston Comment
84a35f3
synapse/storage/devices.py
@@ -284,6 +285,8 @@ def _get_devices_by_remote_txn(self, txn, destination, from_stream_id,
results = []
for user_id, user_devices in devices.iteritems():
+ # We bind literal True, as its database dependent how booleans are
+ # handled.
@NegativeMjark
NegativeMjark Jan 27, 2017 edited Contributor

I might prefer something more along the lines of.

# postgresql and sqlite use different formats for boolean literals,
# so we pass True as a query parameter rather than including a
# literal True in the query string itself
" d.display_name AS device_display_name, "
" k.key_json"
- " FROM e2e_device_keys_json k"
- " LEFT JOIN devices d ON d.user_id = k.user_id"
@NegativeMjark
NegativeMjark Jan 27, 2017 edited Contributor

Presumably it's safe to change this from a LEFT JOIN to an INNER JOIN because of https://github.com/matrix-org/synapse/blob/master/synapse/storage/schema/delta/33/devices_for_e2e_keys.sql

@NegativeMjark
Contributor

Other than maybe tweaking the wording in the comments LGTM

erikjohnston added some commits Jan 27, 2017
@erikjohnston erikjohnston Hook device list updates to replication
252b503
@erikjohnston erikjohnston Add ts column to device_lists_outbound_pokes
d1e1fd6
@erikjohnston erikjohnston Always use the latest stream_id, sent or unsent 7610020
@erikjohnston erikjohnston Clear out old destination pokes.
d360c97
synapse/app/synchrotron.py
+ user_index = stream["field_names"].index("user_id")
+
+ for row in stream["rows"]:
+ logger.info("Handling device list row: %r", row)
@NegativeMjark
NegativeMjark Jan 30, 2017 Contributor

Isn't this logging a bit noisy?

synapse/storage/devices.py
@@ -458,6 +465,21 @@ def get_user_whose_devices_changed(self, from_key):
rows = yield self._execute("get_user_whose_devices_changed", None, sql, from_key)
defer.returnValue(set(row["user_id"] for row in rows))
+ def get_users_and_hosts_device_list_changes(self, from_key):
@NegativeMjark
NegativeMjark Jan 30, 2017 Contributor

I wonder if the name of this method should indicate that this is for outbound pokes?

@NegativeMjark
Contributor

Structure looks sensible to me. Might want to tweak some of the method names and tone down the logging, but otherwise LGTM

erikjohnston added some commits Jan 30, 2017
@erikjohnston erikjohnston Remove debug logging 4ac363a
@erikjohnston erikjohnston Rename func
3670025
@NegativeMjark

LGTM

@erikjohnston erikjohnston merged commit 9636b24 into develop Jan 30, 2017

7 of 8 checks passed

Sytest Dendron (Merged PR) Build finished.
Details
Sytest Dendron (Commit) Build #1434 origin/erikj/device_list_stream succeeded in 10 min
Details
Sytest Postgres (Commit) Build #2252 origin/erikj/device_list_stream succeeded in 7 min 34 sec
Details
Sytest Postgres (Merged PR) Build finished.
Details
Sytest SQLite (Commit) Build #2319 origin/erikj/device_list_stream succeeded in 6 min 7 sec
Details
Sytest SQLite (Merged PR) Build finished.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment