Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Implement device lists updates over federation #1857

Merged
merged 19 commits into from Jan 30, 2017

Conversation

Projects
None yet
2 participants
Owner

erikjohnston commented Jan 26, 2017

This implements both the server and client side portions of device list update notifications. See the google doc for more information.

Should help fix vector-im/riot-web#2305

erikjohnston added some commits Jan 25, 2017

@erikjohnston erikjohnston requested a review from NegativeMjark Jan 26, 2017

erikjohnston added some commits Jan 26, 2017

@@ -27,6 +29,21 @@ class DeviceHandler(BaseHandler):
def __init__(self, hs):
super(DeviceHandler, self).__init__(hs)
+ self.hs = hs
@NegativeMjark

NegativeMjark Jan 26, 2017

Contributor

I've been doing self.is_mine_id = hs.is_mine_id in new code in a mostly futile attempt to avoid having self.hs scattered throughout the handlers.

@erikjohnston

erikjohnston Jan 27, 2017

Owner

I'm not sure I necessarily like that, as its not clear that is_mine_id isn't a local function.

@NegativeMjark

NegativeMjark Jan 27, 2017

Contributor

It's kind of nice to kill of the ```self.hs`` as it means that all the accesses to the homeserver object happen during start up rather than during the execution.

Eh, the usage count is now about 50/50 so I guess it doesn't matter that much which style you pick:

$ grep -r self.hs.is_mine synapse | wc -l
34
$ grep -r self.is_mine synapse | wc -l
29
@@ -33,26 +34,23 @@ def store_device(self, user_id, device_id,
user_id (str): id of user associated with the device
device_id (str): id of device
initial_device_display_name (str): initial displayname of the
- device
- ignore_if_known (bool): ignore integrity errors which mean the
@NegativeMjark

NegativeMjark Jan 26, 2017 edited

Contributor

You haven't actually removed the ignore_if_known argument.

synapse/storage/devices.py
@@ -139,3 +137,374 @@ def get_devices_by_user(self, user_id):
)
defer.returnValue({d["device_id"]: d for d in devices})
+
+ def get_device_list_remote_extremity(self, user_id):
@NegativeMjark

NegativeMjark Jan 26, 2017

Contributor

Would "last_stream_id" be a better name than "extremity"?

synapse/storage/devices.py
+ )
+
+ def mark_remote_user_device_list_as_unsubscribed(self, user_id):
+ return self._simple_delete(
@NegativeMjark

NegativeMjark Jan 26, 2017

Contributor

Could we have some docstring for this.

synapse/storage/devices.py
+ txn, [(user_id, None)], include_all_devices=True
+ )
+
+ for user_id, user_devices in devices.iteritems():
@NegativeMjark

NegativeMjark Jan 26, 2017

Contributor

This might be clearer as an if statement.

synapse/storage/devices.py
+ now_stream_id):
+ sql = """
+ SELECT user_id, device_id, max(stream_id) FROM device_lists_outbound_pokes
+ WHERE destination = ? AND stream_id > ? AND stream_id <= ? AND sent = ?
@NegativeMjark

NegativeMjark Jan 26, 2017

Contributor

Maybe write this as ? < stream_id AND stream_id <= ?

@@ -33,7 +31,7 @@ def set_e2e_device_keys(self, user_id, device_id, time_now, json_bytes):
}
)
- def get_e2e_device_keys(self, query_list):
+ def get_e2e_device_keys(self, query_list, include_all_devices=False):
@NegativeMjark

NegativeMjark Jan 26, 2017

Contributor

Whats the include_all_devices parameter for? and could it have some docstring?

synapse/storage/devices.py
+
+ results = []
+ for user_id, user_devices in devices.iteritems():
+ txn.execute(prev_sent_id_sql, (destination, user_id, True))
@NegativeMjark

NegativeMjark Jan 26, 2017

Contributor

I assume you are binding the constant True here rather than putting a literal in the SQL query as some sort of postgresql vs sqlite compatibility hack. Maybe throw in a comment to that effect?

synapse/storage/devices.py
+ SELECT coalesce(max(stream_id), 0) as stream_id
+ FROM device_lists_outbound_pokes
+ WHERE destination = ? AND user_id = ? AND sent = ?
+ """
@NegativeMjark

NegativeMjark Jan 26, 2017 edited

Contributor

Is it reasonable to assume that the number of entries in the device_lists_outbound_pokes per user_id is small? Do we think that the (destination, user_id) index is good enough here?

@erikjohnston

erikjohnston Jan 27, 2017

Owner

This is only going to be a problem for remote servers that have been down for ages, and so this table gets filled up.

I think that should probably be handled in a different way. Maybe a clean up job that deletes everything but the most recent poke if the entries are older than a day?

erikjohnston added some commits Jan 27, 2017

synapse/storage/devices.py
@@ -284,6 +285,8 @@ def _get_devices_by_remote_txn(self, txn, destination, from_stream_id,
results = []
for user_id, user_devices in devices.iteritems():
+ # We bind literal True, as its database dependent how booleans are
+ # handled.
@NegativeMjark

NegativeMjark Jan 27, 2017 edited

Contributor

I might prefer something more along the lines of.

# postgresql and sqlite use different formats for boolean literals,
# so we pass True as a query parameter rather than including a
# literal True in the query string itself
" d.display_name AS device_display_name, "
" k.key_json"
- " FROM e2e_device_keys_json k"
- " LEFT JOIN devices d ON d.user_id = k.user_id"
@NegativeMjark

NegativeMjark Jan 27, 2017 edited

Contributor

Presumably it's safe to change this from a LEFT JOIN to an INNER JOIN because of https://github.com/matrix-org/synapse/blob/master/synapse/storage/schema/delta/33/devices_for_e2e_keys.sql

Contributor

NegativeMjark commented Jan 27, 2017

Other than maybe tweaking the wording in the comments LGTM

erikjohnston added some commits Jan 27, 2017

synapse/app/synchrotron.py
+ user_index = stream["field_names"].index("user_id")
+
+ for row in stream["rows"]:
+ logger.info("Handling device list row: %r", row)
@NegativeMjark

NegativeMjark Jan 30, 2017

Contributor

Isn't this logging a bit noisy?

synapse/storage/devices.py
@@ -458,6 +465,21 @@ def get_user_whose_devices_changed(self, from_key):
rows = yield self._execute("get_user_whose_devices_changed", None, sql, from_key)
defer.returnValue(set(row["user_id"] for row in rows))
+ def get_users_and_hosts_device_list_changes(self, from_key):
@NegativeMjark

NegativeMjark Jan 30, 2017

Contributor

I wonder if the name of this method should indicate that this is for outbound pokes?

Contributor

NegativeMjark commented Jan 30, 2017

Structure looks sensible to me. Might want to tweak some of the method names and tone down the logging, but otherwise LGTM

erikjohnston added some commits Jan 30, 2017

@erikjohnston erikjohnston merged commit 9636b24 into develop Jan 30, 2017

7 of 8 checks passed

Sytest Dendron (Merged PR) Build finished.
Details
Sytest Dendron (Commit) Build #1434 origin/erikj/device_list_stream succeeded in 10 min
Details
Sytest Postgres (Commit) Build #2252 origin/erikj/device_list_stream succeeded in 7 min 34 sec
Details
Sytest Postgres (Merged PR) Build finished.
Details
Sytest SQLite (Commit) Build #2319 origin/erikj/device_list_stream succeeded in 6 min 7 sec
Details
Sytest SQLite (Merged PR) Build finished.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment