Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Implement device lists updates over federation #1857

Merged
merged 19 commits into from
Jan 30, 2017

Conversation

erikjohnston
Copy link
Member

This implements both the server and client side portions of device list update notifications. See the google doc for more information.

Should help fix element-hq/element-web#2305

@@ -27,6 +29,21 @@ class DeviceHandler(BaseHandler):
def __init__(self, hs):
super(DeviceHandler, self).__init__(hs)

self.hs = hs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been doing self.is_mine_id = hs.is_mine_id in new code in a mostly futile attempt to avoid having self.hs scattered throughout the handlers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I necessarily like that, as its not clear that is_mine_id isn't a local function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kind of nice to kill of the ```self.hs`` as it means that all the accesses to the homeserver object happen during start up rather than during the execution.

Eh, the usage count is now about 50/50 so I guess it doesn't matter that much which style you pick:

$ grep -r self.hs.is_mine synapse | wc -l
34
$ grep -r self.is_mine synapse | wc -l
29

@@ -33,26 +34,23 @@ def store_device(self, user_id, device_id,
user_id (str): id of user associated with the device
device_id (str): id of device
initial_device_display_name (str): initial displayname of the
device
ignore_if_known (bool): ignore integrity errors which mean the
Copy link
Contributor

@NegativeMjark NegativeMjark Jan 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You haven't actually removed the ignore_if_known argument.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@@ -139,3 +137,374 @@ def get_devices_by_user(self, user_id):
)

defer.returnValue({d["device_id"]: d for d in devices})

def get_device_list_remote_extremity(self, user_id):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would "last_stream_id" be a better name than "extremity"?

)

def mark_remote_user_device_list_as_unsubscribed(self, user_id):
return self._simple_delete(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have some docstring for this.

txn, [(user_id, None)], include_all_devices=True
)

for user_id, user_devices in devices.iteritems():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be clearer as an if statement.

now_stream_id):
sql = """
SELECT user_id, device_id, max(stream_id) FROM device_lists_outbound_pokes
WHERE destination = ? AND stream_id > ? AND stream_id <= ? AND sent = ?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe write this as ? < stream_id AND stream_id <= ?

@@ -33,7 +31,7 @@ def set_e2e_device_keys(self, user_id, device_id, time_now, json_bytes):
}
)

def get_e2e_device_keys(self, query_list):
def get_e2e_device_keys(self, query_list, include_all_devices=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whats the include_all_devices parameter for? and could it have some docstring?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


results = []
for user_id, user_devices in devices.iteritems():
txn.execute(prev_sent_id_sql, (destination, user_id, True))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you are binding the constant True here rather than putting a literal in the SQL query as some sort of postgresql vs sqlite compatibility hack. Maybe throw in a comment to that effect?

SELECT coalesce(max(stream_id), 0) as stream_id
FROM device_lists_outbound_pokes
WHERE destination = ? AND user_id = ? AND sent = ?
"""
Copy link
Contributor

@NegativeMjark NegativeMjark Jan 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it reasonable to assume that the number of entries in the device_lists_outbound_pokes per user_id is small? Do we think that the (destination, user_id) index is good enough here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only going to be a problem for remote servers that have been down for ages, and so this table gets filled up.

I think that should probably be handled in a different way. Maybe a clean up job that deletes everything but the most recent poke if the entries are older than a day?

@@ -284,6 +285,8 @@ def _get_devices_by_remote_txn(self, txn, destination, from_stream_id,

results = []
for user_id, user_devices in devices.iteritems():
# We bind literal True, as its database dependent how booleans are
# handled.
Copy link
Contributor

@NegativeMjark NegativeMjark Jan 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might prefer something more along the lines of.

# postgresql and sqlite use different formats for boolean literals,
# so we pass True as a query parameter rather than including a
# literal True in the query string itself

" d.display_name AS device_display_name, "
" k.key_json"
" FROM e2e_device_keys_json k"
" LEFT JOIN devices d ON d.user_id = k.user_id"
Copy link
Contributor

@NegativeMjark NegativeMjark Jan 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably it's safe to change this from a LEFT JOIN to an INNER JOIN because of https://github.com/matrix-org/synapse/blob/master/synapse/storage/schema/delta/33/devices_for_e2e_keys.sql

@NegativeMjark
Copy link
Contributor

Other than maybe tweaking the wording in the comments LGTM

user_index = stream["field_names"].index("user_id")

for row in stream["rows"]:
logger.info("Handling device list row: %r", row)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this logging a bit noisy?

@@ -458,6 +465,21 @@ def get_user_whose_devices_changed(self, from_key):
rows = yield self._execute("get_user_whose_devices_changed", None, sql, from_key)
defer.returnValue(set(row["user_id"] for row in rows))

def get_users_and_hosts_device_list_changes(self, from_key):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the name of this method should indicate that this is for outbound pokes?

@NegativeMjark
Copy link
Contributor

Structure looks sensible to me. Might want to tweak some of the method names and tone down the logging, but otherwise LGTM

Copy link
Contributor

@NegativeMjark NegativeMjark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants