Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Send email notifications for missed messages #759

Merged
merged 80 commits into from May 10, 2016

Conversation

Projects
None yet
4 participants
Member

dbkr commented Apr 29, 2016 edited

Most functionality here is confined to adding an email pusher and other functionality used from it. Changes to the rest of the codebase are limited to exposing the previously private _filter_events_for_client and adding an email pusher for new users who register with an email.

This code will not be enabled unless the appropriate options are added to the config on existing installations. They are disabled by default on new installations.

Newly registered users will only have an email pusher set up if email notifications are enabled on the Home Server, so users will only ever start getting mails after registering or explicitly opting in.

Fixes vector-im/vector-web#108

dbkr added some commits Apr 19, 2016

First bits of emailpusher
Mostly logic of when to send an email
Add single instance & logging stuff
Copy the stuff over from http pusher that prevents multiple instances of process running at once and sets up logging and measure blocks.
Send a rather basic email notif
Also pep8 fixes
Flesh out email templating
Mostly WIP porting the room name calculation logic from the web client so our room names in the email mirror the clients.
Mime part is binary so encode it first.
Doesn't get character enocind right yet but makes it not error.
Sort member events
So names of people in a room are given in order
More variable calculation for email notifs
Include name of the person we're sending to and add summary text at the top giving an overview of what's happened.
Hopefully all remaining bits for email notifs
Add public facing base url to the server so synapse knows what URL to use when converting mxc to http urls for use in emails
Better grammar for multiple messages in a room
Say who the messages are from if there's no room name, otherwise it's a bit nonsensical
Run filter_events_for_client
so we don't accidentally mail out events people shouldn't see
Add an email pusher for new users
If they registered with an email address and email notifs are enabled on the HS

@erikjohnston erikjohnston commented on an outdated diff Apr 29, 2016

synapse/config/emailconfig.py
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# This file can't be called email.py because if it is, we cannot:
+import email.utils
+
+from ._base import Config
+
+
+class EmailConfig(Config):
+ """
+ Email Configuration
+ """
@erikjohnston

erikjohnston Apr 29, 2016

Owner

Redundant docstring (also, docstring don't need to start with a newline)

@erikjohnston erikjohnston commented on an outdated diff Apr 29, 2016

synapse/config/emailconfig.py
+import email.utils
+
+from ._base import Config
+
+
+class EmailConfig(Config):
+ """
+ Email Configuration
+ """
+
+ def read_config(self, config):
+ self.email_enable_notifs = False
+
+ email_config = config.get("email", None)
+ if email_config:
+ self.email_enable_notifs = email_config.get("enable_notifs", True)
@erikjohnston

erikjohnston Apr 29, 2016

Owner

Could write this as:

self.email_enable_notifs = config.get("email", {}).get("enable_notifs", True)

@erikjohnston erikjohnston commented on an outdated diff Apr 29, 2016

synapse/config/server.py
@@ -143,6 +148,9 @@ def default_config(self, server_name, **kwargs):
# Whether to serve a web client from the HTTP/HTTPS root resource.
web_client: True
+ # The server's public-facing base URL
+ # https://example.com:8448/
@erikjohnston

erikjohnston Apr 29, 2016

Owner
  • Would prefer if this said it was explicitly for client-server API
  • You forgot the the key name

@erikjohnston erikjohnston commented on an outdated diff Apr 29, 2016

synapse/push/emailpusher.py
+ should_notify_at = max(notif_ready_at, room_ready_at)
+
+ if should_notify_at < self.clock.time_msec():
+ # one of our notifications is ready for sending, so we send
+ # *one* email updating the user on their notifications,
+ # we then consider all previously outstanding notifications
+ # to be delivered.
+ yield self.send_notification(unprocessed)
+
+ yield self.save_last_stream_ordering_and_success(max([
+ ea['stream_ordering'] for ea in unprocessed
+ ]))
+ yield self.sent_notif_update_throttle(
+ push_action['room_id'], push_action
+ )
+ else:
@erikjohnston

erikjohnston Apr 29, 2016

Owner

Should we not break here?

@erikjohnston erikjohnston commented on an outdated diff Apr 29, 2016

synapse/push/emailpusher.py
+ @defer.inlineCallbacks
+ def save_last_stream_ordering_and_success(self, last_stream_ordering):
+ self.last_stream_ordering = last_stream_ordering
+ yield self.store.update_pusher_last_stream_ordering_and_success(
+ self.app_id, self.email, self.user_id,
+ last_stream_ordering, self.clock.time_msec()
+ )
+
+ def seconds_until(self, ts_msec):
+ return (ts_msec - self.clock.time_msec()) / 1000
+
+ def get_room_last_notif_ts(self, last_notif_by_room, room_id):
+ if room_id in last_notif_by_room:
+ return last_notif_by_room[room_id]
+ else:
+ return 0
@erikjohnston

erikjohnston Apr 29, 2016

Owner

Is this function not just last_notify_by_room.get(room_id, 0)?

dbkr added some commits Apr 29, 2016

@erikjohnston erikjohnston commented on an outdated diff Apr 29, 2016

synapse/push/mailer.py
+from synapse.api.errors import StoreError
+
+import jinja2
+import bleach
+
+import time
+import urllib
+
+
+MESSAGE_FROM_PERSON_IN_ROOM = "You have a message from %s in the %s room"
+MESSAGE_FROM_PERSON = "You have a message from %s"
+MESSAGES_FROM_PERSON = "You have messages from %s"
+MESSAGES_IN_ROOM = "There are some messages for you in the %s room"
+MESSAGES_IN_ROOMS = "Here are some messages you may have missed"
+INVITE_FROM_PERSON_TO_ROOM = "%s has invited you to join the %s room"
+INVITE_FROM_PERSON = "%s has invited you to chat"
@erikjohnston

erikjohnston Apr 29, 2016

Owner

I would be tempted to use the name parameter form here, so that you can add/remove and rearrange arguments with ease.

@erikjohnston erikjohnston commented on an outdated diff Apr 29, 2016

synapse/push/mailer.py
+ )
+
+ ret = {
+ "link": self.make_notif_link(notif),
+ "ts": notif['received_ts'],
+ "messages": [],
+ }
+
+ handler = self.hs.get_handlers().message_handler
+ the_events = yield handler.filter_events_for_client(
+ user_id, results["events_before"]
+ )
+ the_events.append(notif_event)
+
+ for event in the_events:
+ vars = self.get_message_vars(notif, event, room_state)
@erikjohnston

erikjohnston Apr 29, 2016

Owner

vars is a builtin ftr

@erikjohnston erikjohnston commented on an outdated diff Apr 29, 2016

synapse/push/mailer.py
+
+ handler = self.hs.get_handlers().message_handler
+ the_events = yield handler.filter_events_for_client(
+ user_id, results["events_before"]
+ )
+ the_events.append(notif_event)
+
+ for event in the_events:
+ vars = self.get_message_vars(notif, event, room_state)
+ if vars is not None:
+ ret['messages'].append(vars)
+
+ defer.returnValue(ret)
+
+ def get_message_vars(self, notif, event, room_state):
+ if event.type != "m.room.message":
@erikjohnston

erikjohnston Apr 29, 2016

Owner

Should probably use the EventTypes constants from synapse.api.constants

@erikjohnston erikjohnston and 1 other commented on an outdated diff Apr 29, 2016

synapse/push/mailer.py
+ ])
+ )
+ else:
+ # Stuff's happened in multiple different rooms
+ return MESSAGES_IN_ROOMS
+
+ def make_room_link(self, room_id):
+ return "https://matrix.to/%s" % (room_id,)
+
+ def make_notif_link(self, notif):
+ return "https://matrix.to/%s/%s" % (
+ notif['room_id'], notif['event_id']
+ )
+
+ def make_unsubscribe_link(self):
+ return "https://vector.im/#/settings" # XXX: matrix.to
@erikjohnston

erikjohnston Apr 29, 2016

Owner

Would very much prefer a direct link to an actual page with a "Unsubscribe by clicking here" button, without needing to log in.

@dbkr

dbkr Apr 29, 2016

Member

Me too, but this is something to add a little later I think.

@erikjohnston

erikjohnston May 4, 2016

Owner

I'm a bit cautious due to various legal requirements to have one.

@erikjohnston erikjohnston and 1 other commented on an outdated diff Apr 29, 2016

synapse/push/pusher.py
def create_pusher(hs, pusherdict):
+ PUSHER_TYPES = {
+ "http": HttpPusher,
+ }
+
+ if hs.config.email_enable_notifs:
+ from emailpusher import EmailPusher
@erikjohnston

erikjohnston Apr 29, 2016

Owner

Why not a top level import?

@dbkr

dbkr Apr 29, 2016

Member

Because I wanted to make it conditional and that needs access to the HS object

@erikjohnston

erikjohnston May 4, 2016

Owner

But why do you want it conditional? I'd rather have the external libraries conditionally imported, that way the tests will at least import the EmailPusher and check that is syntactically correct.

Owner

erikjohnston commented Apr 29, 2016

If using optional dependencies then you should check in the config that the module are importable (like we do with url previews and netaddr)

@erikjohnston erikjohnston commented on an outdated diff Apr 29, 2016

synapse/storage/event_push_actions.py
} for row in after_read_receipt + no_read_receipt
])
@defer.inlineCallbacks
+ def get_time_of_last_push_action_before(self, stream_ordering):
+ def f(txn):
+ sql = (
+ "SELECT e.received_ts "
+ "FROM event_push_actions AS ep "
+ "JOIN events e ON ep.room_id = e.room_id AND ep.event_id = e.event_id "
+ "WHERE ep.stream_ordering > ? "
+ "ORDER BY ep.stream_ordering ASC "
+ "LIMIT 1"
@erikjohnston

erikjohnston Apr 29, 2016

Owner

Generally we prefer to have the spaces at the start of the line, so its clearer when a space is forgotten.

@erikjohnston erikjohnston commented on an outdated diff Apr 29, 2016

synapse/storage/event_push_actions.py
} for row in after_read_receipt + no_read_receipt
])
@defer.inlineCallbacks
+ def get_time_of_last_push_action_before(self, stream_ordering):
+ def f(txn):
+ sql = (
+ "SELECT e.received_ts "
+ "FROM event_push_actions AS ep "
+ "JOIN events e ON ep.room_id = e.room_id AND ep.event_id = e.event_id "
+ "WHERE ep.stream_ordering > ? "
+ "ORDER BY ep.stream_ordering ASC "
+ "LIMIT 1"
+ )
+ txn.execute(sql, (stream_ordering,))
+ return txn.fetchone()
+ result = yield self.runInteraction("get_time_of_last_push_action_before", f)
+ defer.returnValue(result[0] if result is not None else None)
@erikjohnston

erikjohnston Apr 29, 2016

Owner

Why not result[0] if result else None?

@erikjohnston erikjohnston commented on an outdated diff Apr 29, 2016

synapse/storage/event_push_actions.py
@@ -190,6 +212,26 @@ def f(txn):
)
defer.returnValue(result[0] or 0)
+ @defer.inlineCallbacks
+ def get_time_of_latest_push_action_by_room_for_user(self, user_id):
+ """
+ Returns only the received_ts of the last notification in each of the
+ user's rooms, in a dict by room_id
+ """
+ def f(txn):
+ txn.execute(
+ "SELECT ep.room_id, MAX(e.received_ts) "
+ "FROM event_push_actions AS ep "
+ "JOIN events e ON ep.room_id = e.room_id AND ep.event_id = e.event_id "
+ "GROUP BY ep.room_id"
+ )
@erikjohnston

erikjohnston Apr 29, 2016

Owner

Shouldn't this be using user_id?

@erikjohnston erikjohnston commented on an outdated diff Apr 29, 2016

synapse/storage/event_push_actions.py
" FROM events"
" NATURAL JOIN receipts_linearized WHERE receipt_type = 'm.read'"
" GROUP BY room_id, user_id"
") AS rl "
+ "NATURAL JOIN events e "
@erikjohnston

erikjohnston Apr 29, 2016 edited

Owner

I know I've been cheeky and used NATURAL JOIN elsewhere, but can we use INNER JOIN events AS e USING (room_id, event_id) instead?

@erikjohnston erikjohnston and 1 other commented on an outdated diff Apr 29, 2016

synapse/storage/event_push_actions.py
} for row in after_read_receipt + no_read_receipt
])
@defer.inlineCallbacks
+ def get_time_of_last_push_action_before(self, stream_ordering):
+ def f(txn):
+ sql = (
+ "SELECT e.received_ts "
+ "FROM event_push_actions AS ep "
+ "JOIN events e ON ep.room_id = e.room_id AND ep.event_id = e.event_id "
@erikjohnston

erikjohnston Apr 29, 2016

Owner

Why do we need to join against event_push_actions?

@dbkr

dbkr Apr 29, 2016

Member

Did you mean events? We join against that to get the received_ts field.

@erikjohnston erikjohnston commented on the diff Apr 29, 2016

synapse/storage/schema/delta/31/pusher_throttle.sql
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+
+CREATE TABLE pusher_throttle(
+ pusher BIGINT NOT NULL,
+ room_id TEXT NOT NULL,
+ last_sent_ts BIGINT,
+ throttle_ms BIGINT,
+ PRIMARY KEY (pusher, room_id)
+);
@erikjohnston

erikjohnston Apr 29, 2016 edited

Owner

No indices? Oh, or is the primary key enough?

@dbkr

dbkr Apr 29, 2016

Member

Yeah, primary key is implicitly an index.

@erikjohnston erikjohnston commented on the diff Apr 29, 2016

synapse/util/presentable_names.py
+ # at this point we're going to need to search the state by all state keys
+ # for an event type, so rearrange the data structure
+ room_state_bytype = _state_as_two_level_dict(room_state)
+
+ # right then, any aliases at all?
+ if "m.room.aliases" in room_state_bytype:
+ m_room_aliases = room_state_bytype["m.room.aliases"]
+ if len(m_room_aliases.values()) > 0:
+ first_alias_event = m_room_aliases.values()[0]
+ if first_alias_event.content and first_alias_event.content["aliases"]:
+ the_aliases = first_alias_event.content["aliases"]
+ if len(the_aliases) > 0 and _looks_like_an_alias(the_aliases[0]):
+ return the_aliases[0]
+
+ if not fallback_to_members:
+ return None
@erikjohnston

erikjohnston Apr 29, 2016

Owner

Do we really want to return None?

@dbkr

dbkr Apr 29, 2016

Member

Yeah, since the option is essentially asking the function to return None rather than try to make up a name (so the caller can handle the situation where there is no actual name more naturally).

@erikjohnston erikjohnston commented on an outdated diff Apr 29, 2016

synapse/util/presentable_names.py
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import re
+
+# intentionally looser than what aliases we allow to be registered since
+# other HSes may allow aliases that we would not
+ALIAS_RE = re.compile(r"^#.*:.+$")
+
+ALL_ALONE = "Empty Room"
+
+
+def calculate_room_name(room_state, user_id, fallback_to_members=True):
+ # does it have a name?

@erikjohnston erikjohnston commented on an outdated diff Apr 29, 2016

synapse/storage/event_push_actions.py
@@ -190,6 +212,26 @@ def f(txn):
)
defer.returnValue(result[0] or 0)
+ @defer.inlineCallbacks
+ def get_time_of_latest_push_action_by_room_for_user(self, user_id):
+ """
+ Returns only the received_ts of the last notification in each of the
+ user's rooms, in a dict by room_id
+ """
+ def f(txn):
+ txn.execute(
+ "SELECT ep.room_id, MAX(e.received_ts) "
@erikjohnston

erikjohnston Apr 29, 2016

Owner

This looks expensive if received_ts isn't indexed.

@erikjohnston erikjohnston commented on an outdated diff Apr 29, 2016

synapse/push/emailpusher.py
+ def seconds_until(self, ts_msec):
+ return (ts_msec - self.clock.time_msec()) / 1000
+
+ def get_room_throttle_ms(self, room_id):
+ if room_id in self.throttle_params:
+ return self.throttle_params[room_id]["throttle_ms"]
+ else:
+ return 0
+
+ def get_room_last_sent_ts(self, room_id):
+ if room_id in self.throttle_params:
+ return self.throttle_params[room_id]["last_sent_ts"]
+ else:
+ return 0
+
+ def room_ready_to_notify_at(self, room_id, last_notif_time):
@erikjohnston

erikjohnston Apr 29, 2016

Owner

last_notif_time doesn't seem to be used?

Owner

erikjohnston commented Apr 29, 2016

I'm not hugely comfortable with the usages of events.received_ts, because:

  • Without more indexes some of the queries are inefficient
  • A lot of the queries are now having to join against events just for received_ts
  • received_ts is used in places to mean "when we last sent a notif", which seems like a little bit of a dodgy assumption.

Is there any transaction in replacing received_ts with:

  • keeping track of the last time we sent an email in the pushers table (or elsewhere)
  • adding a column to the event_push_actions with either when a) the push action was created or b) when the event should be emailed.
Owner

erikjohnston commented Apr 29, 2016

Do we have different behaviour between highlight and non-highlight notifications? Will we want support for doing digests for non-highlight notificaitons?

@erikjohnston erikjohnston commented on an outdated diff Apr 29, 2016

synapse/push/emailpusher.py
+import logging
+
+from synapse.util.metrics import Measure
+from synapse.util.logcontext import LoggingContext
+
+from mailer import Mailer
+
+logger = logging.getLogger(__name__)
+
+# The amount of time we always wait before ever emailing about a notification
+# (to give the user a chance to respond to other push or notice the window)
+DELAY_BEFORE_MAIL_MS = 2 * 60 * 1000
+
+THROTTLE_START_MS = 2 * 60 * 1000
+THROTTLE_MAX_MS = (2 * 60 * 1000) * (2 ** 11) # ~3 days
+
@erikjohnston

erikjohnston Apr 29, 2016

Owner

Can we also have a constant for the multiplier?

dbkr added some commits Apr 29, 2016

Member

dbkr commented Apr 29, 2016

On received_ts, I think the bit you were looking at that was treating it "when we sent a notif" was just incorrect old code you correctly spotted was no longer used. This was also the inefficient query you pointed out. Now those are gone so the remaining ones should all actually be the time the event/notif arrived. Hopefully this is ok now. (The time was last sent a notif is stored in pusher_throttle).

There's no difference currently between highlight and non-highlight notifications. Digest mails are coming later and will look broadly similar.

Owner

erikjohnston commented May 4, 2016

@NegativeMjark How does this interact with the Slaved stuff?

Contributor

NegativeMjark commented May 4, 2016

You'll need to add the new database methods to https://github.com/matrix-org/synapse/blob/develop/synapse/app/pusher.py#L98

If you add a cached thing then you'll need to add it to the appropriate thing in https://github.com/matrix-org/synapse/blob/develop/synapse/replication/slave/storage/

dbkr and others added some commits May 4, 2016

Owner

erikjohnston commented May 4, 2016

@NegativeMjark: How does packaging work with the manifest stuff? I note that all the static content is in a top level res/ folder, rather than in the package itself like the other static stuff.

dbkr and others added some commits May 4, 2016

Correct SQL statement for postgres
In standard sql, join binds tighter than comma, so we were joining on the wrong table. Postgres follows the standard (apparently).
Include no context
until we can de-dup between the context and other notifs
handle fragments correctly on mxc URLs.
switch to vector.im permalinks as matrix.to isn't ready yet.
merge overlapping notifications together.
give one message of context after a notification (in the unlikely event it exists, but it's possible thanks to throttling).
include name of app in mail templates
Owner

ara4n commented May 5, 2016

@dbkr: i've done an initial skin for vector notifs and also implemented merging of overlapping notifs. please can you take a look at ce81ccb in particular?

ara4n added some commits May 5, 2016

Member

dbkr commented May 5, 2016

lgtm

ara4n and others added some commits May 5, 2016

Switch from CSS to Table layout for HTML mails so they work in Outloo…
…k aka Word

Remove templates-vector and theme templates with variables instead
Switch to matrix.to URLs by default for links
Switch from CSS to Table layout for HTML mails so they work in Outlo…
…ok aka Word

    Remove templates-vector and theme templates with variables instead
    Switch to matrix.to URLs by default for links

@ara4n ara4n merged commit fe97b81 into develop May 10, 2016

7 of 8 checks passed

Sytest SQLite (Merged PR) Build finished.
Details
Flake8 + Packaging (Commit) Build #634 origin/dbkr/email_notifs succeeded in 30 sec
Details
Flake8 + Packaging (Merged PR) Build finished.
Details
Sytest Postgres (Commit) Build #617 origin/dbkr/email_notifs succeeded in 5 min 0 sec
Details
Sytest Postgres (Merged PR) Build finished.
Details
Sytest SQLite (Commit) Build #625 origin/dbkr/email_notifs succeeded in 3 min 25 sec
Details
Unit Tests (Commit) Build #678 origin/dbkr/email_notifs succeeded in 1 min 19 sec
Details
Unit Tests (Merged PR) Build finished.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment