Skip to content

Corresponding Email Overview

Erik Hetzner edited this page Apr 18, 2018 · 6 revisions

Corresponding Email Overview

After doing a review, I personally think we should leverage the Email Log table to be the primary data model to support a corresponding email page. I think it makes the most sense for the following reasons:

  1. The email body for various emails are stored on different data models throughout the system. The overhead in the system querying all or a subset of them directly dependent on parameters would be a total pain.
  2. It's the only table right now that not only confirms for sure that a message was sent for real, but also if it failed. This is good if, for instance, we want to indicate that a users email didn't really get sent out, but still preserve the body and other information about the email.
  3. It has the ability to infer the "type" of email to a certain extent

We are fairly close to getting the necessary scaffolding for supporting correspondence emails within Aperta. However, we have the following issues that will require refactoring in order to support certain features.

We're also only currently storing emails as of recently (since this PR was merged and deployed https://github.com/Tahi-project/tahi/pull/2822) - so we currently don't have a backlog of emails in the email log in the system. Theoretically, it IS possible to backfill it with older message, but there's a couple points I have against it.

First, since there isn't a way to determine if previously sent emails were sent successfully or failed prior to Jan 26 2017. We could just assume that they were all sent successfully or have an "unknown" status, but either way, it's a hole in the data.

Secondly, the migration itself will require iterating through every model (Invitations, Reviewer Reports etc.) and then inserting them in the current table. There's also the problem in that we don't have any record of automatically sent emails from the system, so that's not possible to backfill.

Finally, attachments are currently stored as part of the message body and this is something we need to change going forward.

Recommended path

After a meeting that occurred on March 27, 2017, we came to the conclusion that sticking with Postgres is the preferred solution here. The high points:

  • The previous worry about size and scale is not a problem since a total email wide search isn't a requirement (hopefully, the biggest scope will be on the paper). If we DO need it, we can adjust to moving towards Solr later.
  • Attachment search should be out of scope for this. If we do need it, Solr would be recommended in this case (Tika can extract PDFs and docx files quite well).
  • The table will need to be aware of the "type" of email that it is for filtration reasons, but there's multiple ways to do it, such as tying it to the task or just assigning it on send time.
  • We're assuming full text search is a wanted feature, so we should make sure the table is indexed appropriately to prepare for full text search (see here for more details on how https://www.postgresql.org/docs/9.5/static/textsearch-indexes.html).
  • We do not want to use additional_context as a way to reference other models - if there's a use case where one needs to, it should just be another foreign key column added to the table.
  • There's was the possibility with using Ceph as a potential solution for attachment storage, but the infrastructure isn't there yet and we already keep an extra copy on S3. We should just provide a link to the attachment in emails in the future. The earlier emails will just keep the attachment in the raw source for now.
  • Links and references to email templates should be reachable through tasks, which we'll be fetching anyways since filtering by task is required.

Powerpoint from meeting - the recommendation inside changed since the meeting, but is kept here for historical purposes

Attachments:

Correspondence Email (Current Schema) (application/gliffy+json)
Correspondence Email (Current Schema).png (image/png)
Correspondence Email (Current Schema).html (text/html)
Correspondence Email (Current Schema) (application/gliffy+json)
Correspondence Email (Current Schema).png (image/png)
Correspondence Email (Current Schema).html (text/html)
Aperta Architecture with Email (application/gliffy+json)
Aperta Architecture with Email.png(image/png)
Aperta Architecture with Email.html (text/html)

CorrespondenceHistoryArch.pptx (application/vnd.openxmlformats-officedocument.presentationml.presentation)

Clone this wiki locally