Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emails in which one is CCed/BCCed get treated as 'Sent'. #36

Closed
majorminors opened this issue Jul 24, 2020 · 19 comments
Closed

Emails in which one is CCed/BCCed get treated as 'Sent'. #36

majorminors opened this issue Jul 24, 2020 · 19 comments
Assignees
Labels
bug Something isn't working

Comments

@majorminors
Copy link

majorminors commented Jul 24, 2020

Error Symptoms

Emails in which one is CCed/BCCed get treated as 'Sent'.

Emails in which 'To' contains other addresses before a recognised address gets treated as 'Sent'

Notes

I decided to spin this up and have a play. It's wonderful!

I actually found it via a thread on reddit. One comment asked about how bcc and cc were handled, and you specified that you weren't sure. I thought I'd test that. The poster's concerns were astute: items that you are cced or bcced in get treated as 'Sent' mail.

I wish I could have a crack at fixing this and submit a pull request, but I have only the vaguest idea of how this all works (the plan is to use your brilliant product to learn). If you felt like it was worth your time to point me to the most relevant documentation, I would be more than willing to have a go. I think the thing I'm having most difficulty with understanding is how the cloudformation file would change if I were to play with the forks of your repos. Is your 'grapes' consolidation program included in the codepipeline autodeploy feature?

Otherwise, do let me know if there's any outputs I can provide that would be useful for you to troubleshoot the issue.

@majorminors majorminors added the bug Something isn't working label Jul 24, 2020
@majorminors majorminors changed the title PAST_THE_ERROR_MESSAGE_HERE Emails in which one is CCed/BCCed get treated as 'Sent'. Jul 24, 2020
@davidgatti
Copy link
Member

Yes, that is true, and it is a bit annoying, but since I use this this to handle newsletters, and for signups email. it is not a big problem, sure some sites put emails addresses in the wrong places :) but o well. In any case, sure you can play with the code.

First of all this is a important link: https://github.com/topics/0x4447-product-s3-email - it shows all the resources of the project. And in this case you don't have to bother about CF, the code is in the separated Lambda Functions, that you can un locally.

And you should start looking at this line https://github.com/0x4447/0x4447_product_s3_email_lambda_inbound/blob/development/index.js#L207.

Important, I won't accept code that is not formatted the way I formated it. Make sure to match the style and add meaningful comments.

@majorminors
Copy link
Author

majorminors commented Jul 27, 2020

Can definitely see the point of leaving it, considering the use case of a signups/newsletters bin.

I was thinking that this would be an ideal alternative to a backup mailserver for people (like me) who don't want the expense of spinning up two VPSs just in case one goes down. That way, could both queue the mail for forwarding with SES and also read the email coming in while the primary is down.

Anyway, we'll see if I can easily alter the code.

Cheers!

@davidgatti
Copy link
Member

O I see so you would like to automatically forward every email to S3 Email, to have a backup?

@majorminors
Copy link
Author

majorminors commented Jul 28, 2020

Oh, I didn't think of that. Although that would be a very good idea.

I was actually thinking to have an MX record pointing at the primary mailserver with, say, priority 10, and then a second MX record pointing at SES with, say, priority 20. So if the primary went down, SES and the 0x4447 s3 email stack would take over.

Alternatively, I could use SES as a relay for both use cases - so all email goes to SES. SES first tries to send it to my mail server. If that fails, SES uses the 0x4447 s3 email stack.

@davidgatti
Copy link
Member

Hmmm impressive, I had no idea you could do that. And both ways are interesting, although the MX way seams to be simpler. Since in SES you'd have to make a rule and do some configuration. But yea in this case handling CC and BBC better would be good.

I think the best way would be to to have this flow:

  1. To: is the most important field, and the email should be organized with that and ignore any other field
  2. If To is not present, use CC
  3. if CC is not present, use BCC

What do you think?

@majorminors
Copy link
Author

majorminors commented Jul 29, 2020

I think that makes a lot of sense. So:

  1. To: is the most important field, and the email should be organized with that and ignore any other field
  2. If To is not present, use CC
  3. if CC is not present, use BCC
  4. if none present, treat as sent

?

Quick side note, just so I have it written down. I tested a TO as follows:

TO: someone_else@otherdomain.com; me@SESverifieddomain.com

This also gets treated as sent. I guess the filtering is just looking for the first domain it comes across in the TO field. I will also see if I can address this edge case, after the CC BCC thing.

@majorminors
Copy link
Author

majorminors commented Jul 29, 2020

Would you prefer me to branch off the dev branch, or the master? I.e. how up to date are your dev branches? Reason I ask is that notice the commit history on the inbound lambda repo is different between the two--dev @ 74 vs master @ 86.

@davidgatti
Copy link
Member

Will check tomorrow.

@majorminors
Copy link
Author

No rush.

@davidgatti
Copy link
Member

Sorry @majorminors, I was traveling, back to the main computer. Will check everything this weekend.

@davidgatti
Copy link
Member

@majorminors Right if I merge from master to branch there are changes, but they are all PRs merges. If I check the network graph, all seams to be OK, meaning development is the main branch, and from there I did merge everything in to Master.

So everything is OK? :) and if so, yes, you should branch out from development since it is the main one, and master is only for production. Meaning you branch out from dev, then make a PR that points your branch to dev, once that is OK, i merge dev to master.

@majorminors
Copy link
Author

Hello!

My apologies for slow response this time. Didn't get a notification you'd responded.

Gotcha - my intuition was to branch from dev. Just wanted to check.

Assuming I can get this done, I'll be sure to PR your dev branch.

I'll keep you posted!

@davidgatti
Copy link
Member

No worries, and cool :)

@majorminors
Copy link
Author

Hello!

So the more I think about this, the more complicated I realise it could be if I try to account for all edge cases. The most problematic example would be if you had two or more SES verified domains (e.g. an email sent to two or more email addresses that you own) in any of the to, cc, or bcc fields.

I'm trying to keep to the basic philosophy you've specified in the readme and in the reddit thread. I.e. just a place to direct emails that aren't of critical importance.

As such, I wonder how you'd feel about:

  1. First checks to_domain for an SES verified domain and if there's a hit, send it to the inbox (i.e. the current functionality) which as you say, will account for the vast majority of cases.
  2. If there's no hit, checks to see if an SES verified domain exists anywhere else in the to, cc, or bcc fields. If there's a hit, then send it to the inbox under a path that is uncategorised/from_domain/from_account.
  3. Otherwise, place in "sent".

The reason for using a pathname called "uncategorised" would be to indicate to the user that there's something weird about the email. Rather than lulling the user into a false sense of security that any edge cases would be taken care of by the script. It seems like this would be an improvement on having it default into the sent folder without going overboard.

The alternative seems like it would be to check each domain hit and try to regex the preceding string of each hit, followed by duplicating the email n times to sort it accordingly, at which point one might as well pull a mail parser into the project. That feels like it goes against the grain of the project.

Let me know.

@davidgatti
Copy link
Member

davidgatti commented Sep 21, 2020

I will repeat what you wrote in my own way to see if we are on the same page:

So from the code, here I load all the domains added to SES: L116 and then I go over what we have in the To field here: L409

To build out the location where to send. And what you are saying is that some emails might be directed at example.com based on the To field, but then in the CC or BCC fields the domain could be foo.com, where foo.com is your domain. And you are receiving the email form an external company and they are sending the same email to you the contractor for example.

Is this the problem that you want to solve?

@majorminors
Copy link
Author

majorminors commented Sep 22, 2020

Yes, this is the kind of edge case I was thinking about. But really what I'm worried about is building code that would robustly parse all entries in the 'to', 'cc', and 'bcc' fields.

At the moment you only grab the first entry in the to field here, which is sensible because it will account for most cases. But if I grab all of the entries, like:

container.to = data.to.value.map(currentValue => currentValue.address),

I need to do something a little more robust than just split a single item as you do here. I would need to search through for any domain that is in SES. Then regex the string preceding it for the account. Then repeat that process, both in the to field and also on the cc and bcc fields until there are no matching domains left.

So, if I was sent an email:

to: dorian@mydomain.com; john@doe.com; admin@mydomain.com

This would be non-trivial to address (at least for me), and seems like it would be better served by pulling in a mail parser. Which from what I understand of this project is a bit overkill.

@davidgatti
Copy link
Member

davidgatti commented Sep 23, 2020

For sure there is a straight forward approach that we can figure out. We can do the following: write down a bunch of examples of emails that can be added to the "receiving" fields, and for each example the expected output that we want to see, this way based on that it will be easier to reason about and come up with a simple idea.

@loscil06
Copy link

You guys still working on this? Need any help for testing?

@davidgatti
Copy link
Member

I don't think so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

3 participants