Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request -- Support for retrieving DMARC emails from AWS S3 bucket #107

Open
andrewhenke opened this issue Nov 6, 2023 · 18 comments

Comments

@andrewhenke
Copy link

Hello!

I utilize Amazon Web Services for a majority of my clients, and would love to be able to configure my DMARC reporting email address, and those of my clients, to have all inbound DMARC reports automatically stored in an S3 bucket, instead of having to setup a shared inbox through my Microsoft account, etc.

Would it be possible to utilize something such as the open source project FlySystem to support both local file locations, as well as remote and cloud storage locations, for retrieval of stored email messages and attachments?

I'm happy to clarify further, and answer any questions you may have. Thank you!

@liuch
Copy link
Owner

liuch commented Nov 12, 2023

Hello! You can extract the files using any utility to put them any directory on your server and then run php utils/fetch_reports source=directory. This method is not suitable for you?

@andrewhenke
Copy link
Author

The reason that this is not suitable is because the retrieval of a remote cloud storage location's contents just to sync onto the local machine requires custom code to be written and setup as a cron job onto the server itself, and it would then require either a separate web interface to allow the users of your system to trigger the fetch and download of the files directly, instead of being able to do so natively within the system. Further, by utilizing FlySystem, you will still support the native 'local directory' file retrieval method, but you will also be adding support for using remote cloud file storage locations in the same manner as a 'local' directory.

Does that make sense, and can I clarify anything further?

Thank you!

@liuch
Copy link
Owner

liuch commented Nov 15, 2023

I hear you. I'll check out that project the other day. I'm not sure I have anything to test it on, though.

@andrewhenke
Copy link
Author

Thank you for looking into it -- if you would like to work together on this, I would be happy to privately provide a remote storage location (via S3) for you, to use for testing, etc. Just let me know!

@williamdes
Copy link
Contributor

https://min.io/

Is awesome to have a local S3 working storage in minutes

@andrewhenke
Copy link
Author

https://min.io/

Is awesome to have a local S3 working storage in minutes

Wouldn't this solution require more than simply entering S3 IAM access credentials? Or am I not seeing what you are referencing?

@williamdes
Copy link
Contributor

https://min.io/
Is awesome to have a local S3 working storage in minutes

Wouldn't this solution require more than simply entering S3 IAM access credentials? Or am I not seeing what you are referencing?

This solution is a drop in replacement for AWS S3

I use this for on a client to emulate our production S3 bucket in a free and portable way.

So you can use it just like AWS S3
Credentials and stuff will work the same

@andrewhenke
Copy link
Author

Ahhh

For my use case, I wouldn't find it useful because the DMARC reports are sent to a SES email address, where the inbound emails are then automatically processed and stored into S3.

I'll keep it in mind for future reference, however.

@liuch
Copy link
Owner

liuch commented Nov 20, 2023

I believe williamdes meant using this tool for testing purposes.

@andrewhenke
Copy link
Author

Ahh, my bad, didn't realize that -- that makes a lot more sense

@liuch
Copy link
Owner

liuch commented Nov 28, 2023

@andrewhenke I've implemented these options for S3: key, secret, token, bucket, path, profile, endpoint, region. I haven't forgotten anything? Which options (names) do you use?

@andrewhenke
Copy link
Author

I should be able to tell you most accurately once I take a look at the implementation, but as of now, it looks like that should be everything that is needed! I'm extremely excited to be able to utilize this functionality.

On that note, is there any ability to be able to trigger the fetch/import of the emails from the storage location from within the UX of the web interface? That is the single biggest 'struggle' my team has with using the system on a enterprise level, because the ingestion of new emails requires a technical team member to access the server, instead of non-technical team members being able to trigger the ingestion of data via the web interface.

Thanks!

@liuch
Copy link
Owner

liuch commented Dec 29, 2023

@andrewhenke I have just added implementation for this. Could you please test this commit on your system? See config/conf.sample.php for details.

Note: By default, successfully processed report files are deleted from the file system. Make sure that you use copies.

@andrewhenke
Copy link
Author

Certainly! I will do so in the morning -- I'm very excited to give it a try!

@andrewhenke
Copy link
Author

@liuch I wanted to double check with you -- does the code that you released support extracting the file attachment from the full email itself automatically, or do I need to write and utilize a AWS Lambda function that separates the actual DMARC report attachment from the email itself? I would prefer to not need to use Lambda, if this is something that the codebase will support, or already does support. Please let me know if you have any additional questions or need me to clarify further.

@liuch
Copy link
Owner

liuch commented Dec 29, 2023

I guess I didn't read your first post carefully. I thought the bucket contained report files only (gz, zip, xml), i.e. attachments, not mail messages. My code doesn't work with messages saved as a file.

Could you tell me what is the format of the messages saved to the bucket? Is it *.eml or something else? Maybe I can add processing for such files.

@andrewhenke
Copy link
Author

No worries @liuch! By default, AWS stores the complete, raw email in the MIME format, which you can reference here in the AWS documentation, as well as RFC 2045. There are numerous MIME PHP processing libraries that are out there, such as php-mime-mail-parser which I found rather quickly through doing a few searches.

Does this help?

@liuch
Copy link
Owner

liuch commented Dec 29, 2023

Thank you for the information. I won't promise to add such an implementation anytime soon. But I will definitely consider this possibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants