Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

abusehelper.core.mail: Generalized mail parsing #6

Merged
merged 37 commits into from
May 12, 2016

Conversation

ghost
Copy link

@ghost ghost commented Jan 6, 2016

This pull request adds the subpackage abusehelper.core.mail, which provides the capability to write a generic handler for a mail feed. The trick is that the same handlers can then used parse mails fetched over IMAPv4 or read from a Maildir directory. The handlers can also be tested from the command line and in automated unit tests.

The included README.md aims to document the functionality while giving a short tutorial how to use the code.

The goal is to offer a replacement the current mail feed related code such as abusehelper.core.imapbot and abusehelper.core.shadowservermail. This pull request doesn't remove any old functionality though, and should therefore be fully backwards compatible.

Tasks

  • Add a base Handler class
  • Add a Message class w/ the ability to stream the payload without needing to read everything to memory at once.
  • Add a Mailbox runner
    • Add --concurrency option for handling multiple mails at once
  • Add a IMAPv4 runner
  • Add a command line tester
  • Add a way to write tests for handlers
  • Implement a ShadowServer mail handler
  • Update README.md

@ghost ghost added the enhancement label Jan 6, 2016
@ghost
Copy link
Author

ghost commented Jan 6, 2016

The current abusehelper.core.shadowservermail sets the IMAP filter to (BODY "dl.shadowserver.org" UNSEEN) by default. It would be handy if the IMAP runner, when launched with a ShadowServer handler shadowserver.Handler with

$ python -m abusehelper.core.mail.imapbot user@xmpp.example lobbyroom shadowserver.Handler ...

would know magically to apply a suitable IMAP filter. Currently the IMAP runner can be explicitly configured with a filter, but this doesn't happen automatically based on the handler.

Should the handler even be bothered with such things, though? Also the separation of concerns would be clearer is the Handler could just concentrate on parsing mails that get thrown at it, while the runners can concentrate on fetching the mails.

Currently the Maildir runner just assumes that all mails in a mailbox should be processed. Maybe the IMAP runner could do the same.

@execgit
Copy link
Contributor

execgit commented Jan 7, 2016

I always specify filters manually for imap bots, so fine by me.

@ghost
Copy link
Author

ghost commented May 10, 2016

Removed the backwards compatibility layer task, as it should not be a blocker for merging this pull request. This pull request doesn't touch abusehelper.core.imapbot and abusehelper.core.shadowservermail at all and they can be deprecated separately if needed.

@ghost ghost changed the title [WIP] abusehelper.core.mail: Generalized mail parsing abusehelper.core.mail: Generalized mail parsing May 11, 2016
@ghost ghost assigned mseppanen May 11, 2016
@mseppanen mseppanen merged commit 8399132 into master May 12, 2016
@mseppanen mseppanen deleted the feature-mail-feeds branch May 12, 2016 13:30
@ghost ghost mentioned this pull request May 13, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants