Skip to content

Commit

Permalink
Merge pull request #148 from kyrias/nuke-learn
Browse files Browse the repository at this point in the history
Rip out broken classification system
  • Loading branch information
flokli committed Jun 12, 2017
2 parents 9c8a4da + 86d881d commit 2671bbe
Show file tree
Hide file tree
Showing 19 changed files with 18 additions and 579 deletions.
88 changes: 4 additions & 84 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,8 @@ Its basic task is to provide automatic tagging each time new mail is registered
with notmuch. In a classic setup, you might call it after 'notmuch new' in an
offlineimap post sync hook.

In addition to more elementary features such as adding tags based on email
headers or maildir folders, handling killed threads and spam, it can do some
heavy magic in order to *learn* how to initially tag your mails based on their
content.
It can do basic thing such as adding tags based on email headers or maildir
folders, handling killed threads and spam.

In move mode, afew will move mails between maildir folders according to
configurable rules that can contain arbitrary notmuch queries to match against
Expand All @@ -44,7 +42,6 @@ Please keep an eye on NEWS.md for important news.
Features
--------

* text classification, magic tags aka the mailing list without server
* spam handling (flush all tags, add spam)
* killed thread handling
* tags posts to lists with ``lists``, ``$list-id``
Expand All @@ -62,12 +59,6 @@ Features
Installation
------------

You'll need dbacl for the text classification:

.. code:: bash
$ aptitude install dbacl
And I'd like to suggest to install afew as your unprivileged user.
If you do, make sure ``~/.local/bin`` is in your path.

Expand All @@ -94,7 +85,6 @@ Put a list of filters into ``~/.config/afew/config``:
# This is the default filter chain
[SpamFilter]
[ClassifyingFilter]
[KillThreadsFilter]
[ListMailsFilter]
[ArchiveSentMailsFilter]
Expand Down Expand Up @@ -126,25 +116,13 @@ Commandline help
-h, --help show this help message and exit
Actions:
Please specify exactly one action (both update actions can be
specified simultaniously).
Please specify exactly one action.
-t, --tag run the tag filters
-l LEARN, --learn=LEARN
train the category with the messages matching the
given query
-u, --update update the categories [requires no query]
-U, --update-reference
update the reference category (takes quite some time)
[requires no query]
-c, --classify classify each message matching the given query (to
test the trained categories)
-m, --move-mails move mail files between maildir folders
Query modifiers:
Please specify either --all or --new or a query string. The default
query for the update actions is a random selection of
REFERENCE_SET_SIZE mails from the last REFERENCE_SET_TIMEFRAME days.
Please specify either --all or --new or a query string.
-a, --all operate on all messages
-n, --new operate on all new messages
Expand Down Expand Up @@ -232,64 +210,6 @@ For information on how to configure rules for move mode, what you can
do with it and what you can't, please refer to ``docs/move_mode``.



The real deal
-------------

Let's train on an existing tag ``spam``:

.. code:: bash
$ afew --learn spam -- tag:spam
Let's build the reference category. This is important to reduce the
false positive rate. This may take a while...


.. code:: bash
$ afew --update-reference
And now let's create a new tag from an arbitrary query result:

.. code:: bash
$ afew -vv --learn sourceforge -- sourceforge
Let's see how good the classification is:

.. code:: bash
$ afew --classify -- tag:inbox and not tag:killed
Sergio López <slpml@sinrega.org> (2011-10-08) (bug-hurd inbox lists unread) --> no match
Patrick Totzke <reply+i-1840934-9a702d09342dca2b120126b26b008d0deea1731e@reply.github.com> (2011-10-08) (alot inbox lists) --> alot
[...]
As soon as you trained some categories, afew will automatically
tag your new mails using the classifier. If you want to disable this
feature, either use the ``--enable-filters`` option to override the default
set of filters or remove the files in your afew state dir:

.. code:: bash
$ ls ~/.local/share/afew/categories
alot juggling reference_category sourceforge spam
You need to update the category files periodically. I'd suggest to run

.. code:: bash
$ afew --update
on a weekly and

.. code:: bash
$ afew --update-reference
on a monthly basis.


Have fun :)


Expand Down
112 changes: 0 additions & 112 deletions afew/DBACL.py

This file was deleted.

17 changes: 0 additions & 17 deletions afew/Database.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@
import notmuch

from .NotmuchSettings import notmuch_settings, get_notmuch_new_tags
from .utils import extract_mail_body

class Database(object):
'''
Expand Down Expand Up @@ -111,22 +110,6 @@ def get_messages(self, query, full_thread = False):
for message in self.walk_thread(thread):
yield message


def mail_bodies_matching(self, *args, **kwargs):
'''
Filters each message yielded from
:func:`Database.get_messages` through
:func:`afew.utils.extract_mail_body`.
This functions accepts the same arguments as
:func:`Database.get_messages`.
:returns: an iterator over :class:`list` of :class:`str`
'''
query = self.get_messages(*args, **kwargs)
for message in query:
yield extract_mail_body(message)

def walk_replies(self, message):
'''
Returns all replies to the given message.
Expand Down
43 changes: 5 additions & 38 deletions afew/commands.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,7 @@
# the actions
action_group = parser.add_argument_group(
'Actions',
'Please specify exactly one action (both update actions can be specified '
'simultaneously).'
'Please specify exactly one action.'
)
action_group.add_argument(
'-t', '--tag', action='store_true',
Expand All @@ -50,24 +49,6 @@
'-w', '--watch', action='store_true',
help='continuously monitor the mailbox for new files'
)
action_group.add_argument(
'-l', '--learn', action='store',
help='train the category with the messages matching the given query'
)
action_group.add_argument(
'-u', '--update', action='store_true',
help='update the categories [requires no query]'
)
action_group.add_argument(
'-U', '--update-reference', action='store_true',
help='update the reference category (takes quite some time) [requires no'
' query]'
)
action_group.add_argument(
'-c', '--classify', action='store_true',
help='classify each message matching the given query (to test the trained'
' categories)'
)
action_group.add_argument(
'-m', '--move-mails', action='store_true',
help='move mail files between maildir folders'
Expand All @@ -77,8 +58,6 @@
query_modifier_group = parser.add_argument_group(
'Query modifiers',
'Please specify either --all or --new or a query string.'
' The default query for the update actions is a random selection of'
' REFERENCE_SET_SIZE mails from the last REFERENCE_SET_TIMEFRAME days.'
)
query_modifier_group.add_argument(
'-a', '--all', action='store_true',
Expand Down Expand Up @@ -131,23 +110,17 @@ def main():
no_actions = len(filter_compat(None, (
args.tag,
args.watch,
args.update or args.update_reference,
args.learn,
args.classify,
args.move_mails
)))
if no_actions == 0:
sys.exit('You need to specify an action')
elif no_actions > 1:
sys.exit(
'Please specify exactly one action (both update actions can be'
' given at once)')
sys.exit('Please specify exactly one action')

no_query_modifiers = len(filter_compat(None, (args.all,
args.new, args.query)))
if no_query_modifiers == 0 and not \
(args.update or args.update_reference or args.watch) and not \
args.move_mails:
if no_query_modifiers == 0 and not args.watch \
and not args.move_mails:
sys.exit('You need to specify one of --new, --all or a query string')
elif no_query_modifiers > 1:
sys.exit('Please specify either --all, --new or a query string')
Expand All @@ -158,14 +131,8 @@ def main():
query_string = get_notmuch_new_query()
elif args.all:
query_string = ''
elif not (args.update or args.update_reference):
query_string = ' '.join(args.query)
elif args.update or args.update_reference:
query_string = '%i..%i' % (
time.time() - args.reference_set_timeframe * 24 * 60 * 60,
time.time())
else:
sys.exit('Weird... please file a bug containing your command line.')
query_string = ' '.join(args.query)

loglevel = {
0: logging.WARNING,
Expand Down
1 change: 0 additions & 1 deletion afew/defaults/afew.config
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@

# This is the default filter chain
#[SpamFilter]
#[ClassifyingFilter]
#[KillThreadsFilter]
#[ListMailsFilter]
#[ArchiveSentMailsFilter]
Expand Down

0 comments on commit 2671bbe

Please sign in to comment.