How to teach Bayes ham and spam #1438

dmenne · 2020-04-08T13:48:25Z

The requirement for 200 ham for Bayes classification is not easy to meet with low-volume servers. rspamd works reasonably for the usual viagnigeria spam, but it fails completely for the dozens of requests per day to submit my publication to some fake-open-source journal

I am used to use sa-learn with Spamassassin or rspamc with rspamd to use existing mail for ham education, but ther is no rspamc. The spam/learn tab in rspamd is ridiculous - should I manually paste in hundreds of ham messages?

What is the best method to instruct a mailu installation to use an existing clean mailbox as ham-fodder?

ofthesun9 · 2020-04-08T14:21:17Z

Hello,
You can move (using your mail client, or the webmail) messages to the Junk Folder, rpsmad will learn them as spam messages.

If you move a mail from the Junk folder back to any folder (except the Trash folder), Mailu will learn them as ham. Not straightforward, but maybe that would work ?

A more straightforward way would be to modify the existing behaviour as defined in dovecot

dmenne · 2020-04-08T14:27:17Z

Sounds scary, but would it work? Will it be "unlearned" from spam after moving it back?

dmenne · 2020-04-08T14:31:35Z

I did it and got a strange message (this is neural, so not directly relevant)

; lua; neural.lua:487: cannot learn ANN tRFANN9D954503A5235C34260: too many ham samples: 162

dmenne · 2020-04-08T14:35:04Z

classify: skip classification as ham class has not enough learns: 144, 200 required

This value was unchanged by the move in/out

ofthesun9 · 2020-04-08T18:26:47Z

Sounds scary, but would it work? Will it be "unlearned" from spam after moving it back?

I did look into the code, and we are maybe missing a "fuzzy_del" action when we are moving a from ham to spam and vice versa.
I will submit a PR for this.

dmenne · 2020-04-08T18:34:41Z

Thank for looking into it. Please post your PR # so I can follow up

ofthesun9 · 2020-04-09T08:13:29Z

Yes I will do.
To come back to you question "How to teach Bayes ham and spam":

to learn a spam: put it in the Junk folder
to learn a ham message: you could create a mailbox "Ham", and create the following file in /mailu/overrides/dovecot.conf

plugin {
  # Learn ham
  imapsieve_mailbox3_name = Ham
  imapsieve_mailbox3_from = *
  imapsieve_mailbox3_causes = COPY
  imapsieve_mailbox3_before = file:/conf/report-ham.sieve
}

Any new message that will be put in this folder will be learnt as ham (for both bayes and fuzzy)
In order to learn from an already existing mailbox, if moving back and forth the messages is not suitable, you might run rspamc from within the dovecot container:

rspamc -h antispam:11334 -P mailu -f 13 fuzzy_add /mail/user\@example.com/.Ham_Learn/cur/  `

This should learn every file located in the Ham_Learn mailbox from user@example.com

dmenne · 2020-04-09T08:22:02Z

Thanks. I know about the rspamc method, but I tried to run it from the rspamd container. Maybe put this info somewhere in the docs?

1440: Relearn messages for fuzzy storage r=mergify[bot] a=ofthesun9 ## What type of PR? enhancement, bugfix ## What does this PR do? This PR add a rspamc fuzzy_del to ham & spam scripts, in order to cover [relearning](https://rspamd.com/doc/faq.html#can-i-relearn-messages-for-fuzzy-storage-or-for-statistics) from Junk list to Ham list and vice versa ### Related issue(s) #1438 ## Prerequistes Before we can consider review and merge, please make sure the following list is done and checked. If an entry in not applicable, you can check it or remove it from the list. - [x] Added 1438.bugfix Co-authored-by: ofthesun9 <olivier@ofthesun.net>

HorayNarea · 2020-04-12T22:50:35Z

In the meantime you can also learn spam/ham in the webinterface of rspamd under yourmaildomain.tld/admin/antispam/#scan (which only works if you are logged in as an administrator)

dmenne · 2020-04-13T07:08:53Z

As I wrote in the original posting: "The spam/learn tab in rspamd is ridiculous - should I manually paste in hundreds of ham messages?"

1470: Adding faq entry: How to teach Bayes ham and spam #1438 r=mergify[bot] a=ofthesun9 Fix #1438 ## What type of PR? documentation (faq) ## What does this PR do? This PR add an faq entry to cover #1438 ### Related issue(s) closes #1438 ## Prerequistes - [x] In case of feature or enhancement: documentation updated accordingly Co-authored-by: ofthesun9 <olivier@ofthesun.net>

1470: Adding faq entry: How to teach Bayes ham and spam #1438 r=muhlemmer a=ofthesun9 Fix #1438 ## What type of PR? documentation (faq) ## What does this PR do? This PR add an faq entry to cover #1438 ### Related issue(s) closes #1438 ## Prerequistes - [x] In case of feature or enhancement: documentation updated accordingly Co-authored-by: ofthesun9 <olivier@ofthesun.net>

Fix Mailu#1438

ofthesun9 added a commit to ofthesun9/Mailu that referenced this issue Apr 9, 2020

Newsfragment for Mailu#1438

32d7162

ofthesun9 mentioned this issue Apr 9, 2020

Relearn messages for fuzzy storage #1440

Merged

1 task

ofthesun9 mentioned this issue May 1, 2020

Adding faq entry: How to teach Bayes ham and spam #1438 #1470

Merged

1 task

bors bot closed this as completed in 888ce1b May 5, 2020

sholl pushed a commit to sholl/Mailu that referenced this issue Jun 26, 2020

Newsfragment for Mailu#1438

fe6fd66

sholl pushed a commit to sholl/Mailu that referenced this issue Jun 26, 2020

Adding faq entry to cover Mailu#1438

cb373be

Fix Mailu#1438

woj-tek mentioned this issue Sep 1, 2022

rspamc doesn't seem to correctly teach HAM / spam filtering almost doesn't work #2442

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to teach Bayes ham and spam #1438

How to teach Bayes ham and spam #1438

dmenne commented Apr 8, 2020

ofthesun9 commented Apr 8, 2020

dmenne commented Apr 8, 2020

dmenne commented Apr 8, 2020

dmenne commented Apr 8, 2020

ofthesun9 commented Apr 8, 2020

dmenne commented Apr 8, 2020

ofthesun9 commented Apr 9, 2020

dmenne commented Apr 9, 2020

HorayNarea commented Apr 12, 2020

dmenne commented Apr 13, 2020

How to teach Bayes ham and spam #1438

How to teach Bayes ham and spam #1438

Comments

dmenne commented Apr 8, 2020

ofthesun9 commented Apr 8, 2020

dmenne commented Apr 8, 2020

dmenne commented Apr 8, 2020

dmenne commented Apr 8, 2020

ofthesun9 commented Apr 8, 2020

dmenne commented Apr 8, 2020

ofthesun9 commented Apr 9, 2020

dmenne commented Apr 9, 2020

HorayNarea commented Apr 12, 2020

dmenne commented Apr 13, 2020