Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to teach Bayes ham and spam #1438

Closed
dmenne opened this issue Apr 8, 2020 · 10 comments · Fixed by #1470
Closed

How to teach Bayes ham and spam #1438

dmenne opened this issue Apr 8, 2020 · 10 comments · Fixed by #1470

Comments

@dmenne
Copy link

dmenne commented Apr 8, 2020

The requirement for 200 ham for Bayes classification is not easy to meet with low-volume servers. rspamd works reasonably for the usual viagnigeria spam, but it fails completely for the dozens of requests per day to submit my publication to some fake-open-source journal

I am used to use sa-learn with Spamassassin or rspamc with rspamd to use existing mail for ham education, but ther is no rspamc. The spam/learn tab in rspamd is ridiculous - should I manually paste in hundreds of ham messages?

What is the best method to instruct a mailu installation to use an existing clean mailbox as ham-fodder?

@ofthesun9
Copy link
Contributor

Hello,
You can move (using your mail client, or the webmail) messages to the Junk Folder, rpsmad will learn them as spam messages.

If you move a mail from the Junk folder back to any folder (except the Trash folder), Mailu will learn them as ham. Not straightforward, but maybe that would work ?

A more straightforward way would be to modify the existing behaviour as defined in dovecot

@dmenne
Copy link
Author

dmenne commented Apr 8, 2020

Sounds scary, but would it work? Will it be "unlearned" from spam after moving it back?

@dmenne
Copy link
Author

dmenne commented Apr 8, 2020

I did it and got a strange message (this is neural, so not directly relevant)

; lua; neural.lua:487: cannot learn ANN tRFANN9D954503A5235C34260: too many ham samples: 162

@dmenne
Copy link
Author

dmenne commented Apr 8, 2020

classify: skip classification as ham class has not enough learns: 144, 200 required

This value was unchanged by the move in/out

@ofthesun9
Copy link
Contributor

Sounds scary, but would it work? Will it be "unlearned" from spam after moving it back?

I did look into the code, and we are maybe missing a "fuzzy_del" action when we are moving a from ham to spam and vice versa.
I will submit a PR for this.

@dmenne
Copy link
Author

dmenne commented Apr 8, 2020

Thank for looking into it. Please post your PR # so I can follow up

@ofthesun9
Copy link
Contributor

Yes I will do.
To come back to you question "How to teach Bayes ham and spam":

  • to learn a spam: put it in the Junk folder
  • to learn a ham message: you could create a mailbox "Ham", and create the following file in /mailu/overrides/dovecot.conf
plugin {
  # Learn ham
  imapsieve_mailbox3_name = Ham
  imapsieve_mailbox3_from = *
  imapsieve_mailbox3_causes = COPY
  imapsieve_mailbox3_before = file:/conf/report-ham.sieve
}

Any new message that will be put in this folder will be learnt as ham (for both bayes and fuzzy)
In order to learn from an already existing mailbox, if moving back and forth the messages is not suitable, you might run rspamc from within the dovecot container:

rspamc -h antispam:11334 -P mailu -f 13 fuzzy_add /mail/user\@example.com/.Ham_Learn/cur/  `

This should learn every file located in the Ham_Learn mailbox from user@example.com

@dmenne
Copy link
Author

dmenne commented Apr 9, 2020

Thanks. I know about the rspamc method, but I tried to run it from the rspamd container. Maybe put this info somewhere in the docs?

ofthesun9 added a commit to ofthesun9/Mailu that referenced this issue Apr 9, 2020
bors bot added a commit that referenced this issue Apr 12, 2020
1440:  Relearn messages for fuzzy storage r=mergify[bot] a=ofthesun9

## What type of PR?
enhancement, bugfix

## What does this PR do?
This PR add a rspamc fuzzy_del to ham & spam scripts, in order to cover
[relearning](https://rspamd.com/doc/faq.html#can-i-relearn-messages-for-fuzzy-storage-or-for-statistics) from Junk list to Ham list and vice versa

### Related issue(s)
#1438

## Prerequistes
Before we can consider review and merge, please make sure the following list is done and checked.
If an entry in not applicable, you can check it or remove it from the list.

- [x] Added 1438.bugfix


Co-authored-by: ofthesun9 <olivier@ofthesun.net>
@HorayNarea
Copy link
Member

In the meantime you can also learn spam/ham in the webinterface of rspamd under yourmaildomain.tld/admin/antispam/#scan (which only works if you are logged in as an administrator)

@dmenne
Copy link
Author

dmenne commented Apr 13, 2020

As I wrote in the original posting: "The spam/learn tab in rspamd is ridiculous - should I manually paste in hundreds of ham messages?"

bors bot added a commit that referenced this issue May 4, 2020
1470: Adding faq entry: How to teach Bayes ham and spam #1438 r=mergify[bot] a=ofthesun9

Fix #1438

## What type of PR?
documentation (faq)

## What does this PR do?
This PR add an faq entry to cover #1438 

### Related issue(s)
closes #1438 

## Prerequistes
- [x] In case of feature or enhancement: documentation updated accordingly



Co-authored-by: ofthesun9 <olivier@ofthesun.net>
bors bot added a commit that referenced this issue May 5, 2020
1470: Adding faq entry: How to teach Bayes ham and spam #1438 r=muhlemmer a=ofthesun9

Fix #1438

## What type of PR?
documentation (faq)

## What does this PR do?
This PR add an faq entry to cover #1438 

### Related issue(s)
closes #1438 

## Prerequistes
- [x] In case of feature or enhancement: documentation updated accordingly



Co-authored-by: ofthesun9 <olivier@ofthesun.net>
@bors bors bot closed this as completed in 888ce1b May 5, 2020
sholl pushed a commit to sholl/Mailu that referenced this issue Jun 26, 2020
sholl pushed a commit to sholl/Mailu that referenced this issue Jun 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants