Rspamd unexpectedly stops working #1869

Open
gibzer opened this Issue Oct 18, 2017 · 8 comments

Comments

Projects
None yet
3 participants

gibzer commented Oct 18, 2017

Classification (Please choose one option):

  • Crash/Hang/Data loss
  • WebUI/Usability
  • Serious bug
  • Other bug
  • Feature
  • Enhancement

Reproducibility (Please choose one option):

  • Always
  • Sometimes
  • Rarely
  • Unable
  • I didn’t try
  • Not applicable

Rspamd version:

1.6.4

Operation system, CPU, memory and environment:

Ubuntu 16.04

Description (Please provide a descriptive summary of the issue):

Rspamd sometimes unexpectedly stops working. Though all Rspamd processes work (main, rspamd_proxy, controller, normal, log_helper, hs_helper), no one is down. Rspamd log is clean like just no mails is coming. In Postfix log I have this:

Oct 18 10:29:33 mail postfix/smtpd[5768]: warning: milter inet:localhost:11332: can't read SMFIC_OPTNEG reply packet header: Connection timed out
Oct 18 10:29:33 mail postfix/smtpd[5768]: warning: milter inet:localhost:11332: read error in initial handshake
Oct 18 10:29:33 mail postfix/smtpd[5768]: NOQUEUE: milter-reject: CONNECT from mmail8.mail365.ru[83.222.116.101]: 451 4.7.1 Service unavailable - try again later; proto=SMTP
Oct 18 10:29:33 mail postfix/smtpd[5768]: NOQUEUE: milter-reject: EHLO from mmail8.mail365.ru[83.222.116.101]: 451 4.7.1 Service unavailable - try again later; proto=SMTP helo=<mmail8.mail365.ru>

This could happen once a month or twice a week, so I can't reveal the trends.

Compile errors (if any):

Steps to reproduce:

Expected results:

Actual results:

Debugging information (see details here):

Configuration:

Additional information:

fenice2 commented Oct 19, 2017

This sounds like it might be a problem with the MTA not rspamd. Can you check if the milter protocol is set correctly to "6"? You can check the protocol version with "postconf milter_protocol". Which MTA are you using?

gibzer commented Oct 20, 2017

I'm not sure that problem is because of MTA. I use Postfix. When this happens, if I restart Rspamd - everything starts to work. So no need to restart MTA.

smtpd_milters = inet:localhost:11332, inet:localhost:54000
non_smtpd_milters = inet:localhost:11332, inet:localhost:54000
milter_protocol = 6
milter_mail_macros = i {mail_addr} {client_addr} {client_name} {auth_authen}
milter_default_action = tempfail

fenice2 commented Oct 21, 2017

I see you have multiple milters in there, what's the second one for? Is that the one that might be failing when mail flow stops? I believe the "milter_default_action" applies to both milters unless you specific different actions for each milter. Have you tried setting "milter_default_action" to accept to see if that allows mail to flow?

gibzer commented Oct 23, 2017

The second is AVG antivirus milter. Log shows that this is exactly Rspamd. And after Rspamd restart error is gone. I tried milter_default_action = accept, mail flows, but without Rspamd check.

gibzer commented Nov 16, 2017

Problem still persists. Got 2 crashes within 3 days. Rspamd logs are clear, all processes are working. I have to use Monit to check Postfix logs for Rspamd errors and restart it...

jrampin commented Nov 16, 2017

@gibzer are you still getting this issue? Have you found a workaround? I've got this once two weeks ago, and once 2 days ago...

gibzer commented Nov 17, 2017

Yes, I still have it. The only one workaround is to use Monit to restart Rspamd when Postfix says in log that it can't connect to Rspamd.
Monit config:

check file maillog with path /var/log/mail.log
    if match "warning: milter inet:localhost:11332: can't read SMFIC_OPTNEG reply packet header: Connection timed out" then exec "/usr/sbin/service rspamd restart"

jrampin commented Nov 26, 2017

@gibzer I've done 2 things:

  1. Increase the memory up to 16GB of RAM
  2. Added both parameters on redis.conf file

/etc/redis/redis.conf


maxmemory 1024mb
maxmemory-policy volatile-lru

It's been a while since I don't get that error anymore. Both servers reach up to 10GB of used memory, and then it releases memory after a while.

Hopefully it can help you! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment