Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Many ossec-maild processes stuck at ep_poll #1848

Closed
alanwevans opened this issue Feb 28, 2020 · 12 comments
Closed

Many ossec-maild processes stuck at ep_poll #1848

alanwevans opened this issue Feb 28, 2020 · 12 comments

Comments

@alanwevans
Copy link

Many ossec-maild processes are stuck on ep_poll and eventually the system becomes unusable.

OSSEC Version: 3.6.0 (also present in at least 3.4.0) installed from Atomic RPMs
OS: CentOS 7.7

OSSEC Maild config:

<ossec_config>
  <global>
    <email_notification>yes</email_notification>
    <email_to>REDACTED</email_to>
    <smtp_server>127.0.0.1</smtp_server>
    <email_from>REDACTED</email_from>
    <email_maxperhour>1024</email_maxperhour>
  </global>  
...

The local MTA is postfix

# rpm -q postfix
postfix-2.10.1-7.x86_64

And there's nothing interesting in /var/log/maillog to indicate problems between ossec-maild and postfix itself.

Process list

The following output is from a system where OSSEC has been running for ~ 40 minutes. Process ID 21540 is the "main" ossec-maild process and you can see many processes stuck in the Sleep state at ep_poll (I have removed most of these for brevity). You can also see that there are some processes in the Running state that have high system times. Occasionally I have been ending up with <defunct> processes but there were none at the time of the output below.

# ps -eo f,s,pid,ppid,c,pri,ni,addr,sz,wchan=WIDE-WCHAN-COLUMN,stime,time,cmd | grep -e WCHAN -e ossec-maild
F S   PID  PPID  C PRI  NI ADDR    SZ WIDE-WCHAN-COLUMN STIME     TIME CMD
5 S 21540     1  0  19   0    - 15327 poll_schedule_tim 11:08 00:00:00 /var/ossec/bin/ossec-maild
5 S 21541 21540  0  19   0    - 15326 ep_poll           11:08 00:00:00 /var/ossec/bin/ossec-maild
1 S 24645 21540  0  19   0    - 15327 ep_poll           11:23 00:00:00 /var/ossec/bin/ossec-maild
... 23 lines removed for brevity
1 S 27486 21540  0  19   0    - 15327 ep_poll           11:35 00:00:00 /var/ossec/bin/ossec-maild
1 R 27909 21540 81  19   0    - 15327 -                 11:37 00:17:42 /var/ossec/bin/ossec-maild
1 S 27910 21540  0  19   0    - 15327 ep_poll           11:37 00:00:00 /var/ossec/bin/ossec-maild
... 78 lines removed for brevity
1 S 28567 21540  0  19   0    - 15327 ep_poll           11:39 00:00:00 /var/ossec/bin/ossec-maild
1 R 28568 21540 99  19   0    - 15327 -                 11:39 00:19:42 /var/ossec/bin/ossec-maild
1 S 28569 21540  0  19   0    - 15327 ep_poll           11:39 00:00:00 /var/ossec/bin/ossec-maild
... 26 lines removed for brevity

"No socket." errors
I noticed something strange too. After ossec-maild gets SIG_TERM /var/ossec/logs/ossec.log is flooded with "ERROR: No socket." messages.

2020/02/27 11:07:33 ossec-maild: DEBUG: Running OS_Sendmail()
2020/02/27 11:07:33 ossec-maild: DEBUG: Running OS_Sendmail()
2020/02/27 11:07:33 ossec-maild: DEBUG: Running OS_Sendmail()
2020/02/27 11:07:33 ossec-maild: DEBUG: Running OS_Sendmail()
2020/02/27 11:08:45 ossec-maild(1225): INFO: SIGNAL [(15)-(Terminated)] Received. Exit Cleaning...
2020/02/27 11:08:45 ossec-maild: ERROR: No socket.
.. 275 lines removed for brevity
2020/02/27 11:08:45 ossec-maild: ERROR: No socket.
2020/02/27 11:08:53 ossec-maild: INFO: Started (pid: 21540).
2020/02/27 11:09:08 ossec-maild: DEBUG: Running OS_Sendmail()
2020/02/27 11:17:08 ossec-maild: DEBUG: Running OS_Sendmail()
2020/02/27 11:17:08 ossec-maild: DEBUG: Running OS_Sendmail()

Maybe something kicked off by dispatch_event()?

event_dispatch();
if (os_sock <= 0) {
ErrorExit("ossec-maild: ERROR: No socket.");
}

@ddpbsd
Copy link
Member

ddpbsd commented Feb 29, 2020

I've seen similar problems. PR #1842 is an attempt to fix some of the problems. Testing would be appreciated!

@sic-bordeaux
Copy link

sic-bordeaux commented Aug 7, 2020

I have similar issue with about 15 ossec-maild processes when running service started for 1 hour and about 350 after 1 day
But I do not have any "ossec-maild: ERROR: No socket." messages neither any "DEBUG: Running OS_Sendmail()" messages.

I only have:

grep ossec-maild /var/ossec/logs/ossec.log
2020/08/07 09:57:42 ossec-maild: DEBUG: Starting ...
2020/08/07 09:57:42 ossec-maild: INFO: Chrooted to directory: /var/ossec
2020/08/07 09:57:42 ossec-maild [dns]: INFO: Starting osdns

OSSEC version: 1:3.6.0-12032.el7.art (also tested with 1:3.6.0-11279.el7.art)
OS: CentOS Linux release 7.8.2003 (Core)

local mta: postfix-2.10.1-9.el7.x86_64 (also tested with remote smtp server)

ossec-maild conf:

  <global>
    <email_notification>yes</email_notification>
    <email_to>REDACTED</email_to>
    <smtp_server>localhost</smtp_server>
    <email_from>REDACTED</email_from>
    <email_maxperhour>1024</email_maxperhour>
  </global>
  <email_alerts>
    <email_to>REDACTED2</email_to>
    <level>10</level>
    <event_location>SPECIFIC_LOCATION</event_location>
    <do_not_delay />
    <do_not_group />
  </email_alerts>
  <email_alerts>
    <email_to>REDACTED</email_to>
    <do_not_delay />
    <do_not_group />
  </email_alerts>

debug enabled with the following command:
/var/ossec/bin/ossec-control enable debug && /var/ossec/bin/ossec-control

Is there a way to increase ossec-maild verbosity ?

How can I test #1848 (comment) ?

As workaround, I added the following content to /etc/logrotate.d/ossec-hids in /var/ossec/logs/ossec.log section

    daily
    postrotate
        /var/ossec/bin/ossec-control reload > /dev/null 2>&1
    endscript

@ddpbsd
Copy link
Member

ddpbsd commented Aug 7, 2020

@sic-bordeaux I recommend testing #1891 instead.

@sic-bordeaux
Copy link

How can I test it ?

@ddpbsd
Copy link
Member

ddpbsd commented Aug 7, 2020

@sic-bordeaux

git checkout https://github.com/ossec/ossec-hids.git && cd ossec-hids && git pull https://github.com/ddpbsd/ossec-hids.git "revert_dns" && sudo ./install.sh

@sic-bordeaux
Copy link

Thank you, for the moment I just git pull https://github.com/ddpbsd/ossec-hids.git "revert_dns" && cd src && make TARGET=server && cp src/ossec-maild /var/ossec/bin/ossec-maild && /var/ossec/bin/ossec-control reload

I'll test it for 2 hours before revert to previous version for the weekend and make more tests next week

@sic-bordeaux
Copy link

It does not work. Mail are not send at all.
I'll update the whole project next week.

@ddpbsd
Copy link
Member

ddpbsd commented Aug 7, 2020

@sic-bordeaux Strange, works for me. Try changing localhost to 127.0.0.1 for your smtp server.

@sic-bordeaux
Copy link

Thanks, mails are sent when I use IP address istead of name and only copying ossec-maild compiled binary to ossec bin dir.
I'll let you know if possess number continue growing after a few hours.

@sic-bordeaux
Copy link

Seems to be ok for now. only 1 ossec-maild process and mails still being delivered

@sic-bordeaux
Copy link

Everything is ok after one week with this ossec-maild binary.
Thanks

@atomicturtle
Copy link
Member

closed as resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants