[dev.icinga.com #2340] async patch from ndoutils 1.5 (ipc, kernel message queue) #871

Closed
icinga-migration opened this Issue Feb 21, 2012 · 4 comments

Projects

None yet

1 participant

@icinga-migration
Member

This issue has been migrated from Redmine: https://dev.icinga.com/issues/2340

Created by mfriedrich on 2012-02-21 13:55:16 +00:00

Assignee: (none)
Status: Closed (closed on 2012-05-05 11:38:53 +00:00)
Target Version: (none)
Last Update: 2014-12-08 14:46:34 +00:00 (in Redmine)


the recent release introduces a so called "asynchronous kernel message queue patch" which merely split up the communication of ndo2db after reading data from the socket putting the packet on a message queue and forking a child to read and process from that, making the overall connection asynchronous.

while the design of that is pretty clear, and prevents from using multithreaded architecture, the implementation highly depends on kernel parameters and possible tunings in there. unless there aren't proper tests in many environments, such patch cannot be included upstream.

furthermore, the patch only solves the so called "blocking the core" problem by putting a msg queue in between, letting the core push everything to it and start working very quick. the main problem with that patch is that the amount of data - see the RFC for changing the order of data dumping in idoutils in #1934 - on startup is pretty much (full configs plus retained state dump).
therefore the ido2db daemon actually needs to process the full amount of data at first sight and then starting to insert actual status data - which can take ages and the web guis querying the data from the idoutils db is not uptodate.

imho the gui shouldn't show inaccurate, out-of-date data, but tell that the daemon is currently inserting data.

either way, that patch is a good catch on the core part, but does not solve the problem with multiple database connections, multiple workers increasing the performance on the REAL bottleneck, the database queries and flow of data put into.

another catch would be to replace ipc (dangerous) by some proprietary message queue systems like zeromq or rabbitmq. but that's up to highly experimental attempts and should NOT be made available with a release to upstream.

below attached, what we keep for reference and further discussing.

http://exchange.nagios.org/directory/Patches/NDOUtils/ndoutils-2Dasync/details

Patch adds asynchronous database queries in ndoutils tool. Tested against version 1.4b9 with 5k service checks per minute.
It is recomended to remove all message queues owned by user nagios before ndoutils startup. 

This can be achieved by script: 
for i in `ipcs -q | grep nagios |awk '{print $2}'`; do ipcrm -q $i; done 



It is strongly recomended to tune message queue parameters: 

kernel.msgmnb 

kernel.msgmax 

kernel.msgmni 

Errors of asynchronous process are logged via syslog. Eg. 

ndo2db-3x: Error: queue send error. - means that process cannot put message into queue.

Attachments


Relations:

Member

Updated by mfriedrich on 2012-02-21 13:59:35 +00:00

  • File added ndoutils-async-v1.1.patch
Member

Updated by mfriedrich on 2012-02-21 14:01:31 +00:00

  • File added ndoutils_1.5_svn_commit.txt.zip
Member

Updated by mfriedrich on 2012-05-05 11:38:53 +00:00

  • Status changed from Feedback to Closed

requires in deep queuing rewrite - not possible with current code.

Member

Updated by mfriedrich on 2014-12-08 14:46:34 +00:00

  • Project changed from 18 to Core, Classic UI, IDOUtils
  • Category set to IDOUtils
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment