-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Check did not exit properly / Failed to register iobroker #433
Comments
I'll start by gathering some information. How did you install it? What guide did you follow? Was it this one? In the forum thread you mentioned "container", can you elaborate on this please. Comparing your working SLES 12.3 system against the openSUSE 42.3 system what is the output of these commands:
|
Yes, I used https://support.nagios.com/kb/article.php?id=96#SUSE |
@19Alex neither of your links works (the file does though). I'm not sure if this is an encoding issue from your end or github. Copy/paste of both links works fine, so maybe there's nothing you need to do and this is just an FYI for others on the thread. |
@douglasawh thank you for your feedback, I didn't test them. But I have fixed them now. |
Uh - well heck. I wish I would've asked for your |
That's very interresting; the limits.conf on the new SLES-box only has comment lines in it. Would that mean it has no limits? I have changed the values as asked on the openSUSE-box and restarted it. I'll let you know in the morning how it performed through the night. |
This is all very interesting, before I went home yesterday I saw 560 improperly exited checks in a period of 2 hours. I immediately increased the values to 1000000. This morning I see only 52 improperly exited checks since midnight. I am now going to remove the limits alltogether, and see what that does. |
Yes. I checked openSUSE Leap 42.2 and 42.3 and they seem to have the same limits implemented. However 42.1 does not, so I assume based on the comment Good news is that it sounds like your issue is resolved. I've created the following KB article based on the information you gave us, this will help others in the future. @hedenface do you think it's possible for Nagios Core to detect such limits are causing said issue and add some more detailed logging? |
This is not good, I removed the limits, the file now only has comment lines, but I still witnessed a few dozen improperly exited checks per day It's way less than before however. |
@box293 I think that sounds like a great idea for an edition to Core 4.4. (#434) @19Alex Yes but it's pretty obviously some kind of limit at this point. What do your cgroups look like? I'm not an OpenSUSE guy, but maybe the following:
|
I have been struggling with a very strange problem on the supportforum. dwhitfield and bheden have asked me to enter my problem here at Github. The entire story can be read at the support forum: https://support.nagios.com/forum/viewtopic.php?f=7&t=45389
In essence the wproc is not catching the check output. Around the same time Nagios fails to register iobrokers for stdout and stderr:
`
[1504231962] wproc: Core Worker 68003: Failed to register iobroker for stdout
[1504231962] wproc: Core Worker 68003: Failed to register iobroker for stderr
[1504231962] Warning: Check of host 'VM-MIJNHELICON2' did not exit properly!
[1504231962] HOST ALERT: VM-MIJNHELICON2;UNREACHABLE;SOFT;1;(Host check did not exit properly)
`
The problem only occurred on openSUSE 42.3 x64, which I installed twice because I doubted myself. Only after installing on CentOS and running problem free, I began suspecting the OS. Later installed it on SLES 12.3 x64, also without even a glitch.
From my point of view, the move to SLES solved my problem. But I can imagine you might want to investigate the openSUSE box a little further. I have it still running.
The text was updated successfully, but these errors were encountered: