Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

icinga2 daemonizing problem under OpenSuse 42.2 #5103

Closed
melaniebauer opened this issue Mar 30, 2017 · 4 comments
Closed

icinga2 daemonizing problem under OpenSuse 42.2 #5103

melaniebauer opened this issue Mar 30, 2017 · 4 comments
Labels
area/checks Check execution and results bug Something isn't working

Comments

@melaniebauer
Copy link

melaniebauer commented Mar 30, 2017

Hi,

the problem has been described in detail in the following thread and should be made an issue as it cannot be easily solved:
https://monitoring-portal.org/index.php?thread/40270-error-premature-eof-kills-icinga2-service/&postID=246839#post246839

Short summary:
icinga2 crashes when started with -d, with checker errors. Without -d (started in gdb or shell) it runs fine.

used software:
openSuse LEAP 42.2 (on a virtual machine)
icinga2 r2.6.2-1 (with director)

Log
===
My log shows normal checks (and a lot of director activity ;-)). Then comes an entry critical/checker that looks like the following (hostname changed to host1):

[2017-03-20 14:01:58 +0100] critical/checker: Exception occured while checking 'host1!snmp_uptime_switch': Error: parse error: premature EOF

(right here) ------^

(0) libbase.so.2.6.2: <unknown function> (+0xe2801) [0x7f4542871801
(1) libbase.so.2.6.2: icinga::JsonDecode(icinga::String const&) (+0x37e) [0x7f4542871dae]
(2) libbase.so.2.6.2: icinga::Process::Run(boost::function<void (icinga::ProcessResult const&)> const&) (+0x4ff) [0x7f45428808af]
(3) libicinga.so.2.6.2: icinga::PluginUtility::ExecuteCommand(boost::intrusive_ptr<icinga::Command> const&, boost::intrusive_ptr<icinga::Checkable> const&, boost::intrusive_ptr<icinga::CheckResult> const&, std::vector<std::pair<icinga::String, boost::intrusive_ptr<icinga::Object> >, std::allocator<std::pair<icinga::String, boost::intrusive_ptr<icinga::Object> > > > const&, boost::intrusive_ptr<icinga::Dictionary> const&, bool, boost::function<void (icinga::Value const&, icinga::ProcessResult const&)> const&) (+0x8fc) [0x7f453ee84f3c]
(4) libmethods.so.2.6.2: icinga::PluginCheckTask::ScriptFunc(boost::intrusive_ptr<icinga::Checkable> const&,boost::intrusive_ptr<icinga::CheckResult> const&, boost::intrusive_ptr<icinga::Dictionary> const&, bool) (+0x5bb) [0x7f453ea90e5b]
(5) libmethods.so.2.6.2: <unknown function> (+0x13a56) [0x7f453ea8da56]
(6) libmethods.so.2.6.2: <unknown function> (+0xcf8d) [0x7f453ea86f8d]
(7) libbase.so.2.6.2: icinga::Function::Invoke(std::vector<icinga::Value, std::allocator<icinga::Value> > const&) (+0x3c) [0x7f454289194c]
(8) libicinga.so.2.6.2: icinga::CheckCommand::Execute(boost::intrusive_ptr<icinga::Checkable> const&, boost::intrusive_ptr<icinga::CheckResult> const&, boost::intrusive_ptr<icinga::Dictionary> const&, bool) (+0x163) [0x7f453ed8a973]
(9) libicinga.so.2.6.2: icinga::Checkable::ExecuteCheck() (+0xa7f) [0x7f453ee9b7af]
(10) libchecker.so.2.6.2: <unknown function> (+0x2726f) [0x7f453e83226f]
(11) libbase.so.2.6.2: icinga::ThreadPool::WorkerThread::ThreadProc(icinga::ThreadPool::Queue&) (+0x365) [0x7f454287a1e5]
(12) libboost_thread.so.1.54.0: <unknown function> (+0xcc0a) [0x7f45434cfc0a]
(13) libpthread.so.0: <unknown function> (+0x8734) [0x7f4541dd8734]
(14) libc.so.6: clone (+0x6d) [0x7f45437c4d3d]
(0) Executing check for object 'host1!snmp_uptime_switch'

Next entry:
[2017-03-20 14:05:28 +0100] information/ConfigItem: Activated all objects.
This comes when I restart the icinga2 service.

Status
=====
# systemctl status icinga2.service 
shows something like that (just one moment ago):

icinga2.service - Icinga host/service/network monitoring system
Loaded: loaded (/usr/lib/systemd/system/icinga2.service; disabled; vendor preset: disabled)
Active: failed (Result: signal) since Wed 2017-03-22 11:58:26 CET; 18s ago
Process: 27289 ExecStart=/usr/sbin/icinga2 daemon -d -e ${ICINGA2_ERROR_LOG} (code=exited, status=0/SUCCESS)
Process: 27225 ExecStartPre=/usr/lib/icinga2/prepare-dirs /etc/sysconfig/icinga2 (code=exited, status=0/SUCCESS)
Main PID: 27317 (code=killed, signal=ABRT)

Mar 22 11:53:15 icinga2 icinga2[27289]: [2017-03-22 11:53:15 +0100] information/ConfigItem: Instantiated 54 HostGroups.
Mar 22 11:53:15 icinga2 icinga2[27289]: [2017-03-22 11:53:15 +0100] information/ConfigItem: Instantiated 1 IcingaApplication.
Mar 22 11:53:15 icinga2 icinga2[27289]: [2017-03-22 11:53:15 +0100] information/ConfigItem: Instantiated 1316 Hosts.
Mar 22 11:53:15 icinga2 icinga2[27289]: [2017-03-22 11:53:15 +0100] information/ConfigItem: Instantiated 43 Users.
Mar 22 11:53:15 icinga2 icinga2[27289]: [2017-03-22 11:53:15 +0100] information/ConfigItem: Instantiated 14 UserGroups.
Mar 22 11:53:15 icinga2 systemd[1]: Started Icinga host/service/network monitoring system.
Mar 22 11:58:07 icinga2 icinga2[27289]: [2017-03-22 11:53:15 +0100] information/ConfigI
Mar 22 11:58:26 icinga2 systemd[1]: icinga2.service: Main process exited, code=killed, status=6/ABRT
Mar 22 11:58:26 icinga2 systemd[1]: icinga2.service: Unit entered failed state.
Mar 22 11:58:26 icinga2 systemd[1]: icinga2.service: Failed with result 'signal'.


Here are the corresponding configuration files for the check. Conf-Files generated by the director in /var/lib/icinga2/api/zones/director-global/director/commands.conf

command: 
=========
object CheckCommand "snmp-uptime-switch" {
import "plugin-check-command"

command = [ PluginDir + "/check_snmp" ]
arguments += {
"-C" = "$snmp_community$"
"-H" = {
required = true
value = "$snmp_address$"
}
"-P" = "$snmp_version$"
"-c" = "$snmp_crit$"
"-o" = {
required = true
value = "$snmp_oid$"
}
"-w" = "$snmp_warn$"
}
vars.snmp_address = "$address$"
....
}
(vars are set in ....)


service-template:
===============
template Service "snmp_uptime_switch" {
check_command = "snmp-uptime-switch"
....
}
(vars are set in ....)


service-apply ( I assign services according to host-variables, they can be nicely chosen from a list in the director):
===============================================================================================
apply Service "snmp_uptime_switch" {
import "snmp_uptime_switch"

assign where host.vars.services1 == "uptime" || host.vars.services2 == "uptime" || host.vars.services3 == "uptime" || host.vars.services4 == "uptime" || host.vars.services5 == "uptime"

import DirectorOverrideTemplate
}

Manually
========
./check_snmp -H <host> -C <community> -P <version> -c <range> -o <OID> 
returns CRITICAL - Plugin timed out while executing system call
This is what systemctl does when it starts the icinga2 service:
Code
/usr/lib64/icinga2/sbin/icinga2 --no-stack-rlimit daemon -d -e /var/log/icinga2/error.log 
Now I did the same with gdb apart from the -d flag. And: icinga2 does not crash after a short while! ????

Then I started icinga2 without -d flag (from the shell). Now it outputs a lot of things into the terminal window and does not crash, at least not within minutes as before (I did not let it run over night though). I just assigned a ping4 service to all hosts with IPv4 and it is still running.
@dnsmichi dnsmichi added bug Something isn't working area/checks Check execution and results labels Mar 30, 2017
@dnsmichi
Copy link
Contributor

I've edited the original issue to include all relevant information. Please continue to do so in the future, just as monitoring-portal.org may be unreachable or down when someone is looking into this issue.

@dnsmichi
Copy link
Contributor

Could be related: #4794

@DanielP81
Copy link

DanielP81 commented Apr 7, 2017

Same Problem here after a complete new Installation:
SLES 12 SP2
Icinga 2.6.2
Icinga Director 1.3.1
Icinga2 crashes after a new Check is planned.

Have only 132 Hosts and Hostalive Check, no further configurations.

I have two errors on several Servers:

Error1)
Exception occured while checking '...': Error: Function call 'fork' failed with error code 11, 'Resource temporarily unavailable'
(0) Executing check for object '...'

Error2)
Exception occured while checking '...': Error: parse error: premature EOF
(right here) ------^

(0) libbase.so.2.6.2: <unknown function> (+0xb12c5) [0x7fea5cb862c5]
(1) libbase.so.2.6.2: icinga::JsonDecode(icinga::String const&) (+0x37e) [0x7fea5cb876ae]
(2) libbase.so.2.6.2: icinga::Process::Run(boost::function<void (icinga::ProcessResult const&)> const&) (+0x4ff) [0x7fea5cbbc43f]
(3) libicinga.so.2.6.2: icinga::PluginUtility::ExecuteCommand(boost::intrusive_ptr<icinga::Command> const&, boost::intrusive_ptr<icinga::Checkable> const&, boost::intrusive_ptr<icinga::CheckResult> const&, std::vector<std::pair<icinga::String, boost::intrusive_ptr<icinga::Object> >, std::allocator<std::pair<icinga::String, boost::intrusive_ptr<icinga::Object> > > > const&, boost::intrusive_ptr<icinga::Dictionary> const&, bool, boost::function<void (icinga::Value const&, icinga::ProcessResult const&)> const&) (+0x945) [0x7fea589add85]
(4) libmethods.so.2.6.2: icinga::PluginCheckTask::ScriptFunc(boost::intrusive_ptr<icinga::Checkable> const&, boost::intrusive_ptr<icinga::CheckResult> const&, boost::intrusive_ptr<icinga::Dictionary> const&, bool) (+0x5a4) [0x7fea585b6af4]
(5) libmethods.so.2.6.2: <unknown function> (+0x1c4e6) [0x7fea585b84e6]
(6) libmethods.so.2.6.2: <unknown function> (+0xff4d) [0x7fea585abf4d]
(7) libbase.so.2.6.2: icinga::Function::Invoke(std::vector<icinga::Value, std::allocator<icinga::Value> > const&) (+0x3c) [0x7fea5cbd344c]
(8) libicinga.so.2.6.2: icinga::CheckCommand::Execute(boost::intrusive_ptr<icinga::Checkable> const&, boost::intrusive_ptr<icinga::CheckResult> const&, boost::intrusive_ptr<icinga::Dictionary> const&, bool) (+0x163) [0x7fea588bbe23]
(9) libicinga.so.2.6.2: icinga::Checkable::ExecuteCheck() (+0xa7f) [0x7fea589c59ef]
(10) libchecker.so.2.6.2: <unknown function> (+0x245ef) [0x7fea5834e5ef]
(11) libbase.so.2.6.2: icinga::ThreadPool::WorkerThread::ThreadProc(icinga::ThreadPool::Queue&) (+0x365) [0x7fea5cbb5d75]
(12) libboost_thread.so.1.54.0: <unknown function> (+0xcc8a) [0x7fea5d5dac8a]
(13) libpthread.so.0: <unknown function> (+0x8734) [0x7fea5a24e734]
(14) libc.so.6: clone (+0x6d) [0x7fea59f8cd3d]


(0) Executing check for object '...'

@dnsmichi dnsmichi added the help wanted Extra attention is needed label Apr 26, 2017
@dnsmichi dnsmichi self-assigned this May 11, 2017
@dnsmichi dnsmichi assigned Al2Klimov and unassigned dnsmichi May 31, 2017
@dnsmichi
Copy link
Contributor

I would believe this is fixed with #5477

@dnsmichi dnsmichi removed the help wanted Extra attention is needed label Aug 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/checks Check execution and results bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants