Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #11714] Crash in UnameHelper #4182

Closed
icinga-migration opened this issue May 3, 2016 · 12 comments

Comments

Projects
None yet
1 participant
@icinga-migration
Copy link
Member

commented May 3, 2016

This issue has been migrated from Redmine: https://dev.icinga.com/issues/11714

Created by erikandersonmx on 2016-05-03 02:28:06 +00:00

Assignee: gbeutner
Status: Resolved (closed on 2016-05-10 07:50:04 +00:00)
Target Version: 2.4.8
Last Update: 2016-05-11 12:41:08 +00:00 (in Redmine)

Icinga Version: 2.4.7-1~ppa1~trusty1
Backport?: Not yet backported
Include in Changelog: 1

Clean install of 2.4.7-1ppa1trusty1 on 14.04

sudo dpkg -l | grep icinga
iU  icinga2                              2.4.7-1~ppa1~trusty1                amd64        host and network monitoring system
iU  icinga2-bin                          2.4.7-1~ppa1~trusty1                amd64        host and network monitoring system - daemon
iF  icinga2-common                       2.4.7-1~ppa1~trusty1                all          host and network monitoring system - common files
ii  icinga2-doc                          2.4.7-1~ppa1~trusty1                all          host and network monitoring system - documentation
ii  libicinga2                           2.4.7-1~ppa1~trusty1                amd64        host and network monitoring system - internal libraries

In /var/log/icinga2/startup.log:

Segmentation fault (core dumped)

In dmesg:

[4785551.447919] icinga2[3185]: segfault at 0 ip 00007fb5897cd23b sp 00007ffff3885470 error 4 in libc-2.19.so[7fb58975f000+1bb000]

Changesets

2016-05-10 07:46:48 +00:00 by gbeutner eab2fb7

Fix crash in UnameHelper()

fixes #11714

2016-05-12 09:08:21 +00:00 by gbeutner 7f8a921

Fix crash in UnameHelper()

fixes #11714

Relations:

@icinga-migration

This comment has been minimized.

Copy link
Member Author

commented May 3, 2016

Updated by erikandersonmx on 2016-05-03 02:48:17 +00:00

More info:

==> /var/log/apport.log <==
ERROR: apport (pid 9598) Mon May  2 20:46:50 2016: called for pid 9556, signal 11, core limit 0
ERROR: apport (pid 9598) Mon May  2 20:46:50 2016: executable: /usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2 (command line "/usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2 --no-stack-rlimit daemon --validate")
ERROR: apport (pid 9598) Mon May  2 20:46:50 2016: is_closing_session(): no DBUS_SESSION_BUS_ADDRESS in environment
ERROR: apport (pid 9598) Mon May  2 20:46:50 2016: apport: report /var/crash/_usr_lib_x86_64-linux-gnu_icinga2_sbin_icinga2.0.crash already exists and unseen, doing nothing to avoid disk usage DoS
@icinga-migration

This comment has been minimized.

Copy link
Member Author

commented May 3, 2016

Updated by erikandersonmx on 2016-05-03 02:56:36 +00:00

Full crashlog: https://gist.github.com/erikanderson/7c38a4d8d12c81acdf04f2866bf5ab21

@icinga-migration

This comment has been minimized.

Copy link
Member Author

commented May 6, 2016

Updated by mfriedrich on 2016-05-06 14:12:49 +00:00

  • Status changed from New to Feedback
  • Assigned to set to erikandersonmx

Any crash logs underneath /var/log/icinga2?
Can you easily reproduce the error, and run icinga2 with gdb generating a full bt (check the development docs)?

@icinga-migration

This comment has been minimized.

Copy link
Member Author

commented May 9, 2016

Updated by erikandersonmx on 2016-05-09 01:55:05 +00:00

There are files in `/var/log/icinga2/crash` but they are empty. Unfortunately I can;t easily reproduce yet

@icinga-migration

This comment has been minimized.

Copy link
Member Author

commented May 9, 2016

Updated by mfriedrich on 2016-05-09 15:52:23 +00:00

Hm. Maybe provide some more insights into your current setup

  • How many hosts, services and check interval
  • Cluster setup? (zones.conf)
  • Icinga 2 clients involved?
  • How long does the config validation take? (time icinga2 daemon -C)
@icinga-migration

This comment has been minimized.

Copy link
Member Author

commented May 9, 2016

Updated by erikandersonmx on 2016-05-09 17:29:35 +00:00

  • 230 hosts, about 2500 services, check interval for almost all services is 1m
  • One zone per host with master zone present on all nodes
  • Client installed on each host, API is piggybacking on puppet ssl cert
  • validation is under 1 second

One note, previously we were just using `service icinga2 reload` on the icinga node but we are currently testing it on all nodes. We have also changed the event engine to `poll` on all but the icinga node (which is set to `epoll`)

@icinga-migration

This comment has been minimized.

Copy link
Member Author

commented May 10, 2016

Updated by gbeutner on 2016-05-10 07:42:09 +00:00

Stacktrace:
 #0  _IO_fgets (buf=0x7ffd705eb6c0 "", n=1024, fp=0x7ffd705eb6c0) at iofgets.c:50
         count = 0
         old_error = 1885254992
 #1  0x00007fe9b1f9a430 in ?? () from /usr/lib/x86_64-linux-gnu/icinga2/libbase.so
 No symbol table info available.
 #2  0x00007fe9b1f9a59e in icinga::Utility::GetPlatformArchitecture() () from /usr/lib/x86_64-linux-gnu/icinga2/libbase.so
 No symbol table info available.
 #3  0x00007fe9b1fbfe85 in icinga::Application::DisplayInfoMessage(std::ostream&, bool) () from /usr/lib/x86_64-linux-gnu/icinga2/libbase.so
 No symbol table info available.
 #4  0x00007fe9b1fdfa1e in icinga::Application::ExceptionHandler() () from /usr/lib/x86_64-linux-gnu/icinga2/libbase.so
 No symbol table info available.
 #5  0x00007fe9b16596d6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #6  0x00007fe9b1659703 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #7  0x00007fe9b1659922 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
 No symbol table info available.
 #8  0x00007fe9b1f94cb9 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/icinga2/libbase.so
 No symbol table info available.
 #9  0x00007fe9b1ffb911 in void boost::throw_exception(boost::thread_resource_error const&) () from /usr/lib/x86_64-linux-gnu/icinga2/libbase.so
 No symbol table info available.
 #10 0x00007fe9b1fb9980 in icinga::WorkQueue::Enqueue(boost::function const&, icinga::WorkQueuePriority, bool) () from /usr/lib/x86_64-linux-gnu/icinga2/libbase.so
 No symbol table info available.
 #11 0x00007fe9b1c694bf in icinga::ConfigItem::CommitNewItems(boost::intrusive_ptr const&, icinga::WorkQueue&, std::vector, std::allocator > >&) () from /usr/lib/x86_64-linux-gnu/icinga2/libconfig.so
 No symbol table info available.
 #12 0x00007fe9b1c6deb5 in icinga::ConfigItem::CommitItems(boost::intrusive_ptr const&, icinga::WorkQueue&, std::vector, std::allocator > >&) () from /usr/lib/x86_64-linux-gnu/icinga2/libconfig.so
 No symbol table info available.
 #13 0x00007fe9b1964e6e in icinga::DaemonUtility::LoadConfigFiles(std::vector > const&, std::vector, std::allocator > >&, icinga::String const&, icinga::String const&) () from /usr/lib/x86_64-linux-gnu/icinga2/libcli.so
 No symbol table info available.
 #14 0x00007fe9b1974469 in icinga::DaemonCommand::Run(boost::program_options::variables_map const&, std::vector > const&) const () from /usr/lib/x86_64-linux-gnu/icinga2/libcli.so
 No symbol table info available.
 #15 0x0000000000411acd in ?? ()
 No symbol table info available.
 #16 0x000000000040f17a in ?? ()
 No symbol table info available.
 #17 0x00007fe9b1041ec5 in __libc_start_main (main=0x40f110, argc=4, argv=0x7ffd705f0708, init=, fini=, rtld_fini=, stack_end=0x7ffd705f06f8) at libc-start.c:287
         result = 
         unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, 6922956623478753145, 4256327, 140726488729344, 0, 0, -6923817269168208007, -6935496874659313799}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x419920, 0x7ffd705f0708}, data = {prev = 0x0, cleanup = 0x0, canceltype = 4299040}}}
         not_first_call = 
 #18 0x000000000040f270 in ?? ()
 No symbol table info available.
@icinga-migration

This comment has been minimized.

Copy link
Member Author

commented May 10, 2016

Updated by gbeutner on 2016-05-10 07:47:11 +00:00

  • Category set to libbase
  • Status changed from Feedback to Assigned
  • Assigned to changed from erikandersonmx to gbeutner
  • Target Version set to 2.4.8

Looks like Icinga 2 crashed while trying to report an out-of-resources condition. This might be related to #8137.

@icinga-migration

This comment has been minimized.

Copy link
Member Author

commented May 10, 2016

Updated by gbeutner on 2016-05-10 07:50:04 +00:00

  • Status changed from Assigned to Resolved
  • Done % changed from 0 to 100

Applied in changeset eab2fb7.

@icinga-migration

This comment has been minimized.

Copy link
Member Author

commented May 10, 2016

Updated by erikandersonmx on 2016-05-10 20:38:34 +00:00

Thank you for looking at this, really appreciate it

@icinga-migration

This comment has been minimized.

Copy link
Member Author

commented May 11, 2016

Updated by mfriedrich on 2016-05-11 12:40:48 +00:00

  • Relates set to 8137
@icinga-migration

This comment has been minimized.

Copy link
Member Author

commented May 11, 2016

Updated by mfriedrich on 2016-05-11 12:41:08 +00:00

  • Subject changed from segfault at 0 ip ... error 4 in libc-2.19.so to Crash in UnameHelper

@icinga-migration icinga-migration added this to the 2.4.8 milestone Jan 17, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.