Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
[dev.icinga.com #702] Solaris 10: Bus Error (core dumped) when starting icinga #353
This issue has been migrated from Redmine: https://dev.icinga.com/issues/702
Created by antonxx on 2010-08-11 07:33:53 +00:00
I did now the same steps on solaris which I did when compiling on linux.
My actual status:
icinga with classical web interface works on suse linux 11.1 64 bit.
On solaris 10 (sparc) I stumble over the step in
When looking at the dump, I see:
Note: I just compiled nagion 3.2.1 + the nagios plugins 1.4.15 (the same used with icinga)
2010-09-24 12:03:12 +00:00 by mfriedrich 69d5fab
2010-09-24 16:38:26 +00:00 by mfriedrich 8ca33ed
Updated by mfriedrich on 2010-08-12 22:01:07 +00:00
ok, some things to think about.
this patch add profiler_init() without any checks on enabled/disabled in icinga.c
could be a possible leak for solaris dumping the core.
althouth the trace leads the way to verifying the config.
but as a matter of fact, the output in #572 points out that the drop_privileges function with getgid and getuid are faulty.
this leads to the following ideas:
and check on what's been changing since 1.0.1 as this was working fine.
Updated by antonxx on 2010-08-13 15:00:34 +00:00
Note: I compile as normal user.
As normal user I get:
as root I get
does this help?
Updated by raindog on 2010-08-18 16:34:13 +00:00
I've encountered the same issue on Solaris 10 with Icinga 1.0.3, complied as non root - core with cgis only.
bash-3.00$ /sw_ux/scripts/icinga checkconfig
bash-3.00$ gcc --version
Also another issue when compiling that's been around for a while. I know the work around to copy the sprintf.o to the common directory.
bash-3.00$ make all
***** Error code 1
***** Error code 1
Updated by mfriedrich on 2010-08-18 16:57:47 +00:00
the snprintf target was an attempt for solaris in this issues. it has not been touched ever since missing any more feedback.
it would be great if you can test that branch, and report feedback on this.
besides - are there any ready-to-use solaris vm's available?
Updated by raindog on 2010-08-18 17:50:23 +00:00
Tried your changes for the snprintf issue ...
***** Error code 1
***** Error code 1
Updated by antonxx on 2010-08-18 20:38:18 +00:00
After registration you can go to:
and here you can grab a virtualbox appliance (get it from www.virtualbox.org).
After unzipping the zip file, start your virtualbox and go to:
File -> import appliance
You can use this vm for free, but as I understand, only for development purposes,
(By the way oracle just announced they would stop OpenSolaris!)
Updated by Meier on 2010-09-10 18:06:42 +00:00
And they just released Solaris 10u9. Also there are some plans about Solaris Express.
Updated by LarsEngels on 2010-09-14 15:37:49 +00:00
gdb ./icinga /var/core/core_ecpmon01_icinga_0_0_1284475992_11339
Updated by mfriedrich on 2010-09-14 16:05:41 +00:00
i consider gcc3 as root of all evil, and as a matter of fact that #define trick does not work with gcc3 then. the strdup cannot duplicate the string as there is no source address in memory - best guess so far.
in order to remove this bug, I'll revert the commit d60c8af but leave the notificationsescalated macrofix in place.
Updated by mfriedrich on 2010-09-23 18:22:05 +00:00
ok. taking gurrent git master from 23-09-2010 18:00 dbe4749
gcc version 3.4.6
installed like this, with some ssl configure hacks: https://dev.icinga.org/projects/icinga-core/wiki/Setup\_Solaris\_VM
40b98f2 in mfriedrich/solaris
compiled as user, installed via sudo into /usr/local/icinga
run as daemon, root: fine
-bash-3.00# /usr/local/icinga/bin/icinga /usr/local/icinga/etc/icinga.cfg
run via init-script
-bash-3.00# /etc/init.d/icinga start
but with -d it does not on the shell.
next hackup - echo the initscript output of chkconfig, where the segfault happens.
-bash-3.00# /usr/local/icinga/bin/icinga -v /usr/local/icinga/etc/icinga.cfg > /usr/local/icinga/var/icinga.chk 2>&1
-bash-3.00# /usr/local/icinga/bin/icinga -v /usr/local/icinga/etc/icinga.cfg > /dev/null 2>&1
taken old init-script from 1.0.1 - same dump. so it has to do something with the > ... param somwhow. opened pipe while opening files? prohibited by some mechanism like selinux?
the segfault clearly shows an access violation in mmap. which leads into the shared.c directive introduced after 1.0.1
Updated by mfriedrich on 2010-09-24 08:43:14 +00:00
regarding memory allocation.
i've now done reversed quicksort. got commit sha1 from 1.0.2 and 1.0.1 and stepped half way down, running gdb all the time on the checked out branches (iirc i was at test25 then)
point is, that
this is when the eventprofiler steps in.
running through what it does.
profiler_init(); is called. even if event_profiling is disabled.
within profiler_init() several profiler_add() calls.
profiler_add() allocates memory like this
afterwards, nothing special happens.
profiler_item is int, int, double, char*
ok, so it just re-allocates more memory.
what if it allocates too much for the current process?
ok, man pages.
what can be resolved
=> comment profiler_init(); call in icinga.c - everything works fine (x86 and sparc tested).
Updated by mfriedrich on 2010-09-24 10:25:11 +00:00
needed to debug if dangling pointers might happen.
Updated by mfriedrich on 2010-09-27 08:06:37 +00:00
runs fine on x86 and sparc. x86 gdb session over the weekend did not throw anything special.
re-open if you consider any other error.