Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
[dev.icinga.com #11686] Icinga Crash with the workflow Create_Host-> Downtime for the Host -> Delete Downtime -> Remove Host #4170
This issue has been migrated from Redmine: https://dev.icinga.com/issues/11686
Created by Christian_vlc on 2016-04-27 09:38:26 +00:00
Icinga Crash if I remove one host after that I created/removed one downtime for this host:
In the dmesg ist one segfault error, but in Icinga logs are any Error (debug not too).
After this error, I must restart Icinga and then I can remove this host.
I tried with flexibel and fixed Downtimes.. .but every time the same result...
2016-05-09 12:30:12 +00:00 by gbeutner b8e911b
2016-05-12 09:08:21 +00:00 by gbeutner d82db2a
Updated by saurabh_hirani on 2016-04-28 10:19:43 +00:00
I verified the same and this is the case. Create host, downtime host, delete downtime, remove host - leads to the exact same error:
Apr 28 07:39:12 vagrant kernel: [11121.769790] traps: icinga2 general protection ip:7f199c3c722f sp:7f1995c80950 error:0 in libstdc**.so.6.0.16[7f199c313000+e2000]
This is a pretty normal activity as hosts will transition through this cycle. Causing an entire icinga2 instance to fail is highly problematic in production systems.
Updated by mfriedrich on 2016-05-02 14:27:12 +00:00
Updated by mfriedrich on 2016-05-02 14:51:29 +00:00
Looks like a race condition to me. At this stage the host object is not fully available when enforcing a dynamic_pointer_cast. Turns out when debugging and stepping into the functions, it works as expected from inside the debugger.
Updated by mfriedrich on 2016-05-02 15:10:39 +00:00
For some reason DependencyGraph::GetParents() returns in incomplete parent object vector which causes trouble here. If you stop the debugger and let another thread finish updating the dependency graph, everything is fine.
Updated by saurabh_hirani on 2016-05-07 04:04:37 +00:00
Got that. Thanks for the update Michael. I am not good at C/C** - so I cannot decipher the coding part clearly enough to contribute back. But I can help out with scenario testing in any way if possible.