Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

[dev.icinga.com #2676] unify check scheduling replacement logic #997

Closed
icinga-migration opened this Issue Jun 11, 2012 · 4 comments

Comments

Projects
None yet
1 participant
Member

icinga-migration commented Jun 11, 2012

This issue has been migrated from Redmine: https://dev.icinga.com/issues/2676

Created by mfriedrich on 2012-06-11 15:39:04 +00:00

Assignee: mfriedrich
Status: Resolved (closed on 2012-08-25 13:35:55 +00:00)
Target Version: 1.7.2
Last Update: 2012-08-25 13:35:55 +00:00 (in Redmine)

Icinga Version: 1.7.1
OS Version: Debian

from the 1.7 released new_event "hash" logic, a replacement patch, origin nagios svn, andreas ericsson. needs proper analysis and discussion, next to tests.

Attachments

Changesets

2012-07-06 11:21:41 +00:00 by mfriedrich be88bd5

core: unify check scheduling replacement logic for new events (Andreas Ericsson) #2676

previously, the logic on scheduling a new event was changed using the
new_event attribute. the decision for actually scheduling a new event
now happens generalized after having decided to actually do so.
furthermore next_check_event is correctly assigned to that new event for
the host|service check (which may be a bug in previous versions).

refs #2676

2012-08-18 14:37:52 +00:00 by mfriedrich 379b712

* core: fix duplicated events on check scheduling logic for new events (Andreas Ericsson) #2676 #2993

previously, the logic on scheduling a new event was changed using the
new_event attribute. the decision for actually scheduling a new event
now happens generalized after having decided to actually do so.
furthermore next_check_event is correctly assigned to that new event for
the host|service check (which is a bug in previous versions, causing
duplicate events under different circumstances).

refs #2676
refs #2993

Conflicts:

	Changelog

2012-08-19 17:09:21 +00:00 by mfriedrich ec9c5e3

core: avoid duplicate events when scheduling forced host|service check (Imri Zvik) #2993

previously, we had introduced a hash-like implementation of
host|service->next_check_event directly pointing to the next
scheduled event. this algorithm is being used within
schedule_host|service_check, determing wether to use the
already assigned event, or scheduling a new event. Since we
did not populate the event_data (host or service object) when
adding a new event to the scheduler, this became insame, always
rescheduling a new event, but not discarding the old one.

This has been partly fixed in #2676 with refactoring that detection
and saving the next_check_event accordingly. But on already scheduled
events which were forced (overridden), another bug was unveiled.

Now that we add the reverse pointer from the host|service event_data
back to the newly created event when forcing a check, we will make sure
that those events are checked correctly, and executed/discarded in the
first place, and not always creating a new event, seperated from the rest.

basically, using the previous implementation (with and without the fix
from #2676) we've created an event bomb under various circumstances,
especially when future events would have been overridden (forced checks).
as events usually result in checks, which can result into perfdata, this
could possibly the root cause for #2924 as well, as other users reported
on the portal as well.

http://www.monitoring-portal.org/wbb/index.php?page=Thread&threadID=26352

With the kind patch provided by Imri Zvik, this now works like expected.
Adapted the debug output a bit myself, so with debug_level=272 it will be
easy to see what happens on events scheduling - and not the non-telling
mess before.

kudos to Imri Zvik for the patch.

refs #2993
refs #2676
refs #2182
refs #2924

2012-08-19 17:29:57 +00:00 by mfriedrich f32fbf8

core: avoid duplicate events when scheduling forced host|service check (Imri Zvik) #2993

previously, we had introduced a hash-like implementation of
host|service->next_check_event directly pointing to the next
scheduled event. this algorithm is being used within
schedule_host|service_check, determing wether to use the
already assigned event, or scheduling a new event. Since we
did not populate the event_data (host or service object) when
adding a new event to the scheduler, this became insame, always
rescheduling a new event, but not discarding the old one.

This has been partly fixed in #2676 with refactoring that detection
and saving the next_check_event accordingly. But on already scheduled
events which were forced (overridden), another bug was unveiled.

Now that we add the reverse pointer from the host|service event_data
back to the newly created event when forcing a check, we will make sure
that those events are checked correctly, and executed/discarded in the
first place, and not always creating a new event, seperated from the rest.

basically, using the previous implementation (with and without the fix
from #2676) we've created an event bomb under various circumstances,
especially when future events would have been overridden (forced checks).
as events usually result in checks, which can result into perfdata, this
could possibly the root cause for #2924 as well, as other users reported
on the portal as well.

http://www.monitoring-portal.org/wbb/index.php?page=Thread&threadID=26352

With the kind patch provided by Imri Zvik, this now works like expected.
Adapted the debug output a bit myself, so with debug_level=272 it will be
easy to see what happens on events scheduling - and not the non-telling
mess before.

kudos to Imri Zvik for the patch.

refs #2993
refs #2676
refs #2182
refs #2924

Conflicts:
	Changelog

Relations:

Member

icinga-migration commented Jul 6, 2012

Updated by mfriedrich on 2012-07-06 11:25:42 +00:00

  • Category set to Check Scheduling
  • Status changed from New to Assigned
  • Assigned to set to mfriedrich
  • Target Version set to 1.8
Member

icinga-migration commented Jul 20, 2012

Updated by mfriedrich on 2012-07-20 18:19:01 +00:00

  • Status changed from Assigned to Feedback
  • Done % changed from 0 to 90
Member

icinga-migration commented Aug 19, 2012

Updated by mfriedrich on 2012-08-19 18:25:14 +00:00

  • Target Version changed from 1.8 to 1.7.2
  • Icinga Version set to 1
  • OS Version set to Debian

this is a fix for the bug introduced with #2182 and will solve #2993 partly, as well as #2964 too.

Member

icinga-migration commented Aug 25, 2012

Updated by mfriedrich on 2012-08-25 13:35:55 +00:00

  • Status changed from Feedback to Resolved
  • Done % changed from 90 to 100

@icinga-migration icinga-migration added this to the 1.7.2 milestone Jan 17, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment