Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dev.icinga.com #11019] next_check noise in the IDO #3870

Closed
icinga-migration opened this issue Jan 22, 2016 · 11 comments
Closed

[dev.icinga.com #11019] next_check noise in the IDO #3870

icinga-migration opened this issue Jan 22, 2016 · 11 comments

Comments

@icinga-migration
Copy link
Member

@icinga-migration icinga-migration commented Jan 22, 2016

This issue has been migrated from Redmine: https://dev.icinga.com/issues/11019

Created by tgelf on 2016-01-22 15:19:55 +00:00

Assignee: mfriedrich
Status: Resolved (closed on 2016-01-22 18:12:40 +00:00)
Target Version: 2.4.2
Last Update: 2016-02-23 09:58:44 +00:00 (in Redmine)

Icinga Version: 2.4.1
Backport?: Already backported
Include in Changelog: 1

Currently as of single updates for next_check 66-75% of all IDO queries are pretty useless overhead. next_check should usually be updated with the status update, only exception should be command-triggered reschedules.

Cheers,
Thomas

Changesets

2016-01-22 17:40:14 +00:00 by mfriedrich 2a11b27

Properly set the next check time for active and passive checks

fixes #7287
refs #11019

2016-01-22 17:42:15 +00:00 by mfriedrich b960850

DB IDO: Only update 'next_check' column when manually scheduling a check

Otherwise the changes from #7287 already take care of setting
the proper next check time from inside ProcessCheckResult().

There is no need to use the generic OnNextCheckChanged signal
but instead we're using a new one, locally just for DB IDO.

fixes #11019

2016-02-23 08:24:57 +00:00 by mfriedrich 9ca7245

Properly set the next check time for active and passive checks

fixes #7287
refs #11019

2016-02-23 08:25:05 +00:00 by mfriedrich 9dc37c5

DB IDO: Only update 'next_check' column when manually scheduling a check

Otherwise the changes from #7287 already take care of setting
the proper next check time from inside ProcessCheckResult().

There is no need to use the generic OnNextCheckChanged signal
but instead we're using a new one, locally just for DB IDO.

fixes #11019

Relations:

@icinga-migration
Copy link
Member Author

@icinga-migration icinga-migration commented Jan 22, 2016

Updated by mfriedrich on 2016-01-22 15:27:32 +00:00

That's intended as otherwise external interfaces such as Icinga Web 2 won't get the next_check field updated in time and would have to wait for the next check result triggering a status update. While I agree that it is used sub-optimal, I don't really see a different implementation for this.

@icinga-migration
Copy link
Member Author

@icinga-migration icinga-migration commented Jan 22, 2016

Updated by tgelf on 2016-01-22 15:43:08 +00:00

Well, that's an implementation problem then and needs to be solved in one way or the other. The next check has to be rescheduled the moment a check get's executed or once it terminates. In both scenarios the next timestamp should already be available once we store the result to the database. No need to fire additional 2-3 queries for the very same field.

The time needed to fix an issue generating 66-75% superfluous queries would IMO be well spent.

@icinga-migration
Copy link
Member Author

@icinga-migration icinga-migration commented Jan 22, 2016

Updated by mfriedrich on 2016-01-22 15:56:56 +00:00

If you'll remove the update on the 'next_check' column once you'll press "reschedule next check" from inside the web interface, you might encounter lags on large scale systems executing the check "later", and only then updating the IDO status table at once when processing the check result.

There's also an issue with just rescheduling the check not "now", but at a later point in time. Then you won't ordinarily see an update inside Icinga Web 2, until the next check happened and updated the database. The difficulty here is - the DB IDO feature does not care about such event conditions, it will certainly fire the update when "next check" changes in order to help resolve the issue with not up-to-date columns in external interfaces.

There are other events triggering changes inside the DB IDO - see #6051 for details. Obviously "next_check" is the most common one. We certainly could just kill the next_check db events and leave that to status updates, but I guess that will open yet another issue then that the next check time is not accurately updated.

@icinga-migration
Copy link
Member Author

@icinga-migration icinga-migration commented Jan 22, 2016

Updated by tgelf on 2016-01-22 16:08:02 +00:00

As far as I remember I mentioned "check now" in the related customer support issue. I fully agree, no doubt that that one should result in dedicated queries - but this should than somehow be handled with distinct function calls. What I'm talking about here are about 20+ millions of superfluous queries a day in a mid-sized setup. You'll find A LOT of useful information (numbers, statistics, performance and flame graphs...) in the mentioned customer issue. I strongly suggest to work through that material before taking any farther actions ;-) The related issues I created on dev.icinga.org are meant to have dedicated Icinga bug reports to track the individual problems we encountered.

Cheers,
Thomas

@icinga-migration
Copy link
Member Author

@icinga-migration icinga-migration commented Jan 22, 2016

Updated by mfriedrich on 2016-01-22 16:11:53 +00:00

  • Status changed from New to Assigned
  • Assigned to set to mfriedrich
  • Target Version set to 2.5.0

I'll think about a proper signal or solution, I fully agree on the original problem of dealing with such updates. Though that fix depends on the resolved #7287 where the next check time is properly set (it hasn't been before why the additional next_check update was probably a workaround fix).

@icinga-migration
Copy link
Member Author

@icinga-migration icinga-migration commented Jan 22, 2016

Updated by mfriedrich on 2016-01-22 16:12:08 +00:00

  • Blocked set to 7287
@icinga-migration
Copy link
Member Author

@icinga-migration icinga-migration commented Jan 22, 2016

Updated by mfriedrich on 2016-01-22 16:14:24 +00:00

  • Priority changed from Normal to High

Note:

  • Remove OnNextCheckChanged signal from DbEvents class, but not the handler
  • Introduce a safe way to DB IDO updates for non-forced not-now checks for updating the next check column in DB IDO
@icinga-migration
Copy link
Member Author

@icinga-migration icinga-migration commented Jan 22, 2016

Updated by mfriedrich on 2016-01-22 18:12:09 +00:00

While working on a straight-forward fix for this issue, I came around the problem with rescheduling checks for passively received check results. Details: https://dev.icinga.org/issues/7287#note-26

I've tried to not depend on #7287 being fixed but in that case those commits depend on each other which is why I'll reference it on both issues. Though their apply order does not matter.
The second commit will ensure that next_check updates are only fired when explicitly required by scheduling a forced check, the one of sending in check results has been reduced to 0 simply going for the OnNewCheckResult host/servicestatus update.

@icinga-migration
Copy link
Member Author

@icinga-migration icinga-migration commented Jan 22, 2016

Updated by mfriedrich on 2016-01-22 18:12:40 +00:00

  • Status changed from Assigned to Resolved
  • Done % changed from 0 to 100

Applied in changeset b960850.

@icinga-migration
Copy link
Member Author

@icinga-migration icinga-migration commented Jan 25, 2016

Updated by mfriedrich on 2016-01-25 10:26:44 +00:00

  • Target Version changed from 2.5.0 to 2.4.2
@icinga-migration
Copy link
Member Author

@icinga-migration icinga-migration commented Feb 23, 2016

Updated by gbeutner on 2016-02-23 09:58:44 +00:00

  • Backport? changed from Not yet backported to Already backported
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant
You can’t perform that action at this time.