[dev.icinga.com #1100] only insert service and host checks when they are finally processed, increase performance by replacing insert/update with single insert #502

Closed
icinga-migration opened this Issue Jan 11, 2011 · 11 comments

Comments

Projects
None yet
1 participant
Member

icinga-migration commented Jan 11, 2011

This issue has been migrated from Redmine: https://dev.icinga.com/issues/1100

Created by mfriedrich on 2011-01-11 20:27:57 +00:00

Assignee: mfriedrich
Status: Resolved (closed on 2011-01-20 14:08:09 +00:00)
Target Version: 1.3
Last Update: 2014-12-08 14:34:46 +00:00 (in Redmine)


this approach makes use of reducing the insert on duplicate key update queries straight into insert queries.

the common approach is to just take the final resultset, and not keep started values up into the database (which will trigger the update statement as a unique key violation will take place).

taking this a step further, this will remove an update query from postgres too, and revoke the merge trick in oracle to a simple insert prepared statement.

consider this a patch for 1.3.0 unstable, like some other opsview patches from their svn too.

http://labs.opsview.com/2010/07/opsview-is-75-faster-than-a-standard-nagios-database-implementation/

In a standard Nagios plus database implementation, you use NDOutils to store information in a database. While we think NDOutils is fantastic, there are some major limitations with it as you monitor more hosts. With Opsview, we want to scale. We’ve already done lots of work with NDOutils, including adding view-like helper tables, updating the database asynchronously, improved indices and speeding up the time to load the configuration at a Nagios reload. Now we want to share an amazing improvement we’ve discovered.

We know that the nagios_servicechecks table is the most heavily used table. This records every result that flows into Nagios, whether it is actively or passively checked. The statement to add a row in that table is an INSERT … ON DUPLICATE KEY UPDATE ….

However, this has problems. In our experience with the Opsview Data Warehouse – where we took best practise information from datawarehouse experts – fact tables should not have unique keys unless they really are unique. There needs to be suitable indices to help the queries, but uniqueness means that some records may be updated when you expect to have a new record instead.

This gave us pause to wonder why the statement was an UPDATE. Further investigation showed that Nagios was sending extra messages to the database for processing.

The flow was:

   1. a service check is initiated with an NEBTYPE_SERVICECHECK_INITIATE event being fired. NDOutils adds a new row into the table with start times but no result
   2. a NEBTYPE_SERVICECHECK_ASYNC_PRECHECK was being fired – this is to allow other broker modules to intercept a service check execution. This was being sent to NDOutils, but not processed
   3. finally, a NEBTYPE_SERVICECHECK_PROCESSED event was fired – this updates the earlier row with the results of the check

In order to work out the “earlier row”, NDOutils used the unique index which consists of the instance_id, object id, start time and start time usec (micro seconds). However, with passive check results, the start time usec is always set to 0. This means it is possible to lose results if you have checks which have the same start time for the same object.

We took the view that (1) and (2) were not necessary. That meant (3) was the only event that needed to be processed by NDOutils. So our change was to tell (1) and (2) not to send information to NDOutils, and to update the command for (3) to do a straight INSERT, rather than an INSERT … ON DUPLICATE KEY UPDATE ….. This saved an index lookup.

We also changed the database index to reflect this whilst making it much smaller. The index used to consist of (start_time, instance_id, service_object_id, start_time_usec) – this meant for each row, the index was adding another 36 bytes. However, we changed it to (start_time) – only 8 bytes. Opsview only has 1 instance_id, so it is not necessary to include it in the index.

If you are keeping score, here are the improvements:

    * Reduced number of events sent to NDOutils by 66%
    * Reduced number of SQL statements by 50%
    * Changed 1 SQL statement, making it a smaller statement and saving an index lookup
    * Reduced the size of one index by 77%

To test this was easy. As Opsview uses an asynchronous method of updating the database, you can change a debug file and Opsview will automatically start copying the data that would be pushed to the database. This gave us an NDO data packet. We then updated this data packet to have 10000 events of the same object. And then we pushed this to our database instance.

Results? 10000 records was taking 23 seconds to update the database. With our changes, this reduced down to 6 seconds! We’re thrilled that this has speeded up one of the most common database operations.

NDOutils is distributed under the GPL, which stipulates that all changes have to be available to our users. We go one better because Opsview is open source and we publish our source code, so everyone can benefit from our findings. Our complete patch list (for our 3rd party software) is here.

The specific patch for this change is here.

This improvement is shipped with Opsview Enterprise 3.8.0. Keep your eyes out for more performance tuning enhancements and new features that we will be adding to Opsview in the next few months!

https://secure.opsera.com/wsvn/wsvn/opsview/trunk/?#path\_trunk\_
https://secure.opsera.com/wsvn/wsvn/opsview/trunk/opsview-base/patches/?#path\_trunk\_opsview-base\_patches\_
https://secure.opsera.com/wsvn/wsvn/opsview/trunk/opsview-base/patches/ndoutils\_no\_unique\_key\_on\_servicechecks.patch

Attachments

Changesets

2011-01-20 09:23:38 +00:00 by mfriedrich 20b6343

idoutils: only insert service and host checks when they are finally processed, increase performance by replacing insert/update with single insert (idea by Opsview/Opsera Ltd with mysql and servicechecks) #1100

kudos and credits to Opsview for the idea and initial implementation.

this is extended for hostchecks and all 3 supported rdbms.

upgrade scripts are available for 1.3.0

massive performance increase!

fixes #1100
Member

icinga-migration commented Jan 11, 2011

Updated by mfriedrich on 2011-01-11 20:32:39 +00:00

provided patch uses a not normalized query though, keep that in mind when applying, needs to be implemented like it should on sql standards.

Member

icinga-migration commented Jan 11, 2011

Updated by Anonymous on 2011-01-11 21:27:59 +00:00

I have no understanding of SQL queries, but cutting the number of update queries in half and make them inserts instead, sounds to me like a big performance increase =)

Member

icinga-migration commented Jan 19, 2011

Updated by mfriedrich on 2011-01-19 15:26:01 +00:00

for oracle, this will need a replaced prepared statement, and the normal bindings. since this is a reduction/renaming of current code (MERGE holds the needed INSERT, and the binded values are multiple available), this is rather easy to implement on the current code basis.

best would be to apply that in several steps - mysql, then pgsql, then oracle. and if it works out, then for hostchecks too.

Member

icinga-migration commented Jan 19, 2011

Updated by mfriedrich on 2011-01-19 15:42:11 +00:00

ok, hostchecks is rather similar. as an addon to the above explainations:

stepping into base/checks.c on hostchecks

INITIATE

        /*********** EXECUTE THE CHECK AND PROCESS THE RESULTS **********/

#ifdef USE_EVENT_BROKER
        /* send data to event broker */
        end_time.tv_sec=0L;
        end_time.tv_usec=0L;
        broker_host_check(NEBTYPE_HOSTCHECK_INITIATE,NEBFLAG_NONE,NEBATTR_NONE,hst,HOST_CHECK_ACTIVE,hst->current_state,hst->state_type,start_time,end_time,hst->host_check_command,hst->latency,0.0,host_check_timeout,FALSE,0,NULL,NULL,NULL,NULL,NULL);
#endif  

PROCESSED

#ifdef USE_EVENT_BROKER
        /* send data to event broker */
        broker_host_check(NEBTYPE_HOSTCHECK_PROCESSED,NEBFLAG_NONE,NEBATTR_NONE,hst,HOST_CHECK_ACTIVE,hst->current_state,hst->state_type,start_time,end_time,hst->host_check_command,hst->latency,hst->execution_time,host_check_timeout,FALSE,hst->current_state,NULL,hst->plugin_output,hst->long_plugin_output,hst->perf_data,NULL);
#endif 

see? end_time is blanked out by the core next to other stuff, sending unfinished data to ido2db.

so taking the fully processed *check will get all available data from

        /* execute the host check */
        host_result=execute_sync_host_check_3x(hst);

        /* process the host check result */
        process_host_check_result_3x(hst,host_result,old_plugin_output,check_options,FALSE,use_cached_result,check_timestamp_horizon);

        /* free memory */
        my_free(old_plugin_output);

        log_debug_info(DEBUGL_CHECKS,1,"* Sync host check done: new state=%d\n",hst->current_state);

        /* high resolution end time for event broker */
        gettimeofday(&end_time,NULL);

so thanks again to Opsview, this patch really makes sense :-)

Member

icinga-migration commented Jan 19, 2011

Updated by mfriedrich on 2011-01-19 15:43:59 +00:00

  • Subject changed from only insert servicechecks when they are finished to only insert service and host checks when they are finally processed, increase performance by replacing insert/update with single insert
Member

icinga-migration commented Jan 19, 2011

Updated by mfriedrich on 2011-01-19 15:50:11 +00:00

ok, for oracle i made a good choice a while ago - the prepared statements are named modular, it's only the function called for preparing/freeing the statements in db.c then.

the binding stays unchanged in dbqueries.c

                        if(!OCI_BindUnsignedBigInt(idi->dbinfo.oci_statement_servicechecks, MT(":X1"), (big_uint *) data[0])) {
                                return IDO_ERROR;
                        }
Member

icinga-migration commented Jan 19, 2011

Updated by mfriedrich on 2011-01-19 17:25:39 +00:00

mysql and posgres inserts are luckily the same as idoutils queries are normalized.

i made my tests with postgres in the first place on the hostchecks - that looks good.

postgres looks good.

 hostcheck_id | instance_id | host_object_id | check_type | is_raw_check | current_check_attempt | max_check_attempts | state | state_type |     start_time      | start_time_usec |      end_time       | end_time_usec | command_object_id |           comman
_args           | command_line | timeout | early_timeout | execution_time | latency | return_code |                                     output                                     | long_output |       perfdata       
--------------+-------------+----------------+------------+--------------+-----------------------+--------------------+-------+------------+---------------------+-----------------+---------------------+---------------+-------------------+-----------------
----------------+--------------+---------+---------------+----------------+---------+-------------+--------------------------------------------------------------------------------+-------------+----------------------
        13569 |           1 |            138 |          0 |            0 |                     5 |                  5 |     2 |          1 | 2011-01-19 18:01:39 |          280334 | 2011-01-19 18:15:02 |        255306 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.06987 |       0 |           2 | test_host_133 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13570 |           1 |            132 |          0 |            0 |                     5 |                  5 |     2 |          1 | 2011-01-19 18:01:39 |          269314 | 2011-01-19 18:15:02 |        255754 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.15237 |       0 |           2 | test_host_127 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13571 |           1 |            150 |          0 |            0 |                     5 |                  5 |     2 |          1 | 2011-01-19 18:01:39 |          354457 | 2011-01-19 18:15:02 |        260253 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.20235 |       0 |           2 | test_host_145 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13572 |           1 |            144 |          0 |            0 |                     5 |                  5 |     2 |          1 | 2011-01-19 18:01:39 |          293078 | 2011-01-19 18:15:02 |        260685 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.28466 |       0 |           2 | test_host_139 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13573 |           1 |            156 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:15:02 |          256087 | 2011-01-19 18:15:12 |        236605 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.29295 |       0 |           2 | test_host_151 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13574 |           1 |              6 |          0 |            0 |                     1 |                  5 |     0 |          1 | 2011-01-19 18:15:02 |          295845 | 2011-01-19 18:15:12 |        237536 |                 2 | up!$HOSTSTATE:te
t_router_0$     |              |      30 |             0 |        0.29368 |  23.295 |           0 | test_host_000 (checked by icinga-dev) OK: ok hostcheck                         |             | 
        13575 |           1 |              7 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:15:02 |          711864 | 2011-01-19 18:15:12 |        264209 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.42038 |  15.711 |           2 | test_host_001 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13576 |           1 |             13 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:15:03 |          347958 | 2011-01-19 18:15:12 |        467872 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.17196 |   7.347 |           2 | test_host_007 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13577 |           1 |             19 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:15:05 |           50386 | 2011-01-19 18:15:12 |        474177 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.03053 |    0.05 |           2 | test_host_013 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13578 |           1 |             84 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:15:12 |          232731 | 2011-01-19 18:15:22 |        495865 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.05695 |       0 |           2 | test_host_079 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13579 |           1 |            102 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:15:12 |          259080 | 2011-01-19 18:15:22 |        496448 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.09403 |       0 |           2 | test_host_097 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13580 |           1 |             90 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:15:12 |          238850 | 2011-01-19 18:15:22 |        496981 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.18454 |       0 |           2 | test_host_085 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13581 |           1 |            108 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:15:12 |          271788 | 2011-01-19 18:15:22 |        497541 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.15389 |       0 |           2 | test_host_103 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13582 |           1 |            114 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:15:12 |          296689 | 2011-01-19 18:15:22 |        498117 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.14342 |       0 |           2 | test_host_109 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13583 |           1 |             96 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:15:12 |          245076 | 2011-01-19 18:15:22 |        498690 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.22269 |       0 |           2 | test_host_091 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13584 |           1 |            126 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:15:12 |          440951 | 2011-01-19 18:15:22 |        499245 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.04508 |       0 |           2 | test_host_121 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13585 |           1 |            120 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:15:12 |          308027 | 2011-01-19 18:15:22 |        499801 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.19323 |       0 |           2 | test_host_115 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13586 |           1 |            132 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:15:12 |          470735 | 2011-01-19 18:15:22 |        500337 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.03211 |       0 |           2 | test_host_127 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13587 |           1 |             25 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:15:13 |          230263 | 2011-01-19 18:15:22 |        501418 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |         0.0562 |   0.229 |           2 | test_host_019 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13588 |           1 |             31 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:15:22 |          501923 | 2011-01-19 18:15:32 |         81026 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |         0.0563 |   0.501 |           2 | test_host_025 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13589 |           1 |             37 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:15:31 |           49182 | 2011-01-19 18:15:32 |         81440 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.03036 |   0.048 |           2 | test_host_031 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13590 |           1 |             43 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:15:39 |          228253 | 2011-01-19 18:15:42 |        102676 |                 2 | random!$HOSTSTAT
:test_router_1$ |              |      30 |             0 |        0.05615 |   0.227 |           2 | test_host_037 (checked by icinga-dev) DOWN: random hostcheck: parent host down |             | 
        13591 |           1 |             44 |          0 |            0 |                     1 |                  5 |     0 |          1 | 2011-01-19 18:15:48 |          148264 | 2011-01-19 18:15:52 |        218669 |                 2 | up!$HOSTSTATE:te
t_router_2$     |              |      30 |             0 |        0.05764 |   0.147 |           0 | test_host_038 (checked by icinga-dev) OK: ok hostcheck                         |             | 
        13592 |           1 |             49 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:15:57 |            7995 | 2011-01-19 18:16:02 |         95492 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.05813 |   0.007 |           2 | test_host_043 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13593 |           1 |             55 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:16:05 |          120199 | 2011-01-19 18:16:12 |        261988 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.05272 |   0.119 |           2 | test_host_049 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13594 |           1 |              7 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:16:12 |          262506 | 2011-01-19 18:16:22 |        173965 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |         0.0416 |       0 |           2 | test_host_001 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13595 |           1 |             61 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:16:14 |           45544 | 2011-01-19 18:16:22 |        176372 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.02861 |   0.045 |           2 | test_host_055 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 
        13596 |           1 |             43 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:16:22 |          178589 | 2011-01-19 18:16:32 |        116916 |                 2 | random!$HOSTSTAT
:test_router_1$ |              |      30 |             0 |        0.03927 |       0 |           2 | test_host_037 (checked by icinga-dev) DOWN: random hostcheck: parent host down |             | 
        13597 |           1 |             49 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:16:22 |          181491 | 2011-01-19 18:16:32 |        117202 |                 2 | up!$HOSTSTATE:te
t_router_1$     |              |      30 |             0 |        0.05815 |       0 |           2 | test_host_043 (checked by icinga-dev) DOWN: up hostcheck: parent host down     |             | 

servicechecks too.

 servicecheck_id | instance_id | service_object_id | check_type | current_check_attempt | max_check_attempts | state | state_type |     start_time      | start_time_usec |      end_time       | end_time_usec | command_object_id | command_args | command_li
e | timeout | early_timeout | execution_time | latency | return_code |                                                   output                                                   | long_output |           perfdata           
-----------------+-------------+-------------------+------------+-----------------------+--------------------+-------+------------+---------------------+-----------------+---------------------+---------------+-------------------+--------------+-----------
--+---------+---------------+----------------+---------+-------------+------------------------------------------------------------------------------------------------------------+-------------+------------------------------
           54178 |           1 |              3235 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:01:39 |          386602 | 2011-01-19 18:01:39 |        477005 |                 0 |              |           
  |      60 |             0 |         0.0904 |   0.386 |           0 | test_host_151 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def                      |             | 
           54179 |           1 |              3215 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:01:39 |          392987 | 2011-01-19 18:01:39 |        526588 |                 0 |              |           
  |      60 |             0 |         0.1336 |   0.392 |           0 | test_host_150 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def                      |             | 
           54180 |           1 |              1784 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:15:02 |          298810 | 2011-01-19 18:15:02 |        363035 |                 0 |              |           
  |      60 |             0 |        0.06422 |  22.298 |           0 | test_host_078 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def                      |             | 
           54181 |           1 |              1804 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:15:02 |          311853 | 2011-01-19 18:15:02 |        409628 |                 0 |              |           
  |      60 |             0 |        0.09778 |  22.311 |           0 | test_host_079 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def                      |             | 
           54182 |           1 |              1824 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:15:02 |          319848 | 2011-01-19 18:15:02 |        588622 |                 0 |              |           
  |      60 |             0 |        0.26877 |  21.319 |           0 | test_host_080 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def                      |             | 
           54183 |           1 |              1844 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:15:02 |          323488 | 2011-01-19 18:15:02 |        591170 |                 0 |              |           
  |      60 |             0 |        0.26768 |  21.323 |           0 | test_host_081 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def                      |             | 
           54184 |           1 |              1884 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:15:02 |          413808 | 2011-01-19 18:15:02 |        623889 |                 0 |              |           
  |      60 |             0 |        0.21008 |  20.413 |           0 | test_host_083 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def                      |             | 
           54185 |           1 |              1924 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:15:02 |          427577 | 2011-01-19 18:15:02 |        645356 |                 0 |              |           
  |      60 |             0 |        0.21778 |  19.427 |           0 | test_host_085 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def                      |             | 
           54186 |           1 |              1904 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:15:02 |          420771 | 2011-01-19 18:15:02 |        690422 |                 0 |              |           
  |      60 |             0 |        0.26965 |   19.42 |           0 | test_host_084 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def                      |             | 
           54187 |           1 |              2024 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:15:02 |          684618 | 2011-01-19 18:15:02 |        828838 |                 0 |              |           
  |      60 |             0 |        0.14422 |  17.684 |           0 | test_host_090 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def                      |             | 
Member

icinga-migration commented Jan 19, 2011

Updated by mfriedrich on 2011-01-19 17:30:41 +00:00

now for mysql,

mysql> select * from icinga_hostchecks limit 10;
+--------------+-------------+----------------+------------+--------------+-----------------------+--------------------+-------+------------+---------------------+-----------------+---------------------+---------------+-------------------+------------------------------+--------------+---------+---------------+----------------+---------+-------------+----------------------------------------------------------------------------+-------------+----------+
| hostcheck_id | instance_id | host_object_id | check_type | is_raw_check | current_check_attempt | max_check_attempts | state | state_type | start_time          | start_time_usec | end_time            | end_time_usec | command_object_id | command_args                 | command_line | timeout | early_timeout | execution_time | latency | return_code | output                                                                     | long_output | perfdata |
+--------------+-------------+----------------+------------+--------------+-----------------------+--------------------+-------+------------+---------------------+-----------------+---------------------+---------------+-------------------+------------------------------+--------------+---------+---------------+----------------+---------+-------------+----------------------------------------------------------------------------+-------------+----------+
|            1 |           1 |          12843 |          0 |            0 |                     5 |                  5 |     2 |          1 | 2011-01-19 18:23:42 |           58097 | 2011-01-19 18:26:31 |        594341 |             12733 | up!$HOSTSTATE:test_router_1$ |              |      30 |             0 |        0.04097 |       0 |           2 | test_host_061 (checked by icinga-dev) DOWN: up hostcheck: parent host down |             |          | 
|            2 |           1 |          12757 |          0 |            0 |                     5 |                  5 |     2 |          1 | 2011-01-19 18:23:42 |           53541 | 2011-01-19 18:26:31 |        594636 |             12733 | up!$HOSTSTATE:test_router_1$ |              |      30 |             0 |        0.05511 |       0 |           2 | test_host_055 (checked by icinga-dev) DOWN: up hostcheck: parent host down |             |          | 
|            3 |           1 |          12776 |          0 |            0 |                     5 |                  5 |     2 |          1 | 2011-01-19 18:23:42 |           73905 | 2011-01-19 18:26:31 |        594901 |             12733 | up!$HOSTSTATE:test_router_1$ |              |      30 |             0 |        0.08804 |       0 |           2 | test_host_073 (checked by icinga-dev) DOWN: up hostcheck: parent host down |             |          | 
|            4 |           1 |          12844 |          0 |            0 |                     5 |                  5 |     2 |          1 | 2011-01-19 18:23:42 |           70814 | 2011-01-19 18:26:31 |        597809 |             12733 | up!$HOSTSTATE:test_router_1$ |              |      30 |             0 |        0.18493 |       0 |           2 | test_host_067 (checked by icinga-dev) DOWN: up hostcheck: parent host down |             |          | 
|            5 |           1 |          12734 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:26:31 |          608124 | 2011-01-19 18:26:38 |        196228 |             12733 | up!$HOSTSTATE:test_router_1$ |              |      30 |             0 |        0.05607 |       0 |           2 | test_host_019 (checked by icinga-dev) DOWN: up hostcheck: parent host down |             |          | 
|            6 |           1 |          12865 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:26:31 |          598536 | 2011-01-19 18:26:38 |        196494 |             12733 | up!$HOSTSTATE:test_router_1$ |              |      30 |             0 |        0.07368 |       0 |           2 | test_host_079 (checked by icinga-dev) DOWN: up hostcheck: parent host down |             |          | 
|            7 |           1 |          12846 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:26:31 |          604868 | 2011-01-19 18:26:38 |        196759 |             12733 | up!$HOSTSTATE:test_router_1$ |              |      30 |             0 |        0.11296 |       0 |           2 | test_host_085 (checked by icinga-dev) DOWN: up hostcheck: parent host down |             |          | 
|            8 |           1 |          12756 |          0 |            0 |                     1 |                  5 |     0 |          1 | 2011-01-19 18:26:31 |          619812 | 2011-01-19 18:26:38 |        197223 |             12733 | up!$HOSTSTATE:test_router_0$ |              |      30 |             0 |        0.11476 |  12.619 |           0 | test_host_042 (checked by icinga-dev) OK: ok hostcheck                     |             |          | 
|            9 |           1 |          12828 |          0 |            0 |                     1 |                  5 |     2 |          1 | 2011-01-19 18:26:31 |          616411 | 2011-01-19 18:26:38 |        197647 |             12733 | up!$HOSTSTATE:test_router_1$ |              |      30 |             0 |        0.12462 |  13.616 |           2 | test_host_001 (checked by icinga-dev) DOWN: up hostcheck: parent host down |             |          | 
|           10 |           1 |          12918 |          0 |            0 |                     1 |                  5 |     0 |          1 | 2011-01-19 18:26:31 |          622808 | 2011-01-19 18:26:38 |        197911 |             12733 | up!$HOSTSTATE:test_router_4$ |              |      30 |             0 |        0.12092 |   2.622 |           0 | test_host_046 (checked by icinga-dev) OK: ok hostcheck                     |             |          | 
+--------------+-------------+----------------+------------+--------------+-----------------------+--------------------+-------+------------+---------------------+-----------------+---------------------+---------------+-------------------+------------------------------+--------------+---------+---------------+----------------+---------+-------------+----------------------------------------------------------------------------+-------------+----------+
10 rows in set (0.00 sec)

the reason the first rows take longer - the config dump takes ages and screws the check latency. so after the core is in sync with idoutils, it runs flawlessly.

oracle will be an overnight express test.

mysql> select * from icinga_servicechecks limit 10;
+-----------------+-------------+-------------------+------------+-----------------------+--------------------+-------+------------+---------------------+-----------------+---------------------+---------------+-------------------+--------------+--------------+---------+---------------+----------------+---------+-------------+---------------------------------------------------------------------------------------+-------------+----------+
| servicecheck_id | instance_id | service_object_id | check_type | current_check_attempt | max_check_attempts | state | state_type | start_time          | start_time_usec | end_time            | end_time_usec | command_object_id | command_args | command_line | timeout | early_timeout | execution_time | latency | return_code | output                                                                                | long_output | perfdata |
+-----------------+-------------+-------------------+------------+-----------------------+--------------------+-------+------------+---------------------+-----------------+---------------------+---------------+-------------------+--------------+--------------+---------+---------------+----------------+---------+-------------+---------------------------------------------------------------------------------------+-------------+----------+
|               1 |           1 |               283 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:23:42 |           99751 | 2011-01-19 18:23:42 |        172293 |                 0 |              |              |      60 |             0 |        0.07254 |   0.079 |           0 | test_host_002 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def |             |          | 
|               2 |           1 |              1999 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:23:42 |          134539 | 2011-01-19 18:23:42 |        208757 |                 0 |              |              |      60 |             0 |        0.07422 |   0.134 |           0 | test_host_078 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def |             |          | 
|               3 |           1 |               286 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:23:42 |          128965 | 2011-01-19 18:23:42 |        232195 |                 0 |              |              |      60 |             0 |        0.10323 |   0.128 |           0 | test_host_006 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def |             |          | 
|               4 |           1 |               276 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:23:42 |          126112 | 2011-01-19 18:23:42 |        249237 |                 0 |              |              |      60 |             0 |        0.12312 |   0.125 |           0 | test_host_004 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def |             |          | 
|               5 |           1 |              1980 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:23:42 |          131501 | 2011-01-19 18:23:42 |        277692 |                 0 |              |              |      60 |             0 |        0.14619 |   0.131 |           0 | test_host_077 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def |             |          | 
|               6 |           1 |               670 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:23:43 |          143064 | 2011-01-19 18:23:43 |        174150 |                 0 |              |              |      60 |             0 |        0.03109 |   0.142 |           0 | test_host_008 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def |             |          | 
|               7 |           1 |              2037 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:23:43 |          150954 | 2011-01-19 18:23:43 |        186872 |                 0 |              |              |      60 |             0 |        0.03592 |    0.15 |           0 | test_host_080 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def |             |          | 
|               8 |           1 |              2018 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:23:43 |          148470 | 2011-01-19 18:23:43 |        200957 |                 0 |              |              |      60 |             0 |        0.05249 |   0.148 |           0 | test_host_079 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def |             |          | 
|               9 |           1 |               705 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:23:43 |          145963 | 2011-01-19 18:23:43 |        210418 |                 0 |              |              |      60 |             0 |        0.06445 |   0.145 |           0 | test_host_010 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def |             |          | 
|              10 |           1 |               743 |          0 |                     1 |                  3 |     0 |          1 | 2011-01-19 18:23:44 |          159940 | 2011-01-19 18:23:44 |        190404 |                 0 |              |              |      60 |             0 |        0.03046 |   0.159 |           0 | test_host_012 (checked by icinga-dev) OK: ok warning: '$this$ is a test"te$st\abc\def |             |          | 
+-----------------+-------------+-------------------+------------+-----------------------+--------------------+-------+------------+---------------------+-----------------+---------------------+---------------+-------------------+--------------+--------------+---------+---------------+----------------+---------+-------------+---------------------------------------------------------------------------------------+-------------+----------+
10 rows in set (0.00 sec)
Member

icinga-migration commented Jan 20, 2011

Updated by mfriedrich on 2011-01-20 08:47:40 +00:00

  • Done % changed from 0 to 90

oracle is fine too.

servicechecks

1   318869389   21  43181   0   1   3   0   1   20-Jän-2011 8:08:09 63268   20-Jän-2011 8:08:09 164239  0           60  0   0,10097 0,062   0   test_host_093 (checked by icinga-dev) OK: ok warning: ''$this$ is a test"te$st\abc\def          
2   318869390   21  43221   0   1   3   0   1   20-Jän-2011 8:08:10 102342  20-Jän-2011 8:08:10 157484  0           60  0   0,05514 0,102   0   test_host_095 (checked by icinga-dev) OK: ok warning: ''$this$ is a test"te$st\abc\def          
3   318869391   21  43241   0   1   3   0   1   20-Jän-2011 8:08:10 95376   20-Jän-2011 8:08:10 198484  0           60  0   0,10311 0,095   0   test_host_096 (checked by icinga-dev) OK: ok warning: ''$this$ is a test"te$st\abc\def          
4   318869392   21  43281   0   1   3   0   1   20-Jän-2011 8:08:11 113310  20-Jän-2011 8:08:11 171809  0           60  0   0,0585  0,112   0   test_host_098 (checked by icinga-dev) OK: ok warning: ''$this$ is a test"te$st\abc\def          
5   318869393   21  43261   0   1   3   0   1   20-Jän-2011 8:08:11 118302  20-Jän-2011 8:08:11 175346  0           60  0   0,05704 0,117   0   test_host_097 (checked by icinga-dev) OK: ok warning: ''$this$ is a test"te$st\abc\def          

hostchecks

1   1851412 21  41150   0   0   1   5   0   1   20-Jän-2011 8:05:56 156157  20-Jän-2011 8:06:06 132733  41138   up                                  30  0   0,05589 0,155   0   test_host_005 (checked by icinga-dev) OK: ok hostcheck                                                  
2   1851413 21  41153   0   0   1   5   0   1   20-Jän-2011 8:05:56 164423  20-Jän-2011 8:06:06 133191  41107   up!$HOSTSTATE:test_router_3$        30  0   0,06275 0,164   0   test_host_009 (checked by icinga-dev) OK: ok hostcheck                                                  
3   1851414 21  41157   0   0   1   5   2   1   20-Jän-2011 8:05:56 160908  20-Jän-2011 8:06:06 134038  41107   up!$HOSTSTATE:test_router_1$        30  0   0,07518 0,16    2   test_host_013 (checked by icinga-dev) DOWN: up hostcheck: parent host down                              
4   1851415 21  41145   0   0   1   5   0   1   20-Jän-2011 8:05:56 153098  20-Jän-2011 8:06:06 134498  41102                                       30  0   0,08469 0,152   0   test                                                                            in=0c;;;0 out=0c;;;0    
5   1851416 21  41149   0   0   1   5   0   1   20-Jän-2011 8:05:56 181003  20-Jän-2011 8:06:06 134946  41107   up!$HOSTSTATE:test_router_4$        30  0   0,07785 0,18    0   test_host_004 (checked by icinga-dev) OK: ok hostcheck                                                  
6   1851417 21  41160   0   0   1   5   0   1   20-Jän-2011 8:06:10 161037  20-Jän-2011 8:06:16 234002  41138   up                                  30  0   0,033   0,16    0   test_host_017 (checked by icinga-dev) OK: ok hostcheck                                                  
7   1851418 21  41159   0   0   1   5   0   1   20-Jän-2011 8:06:10 170263  20-Jän-2011 8:06:16 234275  41107   up!$HOSTSTATE:test_router_4$        30  0   0,0407  0,17    0   test_host_016 (checked by icinga-dev) OK: ok hostcheck                                                  
8   1851419 21  41161   0   0   1   5   0   1   20-Jän-2011 8:06:10 167355  20-Jän-2011 8:06:16 234527  41107   up!$HOSTSTATE:test_router_0$        30  0   0,05136 0,167   0   test_host_018 (checked by icinga-dev) OK: ok hostcheck                                                  
9   1851420 21  41158   0   0   1   5   0   1   20-Jän-2011 8:06:10 164404  20-Jän-2011 8:06:16 234789  41107   up!$HOSTSTATE:test_router_3$        30  0   0,06859 0,164   0   test_host_015 (checked by icinga-dev) OK: ok hostcheck                                                  
Member

icinga-migration commented Jan 20, 2011

Updated by mfriedrich on 2011-01-20 14:08:09 +00:00

  • Status changed from Assigned to Resolved
  • Done % changed from 90 to 100

works for me. will backport the hostchecks for opsview too.

[13:01:59]  the @opsview twitter acc told me to ask you directly on the servicecheck performance incresemants
[13:02:31]  this one - https://dev.icinga.org/issues/1100
[13:02:40]  i was wondering why opsview doesn't do the same with hostchecks?
[13:02:47]  Time
[13:03:24]  ah so if i send you a patch you'll add that? ;-)
[13:03:56]  Would be nice to get a patch in this direction, yes :)
[13:04:17]  ok. which version is your main source .. 1.4b7 right?
[13:07:25]  @dnsmichi: 1.4b7 - yes. Though it's probably easier to checkout https://secure.opsera.com/svn/opsview/trunk/opsview-base/
[13:07:33]  and run make ndoutils
[13:08:12]  I then cp -pr ndoutils-1.4b7 ndoutils-1.4b7.original and hack away at ndoutils-1.4b7 and then do a diff -ur between the two directories for the patch
[13:08:42]  Be aware that the ndomod.o won't "just work" because he have different object structures
Member

icinga-migration commented Dec 8, 2014

Updated by mfriedrich on 2014-12-08 14:34:46 +00:00

  • Project changed from 18 to Core, Classic UI, IDOUtils
  • Category set to IDOUtils

icinga-migration added this to the 1.3 milestone Jan 17, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment