Skip to content
This repository has been archived by the owner. It is now read-only.

[dev.icinga.com #173] fix duplicate entries in table servicechecks (mysql) #82

Closed
icinga-migration opened this issue Nov 3, 2009 · 1 comment

Comments

@icinga-migration
Copy link
Member

@icinga-migration icinga-migration commented Nov 3, 2009

This issue has been migrated from Redmine: https://dev.icinga.com/issues/173

Created by mfriedrich on 2009-11-03 11:37:31 +00:00

Assignee: mfriedrich
Status: Resolved (closed on 2009-11-03 12:01:07 +00:00)
Target Version: 1.0 Stable
Last Update: 2009-11-03 12:01:07 +00:00 (in Redmine)


my comment from nagios-devel

Hi,

Øyvind Nordang wrote:

When I check the servicechecks table each check of a service is listed
twice, but not with the same information in both rows.

I've noticed that myself and that seems to be a bug. Maybe it concerns the unique constraint where the UPDATE should happen. But it could also be that some data is sent incorrectly and therefore no matching can be done.

In dbhandlers.c on line 1518 all I can think of is that ON DUPLICATE
KEY UPDATE does not work as it should. I haven't had time to look at
the code and programming isn't my strong side.

if(asprintf(&buf,"INSERT INTO %s SET %s, command_object_id='%lu',
command_args='%s', command_line='%s' ON DUPLICATE KEY UPDATE %s"
,ndo2db_db_tablenames[NDO2DB_DBTABLE_SERVICECHECKS]
,buf1
,command_id
,es[0]
,es[1]
,buf1
)==-1)
buf=NULL;

Also the creation of "buf1" is interesting. The approach of building the query buffer for both INSERT and UPDATE might lead into a real error too. The only difference is that the INSERT statement gets command_object_id, command_args and command_line

The defined keys in db/mysql.sql

KEY `instance_id` (`instance_id`),
KEY `service_object_id` (`service_object_id`),
KEY `start_time` (`start_time`)

Kind of strange, no unique definition ... but let's step further into analysis.

Problem regarding that routine is, that the constraint never matches.
I'm trying to resolve this error from Icinga IDOUtils where the error exists too. The main differences are that the queries are normalized (INSERT INTO .. () VALUES () instead of INSERT INTO .. SET ..=..) but the constraints are the same as in NDOUtils. Also the debugoutput is quite enhanced to see what's going on.

I've run several tests with Postgres and Oracle, and regarding especially right now Oracle, the defined constraint also matches and entries do not get duplicated. Postgres also works fine.

The reason why this is happening is quite simple - the queries point exactly to the unique constraint.

Postgres:
UPDATE %s SET ... WHERE instance_id=%lu AND service_object_id=%lu AND start_time=%s AND start_time_usec=%lu

Oracle:
MERGE INTO %s USING DUAL ON (instance_id=%lu AND service_object_id=%lu AND start_time=%s AND start_time_usec=%lu) WHEN MATCHED THEN UPDATE SET ... WHEN NOT MATCHED THEN INSERT ... VALUES ...

Conclusion to that is that it is a MySQL problem regarding the defined keys and the ON DUPLICATE KEY statement.

To step further into that, I've only selected the columns which are necessary to see the difference::

mysql> select servicecheck_id, service_object_id, start_time, start_time_usec, end_time, end_time_usec, command_object_id, execution_time from icinga_servicechecks;

-----------------------------------*------------------------------------~~-----------------------------------~~------------------*---------------+
| servicecheck_id | service_object_id | start_time | start_time_usec | end_time | end_time_usec | command_object_id | execution_time |

-----------------------------------*------------------------------------~~-----------------------------------~~------------------*---------------+

This block should be have been updated e.g.

521 6421 2009-11-03 10:34:58 245848 1970-01-01 01:00:00 0 6193 0
522 6422 2009-11-03 10:34:58 339129 1970-01-01 01:00:00 0 6193 0
523 6423 2009-11-03 10:35:02 188397 1970-01-01 01:00:00 0 6193 0
524 6424 2009-11-03 10:35:02 329014 1970-01-01 01:00:00 0 6193 0
525 6425 2009-11-03 10:35:03 91160 1970-01-01 01:00:00 0 6193 0
526 6307 2009-11-03 10:35:03 305725 1970-01-01 01:00:00 0 6193 0
527 6426 2009-11-03 10:35:04 57256 1970-01-01 01:00:00 0 6193 0
528 6427 2009-11-03 10:35:04 150767 1970-01-01 01:00:00 0 6193 0
529 6429 2009-11-03 10:35:05 172237 1970-01-01 01:00:00 0 6193 0
530 6428 2009-11-03 10:35:05 241257 1970-01-01 01:00:00 0 6193 0

But it doesn't, you can see the INSERT statements right after a while in the history.

531 6421 2009-11-03 10:34:58 245848 2009-11-03 10:35:02 798210 0 4.55236
532 6422 2009-11-03 10:34:58 339129 2009-11-03 10:35:02 801988 0 4.46286
533 6430 2009-11-03 10:35:16 721993 1970-01-01 01:00:00 0 6193 0
534 6423 2009-11-03 10:35:02 188397 2009-11-03 10:35:06 395320 0 4.20692
535 6424 2009-11-03 10:35:02 329014 2009-11-03 10:35:06 786733 0 4.45772
536 6307 2009-11-03 10:35:03 305725 2009-11-03 10:35:08 45368 0 4.73964
537 6425 2009-11-03 10:35:03 91160 2009-11-03 10:35:08 389047 0 5.29789
538 6426 2009-11-03 10:35:04 57256 2009-11-03 10:35:08 674087 0 4.61683
539 6427 2009-11-03 10:35:04 150767 2009-11-03 10:35:09 322330 0 5.17156
540 6429 2009-11-03 10:35:05 172237 2009-11-03 10:35:09 524681 0 4.35244
541 6428 2009-11-03 10:35:05 241257 2009-11-03 10:35:09 900865 0 4.65961
542 6431 2009-11-03 10:35:17 26969 1970-01-01 01:00:00 0 6193 0
543 6432 2009-11-03 10:35:17 133886 1970-01-01 01:00:00 0 6193 0
544 6430 2009-11-03 10:35:16 721993 2009-11-03 10:35:20 996513 0 4.27452
545 6431 2009-11-03 10:35:17 26969 2009-11-03 10:35:21 391540 0 4.36457
546 6432 2009-11-03 10:35:17 133886 2009-11-03 10:35:21 396431 0 4.26255

Let's focus on service_object_id=6421 (i've cleared the IP address)

The first query generates the entry where end_time is 0 ... FROM_UNIXTIME(0), 0 ... start_time is 1257240898 (= 2009-11-03 10:34:58 GMT+1)

521 6421 2009-11-03 10:34:58 245848 1970-01-01 01:00:00 0 6193 0

[1257240916.978113] [002.0] [pid=2673] INSERT INTO icinga_servicechecks (instance_id, service_object_id, check_type, current_check_attempt, max_check_attempts, state, state_type, start_time, start_time_usec, end_time, end_time_usec, timeout, early_timeout, execution_time, latency, return_code, output, long_output, perfdata, command_object_id, command_args, command_line) VALUES (1, 6421, 0, 1, 3, 0, 1, FROM_UNIXTIME(1257240898), 245848, FROM_UNIXTIME(0), 0, 60, 0, 0.000000, 0.245000, 0, 'PING OK - Packet loss = 0, RTA = 17\.43 ms', '', '', 6193, '100\.0,20!500\.0,60', '/opt/icinga/libexec/check_ping -H 1\.2\.3\.4 -w 100\.0,20 -c 500\.0,60% -p 5 4') ON DUPLICATE KEY UPDATE check_type='0', current_check_attempt='1', max_check_attempts='3', state='0', state_type='1', start_time=FROM_UNIXTIME(1257240898), start_time_usec='245848', end_time=FROM_UNIXTIME(0), end_time_usec='0', timeout='60', early_timeout='0', execution_time='0.000000', latency='0.245000', return_code='0', output='PING OK Packet loss = 0%, RTA = 17\.43 ms', long_output='', perfdata=''

For that check, we are waiting for end_time and end_time_usec and truly need the UPDATE condition, the INSERT will create this faulty and duplicate entry (short version!).

531 6421 2009-11-03 10:34:58 245848 2009-11-03 10:35:02 798210 0 4.55236

[1257240917.081916] [002.0] [pid=2673] INSERT INTO icinga_servicechecks (instance_id, service_object_id, check_type, current_check_attempt, max_check_attempts, state, state_type, start_time, start_time_usec, end_time, end_time_usec, timeout, early_timeout, execution_time, latency, return_code, output, long_output, perfdata, command_object_id, command_args, command_line) VALUES (1, 6421, 0, 1, 3, 0, 1, FROM_UNIXTIME(1257240898), 245848, FROM_UNIXTIME(1257240902), 798210, 60, 0, 4.552360, 0.245000, 0, 'PING OK - Packet loss = 0, RTA = 75\.77 ms', '', '', 0, '', '') ON DUPLICATE KEY UPDATE check_type='0', current_check_attempt='1', max_check_attempts='3', state='0', state_type='1', start_time=FROM_UNIXTIME(1257240898), start_time_usec='245848', end_time=FROM_UNIXTIME(1257240902), end_time_usec='798210', timeout='60', early_timeout='0', execution_time='4.552360', latency='0.245000', return_code='0', output='PING OK - Packet loss = 0, RTA = 75\.77 ms', long_output='', perfdata=''

As you may see on a full output on that row, the INSERT puts empty command_line, empty command_object_id etc into the DB which is wrong.

Now let's compare on the unique constraint, where ON DUPLICATE KEY shoud trigger on the UPDATE statement:

instance_id - same!
service_object_id - same!
start_time - same!

Well regarding the fact that the unique constraint is missing and instead normal keys are defined, let's turn the other way around - why are the hostchecks running fine without duplicate entries? Hmmm, because the unique key is defined in a correct manner.

servicechecks:
KEY `instance_id` (`instance_id`),
KEY `service_object_id` (`service_object_id`),
KEY `start_time` (`start_time`)

hostchecks:
UNIQUE KEY `instance_id` (`instance_id`,`host_object_id`,`start_time`,`start_time_usec`)

Whoever did that to the SQL-Statement during table creation......

Correct way of getting that to work will be defining a real unique key and then ON DUPLICATE KEY will trigger the UPDATE statement.

I will work on a patch and provide it when finished testing.

Kind regards,
Michael

@icinga-migration

This comment has been minimized.

Copy link
Member Author

@icinga-migration icinga-migration commented Nov 3, 2009

Updated by mfriedrich on 2009-11-03 12:01:07 +00:00

  • Status changed from New to Resolved
  • Done % changed from 0 to 100

resolution is by defining the unique key instead of the keys.

ALTER table icinga_servicechecks DROP KEY instance_id;
ALTER table icinga_servicechecks DROP KEY service_object_id;
ALTER table icinga_servicechecks DROP KEY start_time;

ALTER table icinga_servicechecks ADD UNIQUE KEY `instance_id` (`instance_id`,`service_object_id`,`start_time`,`start_time_usec`);

mysql> show keys from icinga_servicechecks;
+----------------------+------------+-------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table                | Non_unique | Key_name    | Seq_in_index | Column_name       | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------------------+------------+-------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| icinga_servicechecks |          0 | PRIMARY     |            1 | servicecheck_id   | A         |           0 |     NULL | NULL   |      | BTREE      |         | 
| icinga_servicechecks |          0 | instance_id |            1 | instance_id       | A         |        NULL |     NULL | NULL   |      | BTREE      |         | 
| icinga_servicechecks |          0 | instance_id |            2 | service_object_id | A         |        NULL |     NULL | NULL   |      | BTREE      |         | 
| icinga_servicechecks |          0 | instance_id |            3 | start_time        | A         |        NULL |     NULL | NULL   |      | BTREE      |         | 
| icinga_servicechecks |          0 | instance_id |            4 | start_time_usec   | A         |           0 |     NULL | NULL   |      | BTREE      |         | 
+----------------------+------------+-------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
5 rows in set (0.00 sec)

then getting those queries

[1257248851.261462] [002.0] [pid=17833] INSERT INTO icinga_servicechecks (instance_id, service_object_id, check_type, current_check_attempt, max_check_attempts, state, state_type, start_time, start_time_usec, end_time, end_time_usec, timeout, early_timeout, execution_time, latency, return_code, output, long_output, perfdata, command_object_id, command_args, command_line) VALUES (1, 6540, 0, 1, 3, 0, 1, FROM_UNIXTIME(1257248851), 233434, FROM_UNIXTIME(0), 0, 60, 0, 0.000000, 0.233000, 0, 'PING OK - Packet loss = 0%, RTA = 1\.26 ms', '', '', 6512, '150\.0,20%!500\.0,60%', '/opt/icinga/libexec/check_ping -H host\.at -w 150\.0,20% -c 500\.0,60% -p 5 -4') ON DUPLICATE KEY UPDATE check_type='0', current_check_attempt='1', max_check_attempts='3', state='0', state_type='1', start_time=FROM_UNIXTIME(1257248851), start_time_usec='233434', end_time=FROM_UNIXTIME(0), end_time_usec='0', timeout='60', early_timeout='0', execution_time='0.000000', latency='0.233000', return_code='0', output='PING OK - Packet loss = 0%, RTA = 1\.26 ms', long_output='', perfdata=''

this one updates only endtime and so on.

[1257248856.188902] [002.0] [pid=17833] INSERT INTO icinga_servicechecks (instance_id, service_object_id, check_type, current_check_attempt, max_check_attempts, state, state_type, start_time, start_time_usec, end_time, end_time_usec, timeout, early_timeout, execution_time, latency, return_code, output, long_output, perfdata, command_object_id, command_args, command_line) VALUES (1, 6540, 0, 1, 3, 0, 1, FROM_UNIXTIME(1257248851), 233434, FROM_UNIXTIME(1257248855), 341614, 60, 0, 4.108180, 0.233000, 0, 'PING OK - Packet loss = 0%, RTA = 1\.30 ms', '', '', 0, '', '') ON DUPLICATE KEY UPDATE check_type='0', current_check_attempt='1', max_check_attempts='3', state='0', state_type='1', start_time=FROM_UNIXTIME(1257248851), start_time_usec='233434', end_time=FROM_UNIXTIME(1257248855), end_time_usec='341614', timeout='60', early_timeout='0', execution_time='4.108180', latency='0.233000', return_code='0', output='PING OK - Packet loss = 0%, RTA = 1\.30 ms', long_output='', perfdata=''

and here we go:

|             923 |              6540 | 2009-11-03 12:47:31 |          233434 | 2009-11-03 12:47:35 |        341614 |              6512 |        4.10818 |

modified in mysql.sql and added mysql-upgrade-1.0.sql with the ALTER statements (to prepare for stable release).

already pushed to master!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant
You can’t perform that action at this time.