Skip to content

Commit

Permalink
Merge pull request #686 from Icinga/missing-history-index
Browse files Browse the repository at this point in the history
Add a correct composite `INDEX` for the history table
  • Loading branch information
julianbrost committed Apr 8, 2024
2 parents d9dc16d + 79d6f7e commit 0a9f5f1
Show file tree
Hide file tree
Showing 5 changed files with 75 additions and 3 deletions.
64 changes: 64 additions & 0 deletions doc/04-Upgrading.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,70 @@
Specific version upgrades are described below. Please note that version upgrades are incremental.
If you are upgrading across multiple versions, make sure to follow the steps for each of them.

## Upgrading to Icinga DB v1.1.2

Please apply the `1.1.2.sql` upgrade script to your database. For package installations, you can find this file at
`/usr/share/icingadb/schema/mysql/upgrades/` or `/usr/share/icingadb/schema/pgsql/upgrades/`, depending on your
database vendor.

!!! warning

With MySQL and MariaDB, a locking issue can occur if the schema upgrade is applied while the history view is
accessed in Icinga DB Web. This can result in the upgrade being delayed unnecessarily and blocking other queries.
Please see [unblock history tables](#unblock-history-tables) for how to detect and resolve this situation.

### Unblock History Tables

!!! info

You don't need to read this section if you are using PostgreSQL. This applies to MySQL/MariaDB users only.

In order to fix a loading performance issue of the history view in Icinga DB Web, this upgrade script adds an
appropriate index on the `history` table. Creating this new index normally takes place without blocking any other
queries. However, this may hang for a relatively considerable time, blocking all Icinga DB queries on all`*_history`
tables and the `history` table inclusively if there is an ongoing, long-running query on the `history` table. One way
of causing this to happen is if an Icinga Web user accesses the `icingadb/history` view just before you are running
this script. Depending on how many entries you have in the history table, Icinga DB Web may take quite a long time to
load, until your web servers timeout (if any) kicks in.

When you observe that the upgrade script has been taking unusually long (`> 60s`) to complete, you can perform the
following analysis on another console and unblock it if necessary. It is important to note though that the script may
need some time to perform the reindexing on the `history` table even if it is not blocked. Nonetheless, you can use the
`show processlist` command to determine whether an excessive number of queries have been stuck in a waiting state.

```
MariaDB [icingadb]> show processlist;
+------+-----+-----+----------+-----+------+---------------------------------+------------------------------------+-----+
| Id | ... | ... | db | ... | Time | State | Info | ... |
+------+-----+-----+----------+-----+------+---------------------------------+------------------------------------+-----+
| 1475 | ... | ... | icingadb | ... | 1222 | Waiting for table metadata lock | INSERT INTO "notification_history" | ... |
| 1485 | ... | ... | icingadb | ... | 1262 | Creating sort index | SELECT history.id, history.... | ... |
| 1494 | ... | ... | icingadb | ... | 1224 | Waiting for table metadata lock | ALTER TABLE history ADD INDEX ... | ... |
| 1499 | ... | ... | icingadb | ... | 1215 | Waiting for table metadata lock | INSERT INTO "notification_history" | ... |
| 1500 | ... | ... | icingadb | ... | 1215 | Waiting for table metadata lock | INSERT INTO "state_history" ... | ... |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
+------+-----+-----+----------+-----+------+---------------------------------+------------------------------------+-----+
```

In the above output are way too many Icinga DB queries, including the `ALTER TABLE history ADD INDEX` query from the
upgrade script, waiting for a metadata lock, they are just minimised to the bare essentials. Unfortunately, only one of
these queries is holding the `table metadata lock` that everyone else is now waiting for, which in this case is a
`SELECT` statement initiated by Icinga DB Web in the `icingadb/history` view, which takes an unimaginably long time.
Note that there might be multiple `SELECT` statements started before the upgrade script in your case when the history
view of your Icinga DB Web is opened by different Icinga Web users at the same time.

You can now either just wait for the `SELECT` statements to finish by themselves and let them block the upgrade script
and all Icinga DB queries on all `*_history` tables or forcibly terminate them and let the remaining queries do their
work. In this case, cancelling that one blocking `SELECT` query will let the upgrade script continue normally without
blocking any other queries.
```
MariaDB [icingadb]> kill 1485;
```
In case you are insecure about which Icinga DB Web queries are blocking, you may simply cancel all long-running
`SELECT` statements listed with `show processlist` (see column `Time`). Cancelling a `SELECT` query will neither
crash Icinga DB nor corrupt your database, so feel free to abort every single one of them matching the Icinga DB
database (see column `db`).

## Upgrading to Icinga DB v1.1.1

Please apply the `1.1.1.sql` upgrade script to your database.
Expand Down
2 changes: 1 addition & 1 deletion schema/mysql/schema.sql
Original file line number Diff line number Diff line change
Expand Up @@ -1289,7 +1289,7 @@ CREATE TABLE history (
CONSTRAINT fk_history_notification_history FOREIGN KEY (notification_history_id) REFERENCES notification_history (id) ON DELETE CASCADE,
CONSTRAINT fk_history_state_history FOREIGN KEY (state_history_id) REFERENCES state_history (id) ON DELETE CASCADE,

INDEX idx_history_event_time (event_time) COMMENT 'History filtered/ordered by event_time',
INDEX idx_history_event_time_event_type (event_time, event_type) COMMENT 'History filtered/ordered by event_time/event_type',
INDEX idx_history_acknowledgement (acknowledgement_history_id),
INDEX idx_history_comment (comment_history_id),
INDEX idx_history_downtime (downtime_history_id),
Expand Down
3 changes: 3 additions & 0 deletions schema/mysql/upgrades/1.1.2.sql
Original file line number Diff line number Diff line change
@@ -1 +1,4 @@
UPDATE icingadb_schema SET timestamp = UNIX_TIMESTAMP(timestamp / 1000) * 1000 WHERE timestamp > 20000000000000000;

ALTER TABLE history ADD INDEX idx_history_event_time_event_type (event_time, event_type) COMMENT 'History filtered/ordered by event_time/event_type';
ALTER TABLE history DROP INDEX idx_history_event_time;
4 changes: 2 additions & 2 deletions schema/pgsql/schema.sql
Original file line number Diff line number Diff line change
Expand Up @@ -2074,7 +2074,7 @@ ALTER TABLE history ALTER COLUMN comment_history_id SET STORAGE PLAIN;
ALTER TABLE history ALTER COLUMN flapping_history_id SET STORAGE PLAIN;
ALTER TABLE history ALTER COLUMN acknowledgement_history_id SET STORAGE PLAIN;

CREATE INDEX idx_history_event_time ON history(event_time);
CREATE INDEX idx_history_event_time_event_type ON history(event_time, event_type);
CREATE INDEX idx_history_acknowledgement ON history(acknowledgement_history_id);
CREATE INDEX idx_history_comment ON history(comment_history_id);
CREATE INDEX idx_history_downtime ON history(downtime_history_id);
Expand All @@ -2095,7 +2095,7 @@ COMMENT ON COLUMN history.comment_history_id IS 'comment_history.comment_id';
COMMENT ON COLUMN history.flapping_history_id IS 'flapping_history.id';
COMMENT ON COLUMN history.acknowledgement_history_id IS 'acknowledgement_history.id';

COMMENT ON INDEX idx_history_event_time IS 'History filtered/ordered by event_time';
COMMENT ON INDEX idx_history_event_time_event_type IS 'History filtered/ordered by event_time/event_type';
COMMENT ON INDEX idx_history_host_service_id IS 'Host/service history detail filter';

CREATE TABLE sla_history_state (
Expand Down
5 changes: 5 additions & 0 deletions schema/pgsql/upgrades/1.1.2.sql
Original file line number Diff line number Diff line change
Expand Up @@ -137,3 +137,8 @@ BEGIN
RETURN (100 * (total_time - problem_time)::decimal / total_time)::decimal(7, 4);
END;
$$;

CREATE INDEX CONCURRENTLY idx_history_event_time_event_type ON history(event_time, event_type);
COMMENT ON INDEX idx_history_event_time_event_type IS 'History filtered/ordered by event_time/event_type';

DROP INDEX idx_history_event_time;

0 comments on commit 0a9f5f1

Please sign in to comment.