/var/log/icinga2/icinga2.log is growing very fast on satellites #6542

stevie-sy · 2018-08-10T15:19:13Z

We have a HA configuration with 2 master nodes, 2 satellites in the Network Zone and 2 satellites in the Server Zone. For configuration we use the director.

Since some time - we noticed it after upgrade to Version 2.9.1 - that icinga2.log ist growing and growing very fast from one point to another. But only on the satellites in the Network Zone. It seems it happend after deployment with the director.

For example: that are times when there was a deplolyment today:

That is the visualisation with grafana where you can see the growing of the log file:

and direct in the bash:

I think the problem are the ServiceSets which wie generate with the director (see also Icinga/icingaweb2-module-director#1589). Because it will deploy them at every node. So if the ServerSet is for example for the Server Zone, the satellite in the Network Zone won't find the devices to apply. And if I take a look into the icinga2.log at the time of the depolyment ist is full of Messages like this:

the only work arround is to empty the file with
> icinga2.log

Your Environment

•Director version (System - About): Git Master 71ad855
•Icinga Web 2 version and modules (System - About): 2.6.1
•Icinga 2 version (icinga2 --version): r2.9.1-1
•Operating System and version: CentOS 7
•Webserver, PHP versions: Apache 2.4.6-80.el7, rh-php 7.1.8-1.el7

dbu1986 · 2018-08-17T08:04:27Z

Awesome, opening an issue for this has been on my agenda, too.
+1!

Same here:

Your Environment

Director version (System - About): 1.5.1
Icinga Web 2 version and modules (System - About): 2.6.1
Icinga 2 version (icinga2 --version): 2.9.1
Operating System and version: RHEL 7.5

Currently we are "handling" this problem by creating a dedicated check_disk service for the logging volume /var, which executes an EventCommand @ warning free space <15%. The EventCommand calls a shell script which executes: > icinga2.log

Rotating & zipping the log may be another approach to handle this, depending on the need to keep the logs.

dbu1986 · 2018-08-20T07:38:26Z

More thoughts on this:

The commits are not exclusively triggered by Director's Deployments. I triggered 4 Deploys this morning:

The icinga2.log counts much more sets of Apply Rule mismatches (generally headed by the message 'Commiting config item(s)', so i grepped for it):

[root@satellite_z1n1 icinga2]# grep 'Committing config item' icinga2.log | wc -l
421

Each commit triggers 690 warnings of Apply Rule mismatches on one of our satellites in Zone A, which seems valid, since theses services need to be applied to hosts in a different Zone B. The log on a sattelite in Zone B looks vice versa: 294 warnings of Apply Rule mismatches, because these services need to be deployed at Zone A.

It may also be helpful to know the reason for the huge amount of commits.

stevie-sy · 2018-08-20T12:54:03Z

@dbu1986 a script for deleting the log file would be a work-around, we thought also in our department. But it first it would be nice if this would be patched and like you wrote why this happend.

dnsmichi · 2018-08-22T10:17:59Z

When apply rules are configured, and they do not match anywhere, the user should be made aware of them. This is a general principle with Icinga and the config compiler, and has been sitting there since 2.0. I'm not sure which change has had an impact on generating more of such non-matching apply rules.

In any case, I would analyse where these apply rules source from, and why they are deployed into the satellite node. Highly likely they do exist as global zone and are synced to any cluster node, or they are wrongly pushed to the satellite via cluster config sync and the corresponding zone.

Removing the configuration warning, or changing its verbosity level is not an option here, as this would remove a well known configuration validation feature used in many places.

stevie-sy · 2018-08-22T12:39:08Z

@dnsmichi thanks for your answer.
I think the "problem" in our case is the director. Because he depolys the ServerSets-configuration into /var/lib/icinga2/api/zones/director-global/director/servicesets.conf on every node in the HA-Cluster.
But why the icinga produce in a couple of minutes so a big log file is the mystery. Maybe you can fix this in combination with the director. Like my Suggestion in the Issue here: Icinga/icingaweb2-module-director#1589

Thomas-Gelf · 2018-08-22T12:51:58Z

I'm currently on vacation, just some quick thoughts on this:

when there are 4 Director-triggered deployments on a single day, but 420 restarts (if I correctly understood your numbers), then there is something weird going on
690 warnings for every one of those 420 restarts may account for 30MB log entries, but never for 11 GB
the linked feature request for the Director is reasonable and can be implemented, it will however not solve your main issue here

Cheers,
Thomas

dbu1986 · 2018-08-22T13:02:21Z

When apply rules are configured, and they do not match anywhere, the user should be made aware of them.

Yes, absolutely! I highly appreciate these messages and it's not my intention to get them suppressed.

The Servicesets are created via Director, and deployed via global zone "director-global". Here is an example:

[2018-08-22 13:16:40 +0200] warning/ApplyRule: Apply rule 'SERVICENAME' (in /var/lib/icinga2/api/zones/director-global/director/servicesets.conf: 13096:1-13096:58) for type 'Service' does not match anywhere!

This seems to be standard behaviour. Director currently does not offer an option to set the zone per Serviceset (Icinga/icingaweb2-module-director#1589).

In my environment the global zone "director-global" is configured on both master nodes and all satellite nodes (2 satellites per zone). Aside of that I have a global zone "globalconfig" configured on all nodes (Masters, Satellites, and client hosts) for deploying Director's CheckCommands. The intention of using these two global zones was to avoid non-matching Apply Rule warnings at all client hosts.

However, I'm confident that the logs will get much smaller if we get the deploying of apply rules solved, I think that is a side effect. But I'm not sure if we have another issue here which causes the high rate of commits. Anyway there still would be a questionble high rate of "Committing config item(s)" mentions in the log.

We also have a problem of reload looping. Sorry for not mentioning earlier, it just came to my mind. At this moment the icinga2.log's look like this:

MasterA (configuration master):

...
[2018-08-22 14:16:24 +0200] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-08-22 14:16:24 +0200] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-08-22 14:16:24 +0200] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-08-22 14:16:24 +0200] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-08-22 14:16:24 +0200] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-08-22 14:16:24 +0200] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-08-22 14:16:24 +0200] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-08-22 14:16:24 +0200] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-08-22 14:16:24 +0200] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
...

MasterB:

...
[2018-08-22 14:14:21 +0200] information/ConfigItem: Committing config item(s).
[2018-08-22 14:14:21 +0200] information/ConfigItem: Instantiated 1 Comment.
[2018-08-22 14:14:21 +0200] information/ConfigItem: Triggering Start signal for config items
[2018-08-22 14:14:21 +0200] information/ConfigItem: Activated all objects.
[2018-08-22 14:14:21 +0200] information/ConfigItem: Committing config item(s).
[2018-08-22 14:14:21 +0200] information/ConfigItem: Instantiated 1 Comment.
[2018-08-22 14:14:21 +0200] information/ConfigItem: Triggering Start signal for config items
[2018-08-22 14:14:21 +0200] information/ConfigItem: Activated all objects.
[2018-08-22 14:14:21 +0200] information/ConfigItem: Committing config item(s).
[2018-08-22 14:14:21 +0200] information/ConfigItem: Instantiated 1 Comment.
[2018-08-22 14:14:21 +0200] information/ConfigItem: Triggering Start signal for config items
[2018-08-22 14:14:21 +0200] information/ConfigItem: Activated all objects.
[2018-08-22 14:14:21 +0200] information/ConfigItem: Committing config item(s).
[2018-08-22 14:14:21 +0200] information/ConfigItem: Instantiated 1 Comment.
[2018-08-22 14:14:21 +0200] information/ConfigItem: Triggering Start signal for config items
[2018-08-22 14:14:21 +0200] information/ConfigItem: Activated all objects.
...

satellite_z1n1 & satellite_z1n2:

same as on MasterB, but additionally with all the apply rules mismatch warnings. That may be the reason for their huge logs.

Any advice, how I can get this under control? I've been struggling for days trying to debug when this starts and when it ends. Currently I'm establishing another separate environment for testing our config, but for now I don't have any clue where this is coming from.

stevie-sy · 2018-08-22T13:13:37Z

@Thomas-Gelf thanks for the answer. Maybe it is a mixture of more things? But I think it reduces the warnings in the logs, if the director deploy the servicesets in the correct zones. For example in our case if the serviceset will deploy it only for the server-zone and not for the network Zone and vice versa.

One information for you: in our configuration we import the hosts every 2 hours from our CMDB via CSV. If there are changes the directore will sync. Sometimes we have to do some manual configuration.

But why icinga produce so much warnings is the other point.

For now: enjoy your holidays ;-) you deserve it! You are doing a good job with the director!

Thomas-Gelf · 2018-08-22T13:20:35Z

@stevie-sy: glad to hear that, thank you! Just wanted to point you to that discrepancy in your numbers, you should try to dig deeper. Eventually you can track down a looping behavior similar to what @dbu1986 describes. Your log file is so huge, take an average log line length and do the math on your own. Regular deployments can hardly account for even 0.1% of that data.

dnsmichi · 2018-09-14T11:20:36Z

I'm wondering why these messages are coming in from the satellite to the master. This should have been denied earlier.

[2018-08-22 14:16:24 +0200] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.

https://github.com/Icinga/icinga2/blob/master/lib/remote/apilistener-configsync.cpp#L79

Please show the Endpoint and Zone configuration on your master, including all the satellites.

dbu1986 · 2018-09-17T11:14:49Z

6542 zonesconf.zip

stevie-sy · 2018-10-29T12:44:07Z

I want to give you a small update to this Issue:
After updating to the newest Version of Icinga (2.10.x) and the latest version of the director in the master branch we didn't got this behavior again. We also startet with a new and empty director database.

Maybe this [(https://github.com//pull/6663]) from the update 2.10.1 helped.

If @dbu1986 didn't get this Problem also again, maybe this could be closed.

dnsmichi · 2018-10-29T12:49:43Z

Likely it helps a bit if you have many created objects and non-matching apply rules. It popped into my mind how to silence these logged lines, so I just did it.

stevie-sy · 2018-10-29T13:00:59Z

And you did a good job (y).
if @Thomas-Gelf can also supress this with the director like my suggestion, that would also help I think. If it happen again than it need some more analysis and investigation like you wrote before.

dnsmichi · 2018-10-29T13:06:35Z

The thing I'm still curious about is that log line

[2018-08-22 14:16:24 +0200] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.

This must never happen, and I want to know how to reproduce this. @dbu1986 Unfortunately the zones configuration doesn't tell me much, how's the configuration deployed and synced top down, what exactly causes this behaviour?

dbu1986 · 2018-11-05T13:51:11Z

@dnsmichi

https://github.com/Icinga/icinga2/blob/master/lib/remote/apilistener-configsync.cpp#L79

I assume you are reffering to #L224 ? The log message might be related to the ConfigDeleteObjectAPIHandler, since here no further details on the discarded object are being written into the log (following origin->FromClient->GetIdentity() ).
Could you provide further details on what Icinga2 is attempting to do here?

, what exactly causes this behaviour?

I'd desperately love to reveal that, too 😄 Unfortunately I have no idea what triggers it and how to find a point to start from.

However, I'm not sure if I got your question well, but as mentioned above, we're using HA distributed monitoring infrastructure in top/down mode, containing 2 Master, 3 Zones with 2 Satellites each, 4 Zones with 1 Satellite each, and a load of clients per Zone. Most of the objects are managed by Director (~10.000 Services, ~800 Hosts), plus ~5.000 Services created as runtime objects via REST-API on the config master, plus <20 Services defined in a configfile on the config master.

Additionally we are using a ticket system, fed by Icinga2 Notifications (passed in a perl script via REST API), and as a return path the ticket system sends Acknowledgements and Comments back to Icinga2 asynchronously.

One thing coming up my mind is the discrepancy between the timestamps of these messages. Today's messages as an example:
MasterA:

[root@MasterA ~]# grep "Discarding.*from 'satellite_z1n1" /var/log/icinga2/icinga2.log | head -40
[2018-11-05 04:06:40 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:06:53 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:08:02 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:08:06 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:08:53 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:09:16 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:09:20 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:09:24 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:09:31 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:09:51 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:09:52 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:15:51 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:22:28 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:22:50 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:23:51 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:24:35 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:24:49 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:25:02 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:25:10 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:25:43 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:26:05 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:26:15 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:26:41 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:30:23 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:39:10 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:39:39 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:44:47 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:48:54 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:49:04 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:49:05 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:55:37 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:56:42 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 04:59:50 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 05:05:16 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 05:18:12 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 05:20:43 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 05:24:39 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 05:39:24 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 05:42:23 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[2018-11-05 05:55:07 +0100] warning/ApiListener: Discarding 'config update object' message from 'satellite_z1n1'.
[root@MasterA ~]#

satellite_z1n1:

[root@satellite_z1n1 ~]# grep "Sending config updates for endpoint 'MasterA" /var/log/icinga2/icinga2.log
[2018-11-05 08:53:44 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 08:54:05 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 08:58:30 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 08:58:48 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 09:01:16 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 09:01:35 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 09:14:51 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 09:15:10 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 09:22:30 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 09:22:52 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 10:06:06 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 10:06:23 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 10:29:28 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 10:37:54 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 10:37:54 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 10:45:04 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 10:48:34 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 10:51:28 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 11:22:48 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 11:23:05 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 11:29:26 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 11:29:45 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 12:20:54 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 12:21:12 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 12:26:17 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 12:26:36 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 12:55:54 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[2018-11-05 14:12:35 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.
[root@satellite_z1n1 ~]#

I somehow expected the message timestamps to nearly match to each other, but the 'discarding' messages at the master started around 04:06 and the 'sending config' messages at the satellite started much later, around 08:53. So there may be config updates on the satellite, that I am totally missing. Director's deployment log is listing today's first deployment at 07:52. And the icinga2 system processes are up and running since a couple of days, so I can also exclude a restart of Icinga2.

A few log lines from satellite_z1n1 before the first 'sending config' message to MasterA:

[2018-11-05 08:53:25 +0100] information/WorkQueue: #10 (JsonRpcConnection, #1) items: 1, rate: 111.45/s (6687/min 32724/5min 98056/15min);
[2018-11-05 08:53:26 +0100] information/ApiListener: New client connection for identity 'q4dee1syw4s' from [139.1.67.48]:44861 (no Endpoint object found for identity)
[2018-11-05 08:53:26 +0100] information/JsonRpcConnection: Received certificate request for CN 'q4dee1syw4s' signed by our CA.
[2018-11-05 08:53:26 +0100] information/JsonRpcConnection: The certificate for CN 'q4dee1syw4s' is valid and uptodate. Skipping automated renewal.
[2018-11-05 08:53:27 +0100] information/ApiListener: New client connection for identity 'linclutest01' from [139.1.67.126]:33682 (no Endpoint object found for identity)
[2018-11-05 08:53:31 +0100] information/ApiListener: New client connection for identity 'clientvm250509' from [139.1.66.123]:55057 (no Endpoint object found for identity)
[2018-11-05 08:53:38 +0100] information/WorkQueue: #6 (ApiListener, SyncQueue) items: 0, rate:  0/s (0/min 0/5min 0/15min);
[2018-11-05 08:53:40 +0100] information/ApiListener: New client connection for identity 'cldvm00004' from [172.30.14.11]:44500 (no Endpoint object found for identity)
[2018-11-05 08:53:40 +0100] information/JsonRpcConnection: Received certificate request for CN 'cldvm00004' signed by our CA.
[2018-11-05 08:53:40 +0100] information/JsonRpcConnection: The certificate for CN 'cldvm00004' is valid and uptodate. Skipping automated renewal.
[2018-11-05 08:53:40 +0100] warning/TlsStream: TLS stream was disconnected.
[2018-11-05 08:53:40 +0100] warning/JsonRpcConnection: API client disconnected for identity 'MasterA'
[2018-11-05 08:53:40 +0100] warning/ApiListener: Removing API client for endpoint 'MasterA'. 0 API clients left.
[2018-11-05 08:53:44 +0100] information/ApiListener: New client connection for identity 'MasterA' from [139.1.170.8]:39106
[2018-11-05 08:53:44 +0100] information/ApiListener: Requesting new certificate for this Icinga instance from endpoint 'MasterA'.
[2018-11-05 08:53:44 +0100] information/ApiListener: Sending config updates for endpoint 'MasterA' in zone 'Z_Master'.

Could the 'API client disconnected' message be related to this?

hardoverflow · 2018-11-16T09:05:25Z

We have the same problem here.

Your Environment

Icinga 2 version (icinga2 --version): r2.10.2-1
Operating System and version: CentOS 7
Icinga Web 2 version: 2.6.1

[2018-11-16 06:55:00 +0100] information/ConfigObjectUtility: Created and activated object 'host.example.com!Service Name!8b488fc1-26a3-47cd-bfbe-97010e9a9ddd' of type 'Downtime'.

This is flooding our logs.

cat /var/log/icinga2/icinga2.log | grep "Created and activated object" | wc -l
277086

dnsmichi · 2018-11-19T07:46:19Z

Are you syncing the ScheduledDowntime objects via global zone to that satellite? Those downtime creation messages can either originate from REST API creation calls (or the legacy command pipe) or they are created from those SD objects.

dnsmichi · 2018-11-22T07:56:19Z

For further tests, please remove any ScheduledDowntime objects/apply rules from your configuration, and test again. I'd like to know whether this has an influence here or not.

Al2Klimov · 2018-11-23T19:17:33Z

@lippserd @dnsmichi After lots of other tasks, tech. problems and (uncached) Icinga 2 builds I've finally set up a playground for this and tried to reproduce the issue. Either I've overseen an elephant (again) or there's something really interesting (and creepy) with that ScheduledDowntimes.

@dnsmichi Let's talk offline about the tech. details on monday.

Al2Klimov · 2018-11-27T16:34:35Z

The ScheduledDowntimes by themselves purr like a cat. But the created downtimes disappear suddenly.

Al2Klimov · 2018-11-28T09:42:19Z

The web app shown here polls the Icinga 2 API all the time and appends new downtimes to its list and colors them green. Once a downtime disappear, it colors it red.

refs #6542

dnsmichi · 2018-12-07T09:08:00Z

This seems to be the following case:

Downtime -> ExpiredHandler() kicks in after 60 seconds
GetObject in HasValidConfigOwner() returns are nullptr
Downtime is deleted
SD detects that there is downtime missing, so it again creates one

Both calls lead into lib/remote/configobjectutility.cpp where CreateObject() always logs that it has created a runtime object (a downtime in this case). When a downtime is deleted, nothing is logged.

Therefore a local loop exists which immediately stops on restart, but then begins again after the expired downtime handler kicks in.

It doesn't matter whether the zone is the master or the satellite, creation and deletion happens locally.
If the configuration activation is fast enough this might not happen.

refs #6542

refs #6542 (cherry picked from commit bbcf469)

refs #6542 (cherry picked from commit 39e0d78)

refs #6542 (cherry picked from commit 5431eae)

refs #6542 (cherry picked from commit c98d6a5)

dnsmichi · 2019-02-11T12:49:38Z

@ALL

Can you please test the snapshot packages, it includes all referenced issues and PRs.

Thanks,
Michael

dnsmichi · 2019-02-11T13:22:04Z

ref/NC/585559
ref/IP/9673

dnsmichi · 2019-02-25T12:51:21Z

ref/IP/9673 reported this working as expected. Since there's no more feedback here, I consider this being resolved.

dnsmichi added question area/configuration DSL, parser, compiler, error handling area/log Logging related labels Aug 22, 2018

dnsmichi added the needs feedback We'll only proceed once we hear from you again label Sep 14, 2018

Al2Klimov self-assigned this Nov 23, 2018

Al2Klimov mentioned this issue Nov 27, 2018

Revert "Remove Downtime objects w/o reference to ScheduledDowntime objects" #6802

Closed

Al2Klimov removed their assignment Nov 29, 2018

Al2Klimov added a commit that referenced this issue Dec 4, 2018

Add and populate Downtime#authoritative_zone

4801162

refs #6542

Al2Klimov added a commit that referenced this issue Dec 4, 2018

Downtime#HasValidConfigOwner(): respect Downtime#authoritative_zone

a0a2008

refs #6542

Al2Klimov added a commit that referenced this issue Dec 6, 2018

Implement ScheduledDowntime::AllConfigIsLoaded()

6f838ee

refs #6542

Al2Klimov added a commit that referenced this issue Dec 6, 2018

Downtime#HasValidConfigOwner(): wait for ScheduledDowntimes

c78c702

refs #6542

Al2Klimov mentioned this issue Dec 6, 2018

Downtime#HasValidConfigOwner(): wait for ScheduledDowntimes #6826

Merged

Al2Klimov added a commit that referenced this issue Dec 6, 2018

Implement ScheduledDowntime::AllConfigIsLoaded()

bbcf469

refs #6542

Al2Klimov added a commit that referenced this issue Dec 6, 2018

Downtime#HasValidConfigOwner(): wait for ScheduledDowntimes

39e0d78

refs #6542

dnsmichi added bug Something isn't working and removed needs feedback We'll only proceed once we hear from you again question labels Dec 7, 2018

dnsmichi assigned Al2Klimov Dec 7, 2018

dnsmichi added this to the 2.10.3 milestone Dec 7, 2018

Al2Klimov removed their assignment Dec 7, 2018

Al2Klimov added a commit that referenced this issue Dec 17, 2018

Add and populate Downtime#authoritative_zone

5431eae

refs #6542

Al2Klimov added a commit that referenced this issue Dec 17, 2018

Downtime#HasValidConfigOwner(): respect Downtime#authoritative_zone

c98d6a5

refs #6542

Al2Klimov mentioned this issue Dec 17, 2018

Don't delete downtimes in satellite zones #6821

Merged

stevie-sy mentioned this issue Jan 25, 2019

/var/lib/icinga2/api/log/ growing although endpoints are connected #6910

Closed

dnsmichi pushed a commit that referenced this issue Feb 11, 2019

Implement ScheduledDowntime::AllConfigIsLoaded()

7c71218

refs #6542 (cherry picked from commit bbcf469)

dnsmichi pushed a commit that referenced this issue Feb 11, 2019

Downtime#HasValidConfigOwner(): wait for ScheduledDowntimes

a242569

refs #6542 (cherry picked from commit 39e0d78)

dnsmichi pushed a commit that referenced this issue Feb 11, 2019

Add and populate Downtime#authoritative_zone

53248a7

refs #6542 (cherry picked from commit 5431eae)

dnsmichi pushed a commit that referenced this issue Feb 11, 2019

Downtime#HasValidConfigOwner(): respect Downtime#authoritative_zone

b888e06

refs #6542 (cherry picked from commit c98d6a5)

dnsmichi added the needs feedback We'll only proceed once we hear from you again label Feb 11, 2019

dnsmichi removed the needs feedback We'll only proceed once we hear from you again label Feb 25, 2019

dnsmichi closed this as completed Feb 25, 2019

dnsmichi added ref/IP ref/NC labels Apr 25, 2019

stevie-sy mentioned this issue Aug 26, 2020

Add (optional) allowed endpoints to global zones configuration #8182

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

/var/log/icinga2/icinga2.log is growing very fast on satellites #6542

/var/log/icinga2/icinga2.log is growing very fast on satellites #6542

stevie-sy commented Aug 10, 2018 •

edited

Loading

dbu1986 commented Aug 17, 2018

dbu1986 commented Aug 20, 2018 •

edited

Loading

stevie-sy commented Aug 20, 2018

dnsmichi commented Aug 22, 2018

stevie-sy commented Aug 22, 2018

Thomas-Gelf commented Aug 22, 2018

dbu1986 commented Aug 22, 2018

stevie-sy commented Aug 22, 2018

Thomas-Gelf commented Aug 22, 2018

dnsmichi commented Sep 14, 2018 •

edited

Loading

dbu1986 commented Sep 17, 2018

stevie-sy commented Oct 29, 2018

dnsmichi commented Oct 29, 2018

stevie-sy commented Oct 29, 2018

dnsmichi commented Oct 29, 2018

dbu1986 commented Nov 5, 2018

hardoverflow commented Nov 16, 2018

dnsmichi commented Nov 19, 2018

dnsmichi commented Nov 22, 2018

Al2Klimov commented Nov 23, 2018

Al2Klimov commented Nov 27, 2018

Al2Klimov commented Nov 28, 2018

dnsmichi commented Dec 7, 2018

dnsmichi commented Feb 11, 2019

dnsmichi commented Feb 11, 2019 •

edited

Loading

dnsmichi commented Feb 25, 2019 •

edited

Loading

/var/log/icinga2/icinga2.log is growing very fast on satellites #6542

/var/log/icinga2/icinga2.log is growing very fast on satellites #6542

Comments

stevie-sy commented Aug 10, 2018 • edited Loading

Your Environment

dbu1986 commented Aug 17, 2018

Your Environment

dbu1986 commented Aug 20, 2018 • edited Loading

stevie-sy commented Aug 20, 2018

dnsmichi commented Aug 22, 2018

stevie-sy commented Aug 22, 2018

Thomas-Gelf commented Aug 22, 2018

dbu1986 commented Aug 22, 2018

stevie-sy commented Aug 22, 2018

Thomas-Gelf commented Aug 22, 2018

dnsmichi commented Sep 14, 2018 • edited Loading

dbu1986 commented Sep 17, 2018

stevie-sy commented Oct 29, 2018

dnsmichi commented Oct 29, 2018

stevie-sy commented Oct 29, 2018

dnsmichi commented Oct 29, 2018

dbu1986 commented Nov 5, 2018

hardoverflow commented Nov 16, 2018

Your Environment

dnsmichi commented Nov 19, 2018

dnsmichi commented Nov 22, 2018

Al2Klimov commented Nov 23, 2018

Al2Klimov commented Nov 27, 2018

Al2Klimov commented Nov 28, 2018

dnsmichi commented Dec 7, 2018

dnsmichi commented Feb 11, 2019

dnsmichi commented Feb 11, 2019 • edited Loading

dnsmichi commented Feb 25, 2019 • edited Loading

stevie-sy commented Aug 10, 2018 •

edited

Loading

dbu1986 commented Aug 20, 2018 •

edited

Loading

dnsmichi commented Sep 14, 2018 •

edited

Loading

dnsmichi commented Feb 11, 2019 •

edited

Loading

dnsmichi commented Feb 25, 2019 •

edited

Loading