Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client tries to sync config (downtimes) back to Master #4969

Closed
marcofl opened this issue Feb 1, 2017 · 16 comments
Closed

Client tries to sync config (downtimes) back to Master #4969

marcofl opened this issue Feb 1, 2017 · 16 comments
Labels
area/distributed Distributed monitoring (master, satellites, clients) bug Something isn't working stalled Blocked or not relevant yet

Comments

@marcofl
Copy link

marcofl commented Feb 1, 2017

Hi everyone,

I noticed a very strange behaviour. When we create downtimes (via icingaweb2 or configured recurring downtimes) the downtime is written in the _api config package (e.g. /var/lib/icinga2/api/packages/_api/icinga-server-1485934526-1/conf.d/downtimes) and then it's also synced to the affected client zone via api and save in /var/lib/icinga2/api/packages/_api/icinga-client-1485934667-0/conf.d/downtimes on the client.

I tried to not have configured downtimes in a global template zone (of my zones.d on the master) instead I put it in the master zone. But it still gets synced to the client.

First, is this intended? Why does the downtime get synced as a runtime config to the client, the client could apply the downtime on its own via a global-template apply.

But the strange thing actually is, that the client tries to sync back the config to master via the API. As I'm using top-down config mode, so I can't see the point here:

in the log of the master:

[2017-02-01 09:22:09 +0100] warning/ApiListener: Ignoring config update from 'icinga-client.vagrant.dev' for object 'icinga-client.vagrant.dev!puppet!icinga-server-1485935357-36' of type 'Downtime'. 'api' does not accept config.
[2017-02-01 09:22:09 +0100] warning/ApiListener: Ignoring config update from 'icinga-client.vagrant.dev' for object 'icinga-client.vagrant.dev!apt!icinga-server-1485934526-0' of type 'Downtime'. 'api' does not accept config.

the message appears for each downtime that is currently active / scheduled for the client and for which there is a file in: /var/lib/icinga2/api/packages/_api/icinga-client-1485934667-0/conf.d/downtimes

When you have accept_config enabled on the master (which is not need afaik) the message changes from "Ignoring" to "Discarding". This tells me that is actually does not want / need this config update at all.

I did never see those messages in Icinga2 2.5, but I can't tell if this is something introduced in 2.6 for sure.

the actual problem with this is, that I seems to slow down icinga2 restarts dramatically on the master and spams the log as we have around 50 host and 3000 service downtimes (configured downtimes by puppet, for non-production systems).

@dnsmichi
Copy link
Contributor

dnsmichi commented Feb 2, 2017

Removing the 'discard' messages by changing the log level will happen with #4930.

@dnsmichi dnsmichi added the area/distributed Distributed monitoring (master, satellites, clients) label Feb 2, 2017
@dnsmichi
Copy link
Contributor

dnsmichi commented Feb 2, 2017

Your messages should silently be discarded, instead of checking accept_configbeforehand. I'll fix that as part of #4930.

@dnsmichi dnsmichi added this to the 2.7.0 milestone Feb 2, 2017
@dnsmichi dnsmichi self-assigned this Feb 2, 2017
@dnsmichi
Copy link
Contributor

dnsmichi commented Feb 2, 2017

Re-opening this issue as discussed with @Thomas-Gelf @lazyfrosch @lippserd - the original problem is that the satellite sends those downtime/comment messages to the parent and does not discard them locally.

@dnsmichi dnsmichi reopened this Feb 2, 2017
@marcofl
Copy link
Author

marcofl commented Feb 2, 2017

Thank you for the quick response.

Why does a client or satellite even try to send config updates to the upper / parent cluster zone? From my understanding, in top-down config mode, there is only one way for configuration to flow - downwards. What did I miss here?

@dnsmichi
Copy link
Contributor

dnsmichi commented Feb 2, 2017

Objects can be created at runtime, e.g. a Downtime object. Such a config object may still be synced to the top (at least it was planned to fix the current non-working behaviour). We agreed on not allowing to send such objects to the parent zone members, and maybe enabling comments/downtimes sync to the top later on with specific ACLs if any.

@dnsmichi
Copy link
Contributor

dnsmichi commented Feb 2, 2017

The issue with the non-working behaviour is described here: #3719

@dnsmichi dnsmichi removed this from the 2.7.0 milestone Feb 6, 2017
@dnsmichi
Copy link
Contributor

dnsmichi commented Feb 6, 2017

TODO notes:

  • If there is no client directly connected (which happens inside OnActiveChanged) RelayMessage() must ensure not to send to all parents
  • Add a new param to RelayMessage() / SyncRelayMessage() which does not propagate the message to the parent zone of the local zone
  • Changes require runtime tests for any cluster message synced

@marcofl
Copy link
Author

marcofl commented Feb 7, 2017

Okay, thank you for clarification.

I can see the need to have icingaweb2 running on a satellite which uses the local commandpipe to create ACKs, Downtimes, etc. that get synced to the master zone and pop up in the icingaweb there.

Did I get this right, that ignoring those config updates was introduced in 2.4 and worked before (#3719)?

@dnsmichi
Copy link
Contributor

dnsmichi commented Feb 7, 2017

Yep. Unfortunately nothing was changed or fixed every since, so those config updates to parent zones remain useless for the time being.

@marcofl
Copy link
Author

marcofl commented Mar 27, 2017

Hi @dnsmichi,
actually this is becoming a bigger problem for us. We currently have issues, that our master zone icinga2 dies from time to time (process still running, no updates to ido and no checks executed in the master zone (at least we don't get results) and we don't know where this comes from. could be ralated to #5093 or #5080 though.

we have 10000 downtimes and more because we put dev / setup phase hosts and services into permanent downtime, which actually creates a downtime for every day for each host and service.

the logfile is nearly unusable due to those "discarding config update" messages when icinga2 master zone starts.

@dnsmichi
Copy link
Contributor

The logging should be gone in 2.6.x, the rest is an open todo.

@dnsmichi dnsmichi removed their assignment May 31, 2017
@dnsmichi dnsmichi added the bug Something isn't working label May 31, 2017
@syswombat
Copy link

i have this error on the Master Log where 'hugin-munin.kozo.ch" would be the client.
but i can't find the error. 3 other client i set up works fine so far...

[2018-07-08 09:33:03 +0200] information/ApiListener: New client connection for identity 'hugin-munin.kozo.ch' from [::ffff:10.147.42.63]:55282 (no Endpoint object found for
identity)
[2018-07-08 09:33:03 +0200] warning/ApiListener: Ignoring config update from 'hugin-munin.kozo.ch' for object 'hugin-munin.kozo.ch!load!hugin-munin.kozo.ch-1531030727-0' of
type 'Downtime'. 'api' does not accept config.

@SimonHoenscheid
Copy link

I have the same behavior in my logs

@dnsmichi
Copy link
Contributor

Which versions are involved for both instances?

@SimonHoenscheid
Copy link

2.9.1-1

@htriem
Copy link
Contributor

htriem commented Jan 23, 2020

The bottom-up sync functionality will be required for future autodiscovery/inventory features and will be existing for the time being. When we iterate on the aforementioned functionality, we will evalutate possible optimizations.

@htriem htriem closed this as completed Jan 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/distributed Distributed monitoring (master, satellites, clients) bug Something isn't working stalled Blocked or not relevant yet
Projects
None yet
Development

No branches or pull requests

5 participants