Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for localhost-masquerading-as-node lacking, esp. in async mode #1415

Open
shallot opened this issue Aug 2, 2021 · 1 comment
Open

Comments

@shallot
Copy link

shallot commented Aug 2, 2021

Hi,

For a few years now I've been using some set of Munin plugins from their website, https://auth.mikrotik.com/wiki/Munin_Monitoring

Essentially what these seem to do is define nodes with address 127.0.0.1, but then use the plugin link name to deduce the actual hostname to which to send the requests

This is practically the same as what https://gallery.munin-monitoring.org/plugins/munin-contrib/snmp__mikrotik/ does, parsing the hostname to connect to out of $0, the script's own filename

This by and large works, but there's a significant problem when one of the remote notes goes offline.
First, the code typically starts using default long timeouts of e.g. 30s. This is practically untenable with shorter update_rate (I used 60s). But even after you reduce that significantly (I used 3s), the problem is still compounded by the fact there's many separate small plugins generating individual graphs, instead of there being one big multigraph plugin that experiences network problems only once per session.

But I somehow made it work, and it was acceptable over a period of many years. Worst case, I had to mark the dead hosts with update no, and that would alleviate any issues. I don't think I ever filed this as an issue here because it seemed like just a fact of life.

However, then I introduced munin-asyncd into the picture recently, and now the problem appears to be back, but worse - even when I make munin-update stop connecting to munin-node for the dead nodes, munin-asyncd on localhost keeps trying to do it itself, and chokes.

It's worse than the original method, as the whole thing becomes so lagged, I actually get timeouts on async calls and lose data from localhost. I noticed this through the fact that localhost munin_stats and munin_update plugins went missing from munin-html output.

I actually had to move away /etc/munin/plugins/mikrotik* links for dead hosts in order to unclog that.

I think the original use case works better because munin-node recognizes the distinction between just list and list remote.host, but munin-asyncd seems to keep hammering everything defined on localhost, regardless of any exceptions in the master config.

Can something be done about this?

TIA

@shallot
Copy link
Author

shallot commented Aug 2, 2021

One more thing. It's particularly weird that this happens, but in the ~munin-async/ directory there's nothing saved about those remote host plugins, even for the good nodes.

Could it be that munin-asyncd is just doing something like config, realizing the hostname doesn't match, and skipping? But when config calls involve remote host calls, it still gets hung up on that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant