Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Support] Information about check_memory vendored into Debian's nagios-plugins-contrib #104

Closed
amotl opened this issue Mar 29, 2022 · 11 comments
Assignees
Labels

Comments

@amotl
Copy link

amotl commented Mar 29, 2022

Dear Davide,

first things first: Thanks a stack for conceiving and maintaining this collection of excellent check programs.

Introduction

Because this is absolutely not a bug report related to your software at all, please apologize for not using the issue template. It is merely a report where we want to share some observations and caveats when operating Icinga on Debian machines and upgrading to newer releases like Debian bullseye. In this manner, we are humbly asking you to label this issue with info, support or similar and close it right away if you see fit.

Observations

It looks like the nagios-plugins-linux Debian source package recently vendored your monitoring programs 1, effectively replacing its previous check_memory program. It happened with those commits:

@Flowdalic already reported that to the Debian bugtracker at #1005908:

In version 35 check_memory is a Perl script and the arguments are the low-water mark of free memory, after which an warning or critical result will be reported. In version 37 check_memory is an ELF and the arguments are the high-water mark of used memory after which a warning or critical result will be reported.
[...]
Hence, invoking the program with the previous parameters will yield different results depending on the installed version. This is clearly not desirable and would require that users need to adjust their monitoring setup to work around the behavior change.

So, effectively, for Debian users transitioning to nagios-plugins-contrib >= 36.20211207, the check semantics will be inverted. This is a somewhat unfortunate change for them.

Mitigations

As it is not really a bug report (your software works perfectly fine at all!) but more a support recipe for the community, our main aim is to share the steps we took to mitigate this problem on an Icinga instance monitoring systems running both old and new versions.

So, to all people using nagios-plugins-contrib or monitoring-plugins-contrib on Debian: The check_memory program by @madrisan has the --available option which should be used in this context of making the program's behavior compatible with the previous variant written in Perl. However, the program's --warning|--critical threshold control options will also need slight adjustments.

Effectively, if that command would have worked before:

check_memory --warning "20%" --critical "10%"

people should now use an invocation like that:

check_memory --available --warning "20%:" --critical "10%:"

Closing words

We hope you appreciate our report here despite the report should go to Debian instead. We will check how to respond to the original ticket at #1005908 properly, referencing this issue. If someone from the community can also do that, we will be more than happy.

With kind regards,
Andreas.

Footnotes

  1. https://salsa.debian.org/nagios-team/pkg-nagios-plugins-contrib/-/tree/master/madrisan-nagios-plugins-linux

@amotl amotl added the bug label Mar 29, 2022
@amotl
Copy link
Author

amotl commented Mar 29, 2022

After knowing about the different call signature, the challenge was to adjust the Icinga configuration appropriately to support both variants. As we didn't find a way to inquire the information at runtime which package version would be installed, like

$ dpkg-query --show --showformat '${Package} ${Version}\n' monitoring-plugins-contrib
monitoring-plugins-contrib 37.20220312

we decided to introduce and propagate another flag to from the host object to the service object and further to the command object. Based on this flag, the call signature to the check_memory program is adjusted at runtime.

Hereby, we are sharing the corresponding configuration snippets. Hopefully, they will be helpful to all people who are in need of them.

object Host "foobar.example.org" {

    vars.os = "Debian"
    vars.kernel = "Linux"

    // Newer shipments of `check_memory` have a different call signature.
    // Needed for {monitoring,nagios}-plugins-contrib >= 36.20211207.
    vars.memory_madrisan = true
}
apply Service "Main memory" {

  check_command = "memory"
  assign where host.vars.kernel == "Linux"
  ignore where host.vars.check_memory == false

  // Define thresholds.
  vars.warning = host.vars.memory_warning || "20%"
  vars.critical = host.vars.memory_critical || "10%"

  // Newer shipments of `check_memory` have a different call signature.
  // Needed for {monitoring,nagios}-plugins-contrib >= 36.20211207.
  vars.madrisan = host.vars.memory_madrisan
  if (vars.madrisan) {
    vars.warning  += ":"
    vars.critical += ":"
  }
}
object CheckCommand "memory" {
    /*
    Universal memory check command accounting for different 
    variants of »check_memory« installed on the system.

    Newer shipments of `check_memory` have a different call signature.
    Needed for {monitoring,nagios}-plugins-contrib >= 36.20211207.
    */
    import "plugin-check-command"

    command = [ "/usr/lib/nagios/plugins/check_memory" ]

    arguments = {
        "--available" = {
            description = "display the free/available memory"
            set_if = "$madrisan$"
        }
        "--warning" = {
            value = "$warning$"
            set_if = {{ if ( len(macro("$warning$")) != 0 ) { return 1 } else { return 0 } }}
            description = "Trigger warning when available memory is lower than X"
        }
        "--critical" = {
            value = "$critical$"
            set_if = {{ if ( len(macro("$critical$")) != 0 ) { return 1 } else { return 0 } }}
            description = "Trigger critical when available memory is lower than X"
        }
    }

}

@amotl
Copy link
Author

amotl commented Mar 29, 2022

Another option would have been not to swap in the program under the same name 1 while retiring the old one. In this case, the program would be installed as /usr/lib/nagios/check_madrisan_memory and could have been handled a bit better in transitioning scenarios. However, this suggestion is really meant to be addressed to the Debian maintainers.

Footnotes

  1. https://salsa.debian.org/nagios-team/pkg-nagios-plugins-contrib/-/commit/eadc785#51d93ced3dcde8185a2a9c39bea1a788328b46ee

@amotl
Copy link
Author

amotl commented Mar 30, 2022

Dear @Flowdalic,

we can see that you acknowledged our original post, thank you. Can we humbly ask you to forward the relevant information to the Debian issue tracker if you can find some minutes? Maybe it will be enough to refer to this ticket?

Thank you in advance and with kind regards,
Andreas.

@madrisan madrisan self-assigned this Mar 30, 2022
@madrisan madrisan added support and removed bug labels Mar 30, 2022
@madrisan
Copy link
Owner

Thanks for your report @amotl
I was not aware of this commit in Debian 👍
So, please tell me if I can do something for enhancing the compatibility with the previous Debian package.
Perhaps a compilation switch?

@amotl
Copy link
Author

amotl commented Mar 30, 2022

Dear Davide,

thank you for your kind reply on this issue. We just pinged @bzed and @waja about it at bzed/pkg-nagios-plugins-contrib#93, trying to condense the relevant information further.

We would like to have them and Peter Palfrader have their verdict what would be best for the Debian vendoring as we are not accustomed to the corresponding details. However, we wanted to share our thoughts on the two different aspects which make the new implementation of check_memory incompatible with the old one, from the perspective of an average Debian user, and how to improve the situation slightly.

1. Threshold value evaluation inversion

Perhaps a compilation switch?

We don't know about a good solution to the aspect of the --available option yet. While a compilation switch would be able to essentially invert the operation of your check_memory program, wouldn't that be very confusing for users in the long run to find your program installed on their machines in two different variants? Currently, we see two options.

1.1. Messy workaround

Maybe it will be sufficient to advise adding this snippet to the CheckCommand configuration section?

"--available" = {
    description = "display the free/available memory"
    set_if = "$madrisan$"
}

However, people who are not aware of this will experience "silent corruption".

1.2. Break things instead of screwing things up

Like mentioned at #104 (comment), a safer option would be to adjust the retirement commit bzed/pkg-nagios-plugins-contrib@eadc7856 to ship the program as /usr/lib/nagios/check_madrisan_memory like before, while still removing the old Perl implementation. At least this will signal users that something has changed instead of screwing up the sensor integrity.

2. Threshold value syntax

The other aspect are the additional : colon characters on the threshold values, currently needed by @madrisan's implementation. We compensated for this with that snippet of Icinga configuration initially shared at #104 (comment):

  // Newer shipments of `check_memory` have a different call signature.
  // Needed for {monitoring,nagios}-plugins-contrib >= 36.20211207.
  vars.madrisan = host.vars.memory_madrisan
  if (vars.madrisan) {
    vars.warning  += ":"
    vars.critical += ":"
  }

please tell me if I can do something for enhancing the compatibility

If it would fit the semantics of your program, would it be possible to make those colon characters optional? In this manner, we could reduce the footprint of the workaround at least slightly.

With kind regards,
Andreas.

@madrisan
Copy link
Owner

madrisan commented Apr 2, 2022

As described in the official nagios documentation, the two syntaxes are not equivalent: 20: means less than 20 while 20 would mean in our context greater than 20.
Note also that the range is parsed by the function parse_range_string which is the same coded in the nagios plugins.
So I cannot make the : optional.

@dnmvisser
Copy link

Hii, I am in the same boat, migration hosts from Debian 11 to Debian 12, and now the memory checks fail. But, with a little ternary shell logic things can work on both versions.

Instead of this:

/usr/lib/nagios/plugins/check_memory -w 20% -c 10%

we now use this:

/usr/lib/nagios/plugins/check_memory "$(( $(lsb_release -rs) < 12 )) && echo -w 20% -c 10% || echo -a -w 20%: -c 10%:"

We use the check_by_ssh wrapper, so in reality the command_line looks like this:

 command_line /usr/lib/nagios/plugins/check_by_ssh -q -t 60 -H '$HOSTADDRESS$' '/usr/lib/nagios/plugins/check_memory "$(( $(lsb_release -rs) < 12 )) && echo -w $ARG1$ -c $ARG2$ || echo -a -w $ARG1$: -c $ARG2$:"'

@madrisan
Copy link
Owner

Hi, thanks for the report. So weird. I'll check ASAP.

@madrisan
Copy link
Owner

madrisan commented Sep 26, 2023

Can you open a new issue please?

@dnmvisser
Copy link

Can you open a new issue please?

My comment/workaround was more meant for reference by others that start using Debian 12 and end up in the same situation. The problem is not so much with this/your script, but rather with the choice of Debian to move from one script to another (incompatible) script.

@waja
Copy link

waja commented Sep 26, 2023

The problem is not so much with this/your script, but rather with the choice of Debian to move from one script to another (incompatible) script.

It's not incompatible, but you might need to parameter it in another way. And yes, it requires to adjust your configuration.

Repository owner locked and limited conversation to collaborators Mar 27, 2024
@madrisan madrisan converted this issue into discussion #147 Mar 27, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Projects
None yet
Development

No branches or pull requests

4 participants