Skip to content

Linux agent: Fixes for ntpq handling, for non-Ubuntu and non-systemd#116

Closed
rawiriblundell wants to merge 6 commits into
Checkmk:masterfrom
rawiriblundell:linux_agent_ntp_fixes
Closed

Linux agent: Fixes for ntpq handling, for non-Ubuntu and non-systemd#116
rawiriblundell wants to merge 6 commits into
Checkmk:masterfrom
rawiriblundell:linux_agent_ntp_fixes

Conversation

@rawiriblundell
Copy link
Copy Markdown
Contributor

Hi,
cc8af1a does not resolve the issues I raised in October (see e.g. d17e58#r35653501 ). I have also noticed another issue - a service named 'ntp' is assumed, whereas on some distributions this might be 'ntpd'. According to your contribution guidelines, you're building to Ubuntu, but that's somewhat short sighted. In the commercial/enterprise world (i.e. potential checkmk customers/users), the likes of RHEL and Suse cannot be ignored.

In this PR, I propose some code that will hopefully resolve all of the issues raised. This should theoretically work across varied Linux distributions, whether they are newer and systemd based, newer and not-systemd based, or legacy systems running another init system.

As an example of the systemd lock-in, I spun up a CentOS 5.11 VM (there is, very sadly, a LOT of RHEL 5.11 in govt and enterprise):

[root@cent5 yum.repos.d]# bash /tmp/check_mk_agent -d 2>&1 | grep ntp
    zfs get -t filesystem,volume -Hp name,quota,used,avail,mountpoint,type 2>/dev/null
ntp                               23416  5048 00:00:00       06:01 10998 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
root                              61220   728 00:00:00       00:01 11150 grep ntp
ntp                               23416  5048 00:00:00       06:01 10998 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
root                              61220   728 00:00:00       00:01 11150 grep ntp
    timesync_status=$(systemctl status ntp | awk '{if(NR==3) print $2}')
        if inpath ntpq; then
            run_cached -s ntp 30 "waitmax 5 ntpq -np | sed -e 1,2d -e 's/^\(.\)/\1 /' -e 's/^ /%/' || true"

That's the output of ps and the script code. With the proposed code in place:

[root@cent5 yum.repos.d]# bash /tmp/check_mk_agent.dev -d 2>&1 | grep ntp
    zfs get -t filesystem,volume -Hp name,quota,used,avail,mountpoint,type 2>/dev/null
ntp                               23416  5048 00:00:00    04:37:16 10998 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
root                              61224   820 00:00:00       00:00 12350 grep -i ntp
ntp                               23416  5048 00:00:00    04:37:16 10998 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
root                              61224   820 00:00:00       00:00 12350 grep -i ntp
    # Identify whether ntp/ntpd/openntpd is active
    if [ "$(systemctl | awk '/ntp.service|ntpd.service/{print $3; exit}')" = "active" ]; then
        # If so, check for ntpq, if it's present, run it
        if inpath ntpq; then
            run_cached -s ntp 30 "waitmax 5 ntpq -np | sed -e 1,2d -e 's/^\(.\)/\1 /' -e 's/^ /%/' || true"
# If we're not on a systemd based host, try ntpq anyway
    if inpath ntpq; then
        run_cached -s ntp 30 "waitmax 5 ntpq -np | sed -e 1,2d -e 's/^\(.\)/\1 /' -e 's/^ /%/' || true"
+ inpath ntpq
+ command -v ntpq
+ run_cached -s ntp 30 'waitmax 5 ntpq -np | sed -e 1,2d -e '\''s/^\(.\)/\1 /'\'' -e '\''s/^ /%/'\'' || true'
+ local 'section=echo '\''<<<ntp:cached(1578969649,30)>>>'\'' ; '
+ '[' ntp = -m ']'
+ '[' ntp = -ma ']'
+ local NAME=ntp
+ local 'CMDLINE=echo '\''<<<ntp:cached(1578969649,30)>>>'\'' ; waitmax 5 ntpq -np | sed -e 1,2d -e '\''s/^\(.\)/\1 /'\'' -e '\''s/^ /%/'\'' || true'
+ CACHEFILE=/var/lib/check_mk_agent/cache/ntp.cache
+ '[' -e /var/lib/check_mk_agent/cache/ntp.cache.new ']'
+ '[' -s /var/lib/check_mk_agent/cache/ntp.cache ']'
++ stat -c %Y /var/lib/check_mk_agent/cache/ntp.cache
+ [[ ntp == local_* ]]
+ sed -e '/^<<<.*\(:cached(\).*>>>/!s/^<<<\([^>]*\)>>>$/<<<\1:cached(1578954544,30)>>>/' /var/lib/check_mk_agent/cache/ntp.cache
<<<ntp:cached(1578954544,30)>>>
+ '[' '!' -e /var/lib/check_mk_agent/cache/ntp.cache.new ']'
+ echo 'set -o noclobber ; exec > "/var/lib/check_mk_agent/cache/ntp.cache.new" || exit 1 ; echo '\''<<<ntp:cached(1578969649,30)>>>'\'' ; waitmax 5 ntpq -np | sed -e 1,2d -e '\''s/^\(.\)/\1 /'\'' -e '\''s/^ /%/'\'' || true && mv "/var/lib/check_mk_agent/cache/ntp.cache.new" "/var/lib/check_mk_agent/cache/ntp.cache" || rm -f "/var/lib/check_mk_agent/cache/ntp.cache" "/var/lib/check_mk_agent/cache/ntp.cache.new"'

For reference, the current code run on a CentOS 7 VM:

[root@localhost tmp]# bash check_mk_agent.master -d 2>&1 | grep ntp
ntpd.service                                  disabled
[ extra stuff removed]
Unit ntp.service could not be found.

At present there is no intelligence for the case of a non-systemd host i.e. ntpq is blindly run if ntpq exists. For one possible approach for this, see section_ntp() here: https://github.com/rawiriblundell/checkMK/blob/merged_nix_agent/agents/check_mk_agent.merged#L1861

@jonaskluger jonaskluger added bug Something isn't working Component: Checks & Agents labels Jan 16, 2020
@jonaskluger
Copy link
Copy Markdown
Contributor

Hello Rawiri,

thank you for your help this fix!

We will have a closer look at your patch and discuss internally on how and if we can implement it into the official code base.

Best
Jonas

Internal Ref: CMK-3605

Comment thread agents/check_mk_agent.linux Outdated
@anthonyh209
Copy link
Copy Markdown
Contributor

Hi Rawiri,

I wanted to make some things clear before I include your changes (see above). Thank you for your suggestions again. I had your comments from October back in mind, but they disappeared unfortunately.

Best,
Anthony

Comment thread agents/check_mk_agent.linux Outdated
@rawiriblundell
Copy link
Copy Markdown
Contributor Author

This appears to have been resolved with ae6103c

@rawiriblundell rawiriblundell deleted the linux_agent_ntp_fixes branch January 16, 2021 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Component: Checks & Agents

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants