Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

4.2.0 not reloading #158

Closed
PitoneMaledetto opened this issue Aug 30, 2016 · 8 comments
Closed

4.2.0 not reloading #158

PitoneMaledetto opened this issue Aug 30, 2016 · 8 comments

Comments

@PitoneMaledetto
Copy link

PitoneMaledetto commented Aug 30, 2016

Hi, as per Nagios Core Support Manager advise I am referencing this thread to address an issue that I am having on reloading the configuration on Nagios Core 4.2.0 (and previously on 4.1.1).
The reload does not work forcing me to restart nagios every time I make a change in configuration.
Please let me know if you need further details on top of those already stated on the forum thread linked above.
Thanks

@jfrickson
Copy link
Contributor

After exchanging a few messages on the support thread, I'm switching here.

@PitoneMaledetto run the following commands as root:

rm /etc/systemd/system/multi-user.target.wants/nagios.service
rm /etc/systemd/system/nagios.service
systemctl daemon-reload

Then, edit /etc/init.d/nagios and starting at line 255, change this:

    reload|force-reload)
        if test "$checkconfig" = "true"; then
            printf "Running configuration check...\n"
            check_config
        fi

        if test ! -f $NagiosRunFile; then
            $0 start
        else
            pid_nagios
            if status_nagios > /dev/null; then
                printf "Reloading nagios configuration...\n"
                killproc_nagios HUP
                echo "done"
            else
                $0 stop
                $0 start
            fi
        fi
        ;;

to this:

    reload|force-reload)
        if test "$checkconfig" = "true"; then
            printf "Running configuration check...\n"
            check_config
        fi

        if test ! -f $NagiosRunFile; then
            printf "DEBUG - Running $0 start\n"
            $0 start
        else
            pid_nagios
            if status_nagios > /dev/null; then
                printf "DEBUG - killproc_nagios HUP\n"
                printf "Reloading nagios configuration...\n"
                killproc_nagios HUP
                echo "done"
            else
                printf "DEBUG - running $0 stop and $0 start\n"
                $0 stop
                $0 start
            fi
        fi
        ;;

Then do a tail -f on the nagios.log and run /etc/init.d/nagios reload and tell me which DEBUG line gets printed, and if nagios.log shows a reload happening.

@PitoneMaledetto
Copy link
Author

PitoneMaledetto commented Sep 1, 2016

Hi John,
I have performed all the changes as per instructions.
The reload still restart hence not reloading, I am not getting any DEBUG messages but I am getting this instead:

root@dev-nagios-01:/usr/local/nagios/libexec# /etc/init.d/nagios reload [....] Reloading nagios configuration (via systemctl): nagios.serviceWarning: Unit file of nagios.service changed on disk, 'systemctl daemon-reload' recommended. . ok

nagios.log
[1472721997] Caught SIGHUP, restarting... [1472721997] Event broker module 'NERD' deinitialized successfully. [1472721997] Nagios 4.2.0 starting... (PID=10034) [1472721997] Local time is Thu Sep 01 10:26:37 BST 2016 [1472721997] LOG VERSION: 2.0 [1472721997] qh: Socket '/usr/local/nagios/var/rw/nagios.qh' successfully initialized [1472721997] qh: core query handler registered [1472721997] nerd: Channel hostchecks registered successfully [1472721997] nerd: Channel servicechecks registered successfully [1472721997] nerd: Channel opathchecks registered successfully [1472721997] nerd: Fully initialized and ready to rock! [1472721997] wproc: Successfully registered manager as @wproc with query handler [1472721997] wproc: Registry request: name=Core Worker 11827;pid=11827 [1472721997] wproc: Registry request: name=Core Worker 11825;pid=11825 [1472721997] wproc: Registry request: name=Core Worker 11828;pid=11828 [1472721997] wproc: Registry request: name=Core Worker 11826;pid=11826

from the log I get a restart of the service as per [1472721997] Caught SIGHUP, restarting...

p.s. force-reload behaves the same.

@PitoneMaledetto
Copy link
Author

Just a thought.
The status of systemctl is running (systemctl status) but I did not get anything back from the reload; I am not sure if the terminal should verbose a reload message on the terminal or not so I can't be 100% sure about the daemon reload.
Moreover systemctl stop nagios.service and systemctl start nagios.service do indeed work when executed.

@jfrickson
Copy link
Contributor

It actually looks like it's working now. When the log says Caught SIGHUP, restarting..., it's not really restarting. It is doing a reload at that point.

I recommend you run the systemctl daemon-reload command, make a change in the nagios config, do a reload, and see if the change is reflected in the UI. I strongly suspect it will work.

@PitoneMaledetto
Copy link
Author

Now I get:

root@dev-nagios-01:/usr/local/nagios/etc/objects/cy/servers# /etc/init.d/nagios reload [ ok ] Reloading nagios configuration (via systemctl): nagios.service.
So I assume it is normal that all the hosts get re-scanned too?
reload
Because I remember that in previous versions of Nagios Core only the changes in configuration where PENDING.

@jfrickson
Copy link
Contributor

In your nagios.cfg file, check the values of the retain_state_information, state_retention_file, retention_update_interval, use_retained_program_state, use_retained_scheduling_info, and all the retained_*_mask variables. I suspect one or more of them is set incorrectly.

@PitoneMaledetto
Copy link
Author

Indeed it was:

Generic host definition

define host{
name generic-host ; The name of this host template
notifications_enabled 1 ; Host notifications are enabled
event_handler_enabled 1 ; Host event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 0 ; Retain status information across program restarts
retain_nonstatus_information 0 ; Retain non-status information across program restarts
notification_period 24x7 ; Send host notifications at any time
check_command check-host-alive
register 0 ; DONT REGISTER THIS DEFINITION
}

In order to close this ticket would you say that removing:
rm /etc/systemd/system/multi-user.target.wants/nagios.service
rm /etc/systemd/system/nagios.service
and reload the daemon did the trick?

@jfrickson
Copy link
Contributor

I would say yes. Most probably, those files were the cause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants