Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Services doesn't updates since +/- 10 days #3405

Closed
stylersnico opened this issue Apr 21, 2016 · 26 comments
Closed

Services doesn't updates since +/- 10 days #3405

stylersnico opened this issue Apr 21, 2016 · 26 comments

Comments

@stylersnico
Copy link
Contributor

Hi,

The "services" using nagios plugins doesn't update since almost 10 days.

If I ran this command i got this, with all the good informations :

runuser -l librenms -c '/opt/librenms/check-services.php -d'

DEBUG!

SQL[SELECT * FROM devices AS D, services AS S WHERE S.device_id = D.device_id ORDER by D.device_id DESC]
Nagios Service - 36
Request: /usr/lib/nagios/plugins/check_ftp -H xxx
Perf Data - DS: time, Value: 0.026305, UOM: s
Response: FTP OK - 0.026 second response time on xxx port 21 [220 Lyon FTP server ready.]
Service DS: {
"time": "s"
}

But on the web interface i got this (an outdated message):

FTP OK - 0.025 second response time on xxx port 21 [220 Lyon FTP server ready.]

If I run:

./validate.php

I got this:

Version info:
Commit SHA: 6694acc
DB Schema: 112
PHP: 5.6.19-0+deb8u1
MySQL: 10.0.23-MariaDB-0+deb8u1
RRDTool: 1.4.8
SNMP: NET-SNMP version: 5.7.2.1

[OK] Database connection successful

Any idea why my services status on the web interface are trapped in time?

Thanks

@adaniels21487
Copy link
Contributor

@stylersnico
Can you please post a screenshot.
What leads you to believe that: FTP OK - 0.025 second response time on xxx port 21 [220 Lyon FTP server ready.] is an outdated message.

There were some majoe changes to the services infrastructure recently, so this may have caused the issue, but I'm not clear on what the issue is :)

@stylersnico
Copy link
Contributor Author

HI @adaniels21487 ,
Thanks for your answer

More speaking exemple.
I have check on certificate expiration.

On the CLI check I have the rights expiration date.
But on the web interface nothing move (you can check the correct expire date on the website):

capture

Service check only update if they have a state change or if I do a new check.
Expect that nothing update.

@adaniels21487
Copy link
Contributor

Thanks @stylersnico I see the issue now.
I will submit a PR shortly.

@laf
Copy link
Member

laf commented Apr 25, 2016

Thanks @adaniels21487

@stylersnico
Copy link
Contributor Author

Thanks, let us know if you have any update on this :) 👍

@adaniels21487
Copy link
Contributor

@stylersnico this should be fixed now. Please let me know how it goes.

@laf laf closed this as completed Apr 25, 2016
@stylersnico
Copy link
Contributor Author

Hi, service are updated but it still show everything in red in the services on my main dashboard
capture

@laf
Copy link
Member

laf commented Apr 26, 2016

Have you updated?

@adaniels21487
Copy link
Contributor

Ahh, the widget. I will get that sorted shortly.
Is there anything else you have spotted as broken?

@stylersnico
Copy link
Contributor Author

stylersnico commented Apr 26, 2016

@laf yes, this morning.
@adaniels21487 No, no for the moment :)

@jmaliska
Copy link

Hi, with latest update I still see some things broken. The widget on dashboard is ok now, but I still see the services as Down:
screenshot 01

in Alerts > Notifications are all services Down

List of all services looks like this:
screenshot 02

And when I want to add service in device page, it looks like this:
screenshot 03
(also there are no types in Type menu)

@adaniels21487
Copy link
Contributor

  1. Services Down - Does your install still think all the services are down (what does the dashboard say?), or are the services up but the alarms have not cleared. It sounds like the latter.
  2. Service List & Add Services - All looks fine on my install, can you please send me the html source for your list services page, something is not right here.

@stylersnico
Copy link
Contributor Author

@adaniels21487 all the services are fine now this morning.
All was working on the service page but, the widget on my dashboard was failing.

Now everything seems to work without any action from my side :)

Thanks again man, great job 👍

@adaniels21487
Copy link
Contributor

Good to hear. apologies for the inconvenience.

@jmaliska
Copy link

@adaniels21487 the dashboard widget shows no services down, looks like only the alarms have not cleared

Regarding the services html source, is it the file services.inc.php ?

I already tried to clone the git repo to another folder and compared both using winmerge, but I didn't find any differences.

@adaniels21487
Copy link
Contributor

@jmaliska

Clearing the alerts - I am not a guru on the alerting system, but a quick look at my install shows that alerts.state in the db defines open alerts, you MAY be able to run the following SQL to clear all alerts:

update alerts set state=0 where state=1;

As mentioned, I am not terribly familiar with the alerting system, you should get a second opinion on IRC.

The source - something is getting into the HTML to screw it up. On the list services page that is broken, please right click, view source, pastebin it and send the link here.

@jmaliska
Copy link

@adaniels21487
I tried the SQL update, but unfortunately it didn't help, librenms created again alerts for all services.

The HTML source is here http://pastebin.com/25HEzDWN

@laf
Copy link
Member

laf commented Apr 28, 2016

Post your alert rule for services. I expect it needs updating.

@jmaliska
Copy link

@laf
For example SSH:
screenshot 02

@adaniels21487
Copy link
Contributor

@jmaliska
Re: Alerts - @laf is on the money. The meaning of the status field has changed during the update. it is now: 0 = Ok, 1 = Warning, 2 = Critical. I suspect you need to change:

From: %services.service_status != 1
To:   %services.service_status != 0

Re: HTML - I can see plenty of PHP in your HTML, of you are really seeing this in your browser your server is not properly rendering the PHP.

@laf
Copy link
Member

laf commented Apr 28, 2016

The php thing is odd - we don't use shorthand <? anywhere so this has got to be a local issue.

@jmaliska
Copy link

@adaniels21487
Thanks, I updated the services alert rules and I don't see alerts in Alerts > Nitofications anymore, but I still see it on Dashboard, see screenshot.
screenshot 03

Regarding the php thing, I just tried new installation of LibreNMS on new Oracle Linux 6 template, but it is still the same. I don't know if it will help, but here is version info from validate.php:
Version info:
Commit SHA: 940e98e
DB Schema: 114
PHP: 5.3.3
MySQL: 5.1.73-log
RRDTool: 1.3.8
SNMP: NET-SNMP version: 5.5
and Apache httpd version is: 2.2.15

@jmaliska
Copy link

My colleague just helped me, setting short_open_tag to On in php.ini fixed the php issues

@jmaliska
Copy link

jmaliska commented May 2, 2016

I'm just guessing, but on page html/pages/front/default.php, line 83 there is this select:
SELECT * FROM servicesAS S,devicesAS D WHERE S.device_id = D.device_id AND service_status = 'down' AND D.ignore = '0' AND S.service_ignore = '0' ANDD.status = '1' LIMIT

When I set service_status = '2', the Dashboard front boxes shows only services which are really down as down, but when it is set to service_status = 'down' , also services which are not down are shown as down

@laf
Copy link
Member

laf commented May 2, 2016

Please create a separate issue for this.

@jmaliska
Copy link

jmaliska commented May 2, 2016

Sure, I just did.
Thanks

@lock lock bot locked as resolved and limited conversation to collaborators May 20, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants