New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LiveStatusQuery spam in icinga2.log #5953

Closed
kajla opened this Issue Jan 8, 2018 · 6 comments

Comments

Projects
None yet
3 participants
@kajla
Copy link

kajla commented Jan 8, 2018

Dear Developers,

I recently noticed that the Icinga2 LiveStatusQuery spam errors every minute to the icinga2.log file.

Current Behavior

...
[2018-01-08 13:13:02 +0100] critical/Socket: send() failed with error code 32, "Broken pipe"
[2018-01-08 13:13:02 +0100] critical/LivestatusQuery: Cannot write query response to socket.
[2018-01-08 13:13:02 +0100] critical/LivestatusQuery: Cannot write query response to socket.
[2018-01-08 13:13:02 +0100] critical/ThreadPool: Exception thrown in event handler:
Error: Tried to read from closed socket.

        (0) libbase.so.2.8.0: <unknown function> (+0xca801) [0x7f0ba1fca801]
        (1) libbase.so.2.8.0: icinga::NetworkStream::Read(void*, unsigned long, bool) (+0x75) [0x7f0ba1febbf5]
        (2) libbase.so.2.8.0: icinga::StreamReadContext::FillFromStream(boost::intrusive_ptr<icinga::Stream> const&, bool) (+0x58) [0x7f0ba1fb2268]
        (3) libbase.so.2.8.0: icinga::Stream::ReadLine(icinga::String*, icinga::StreamReadContext&, bool) (+0x74) [0x7f0ba1fb23a4]
        (4) liblivestatus.so.2.8.0: icinga::LivestatusListener::ClientHandler(boost::intrusive_ptr<icinga::Socket> const&) (+0x13a) [0x7f0b9ce274ea]
        (5) libbase.so.2.8.0: icinga::ThreadPool::WorkerThread::ThreadProc(icinga::ThreadPool::Queue&) (+0x365) [0x7f0ba20082c5]
        (6) libboost_thread.so.1.54.0: <unknown function> (+0xcc0a) [0x7f0ba2c34c0a]
        (7) libpthread.so.0: <unknown function> (+0x8744) [0x7f0ba1543744]
        (8) libc.so.6: clone (+0x6d) [0x7f0ba2f29aad]


[2018-01-08 13:13:02 +0100] critical/Socket: send() failed with error code 32, "Broken pipe"
[2018-01-08 13:13:02 +0100] critical/LivestatusQuery: Cannot write query response to socket.
[2018-01-08 13:13:02 +0100] critical/LivestatusQuery: Cannot write query response to socket.
[2018-01-08 13:13:02 +0100] critical/ThreadPool: Exception thrown in event handler:
Error: Tried to read from closed socket.

        (0) libbase.so.2.8.0: <unknown function> (+0xca801) [0x7f0ba1fca801]
        (1) libbase.so.2.8.0: icinga::NetworkStream::Read(void*, unsigned long, bool) (+0x75) [0x7f0ba1febbf5]
        (2) libbase.so.2.8.0: icinga::StreamReadContext::FillFromStream(boost::intrusive_ptr<icinga::Stream> const&, bool) (+0x58) [0x7f0ba1fb2268]
        (3) libbase.so.2.8.0: icinga::Stream::ReadLine(icinga::String*, icinga::StreamReadContext&, bool) (+0x74) [0x7f0ba1fb23a4]
        (4) liblivestatus.so.2.8.0: icinga::LivestatusListener::ClientHandler(boost::intrusive_ptr<icinga::Socket> const&) (+0x13a) [0x7f0b9ce274ea]
        (5) libbase.so.2.8.0: icinga::ThreadPool::WorkerThread::ThreadProc(icinga::ThreadPool::Queue&) (+0x365) [0x7f0ba20082c5]
        (6) libboost_thread.so.1.54.0: <unknown function> (+0xcc0a) [0x7f0ba2c34c0a]
        (7) libpthread.so.0: <unknown function> (+0x8744) [0x7f0ba1543744]
        (8) libc.so.6: clone (+0x6d) [0x7f0ba2f29aad]


[2018-01-08 13:13:02 +0100] critical/Socket: send() failed with error code 32, "Broken pipe"
[2018-01-08 13:13:02 +0100] critical/LivestatusQuery: Cannot write query response to socket.
[2018-01-08 13:13:02 +0100] critical/LivestatusQuery: Cannot write query response to socket.
[2018-01-08 13:13:02 +0100] critical/ThreadPool: Exception thrown in event handler:
Error: Tried to read from closed socket.

        (0) libbase.so.2.8.0: <unknown function> (+0xca801) [0x7f0ba1fca801]
        (1) libbase.so.2.8.0: icinga::NetworkStream::Read(void*, unsigned long, bool) (+0x75) [0x7f0ba1febbf5]
        (2) libbase.so.2.8.0: icinga::StreamReadContext::FillFromStream(boost::intrusive_ptr<icinga::Stream> const&, bool) (+0x58) [0x7f0ba1fb2268]
        (3) libbase.so.2.8.0: icinga::Stream::ReadLine(icinga::String*, icinga::StreamReadContext&, bool) (+0x74) [0x7f0ba1fb23a4]
        (4) liblivestatus.so.2.8.0: icinga::LivestatusListener::ClientHandler(boost::intrusive_ptr<icinga::Socket> const&) (+0x13a) [0x7f0b9ce274ea]
        (5) libbase.so.2.8.0: icinga::ThreadPool::WorkerThread::ThreadProc(icinga::ThreadPool::Queue&) (+0x365) [0x7f0ba20082c5]
        (6) libboost_thread.so.1.54.0: <unknown function> (+0xcc0a) [0x7f0ba2c34c0a]
        (7) libpthread.so.0: <unknown function> (+0x8744) [0x7f0ba1543744]
        (8) libc.so.6: clone (+0x6d) [0x7f0ba2f29aad]
...

Context

Spam errors to the log file.
Socket file:
srw-rw---- 1 icinga icingacmd 0 Jan 8 13:07 /var/run/icinga2/cmd/livestatus=

# file /var/run/icinga2/cmd/livestatus
/var/run/icinga2/cmd/livestatus: socket

Your Environment

  • Version used (icinga2 --version): r2.8.0-1

  • Operating System and version: SUSE Linux Enterprise Server 12 SP3

  • Enabled features (icinga2 feature list): api checker compatlog ido-mysql livestatus mainlog notification

  • Icinga Web 2 version and modules (System - About):
    businessprocess | 2.1.0
    director | 1.4.2
    doc | 2.5.0
    fileshipper | 1.0.1
    monitoring | 2.5.0
    translation | 2.5.0

  • Config validation (icinga2 daemon -C):

information/cli: Icinga application loader (version: r2.8.0-1)
information/cli: Loading configuration file(s).
information/ConfigItem: Committing config item(s).
information/ApiListener: My API identity: icinga2.lan
warning/ApplyRule: Apply rule 'SMTP' (in /var/lib/icinga2/api/packages/director/icinga2-1515164638-0/zones.d/director-global/service_apply.conf: 13:1-13:20) for type 'Service' does not match anywhere!
warning/ApplyRule: Apply rule 'SMTPS' (in /var/lib/icinga2/api/packages/director/icinga2-1515164638-0/zones.d/director-global/service_apply.conf: 24:1-24:21) for type 'Service' does not match anywhere!
information/ConfigItem: Instantiated 1 ApiListener.
information/ConfigItem: Instantiated 3 Zones.
information/ConfigItem: Instantiated 1 Endpoint.
information/ConfigItem: Instantiated 3 ApiUsers.
information/ConfigItem: Instantiated 1 FileLogger.
information/ConfigItem: Instantiated 1 NotificationComponent.
information/ConfigItem: Instantiated 1 UserGroup.
information/ConfigItem: Instantiated 212 CheckCommands.
information/ConfigItem: Instantiated 1 IcingaApplication.
information/ConfigItem: Instantiated 130 Hosts.
information/ConfigItem: Instantiated 6 HostGroups.
information/ConfigItem: Instantiated 5 TimePeriods.
information/ConfigItem: Instantiated 5 Users.
information/ConfigItem: Instantiated 6 ServiceGroups.
information/ConfigItem: Instantiated 314 Services.
information/ConfigItem: Instantiated 1 CheckerComponent.
information/ConfigItem: Instantiated 1 CompatLogger.
information/ConfigItem: Instantiated 1 LivestatusListener.
information/ConfigItem: Instantiated 1 IdoMysqlConnection.
information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
information/cli: Finished validating the configuration file(s).

/etc/icinga2/features-enabled/livestatus.conf file contents:

/**
 *  The livestatus library implements the livestatus query protocol.
 */

library "livestatus"

object LivestatusListener "livestatus" {
  socket_type = "unix"
  socket_path = "/var/run/icinga2/cmd/livestatus"
}

Thank you in advance.

@dnsmichi

This comment has been minimized.

Copy link
Member

dnsmichi commented Jan 8, 2018

If your client application does not close the socket, or wait for processing the response, such errors occur. This is a clear indication of something going wrong, nothing I would call "spam".

Kind regards,
Michael

@kajla

This comment has been minimized.

Copy link
Author

kajla commented Jan 8, 2018

Oh, I see.
I used Thruk to SLA reports. It needs a LiveStatus. Do you think it doesn't close this socket?
What can I do with it?

Thank you in advance.

@dnsmichi

This comment has been minimized.

Copy link
Member

dnsmichi commented Jan 8, 2018

I'd investigate on the time such queries take, extract what the application fires and debug that. You may enable Icinga 2's debug log to see more, but the rest really is network (sockets) and the remote application.

Cheers,
Michael

@Lali1006

This comment has been minimized.

Copy link

Lali1006 commented Jan 10, 2018

Hi,

I have same problem in Icinga 2.6.2, with Thruk 2.18.
My icinga2.log conatins a lot of Livestatus query errors:
[2018-01-09 12:36:57 +0100] critical/LivestatusQuery: Cannot write query response to socket.

I haven't socket and thread pool error in the log, just this line. I have got same result in the debug.log.

My Thruk collected some information about the problem:
[ERROR][Thruk] No Backend available
[2018/01/08 10:10:16][ERROR][Thruk] on page: https://*********/thruk/cgi-bin/status.cgi?host=all&servicestatustypes=28&_=1515399710886
[2018/01/08 10:10:16][ERROR][Thruk] All: ERROR: failed to connect - Connection refused. (*********:6557)
[2018/01/08 10:10:16][ERROR][Thruk] Error in: /thruk/cgi-bin/status.cgi
[2018/01/08 10:10:16][ERROR][Thruk] failed to connect - Connection refused at /usr/share/thruk/lib/Monitoring/Livestatus/Class/Lite.pm line 380.
at /usr/lib64/thruk/perl5/Plack/Util.pm line 142.
eval {...} called at /usr/lib64/thruk/perl5/Plack/Util.pm line 142
Plack::Util::run_app('CODE(0x2da2040)', 'HASH(0x3bf7db0)') called at /usr/lib64/thruk/perl5/Plack/Handler/FCGI.pm line 143
Plack::Handler::FCGI::run('Plack::Handler::FCGI=HASH(0x10ecc48)', 'CODE(0x2da2040)') called at /usr/share/thruk/script/thruk_fastcgi.pl line 24

My httpd error.log content:
broken pipe at /usr/share/thruk/lib/Thruk/Backend/Pool.pm line 30.
Thruk::Backend::Pool::ANON('PIPE') called at /usr/lib64/thruk/perl5/Plack/Handler/FCGI.pm line 161
eval {...} called at /usr/lib64/thruk/perl5/Plack/Handler/FCGI.pm line 161
Plack::Handler::FCGI::run('Plack::Handler::FCGI=HASH(0x207dc48)', 'CODE(0x3d33050)') called at /usr/share/thruk/script/thruk_fastcgi.pl line 24
[Mon Jan 08 08:53:28.588033 2018] [fcgid:warn] [pid 76590] mod_fcgid: cleanup zombie process 112971
ERROR: got signal TERM while handling request, possible timeout in https://***/thruk/cgi-bin/history.cgi?entries=100&start=2017-05-01+00:00:00&end=2018-01-09+00:00&archive=&host=&service=Filesys$
at /usr/share/thruk/lib/Thruk.pm line 543.

@kajla

This comment has been minimized.

Copy link
Author

kajla commented Jan 10, 2018

My Thruk version is same; 2.18.
We have different Icinga2 version with the same error. I think it's a Thruk issue.
Am I correct, @dnsmichi ?

Thank you.

@dnsmichi

This comment has been minimized.

Copy link
Member

dnsmichi commented Apr 5, 2018

I'd say so.

@dnsmichi dnsmichi closed this Apr 5, 2018

@dnsmichi dnsmichi removed the needs-feedback label Apr 5, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment