Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logger time and connection issues #363

Closed
mrcage opened this issue Oct 14, 2016 · 14 comments
Closed

logger time and connection issues #363

mrcage opened this issue Oct 14, 2016 · 14 comments

Comments

@mrcage
Copy link

mrcage commented Oct 14, 2016

Since version 2.26.2, we have troubles using logger in conjunction with other software, e.g. Apache 2.4 or Java-bases software using log4j syslog forwarder, to send logs to syslog-ng using logger via a pipe (as recommended in Apache HTTPD documentation).

As soon as your syslog-ng (3.7) gets restarted, no messages can be delivered via logger to syslog anymore, and I get errors like the following (in my Apache error_log file):

logger: write failed: Connection refused
logger: write failed: Transport endpoint is not connected
logger: write failed: Transport endpoint is not connected
logger: write failed: Transport endpoint is not connected
logger: write failed: Transport endpoint is not connected
logger: write failed: Transport endpoint is not connected

Our Java-baased application hanged up completely sometimes.

There is also an other isse in that the time of the log-message does not get updated. In the logfiles written by syslog, the timestamp is always the time at which the application which uses logger has been started.

Replacing the binary of logger with an older one (e.g. from 2.25.2) solves these issues.

Operating system: Gentoo Linux 64-bit
Version 2.28.2 solves the isse with the timestamps, but the issue with the connection is not resolved with the latest stable version...

@karelzak
Copy link
Collaborator

Yes, the issue with timestamp has been fixed by commit 7db029e. It's definitely better to use v2.28 than v2.26.

The issue with connection seems painful, but I need more information. Can you try strace the logger? Did you try another setup (--tcp / --udp, --server, etc)? What about --socket-errors=on.

Anyway, this works for me:

/etc/syslog-ng/syslog-ng.conf:

source s_kzak_tcp { tcp( ip(127.0.0.1) port(514) ); };
destination d_kzak { file("/var/log/kzak"); };
log { source(s_kzak_tcp); filter(f_default); destination(d_kzak); };

and command line:

echo "This is my logger message" | ./logger --tcp --server 127.0.0.1 --port 514

the log file:

cat /var/log/kzak 
Oct 18 12:29:22 127.0.0.1 1 2016-10-18T12:29:22.077898+02:00 ws kzak - - [timeQuality tzKnown="1" isSynced="1" syncAccuracy="526895"] This is my logger message

And it also works with --rfc3164 if you like the old style.

@karelzak
Copy link
Collaborator

Note that the default port is "syslog-conn", I use --port to be sure in my examples. The syslog version from the example above is syslog-ng-3.6.2-5.fc24.x86_64

@mrcage
Copy link
Author

mrcage commented Oct 19, 2016

Commands like this

echo "This is my logger message" | ./logger --tcp --server 127.0.0.1 --port 514

work without any problem; but the issue appears when I use logger from inside another program like apache or with the log4j library...

@mhdali
Copy link

mhdali commented Oct 31, 2016

We're seeing this error as well.

We have rsyslog running in different docker container, unix socket is linked between container like bellow.

The issue happened only when we restart rsyslog container, however when I send log from inside the container logger test it works fine, only the apache logger does NOT.

$ logger -V
logger from util-linux 2.25.2

Our setup

rsyslog:

docker run -v /opt/rsyslog/dev:/opt/rsyslog/dev rsyslog

NOTE: rsyslog unix socket is configured to different path on above container /opt/rsyslog/dev/log

apache:

docker run -v /opt/rsyslog/dev:/opt/rsyslog/dev apache entrypoint.sh

entrypoint.sh:

#!/bin/bash
ln -s /opt/rsyslog/dev/log /dev/log
apachectl -DFOREGROUND

Apache logging config:

ErrorLog "|/usr/bin/logger -S 25000 -t ads-production -p local6.err"
CustomLog "|/usr/bin/logger -S 25000 -t ads-production -p local6.notice" combined

Strace

Here is our strace output:

"X.X.X.X - - [31/Oct/2016:11:28:02 +0800] \"GET /server-status?auto HTTP/1.1\" 200 947 \"-\" \"-\" \"0/1566\" \"27.254.96.154:80\"\n", 4096) = 126
sendmsg(3, {msg_name(0)=NULL, msg_iov(2)=[{"<181>Oct 31 11:28:02 ads-production: ", 37}, {"X.X.X.X - - [31/Oct/2016:11:28:02 +0800] \"GET /server-status?auto HTTP/1.1\" 200 947 \"-\" \"-\" \"0/1566\" \"27.254.96.154:80\"", 125}], msg_controllen=0, msg_flags=0}, 0) = -1 ENOTCONN (Transport endpoint is not connected)
write(2, "logger: ", 8)                 = 8
write(2, "send message failed", 19)     = 19
write(2, ": ", 2)                       = 2
write(2, "Transport endpoint is not connected\n", 36) = 36
read(0, "X.X.X.X - - [31/Oct/2016:11:29:10 +0800] \"GET /server-status?auto HTTP/1.1\" 200 946 \"-\" \"-\" \"0/1631\" \"27.254.96.154:80\"\n", 4096) = 126
sendmsg(3, {msg_name(0)=NULL, msg_iov(2)=[{"<181>Oct 31 11:29:10 ads-production: ", 37}, {"X.X.X.X - - [31/Oct/2016:11:29:10 +0800] \"GET /server-status?auto HTTP/1.1\" 200 946 \"-\" \"-\" \"0/1631\" \"27.254.96.154:80\"", 125}], msg_controllen=0, msg_flags=0}, 0) = -1 ENOTCONN (Transport endpoint is not connected)
write(2, "logger: ", 8)                 = 8
write(2, "send message failed", 19)     = 19
write(2, ": ", 2)                       = 2
write(2, "Transport endpoint is not connected\n", 36) = 36
read(0, "X.X.X.X - - [31/Oct/2016:11:29:41 +0800] \"GET /server-status?auto HTTP/1.1\" 200 948 \"-\" \"-\" \"0/1680\" \"27.254.96.154:80\"\n", 4096) = 126
sendmsg(3, {msg_name(0)=NULL, msg_iov(2)=[{"<181>Oct 31 11:29:41 ads-production: ", 37}, {"X.X.X.X - - [31/Oct/2016:11:29:41 +0800] \"GET /server-status?auto HTTP/1.1\" 200 948 \"-\" \"-\" \"0/1680\" \"27.254.96.154:80\"", 125}], msg_controllen=0, msg_flags=0}, 0) = -1 ENOTCONN (Transport endpoint is not connected)
write(2, "logger: ", 8)                 = 8
write(2, "send message failed", 19)     = 19
write(2, ": ", 2)                       = 2
write(2, "Transport endpoint is not connected\n", 36) = 36
read(0, "X.X.X.X - - [31/Oct/2016:11:32:10 +0800] \"GET /server-status?auto HTTP/1.1\" 200 947 \"-\" \"-\" \"0/1692\" \"27.254.96.154:80\"\n", 4096) = 126
sendmsg(3, {msg_name(0)=NULL, msg_iov(2)=[{"<181>Oct 31 11:32:10 ads-production: ", 37}, {"X.X.X.X - - [31/Oct/2016:11:32:10 +0800] \"GET /server-status?auto HTTP/1.1\" 200 947 \"-\" \"-\" \"0/1692\" \"27.254.96.154:80\"", 125}], msg_controllen=0, msg_flags=0}, 0) = -1 ENOTCONN (Transport endpoint is not connected)

@mrcage
Copy link
Author

mrcage commented Nov 4, 2016

Any clue why this is happening? Seems I'm not the only one...

@dtjm
Copy link

dtjm commented Sep 6, 2017

I can reproduce this pretty easily with util-linux-2.29.2:

  1. Run rsyslog with config $AddUnixListenSocket /dev/log
  2. Open STDIN stream to logger: cat - | ./logger
  3. Send a message to logger by typing in STDIN with newline
  4. Stop rsyslog
  5. Send another line and get output from logger: logger: send message failed: Connection refused
  6. Further lines cause logger to print: logger: send message failed: Transport endpoint is not connected
  7. Restart rsyslog and logger still is unable to send lines to rsyslog.

@dtjm
Copy link

dtjm commented Sep 6, 2017

Found the offending commit (1d57503) using git bisect:

1d57503378bdcd838365d625f6d2d0a09da9c29d is the first bad commit
commit 1d57503378bdcd838365d625f6d2d0a09da9c29d
Author: Sami Kerola <kerolasa@iki.fi>
Date:   Sat Aug 9 00:49:46 2014 +0100

    logger: allow use of --id=ppid when logging locally

    There is no obvious way to make syslog(3) to print both pid or ppid, so
    duplicate the libc syslog() to logger.  Making the ppid printing work
    using unix socket has side effect of local becoming capable to use both
    rfc format output, which is hopefully seen as good thing.  The
    syslog_local() is format wise one-to-one copy with glibc syslog(3)
    format.

    Signed-off-by: Sami Kerola <kerolasa@iki.fi>

:040000 040000 f1c56a5556327f374fc074ab6a6ed5a36e1d4e8a 826bbe9d830ce41b5850c34789bb3a33901c7708 M	misc-utils

Seems like in that commit the code no longer uses the syslog syscall which will try to open the socket if it's not already open (https://linux.die.net/man/3/syslog), and I don't see any logic to re-open the socket in case it gets disconnected.

@vernade
Copy link

vernade commented Oct 6, 2017

We are having the exact same problem with using logger from within Apache 2.2.
Our logger version:
logger2 from util-linux 2.30
Our logging configuration:
CustomLog "| /usr/bin/logger -t apache_access -p local6.debug" ourlogformat
ErrorLog "| /usr/bin/logger -t apache_error -p local6.debug"

Apache uses piped logging and writes access and error logs line by line to the started logger binary. When rsyslog is restarted (CentOS 6), the logger loses the connection to the unix socket and logging from Apache stops.

Apache uses reliable logging, which means that the logging process is restarted on failure.

But as the logger does not fail as such (the process does not stop) and only writes errors to the console, Apache has no information that the logger has failed. It seems that stopping the logger would be appropriate, as the error seems unrecoverable - maybe the behaviour could be made configurable with a command line option (--stop-on-failure or similar). Apache would then be able to restart the logger correctly.

Another option would be for the logger to try to reestablish the connection to the unix socket when the connection is lost.

@karelzak
Copy link
Collaborator

I see the issue. The glibc syslog() re-connect on failed send(). I'll implement the same for logger.

Sorry for the delay.

karelzak added a commit that referenced this issue Oct 10, 2017
The libc syslog() reconnects on failed send(). We need the same thing
as logger(1) is expected as long time running tool. For example
recommended Apache configuration is:

	ErrorLog "| /usr/bin/logger -t apache_error -p local6.debug"

The issue is that connection endpoint (e.g. syslogd) maybe restarted.
The simple way how to test is:

	for i in $(seq 0 3600); do echo "This is message number $i"; sleep 1; done | logger --tcp --server 127.0.0.1 --port 514

and restart your syslog. The current implementation gets SIGPIPE or
write warning message, but it never reconnect.

Addresses: #363
Signed-off-by: Karel Zak <kzak@redhat.com>
@karelzak
Copy link
Collaborator

The problem should be fixed in the master branch (and will be in v2.31; next week).

It would be nice if you can test it in your environment before the release, just

  $  git clone git://git.kernel.org/pub/scm/utils/util-linux/util-linux.git
  $ cd util-linux
  $ ./autogen.sh
  $ ./configure --disable-all-programs --enable-logger
  $ make

... and use the "logger" binary.

Thanks (and again sorry for the delay).

@karelzak
Copy link
Collaborator

And also commit e92d55e is relevant for this issue.

@vernade
Copy link

vernade commented Oct 17, 2017

I tested the described issue with a new logger binary, it works as intended. If a Apache is running with a pipe to logger is and syslog is restarted, it continues to log correctly.

It can easily reproduced on the command line:

  • first shell: start logger (e.g. logger -u /dev/log)
  • second shell: tail -f /var/log/messages
  • first shell: type something - it appears in /var/log/messages
  • third shell: restart rsyslog
  • first shell: type something - it appears in /var/log/messages

The last test works with the new logger and fails with the old one.

@b10s
Copy link

b10s commented Oct 18, 2017

@vernade what do you mean under "old one"? Which version? And what do you mean under "new logger" as well?

@vernade
Copy link

vernade commented Oct 18, 2017

The "old logger" is: logger from util-linux 2.30
The "new logger"is: logger from util-linux 2.31-rc1 (compiled from current sources)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants