Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP, RFC] new logger: rawfifo (infra for plugin) #23198

Closed
wants to merge 1 commit into from

Conversation

AkihiroSuda
Copy link
Member

What I did

Added a new logger called rawfifo.

This can be used for implementing logging plugins outside the daemon process.

Related: #18604
Related: #18001 (comment),

How I did it

  • Introduced a new interface: RawLogger. RawLogger embeds Logger.
  • Converted Copier from a structure into an interface.
  • Introduced two implementations for Copier: MessageCopier and RawCopier.
    MessageCopier is identical to the former Copier.
    RawCopier is different from MessageCopier in that it does not try to detect '\n' and just copies raw log streams to an io.Writer, which is returned from RawLogger.RawWriter().
  • Introduced a rawfifo logging driver which implements RawLogger (and Logger as well).

How to verify it

Scenario 1: standard workload

Execute:

host$ docker run -it --log-driver rawfifo --log-opt="rawfifo-dir=/tmp/t1" alpine sh

Make sure you can execute a command in the container interactively.

container$ echo hello

Make sure the output is copied to the log:

host$ cat /tmp/t1/stdout

You may have an issue when you try to stop the container.
This issue will be fixed after discussing the core design.

Scenario 2: make sure infinite output does not lead to OOM

Execute:

host$ docker run -itd --log-driver rawfifo --log-opt="rawfifo-dir=/tmp/t2" alpine sh -c "while :; do date; done"

Wait for 5 or more minutes.
Make sure there is no significant impact in CPU and RAM usage.

host$ for f in $(seq 1 $((5 * 60)));do uptime; free -mt; sleep 1; done

Make sure the output is copied to the log.
Note that some output were blocked during we were waiting for several minutes.

host$ cat -n /tmp/t2/stdout | less
1: Thu Jun  2 07:00:00 UTC 2016
2: Thu Jun  2 07:00:00 UTC 2016
3: Thu Jun  2 07:00:00 UTC 2016
...
...
XXX0: Thu Jun  2 07:00:00 UTC 2016  <<- the pipe buffer was full at that time
XXX1: Thu Jun  2 07:05:00 UTC 2016
XXX2: Thu Jun  2 07:05:00 UTC 2016

Scenario 3: make sure infinite output without \n does not lead to OOM (see also #20600, #18057)

Execute:

host$ docker run -itd --log-driver rawfifo --log-opt="rawfifo-dir=/tmp/t3" alpine sh -c cat /dev/zero

Wait for 5 or more minutes.
Make sure there is no significant impact in CPU and RAM usage.

host$ for f in $(seq 1 $((5 * 60)));do uptime; free -mt; sleep 1; done

Open the log, and Make sure there is still no significant impact in CPU and RAM usage.

host$ cat /tmp/t3/stdout
host$ for f in $(seq 1 $((5 * 60)));do uptime; free -mt; sleep 1; done

Description for the changelog

new logger: rawfifo

A picture of a cute animal (not mandatory but encouraged)

https://upload.wikimedia.org/wikipedia/commons/4/40/Adelie_Penguins_on_iceberg.jpg

Signed-off-by: Akihiro Suda suda.akihiro@lab.ntt.co.jp


Motivation: split the logging routine from the daemon process

I know this PR is controversial and GELF is the standard way for connecting the Docker daemon to other logging software.
#18001 (comment)

But having complicated logging routines (including GELF) in the daemon itself can lead to some issues related to performance and stability.

So I suggest making in-daemon routines as small as possible with this rawfifo driver, and do substantial work outside the daemon process (a.k.a plugin).

As we can limit resource usage for external logging processes via cgroup, we can prevent them from crasing the daemon.

If the external logging process does not read from the fifo, write to stdout is blocked, and the daemon does not need to be busy for buffering the write and waiting for '\n'.

Even if I cannot get positive comments for my implementation, I would like to hear comments about how we can split loggers from the damon in aspect of performance and stability 馃槃

Tags

RawLogger does not support tags because it just copies the raw log stream from a container to a named pipe without appending meta data.

So it is recommended to specify the rawfifo-dir which path contains the container name and other information instead.

docker run --log-driver=rawfifo --log-opt rawfifo-dir=/tmp/alpine/c1 --name c1 alpine

Timestamps

RawLogger does not support timestamps.
It is up to the external logging process to read logs as fast as possible and determine arbitrary timestamps.

TODO

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>
@thaJeztah thaJeztah added the status/needs-attention Calls for a collective discussion during a review session label Jun 2, 2016
@thaJeztah
Copy link
Member

@sirupsen
Copy link
Contributor

sirupsen commented Jun 2, 2016

Years ago when we designed our logging pipeline we decided to not use named pipes because they fundamentally are blocking unless you have a reader on the other side of the pipe, meaning that you lose log messages if that needs to restart鈥攐r worse, you tie up the Docker daemon. This needs to be extremely carefully engineered to handle a crashed reader at the other end of the pipe.

@dqminh
Copy link
Contributor

dqminh commented Jun 2, 2016

@sirupsen hmm iirc you should be able to make the named pipe non blocking and change the size of the named pipe, there are also some trick which allow we to write to the pipe without any reader yet, maybe that help with the concern ?

Otherwise we probably can also look at the msg queue facility provided by the kernel as an alternative to this.

@sirupsen
Copy link
Contributor

sirupsen commented Jun 2, 2016

We use sysvmq for this, yeah鈥攊t's actually awesome and often overlooked. The problem is that you'd have to disable the IPC namespace (--ipc with Docker) to use it well across containers.

That said, a named pipe can still be valuable to many people, but it should come with a strong disclaimer (unless there are ways to make it non-blocking and/or super safe to not block the daemon logging thread).

@AkihiroSuda
Copy link
Member Author

Thank you all for the comment.

Actually, even when there is no reader yet, the container can write some output to stdio until the buffer gets filled up. (please see also "Scenario 1" and "Scenario 2")

In Linux, the buffer size is 4096 bytes by default, but you can configure the size using F_SETPIPE_SZ.
Maybe we can dynamically increase the buffer size depending on how fast it gets filled up.

So I think this help with the concern as @dqminh mentioned.
No message loss or no block will happen as long as the reader starts up immediately and serves stably.

However, we still have to address the reader crash issue as @sirupsen mentioned.

Approach 1: propagate SIGPIPE to the container program when the reader exited

Pro: This keeps the original UNIX semantic, and do nothing particular to Docker.

Con: The process exists on SIGPIPE by default. Even though the approach keeps the original UNIX semantic, this may surprise users.

Approach 2: reopen the pipe in the daemon and wait for reader restart; propagate nothing to the container program

Pro: The container program keeps running on the reader crash.

Con: Some log messages will be lost during reopening the pipe. If reader could not restart due to some reason, the output will be blocked after running out the buffer. This may surprise users.

Approach 3: use other MQs

Several pros and cons..

Approach 4: close this PR and find out other performance/stability isolation technique

For OOM issue, several approaches are suggested in #18057
Basically it should be done by do a trick for extreme long (or endless, say cat /dev/zero) lines.

For CPU usage issues #21181, #19209, I am not sure how we can address the issue.

@AkihiroSuda
Copy link
Member Author

@sirupsen Can I hear your opinion 馃槂 ?
Do you prefer sysvmq rather than POSIX mq?

@cpuguy83
Copy link
Member

I definitely applaud the effort here, but I'm not sure this is the right direction.
I think we should work towards using the plugin system to allow out-of-process drivers.

@AkihiroSuda
Copy link
Member Author

@cpuguy83
Yes, this just a first step toward the plugin system.
The plugin API should be defined after deciding the MQ (named FIFO, POSIX mq, SysV mq, ..) to be used.

So perhaps the rawfifo driver should not be exposed to the CLI.

@thaJeztah
Copy link
Member

Ok, we discussed this in the maintainers session, and think we should close this for now, looking for the right approach.

Thanks for contributing though!

@thaJeztah thaJeztah closed this Jul 14, 2016
@thaJeztah thaJeztah reopened this Jul 14, 2016
@thaJeztah
Copy link
Member

oops closed the wrong PR

@thaJeztah
Copy link
Member

darn, was the good one, sorry for the noise

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/needs-attention Calls for a collective discussion during a review session status/1-design-review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants