New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Logging drivers #7195

Closed
crosbymichael opened this Issue Jul 23, 2014 · 93 comments

Comments

Projects
None yet
@crosbymichael
Contributor

crosbymichael commented Jul 23, 2014

Improved logging support

Topics:

  • Logging drivers
  • Initial logging drivers
  • Default driver improvements

Logging drivers

The driver interface should be able to support the smallest subset available for logging drivers to
implement their functionality. Stdout and stderr will still be the source of logging for containers
in this proposal. Docker will, however, take the raw streams from the containers and create discrete
messages delimited by writes. This parsed struct will then be sent to the logging drivers.

type Message struct {
    // ContainerID is the container id where the message originated from
    ContainerID string 

    // RawMessage is the raw bytes from the write
    RawMessage []byte 

    // Source specifies where this message originated, stderr, stdout, syslog
    Source string

    // Time is the time the message was received
    Time time.Time

    // Fields are user defined fields attach to the message
    Fields map[string]string
}

type Driver interface {
    // Log begins the logging of the stdout and stderr streams for a specific id
    Log(message *Message) error

    // ReadLog fetches the messages for a specific id
    ReadLog(containerID string) (messages []*Message, err error)

    // CloseLog tells the driver that no more log messages will be written for the specific id
    // drivers can implement this to their requirements, it may mean compressing the logs or deleting
    // them off of the disk
    CloseLog(containerID string) error

    // Close ensures that any writes for the logger are properly flushed and can be
    // stopped without data loss
    Close() error
}

When creating or initializing the drivers they will be provided with a key/value map with the user defined configuration specific to the driver. Each driver will also be provided a root directory where it is able to store and manage any type of state on disk.

Initial logging drivers

none - This driver will ignore the streams and log nothing for the containers. This is a totally valid
driver as the docker daemon has to manage the logs for all container it's a memory and performance bottleneck
on the daemon.

default - This driver will be the current implementation of logs that docker currently has. It is a single
file on disk with json objects with the message, timestamp, and stream of the log message separated by a
new line char.

syslog - This driver will write to a syslog socket and use the tag field to insert the container id.

Default driver improvements

One of the biggest issues with the default driver is that there is no log truncation or rotation. Both of
these issues need to be addressed. We can either truncate based on filesize or date. I believe filesize
is better.

Truncation size can default to 10mb with an option when you select the driver to specify additional options.
Rotation can also be set a specific size limit defaulting to 500mb. To change the defaults I propose a
--logging-opt flag on the daemon, similar to --storage-opt for the storage drivers.

Usage

The usage for this feature will be managed via the daemon:

docker -d --logging none
docker -d --logging default --logging-opt truncation=20mb --logging-opt rotation=1gb
@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Jul 23, 2014

Contributor

What about per container driver choosing? For example I don't want logs for elasticsearch, but I want logs for prosody.

Contributor

LK4D4 commented Jul 23, 2014

What about per container driver choosing? For example I don't want logs for elasticsearch, but I want logs for prosody.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Jul 23, 2014

Contributor

A few questions that I have, what should we do about timestamps? Should the read return some type of structured data or should we still manage this as just steams?

Any suggestion and modifications to this proposal is welcome.

Contributor

crosbymichael commented Jul 23, 2014

A few questions that I have, what should we do about timestamps? Should the read return some type of structured data or should we still manage this as just steams?

Any suggestion and modifications to this proposal is welcome.

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Jul 23, 2014

Contributor

I would still manage them as just streams and allow the implementation to handle it.

Contributor

cpuguy83 commented Jul 23, 2014

I would still manage them as just streams and allow the implementation to handle it.

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Jul 23, 2014

Contributor

Hm, actually I think that we have perfect interface for logging - io.Writer :) And for none we have ioutil.Discard.

Contributor

LK4D4 commented Jul 23, 2014

Hm, actually I think that we have perfect interface for logging - io.Writer :) And for none we have ioutil.Discard.

@brianm

This comment has been minimized.

Show comment
Hide comment
@brianm

brianm Jul 23, 2014

Contributor

I encourage being able to attach streams to syslog out of the box, while syslog may not be sexy in 2014, it works everywhere.

Contributor

brianm commented Jul 23, 2014

I encourage being able to attach streams to syslog out of the box, while syslog may not be sexy in 2014, it works everywhere.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Jul 23, 2014

Contributor

@brianm would you be interested in working on a syslog driver for this initial push?

Contributor

crosbymichael commented Jul 23, 2014

@brianm would you be interested in working on a syslog driver for this initial push?

@brianm

This comment has been minimized.

Show comment
Hide comment
@brianm

brianm Jul 23, 2014

Contributor

Happy to!

Contributor

brianm commented Jul 23, 2014

Happy to!

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Jul 23, 2014

Contributor

@brianm sounds good to me. I like to have a few different drives so that it keeps that interface honest and makes sure that we are accounting for different needs within the driver.

I'm guessing for things like syslog we will need to pass options when we create the driver. Maybe something like:

driver, err := syslog.NewDriver("/var/lib/docker/logging/syslog", map[string]string{
    "priority": "1",
    "socket": "/somepath",
})
Contributor

crosbymichael commented Jul 23, 2014

@brianm sounds good to me. I like to have a few different drives so that it keeps that interface honest and makes sure that we are accounting for different needs within the driver.

I'm guessing for things like syslog we will need to pass options when we create the driver. Maybe something like:

driver, err := syslog.NewDriver("/var/lib/docker/logging/syslog", map[string]string{
    "priority": "1",
    "socket": "/somepath",
})
@jamtur01

This comment has been minimized.

Show comment
Hide comment
@jamtur01

jamtur01 Jul 23, 2014

Contributor

+1 to syslog driver and config.

Contributor

jamtur01 commented Jul 23, 2014

+1 to syslog driver and config.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Jul 23, 2014

Contributor

I just added the syslog driver to the proposal

Contributor

crosbymichael commented Jul 23, 2014

I just added the syslog driver to the proposal

@kuon

This comment has been minimized.

Show comment
Hide comment
@kuon

kuon Jul 24, 2014

Contributor

I am currently evaluating a gazillion way of getting my logs to the right place with docker (app directly forward log, agent in the container, agent in another container, agent on the host, syslog, ...) and having docker logs directly to syslog would solve it easily. All apps could just use stdout/err.

As per container configuration, we should be able to give the sender name and the facility.

The other possibility is to turn off logging in docker and use systemd or supervisord to forward the logs of each containers to syslog.

Contributor

kuon commented Jul 24, 2014

I am currently evaluating a gazillion way of getting my logs to the right place with docker (app directly forward log, agent in the container, agent in another container, agent on the host, syslog, ...) and having docker logs directly to syslog would solve it easily. All apps could just use stdout/err.

As per container configuration, we should be able to give the sender name and the facility.

The other possibility is to turn off logging in docker and use systemd or supervisord to forward the logs of each containers to syslog.

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Jul 24, 2014

Contributor

If someone miss my first comment:

  • Drivers should be configurable per container
  • Drivers should implement io.Writer, so we have for free null writer and syslog writer from stdlib. This is goish way to do this.
Contributor

LK4D4 commented Jul 24, 2014

If someone miss my first comment:

  • Drivers should be configurable per container
  • Drivers should implement io.Writer, so we have for free null writer and syslog writer from stdlib. This is goish way to do this.
@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Jul 24, 2014

Contributor

I can see the need for this to be per container but configuration will be weird on the daemon running multiple logging drivers.

-1 on the io.Writer, we need to distinguish stdout and stderr in some of the drivers os we cannot use one interface. the steams are already io.Writers coming in so we are still good.

Contributor

crosbymichael commented Jul 24, 2014

I can see the need for this to be per container but configuration will be weird on the daemon running multiple logging drivers.

-1 on the io.Writer, we need to distinguish stdout and stderr in some of the drivers os we cannot use one interface. the steams are already io.Writers coming in so we are still good.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Jul 24, 2014

Contributor

I updated the Driver interface to include CloseLog for signaling to the driver that no more logs will be written for a specific id.

Contributor

crosbymichael commented Jul 24, 2014

I updated the Driver interface to include CloseLog for signaling to the driver that no more logs will be written for a specific id.

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Jul 24, 2014

Contributor

we need to distinguish stdout and stderr

Yup, and I'm totally want possibility to have different drivers for them.

Contributor

LK4D4 commented Jul 24, 2014

we need to distinguish stdout and stderr

Yup, and I'm totally want possibility to have different drivers for them.

@markcartertm

This comment has been minimized.

Show comment
Hide comment
@markcartertm

markcartertm Jul 25, 2014

Should the logging driver support a syslog server option ?
This will make it easier to troubleshoot when containers are dynamically assigned to hosts by an orchestration layer.
docker -d --logging syslog

Should the logging driver support a syslog server option ?
This will make it easier to troubleshoot when containers are dynamically assigned to hosts by an orchestration layer.
docker -d --logging syslog

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Jul 26, 2014

Contributor

@markcartertm Are you questioning the idea of including a syslog driver (which is much discussed above and part of the proposal) or did you not see the discussion?

Contributor

cpuguy83 commented Jul 26, 2014

@markcartertm Are you questioning the idea of including a syslog driver (which is much discussed above and part of the proposal) or did you not see the discussion?

@solidsnack

This comment has been minimized.

Show comment
Hide comment
@solidsnack

solidsnack Jul 26, 2014

An implementation that chunked logs by time could be helpful for async log archiving strategies. For example, if logs were stored by minute. I'm not sure how this would interact with the truncation option.

An implementation that chunked logs by time could be helpful for async log archiving strategies. For example, if logs were stored by minute. I'm not sure how this would interact with the truncation option.

@kuon

This comment has been minimized.

Show comment
Hide comment
@kuon

kuon Jul 28, 2014

Contributor

A first step that would allow "usual" logging processing system to work is to make docker logrotate compliant. At present there is no way to make docker re-create the log files after a rotation without restarting the containers. kill -HUP on the docker daemon restarts all the containers.

Contributor

kuon commented Jul 28, 2014

A first step that would allow "usual" logging processing system to work is to make docker logrotate compliant. At present there is no way to make docker re-create the log files after a rotation without restarting the containers. kill -HUP on the docker daemon restarts all the containers.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Jul 30, 2014

Contributor

I think the last question here that needs to be answered is, should this be per container or a daemon wide option?

Contributor

crosbymichael commented Jul 30, 2014

I think the last question here that needs to be answered is, should this be per container or a daemon wide option?

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Jul 30, 2014

Contributor

@crosbymichael People will want per container with a default set at the daemon level.

Contributor

cpuguy83 commented Jul 30, 2014

@crosbymichael People will want per container with a default set at the daemon level.

@kuon

This comment has been minimized.

Show comment
Hide comment
@kuon

kuon Jul 30, 2014

Contributor

Syslog configuration could be (in order or priority, from low to high):

  • Daemon level default
  • Image default
  • Container config

The configuration at the image level would obviously not include all options (like where to log) but rather what to log (stderr/stdout/both, include timestamp or not or add formatting).

Contributor

kuon commented Jul 30, 2014

Syslog configuration could be (in order or priority, from low to high):

  • Daemon level default
  • Image default
  • Container config

The configuration at the image level would obviously not include all options (like where to log) but rather what to log (stderr/stdout/both, include timestamp or not or add formatting).

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Jul 30, 2014

Contributor

@kuon we cannot do anything at the image level because it makes images lose portability. Things like this should be host specific/ runtime dependencies.

Contributor

crosbymichael commented Jul 30, 2014

@kuon we cannot do anything at the image level because it makes images lose portability. Things like this should be host specific/ runtime dependencies.

@wking

This comment has been minimized.

Show comment
Hide comment
@wking

wking Jul 30, 2014

On Wed, Jul 30, 2014 at 10:49:07AM -0700, Michael Crosby wrote:

we cannot do anything at the image level because it makes images
lose portability.

Setting defaults at the image level shouldn't compromise portability.
We already do this with other image metadata (e.g. via a Dockerfile's
CMD, ENTRYPOINT, EXPOSE, …).

wking commented Jul 30, 2014

On Wed, Jul 30, 2014 at 10:49:07AM -0700, Michael Crosby wrote:

we cannot do anything at the image level because it makes images
lose portability.

Setting defaults at the image level shouldn't compromise portability.
We already do this with other image metadata (e.g. via a Dockerfile's
CMD, ENTRYPOINT, EXPOSE, …).

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Jul 30, 2014

Contributor

@wking all these things you listed are things that happen inside the container.
Log handling happens outside the container.

Contributor

cpuguy83 commented Jul 30, 2014

@wking all these things you listed are things that happen inside the container.
Log handling happens outside the container.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Jul 30, 2014

Contributor

@wking Yes, they do when it's something specific about what type of logging drivers you have installed on a specific docker host. The settings in the image are all portable. This is the reason why VOLUMES /home/michael:/root is not allowed in the image because not ever docker host will have a folder structure with /home/michael

Contributor

crosbymichael commented Jul 30, 2014

@wking Yes, they do when it's something specific about what type of logging drivers you have installed on a specific docker host. The settings in the image are all portable. This is the reason why VOLUMES /home/michael:/root is not allowed in the image because not ever docker host will have a folder structure with /home/michael

@kuon

This comment has been minimized.

Show comment
Hide comment
@kuon

kuon Jul 30, 2014

Contributor

I guess relying on environment variables (like -e LOGLEVEL=INFO and such) is OK for this use case. It was more an idea than a "thought through" proposal.

Contributor

kuon commented Jul 30, 2014

I guess relying on environment variables (like -e LOGLEVEL=INFO and such) is OK for this use case. It was more an idea than a "thought through" proposal.

@wking

This comment has been minimized.

Show comment
Hide comment
@wking

wking Jul 30, 2014

On Wed, Jul 30, 2014 at 10:59:23AM -0700, Michael Crosby wrote:

Yes, they do when it's something specific about what type of logging
drivers you have installed on a specific docker host.

Right, so which driver shouldn't be in the image config, but what
gets passed
from the container to the logging interface should be.
@kuon's suggestions:

Wed, Jul 30, 2014 at 10:47:59AM -0700, kuon:

The configuration at the image level would obviously not include all
options (like where to log) but rather what to log
(stderr/stdout/both, include timestamp or not or add formatting).

all sound portable to me.

wking commented Jul 30, 2014

On Wed, Jul 30, 2014 at 10:59:23AM -0700, Michael Crosby wrote:

Yes, they do when it's something specific about what type of logging
drivers you have installed on a specific docker host.

Right, so which driver shouldn't be in the image config, but what
gets passed
from the container to the logging interface should be.
@kuon's suggestions:

Wed, Jul 30, 2014 at 10:47:59AM -0700, kuon:

The configuration at the image level would obviously not include all
options (like where to log) but rather what to log
(stderr/stdout/both, include timestamp or not or add formatting).

all sound portable to me.

@shykes

This comment has been minimized.

Show comment
Hide comment
@shykes

shykes Jul 30, 2014

Collaborator

I would prefer the logging interface to be message-oriented. It turns out "logs as continuous byte streams" (ie stdout/stderr) are only a fraction of the total universe of logs out there. Basically every logging system - from syslog to splunk to loggly to systemd to logstash - expects discrete messages with a primary payload and key-value metadata attached to it. So I think that should be our logging primitive, and stdout/stderr should be chopped up and converted to discrete messages at the container boundary, before being ingested that way. Of course along the way these chopped up messages can be annotated with extra fields, for example: "stream: stdout" or something similar.

I think our message format should define a strict schema, with reserved fields and a special area for flexible userdata fields.

Collaborator

shykes commented Jul 30, 2014

I would prefer the logging interface to be message-oriented. It turns out "logs as continuous byte streams" (ie stdout/stderr) are only a fraction of the total universe of logs out there. Basically every logging system - from syslog to splunk to loggly to systemd to logstash - expects discrete messages with a primary payload and key-value metadata attached to it. So I think that should be our logging primitive, and stdout/stderr should be chopped up and converted to discrete messages at the container boundary, before being ingested that way. Of course along the way these chopped up messages can be annotated with extra fields, for example: "stream: stdout" or something similar.

I think our message format should define a strict schema, with reserved fields and a special area for flexible userdata fields.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Jul 30, 2014

Contributor

@shykes what is a "discrete" message when we are only dealing with stdout,stderr? Only read until a \n and say that is a message or chunk by the Write that we get?

Contributor

crosbymichael commented Jul 30, 2014

@shykes what is a "discrete" message when we are only dealing with stdout,stderr? Only read until a \n and say that is a message or chunk by the Write that we get?

@daniel-garcia

This comment has been minimized.

Show comment
Hide comment
@daniel-garcia

daniel-garcia Jul 30, 2014

Contributor

+1 to @shykes comments. A scheme like that would allow container to report all sorts of semi structured data such as white box metrics. For example, metric_name=foo, ts=X, val=Z, tag1=baz

Contributor

daniel-garcia commented Jul 30, 2014

+1 to @shykes comments. A scheme like that would allow container to report all sorts of semi structured data such as white box metrics. For example, metric_name=foo, ts=X, val=Z, tag1=baz

@erikh

This comment has been minimized.

Show comment
Hide comment
@erikh

erikh Jul 30, 2014

Contributor

Logstash internally converts from input plugins to JSON and then uses output plugins to reformat to output to the desired store or transport. Perhaps we could use its model as a source of inspiration?

-Erik

On Jul 30, 2014, at 12:32 PM, Daniel Garcia notifications@github.com wrote:

+1 to shykes comments. A scheme like that would allow container to report all sorts of semi structured data such a white box metrics. For example, metric_name=foo, ts=X, val=Z, tag1=baz


Reply to this email directly or view it on GitHub.

Contributor

erikh commented Jul 30, 2014

Logstash internally converts from input plugins to JSON and then uses output plugins to reformat to output to the desired store or transport. Perhaps we could use its model as a source of inspiration?

-Erik

On Jul 30, 2014, at 12:32 PM, Daniel Garcia notifications@github.com wrote:

+1 to shykes comments. A scheme like that would allow container to report all sorts of semi structured data such a white box metrics. For example, metric_name=foo, ts=X, val=Z, tag1=baz


Reply to this email directly or view it on GitHub.

@unclejack

This comment has been minimized.

Show comment
Hide comment
@unclejack

unclejack Jul 30, 2014

Contributor

Dockerfile level specification of logging driver: -1
This configuration is set on each Docker daemon as needed, the Dockerfile shouldn't specify the logging driver. This would break the reuse of the same image across all environments (dev vs. production and so on).

Default daemon level default logging driver: +1
Being able to specify the driver for each container: +1

Contributor

unclejack commented Jul 30, 2014

Dockerfile level specification of logging driver: -1
This configuration is set on each Docker daemon as needed, the Dockerfile shouldn't specify the logging driver. This would break the reuse of the same image across all environments (dev vs. production and so on).

Default daemon level default logging driver: +1
Being able to specify the driver for each container: +1

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Jul 30, 2014

Contributor

How are we supposed to get key-value format from this without having people modify their applications to support this?

Contributor

crosbymichael commented Jul 30, 2014

How are we supposed to get key-value format from this without having people modify their applications to support this?

@daniel-garcia

This comment has been minimized.

Show comment
Hide comment
@daniel-garcia

daniel-garcia Jul 30, 2014

Contributor

Could the meaning of "discrete" from stdin/stderr be defined by a container level option? The default driver could "do it's best" by breaking on newlines, or writes or whatever; the json logging today breaks up the messages today... just reuse that.

Why can't there exist a new interface at /dev/dockerlog that has a special meaning?

Contributor

daniel-garcia commented Jul 30, 2014

Could the meaning of "discrete" from stdin/stderr be defined by a container level option? The default driver could "do it's best" by breaking on newlines, or writes or whatever; the json logging today breaks up the messages today... just reuse that.

Why can't there exist a new interface at /dev/dockerlog that has a special meaning?

@shykes

This comment has been minimized.

Show comment
Hide comment
@shykes

shykes Jul 30, 2014

Collaborator

@crosbymichael for each logging format commonly used by applications, we should have an adapter which does the translation to our internal logging format. As a start we can bundle the 3 most requested adapters: stdout/stderr, syslog, and files. Then perhaps later we can allow 3d-party adapters.

The problem right now is that many applications commonly use syslog and files. If our internal logging API doesn't support those well, then it will be difficult for Docker logging drivers to receive all logs.

About the stdout/err adapter: I think we have the choice between 1) newline-split and 2) write-split. I think write-split makes more sense, it allows multi-line logs, as long the applications writes them as a single write(2) call. I would like to discuss this part more, I agree it's important.

Collaborator

shykes commented Jul 30, 2014

@crosbymichael for each logging format commonly used by applications, we should have an adapter which does the translation to our internal logging format. As a start we can bundle the 3 most requested adapters: stdout/stderr, syslog, and files. Then perhaps later we can allow 3d-party adapters.

The problem right now is that many applications commonly use syslog and files. If our internal logging API doesn't support those well, then it will be difficult for Docker logging drivers to receive all logs.

About the stdout/err adapter: I think we have the choice between 1) newline-split and 2) write-split. I think write-split makes more sense, it allows multi-line logs, as long the applications writes them as a single write(2) call. I would like to discuss this part more, I agree it's important.

@zepouet

This comment has been minimized.

Show comment
Hide comment
@zepouet

zepouet Aug 7, 2014

Hello,

I have been reading all comments about this new feature proposal and I would like to make some comments. As a matter of fact, I have actually been working those last weeks on tool to retrieve and anlayse the docker's logs. To share where I am coming from: we are using the docker for our "PaaS homebrew solution". Log Management is a critical function for us.

At start, we have coded the whole logic to retrieve logs and expose them to users.
We were about to add new features when we decided to run a gap analysis of existing solutions.
And we decided to take one step back and select LogStash.

Our logic is to feed logtash with the logs produced by our apps. We use either file or sylog (and its flavor syslog4jjappender for the java apps). Then we chosse Elasticsearch to create a datawarehouse of logs (sorting/indexing), irrespective of the containers.
Additionally deleting a container wont delete the log.

We believe the decision to use << stdout/stderr, file, syslog >> or << configuration runtime/dockerfile >> is quite relevant
and we consider several options.

My humble opinion is hereafter:

  • we shall leave people (sysadmin or devops team) use tools they are familiar with (Kibana, Logstash... no limit list).
  • They shall not be asked to deal with configuration of containers.
  • Also all those log reader tools already come with plugin to parse various log formats. That would a tremendous work to recode everything in Docker, even using the modular approach of plugin.
  • we shall leave developper/container architect make a decision on log file to be retained, based on severity level (error, warn, info, debug, custom... ). Possibly in a static manner based on the fact that developpers are the most relevant person to know/determine where are relevant infos/logs.

We therefore would like to propose a new instruction for DockerFile

LOGS[TAG]=
LOGS[DEBUG]=/var/log/apache2/access.log
LOGS[DEBUG]=/opt/tomcat/logs/valve-access.log
LOGS[INFO]=/opt/tomcat/logs/catalina.out
LOGS[ERROR]=/var/log/apache2/error.log
LOGS[ERROR]=/var/log/apache2/mod_jk.log
LOGS[ERROR]=/var/log/apache2/mod_rewrite.log
LOGS[ERROR]=/opt/tomcat/logs/error.log

Keywords used above (DEBUG, ERROR, ...) are free text (i.e. a folksonomy as opposed to a taxonomy).

Therefore people would free to add new keyword such as "AUDIT".

LOGS[AUDIT]=/opt/tomcat/logs/audits.log

Even better they could gain access to an archive folder for rotating logs.

LOGS[ARCHIVES]=/archives/**/*.log [WILDCARD pattern inspired by ruby syntax]
  • The Docker API will allow to retrieve all logs of a given container based on one or multiple tags (ex. INFO + AUDIT). From there a client (may that be a a human person or a software) would query logs based on software call over an API (Rest API or Socket).

  • The log driver of Dock shall be able to retrieve and list all tags attached to a container : "docker logs -list 1e4328"

    {
    "debug" : [ 
          { "file1" : "/var/log/apache2/access.log" },
          { "file2" : "/opt/tomcat/logs/valve-access.log" } 
    ],
    "info" : { "file3 ": { "opt/tomcat/logs/catalina.out" },
    error : [ 
           { file4 : "/var/log/apache2/error.log" },
           { file5 : "/var/log/apache2/mod_jk.log",
           { file6 : "/var/log/apache2/mod_rewrite.log",
           { file7 : "/opt/tomcat/logs/error.log" },
    ],
    "audit" : [ { file8 : "/opt/tomcat/logs/audits.log" } ] 
    "archives" : [ 
          { file9 : "/archives/logs/catalina.1.log" } ] 
          { file10 : "/archives/logs/catalina.2.log" } 
      ] 
    }
    
  • The log driver of Dock shall also be able to scroll backward file of a given tag with command : "docker logs -tags:AUDIT+INFO 1e4328"

    • n last lines, just like a tail command would do. Options: --tail:n
    • only new lines since last call. Option --sinceLastCall
    • to grep the lines directly without returning all informations to the client 👍
      docker logs -grep 'mySpecificMessage' 1e4328
    • also an option --json to retrieve a json based file with extra informations (file source/origin details + extra metadata). Default false.
    • driver shall also be able to retrieve directly a file ex.
      docker logs -keys:file3 1e4328

Usage example:
1 - Lets assume an admin guy in need to get a list of all errors of a given app would launch:

docker logs -tags:ERROR 1e4328 --tail:500

He would receive the last 500 error lines - cross apps - that the developer classified as relevant in case of errors.

2 - Lets now assume a software apps
It may call on a regular basis the Docket API with

 docker logs -tags:INFOS 1e4328 --sinceLastCall 

It would receive only new lines to do whatever processings it needs (storage, trigger alert, ...)

My opinion is that log rotating and archiving shall fall under the responsibility of individual products that produce logs.
Log4j or LogBack are good examples in the Java world.

The core added value of Docker would be to focus on an easy access to those tagged logs (through file, json or Rest API). The static configuration into DockerFile could be too external (properties, json of xml format) and given at runtime for each container. So administrator could so cancel and override a default developper/architect dockerfile configuration.

We could imagine in the future to have a Message Queue or an event broker (publish/subscribe) to inform external application. A such feature is not relevant to the current question. It is more general.

Thanks you if you read me :-) And Thanks you for the previous comments too !
Best regards,

Nicolas


" César - Et ce Paris, c'est vraiment beaucoup plus grand que Marseille ?
M. Brun - A Paris, j'ai vu au moins cinquante Canebière ! "
(Marius - Scène 3, Marcel Pagnol, Adaptation de 1946)

zepouet commented Aug 7, 2014

Hello,

I have been reading all comments about this new feature proposal and I would like to make some comments. As a matter of fact, I have actually been working those last weeks on tool to retrieve and anlayse the docker's logs. To share where I am coming from: we are using the docker for our "PaaS homebrew solution". Log Management is a critical function for us.

At start, we have coded the whole logic to retrieve logs and expose them to users.
We were about to add new features when we decided to run a gap analysis of existing solutions.
And we decided to take one step back and select LogStash.

Our logic is to feed logtash with the logs produced by our apps. We use either file or sylog (and its flavor syslog4jjappender for the java apps). Then we chosse Elasticsearch to create a datawarehouse of logs (sorting/indexing), irrespective of the containers.
Additionally deleting a container wont delete the log.

We believe the decision to use << stdout/stderr, file, syslog >> or << configuration runtime/dockerfile >> is quite relevant
and we consider several options.

My humble opinion is hereafter:

  • we shall leave people (sysadmin or devops team) use tools they are familiar with (Kibana, Logstash... no limit list).
  • They shall not be asked to deal with configuration of containers.
  • Also all those log reader tools already come with plugin to parse various log formats. That would a tremendous work to recode everything in Docker, even using the modular approach of plugin.
  • we shall leave developper/container architect make a decision on log file to be retained, based on severity level (error, warn, info, debug, custom... ). Possibly in a static manner based on the fact that developpers are the most relevant person to know/determine where are relevant infos/logs.

We therefore would like to propose a new instruction for DockerFile

LOGS[TAG]=
LOGS[DEBUG]=/var/log/apache2/access.log
LOGS[DEBUG]=/opt/tomcat/logs/valve-access.log
LOGS[INFO]=/opt/tomcat/logs/catalina.out
LOGS[ERROR]=/var/log/apache2/error.log
LOGS[ERROR]=/var/log/apache2/mod_jk.log
LOGS[ERROR]=/var/log/apache2/mod_rewrite.log
LOGS[ERROR]=/opt/tomcat/logs/error.log

Keywords used above (DEBUG, ERROR, ...) are free text (i.e. a folksonomy as opposed to a taxonomy).

Therefore people would free to add new keyword such as "AUDIT".

LOGS[AUDIT]=/opt/tomcat/logs/audits.log

Even better they could gain access to an archive folder for rotating logs.

LOGS[ARCHIVES]=/archives/**/*.log [WILDCARD pattern inspired by ruby syntax]
  • The Docker API will allow to retrieve all logs of a given container based on one or multiple tags (ex. INFO + AUDIT). From there a client (may that be a a human person or a software) would query logs based on software call over an API (Rest API or Socket).

  • The log driver of Dock shall be able to retrieve and list all tags attached to a container : "docker logs -list 1e4328"

    {
    "debug" : [ 
          { "file1" : "/var/log/apache2/access.log" },
          { "file2" : "/opt/tomcat/logs/valve-access.log" } 
    ],
    "info" : { "file3 ": { "opt/tomcat/logs/catalina.out" },
    error : [ 
           { file4 : "/var/log/apache2/error.log" },
           { file5 : "/var/log/apache2/mod_jk.log",
           { file6 : "/var/log/apache2/mod_rewrite.log",
           { file7 : "/opt/tomcat/logs/error.log" },
    ],
    "audit" : [ { file8 : "/opt/tomcat/logs/audits.log" } ] 
    "archives" : [ 
          { file9 : "/archives/logs/catalina.1.log" } ] 
          { file10 : "/archives/logs/catalina.2.log" } 
      ] 
    }
    
  • The log driver of Dock shall also be able to scroll backward file of a given tag with command : "docker logs -tags:AUDIT+INFO 1e4328"

    • n last lines, just like a tail command would do. Options: --tail:n
    • only new lines since last call. Option --sinceLastCall
    • to grep the lines directly without returning all informations to the client 👍
      docker logs -grep 'mySpecificMessage' 1e4328
    • also an option --json to retrieve a json based file with extra informations (file source/origin details + extra metadata). Default false.
    • driver shall also be able to retrieve directly a file ex.
      docker logs -keys:file3 1e4328

Usage example:
1 - Lets assume an admin guy in need to get a list of all errors of a given app would launch:

docker logs -tags:ERROR 1e4328 --tail:500

He would receive the last 500 error lines - cross apps - that the developer classified as relevant in case of errors.

2 - Lets now assume a software apps
It may call on a regular basis the Docket API with

 docker logs -tags:INFOS 1e4328 --sinceLastCall 

It would receive only new lines to do whatever processings it needs (storage, trigger alert, ...)

My opinion is that log rotating and archiving shall fall under the responsibility of individual products that produce logs.
Log4j or LogBack are good examples in the Java world.

The core added value of Docker would be to focus on an easy access to those tagged logs (through file, json or Rest API). The static configuration into DockerFile could be too external (properties, json of xml format) and given at runtime for each container. So administrator could so cancel and override a default developper/architect dockerfile configuration.

We could imagine in the future to have a Message Queue or an event broker (publish/subscribe) to inform external application. A such feature is not relevant to the current question. It is more general.

Thanks you if you read me :-) And Thanks you for the previous comments too !
Best regards,

Nicolas


" César - Et ce Paris, c'est vraiment beaucoup plus grand que Marseille ?
M. Brun - A Paris, j'ai vu au moins cinquante Canebière ! "
(Marius - Scène 3, Marcel Pagnol, Adaptation de 1946)

@vbalazs

This comment has been minimized.

Show comment
Hide comment
@vbalazs

vbalazs Aug 11, 2014

@zepouet I think the feature you described shouldn't be a part of Docker, there are tools out there which are doing similar things better. We shouldn't try to reimplement those.

Long running conversation on docker-dev list with several use cases and important points: https://groups.google.com/d/topic/docker-dev/3paGTWD6xyw/discussion

vbalazs commented Aug 11, 2014

@zepouet I think the feature you described shouldn't be a part of Docker, there are tools out there which are doing similar things better. We shouldn't try to reimplement those.

Long running conversation on docker-dev list with several use cases and important points: https://groups.google.com/d/topic/docker-dev/3paGTWD6xyw/discussion

@discordianfish

This comment has been minimized.

Show comment
Hide comment
@discordianfish

discordianfish Aug 11, 2014

Contributor

I agree with @LK4D4: Per container logging drivers would be very useful since for anything with high log output (load balancers, caches etc - basically every service in a traffic tier that does per-request logging) this logging would probably become bottleneck.

Beside that, people have different auditing requirements, it's common that you just can't lost logs, so it should be possible to not rotate logs and rather stop the container than having it do stuff without being able to log them.

Contributor

discordianfish commented Aug 11, 2014

I agree with @LK4D4: Per container logging drivers would be very useful since for anything with high log output (load balancers, caches etc - basically every service in a traffic tier that does per-request logging) this logging would probably become bottleneck.

Beside that, people have different auditing requirements, it's common that you just can't lost logs, so it should be possible to not rotate logs and rather stop the container than having it do stuff without being able to log them.

@bwhaley

This comment has been minimized.

Show comment
Hide comment
@bwhaley

bwhaley Aug 20, 2014

@jdef in @shykes comment he says phase 2 of a logging subsystem will handle syslog output via the same mechanism as stderr and stdout. So I see no harm in using -v /dev/log:/dev/log until then.

bwhaley commented Aug 20, 2014

@jdef in @shykes comment he says phase 2 of a logging subsystem will handle syslog output via the same mechanism as stderr and stdout. So I see no harm in using -v /dev/log:/dev/log until then.

@jdef

This comment has been minimized.

Show comment
Hide comment
@jdef

jdef Aug 20, 2014

Contributor

@bwhaley that's the plan. Though it can complicate using docker with
various orchestration tools.

--sent from my phone
On Aug 20, 2014 1:33 AM, "bwhaley" notifications@github.com wrote:

@jdef https://github.com/jdef in @shykes https://github.com/shykes
comment
#7195 (comment) he
says phase 2 of a logging subsystem will handle syslog output via the same
mechanism as stderr and stdout. So I see no harm in using -v
/dev/log:/dev/log until then.


Reply to this email directly or view it on GitHub
#7195 (comment).

Contributor

jdef commented Aug 20, 2014

@bwhaley that's the plan. Though it can complicate using docker with
various orchestration tools.

--sent from my phone
On Aug 20, 2014 1:33 AM, "bwhaley" notifications@github.com wrote:

@jdef https://github.com/jdef in @shykes https://github.com/shykes
comment
#7195 (comment) he
says phase 2 of a logging subsystem will handle syslog output via the same
mechanism as stderr and stdout. So I see no harm in using -v
/dev/log:/dev/log until then.


Reply to this email directly or view it on GitHub
#7195 (comment).

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Aug 27, 2014

Contributor

I guess one of the last questions is if this should be per container or a daemon level option. @shykes what do you think?

Contributor

crosbymichael commented Aug 27, 2014

I guess one of the last questions is if this should be per container or a daemon level option. @shykes what do you think?

@shykes

This comment has been minimized.

Show comment
Hide comment
@shykes

shykes Aug 27, 2014

Collaborator

I would say we can start daemon-wide, and then add per-container later?

Collaborator

shykes commented Aug 27, 2014

I would say we can start daemon-wide, and then add per-container later?

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Aug 27, 2014

Contributor

Yes, I would feel more comfortable defining these types of settings on the daemon, at least initially until we can see how they are being used.

Contributor

crosbymichael commented Aug 27, 2014

Yes, I would feel more comfortable defining these types of settings on the daemon, at least initially until we can see how they are being used.

@zjeraar

This comment has been minimized.

Show comment
Hide comment
@zjeraar

zjeraar Aug 29, 2014

I really like where this is going, totally agree with @shykes let's first do this daemon-wide. I'd say per-container would be a nice-to-have for now.

For the syslog driver, I'd really like to see the container name in the tag field.

zjeraar commented Aug 29, 2014

I really like where this is going, totally agree with @shykes let's first do this daemon-wide. I'd say per-container would be a nice-to-have for now.

For the syslog driver, I'd really like to see the container name in the tag field.

@randywallace

This comment has been minimized.

Show comment
Hide comment
@randywallace

randywallace Aug 30, 2014

IMVHO I think a syslog plugin for docker is completely pointless. Syslog daemons that run on practically every distribution already support everything that has been discussed here. This comment outlines precisely why I feel this way. This is not meant as a flame, but as an alternative discussion that relieves some pressure off the docker devs and puts more responsibility on the Dockerfile maintainer where in this case I feel (again, IMVHO) it belongs.

I am not saying that the logging doesn't need some work; the piece about handling/rotating the stderr/stdout of the container itself is incredibly useful b/c for long-running containers pushing a lot of logs to those pipes results in the issues previously described regarding disk usage. This will at some point need to be solved, though, to cover the bevy of trusted builds that currently send everything to stderr/stdout.

configuring syslog output within the container

I find that the following options work beautifully (these should obviously be expanded):

  • Most apps themselves provide syslog output via configuration. If one doesn't, it should (and probably can be setup but just isn't documented very well). This is especially true for java apps via slf4j, logback, log4j, etc... etc.. Dockerfiles should modify/ADD correct syslog daemon configuration endpoints. My example is for elasticsearch's logging config (this is for @LK4D4) usually found in config/logging.yaml. The conversionPattern could be mangled by a startup script via sed, etc.. to throw-in the container id, hostname, whatever you want (instead of elasticsearch:, or perhaps nothing if you are fine with just the hostname showing up in the syslog). Here is the relevant snippet (I didn't include the default console appender and level in this example):
rootLogger: INFO, syslog
appender:
  syslog:
    type: syslog
    header: true
    syslogHost: <THE_HOST_SYSLOG_DAEMON>
    Facility: USER
    layout:
      type: pattern
      conversionPattern: "elasticsearch: [%p] %t/%c{1} - %m"
  • A wrapper startup script that exec's out everything to logger. For non-daemonized processes running in a wrapper, this just magically works and handles stderr/stdout appropriately (written as a boilerplate that can be modified to run in a sourced file easily)
#!/bin/bash

if host syslog > /dev/null 2>&1; then HOST_SYSLOG_DAEMON=$(host syslog | head -n 1 | cut -f 4); fi

_enable_syslog=${SYSLOG:-true}
_host_syslog_daemon="${HOST_SYSLOG_DAEMON:-172.17.42.1}" # perhaps loaded by --env/-e; ${VAR:-DEFAULT} notation sets default if ENV variable does not exist
_unique_proc_name="randywallace/test_syslog_image"
_facility='local0' # or local1 thru local7, cron, user, etc...
_syslog_and_stdout_stderr=${TEE_OUTPUT:-false} # true/false; also could be a --env

# docker run -e TEE_OUTPUT=true -e HOST_SYSLOG_DAEMON=1.2.3.4 -d my_image
# or
# docker run --link my-rsyslog-container:syslog -d my_image
# or disabled completely
# docker run -e SYSLOG=false -d my_image

__logger() {
  local LEVEL=${1:-info}
  sed -u -r -e 's/\\n/ /g' -e 's/\s\-{3,}/;/g' -e 's/\-{3,}\s//g' |\
  /usr/bin/logger -p ${_facility}.${LEVEL}  -t "${_unique_proc_name}[$$]" -n "${_host_syslog_daemon}"
}

run_logger() {
  if $_enable_syslog; then
    if $_syslog_and_stdout_stderr; then
      tee -a >(__logger $1)
    else
      __logger $1
    fi
  else
    if [ "$1" = "err" ]; then
      cat >&2
    else
      cat
    fi
  fi
}

# Catch all STDOUT and STDERR traffic
exec > >(run_logger info) 2> >(run_logger err)

log() { echo "INFO: $*" | run_logger ; }
error() { echo "ERROR: $*" | run_logger err ; }
critical() { echo "EMERGENCY: $*" | run_logger emerg; exit 1 ; }
alert() { echo "ALERT: $*" | run_logger alert ; }
notice() { echo "NOTICE: $*" | run_logger notice ; }
debug() { echo "DEBUG: $*" | run_logger debug ; }
warning() { echo "WARN: $*" | run_logger warn ; }

log "info"
error "error"
alert "alert"
notice "notice"
debug "debug"
warning "warning"

echo "STDOUT output"
echo "STDERR output" >&2

critical "critical... exiting"

identifying the host to receive syslog traffic

  • use an ENV setting in the dockerfile (see wrapper example above) to indicate your preferred default syslog host. Or, for public Dockerfiles, use a default config (SYSLOG=false in the wrapper above) that is caught at startup to disable syslog output.
  • use a docker container with a volume on /var/log to the host (perhaps on /var/log/docker/syslog/ at the host) and a syslog daemon (I use rsyslog personally). Then EXPOSE the syslog port (514) and link that container to your other containers and specify that link alias in your wrapper (no need to specify the dynamic IP b/c it shows up in /etc/hosts, an example is given in the wrapper that uses 'syslog' for the link alias).
  • Use the actual host daemon, if there is one (boot2docker does not have a syslog daemon, so I use a container and volume). This defaults to 172.17.42.1 unless docker is configured differently, but I don't ever need to change that so I set this IP statically. It would be nice if the docker0 Bridge Gateway IP was configured in /etc/hosts on the containers so that I could specify that in cases in which I may need to change the bridge subnet or something. It may already be there, food for thought.

Profit

  • The hostname of the container shows up in the syslog in all cases. Why not set this when you run the container to something useful? If you're forwarding logs from syslog to logstash/splunk/etc... The IP of the forwarding syslog server will show up, so you can always identify where container X came from.
  • The syslog daemon does not have to exist on the same host as the container. Why fight with tailing /var/log/syslog on 10 docker hosts if you could do it on one?
  • You can use syslog daemon configs to do whatever you want with that stuff getting thrown at it to include

Conclusion

Solving the problem of logging is not a new one, and I seriously doubt that docker could create enough plugins, command line options, etc... to satisfy everybody. This is why rsyslog, syslog-ng, syslogd, papertrail, logstash, graylog2, splunk, fluentd, etc... exist. We've already seen this battle start here, and I don't want to be around when the smoke clears. I hope what I've said here, though, may help some of you to come up with your own solutions that could be working today!

And, if you have problems with the container's logs getting too full (those that are generated from stderr/stdout), don't send them there at all and use my example wrapper above to get rid of that problem completely!

IMVHO I think a syslog plugin for docker is completely pointless. Syslog daemons that run on practically every distribution already support everything that has been discussed here. This comment outlines precisely why I feel this way. This is not meant as a flame, but as an alternative discussion that relieves some pressure off the docker devs and puts more responsibility on the Dockerfile maintainer where in this case I feel (again, IMVHO) it belongs.

I am not saying that the logging doesn't need some work; the piece about handling/rotating the stderr/stdout of the container itself is incredibly useful b/c for long-running containers pushing a lot of logs to those pipes results in the issues previously described regarding disk usage. This will at some point need to be solved, though, to cover the bevy of trusted builds that currently send everything to stderr/stdout.

configuring syslog output within the container

I find that the following options work beautifully (these should obviously be expanded):

  • Most apps themselves provide syslog output via configuration. If one doesn't, it should (and probably can be setup but just isn't documented very well). This is especially true for java apps via slf4j, logback, log4j, etc... etc.. Dockerfiles should modify/ADD correct syslog daemon configuration endpoints. My example is for elasticsearch's logging config (this is for @LK4D4) usually found in config/logging.yaml. The conversionPattern could be mangled by a startup script via sed, etc.. to throw-in the container id, hostname, whatever you want (instead of elasticsearch:, or perhaps nothing if you are fine with just the hostname showing up in the syslog). Here is the relevant snippet (I didn't include the default console appender and level in this example):
rootLogger: INFO, syslog
appender:
  syslog:
    type: syslog
    header: true
    syslogHost: <THE_HOST_SYSLOG_DAEMON>
    Facility: USER
    layout:
      type: pattern
      conversionPattern: "elasticsearch: [%p] %t/%c{1} - %m"
  • A wrapper startup script that exec's out everything to logger. For non-daemonized processes running in a wrapper, this just magically works and handles stderr/stdout appropriately (written as a boilerplate that can be modified to run in a sourced file easily)
#!/bin/bash

if host syslog > /dev/null 2>&1; then HOST_SYSLOG_DAEMON=$(host syslog | head -n 1 | cut -f 4); fi

_enable_syslog=${SYSLOG:-true}
_host_syslog_daemon="${HOST_SYSLOG_DAEMON:-172.17.42.1}" # perhaps loaded by --env/-e; ${VAR:-DEFAULT} notation sets default if ENV variable does not exist
_unique_proc_name="randywallace/test_syslog_image"
_facility='local0' # or local1 thru local7, cron, user, etc...
_syslog_and_stdout_stderr=${TEE_OUTPUT:-false} # true/false; also could be a --env

# docker run -e TEE_OUTPUT=true -e HOST_SYSLOG_DAEMON=1.2.3.4 -d my_image
# or
# docker run --link my-rsyslog-container:syslog -d my_image
# or disabled completely
# docker run -e SYSLOG=false -d my_image

__logger() {
  local LEVEL=${1:-info}
  sed -u -r -e 's/\\n/ /g' -e 's/\s\-{3,}/;/g' -e 's/\-{3,}\s//g' |\
  /usr/bin/logger -p ${_facility}.${LEVEL}  -t "${_unique_proc_name}[$$]" -n "${_host_syslog_daemon}"
}

run_logger() {
  if $_enable_syslog; then
    if $_syslog_and_stdout_stderr; then
      tee -a >(__logger $1)
    else
      __logger $1
    fi
  else
    if [ "$1" = "err" ]; then
      cat >&2
    else
      cat
    fi
  fi
}

# Catch all STDOUT and STDERR traffic
exec > >(run_logger info) 2> >(run_logger err)

log() { echo "INFO: $*" | run_logger ; }
error() { echo "ERROR: $*" | run_logger err ; }
critical() { echo "EMERGENCY: $*" | run_logger emerg; exit 1 ; }
alert() { echo "ALERT: $*" | run_logger alert ; }
notice() { echo "NOTICE: $*" | run_logger notice ; }
debug() { echo "DEBUG: $*" | run_logger debug ; }
warning() { echo "WARN: $*" | run_logger warn ; }

log "info"
error "error"
alert "alert"
notice "notice"
debug "debug"
warning "warning"

echo "STDOUT output"
echo "STDERR output" >&2

critical "critical... exiting"

identifying the host to receive syslog traffic

  • use an ENV setting in the dockerfile (see wrapper example above) to indicate your preferred default syslog host. Or, for public Dockerfiles, use a default config (SYSLOG=false in the wrapper above) that is caught at startup to disable syslog output.
  • use a docker container with a volume on /var/log to the host (perhaps on /var/log/docker/syslog/ at the host) and a syslog daemon (I use rsyslog personally). Then EXPOSE the syslog port (514) and link that container to your other containers and specify that link alias in your wrapper (no need to specify the dynamic IP b/c it shows up in /etc/hosts, an example is given in the wrapper that uses 'syslog' for the link alias).
  • Use the actual host daemon, if there is one (boot2docker does not have a syslog daemon, so I use a container and volume). This defaults to 172.17.42.1 unless docker is configured differently, but I don't ever need to change that so I set this IP statically. It would be nice if the docker0 Bridge Gateway IP was configured in /etc/hosts on the containers so that I could specify that in cases in which I may need to change the bridge subnet or something. It may already be there, food for thought.

Profit

  • The hostname of the container shows up in the syslog in all cases. Why not set this when you run the container to something useful? If you're forwarding logs from syslog to logstash/splunk/etc... The IP of the forwarding syslog server will show up, so you can always identify where container X came from.
  • The syslog daemon does not have to exist on the same host as the container. Why fight with tailing /var/log/syslog on 10 docker hosts if you could do it on one?
  • You can use syslog daemon configs to do whatever you want with that stuff getting thrown at it to include

Conclusion

Solving the problem of logging is not a new one, and I seriously doubt that docker could create enough plugins, command line options, etc... to satisfy everybody. This is why rsyslog, syslog-ng, syslogd, papertrail, logstash, graylog2, splunk, fluentd, etc... exist. We've already seen this battle start here, and I don't want to be around when the smoke clears. I hope what I've said here, though, may help some of you to come up with your own solutions that could be working today!

And, if you have problems with the container's logs getting too full (those that are generated from stderr/stdout), don't send them there at all and use my example wrapper above to get rid of that problem completely!

@kuon

This comment has been minimized.

Show comment
Hide comment
@kuon

kuon Aug 30, 2014

Contributor

For what's worth, I am now using systemd to launch containers and forward logs to syslog, in addition with logrotate in copytruncate mode. This works fine..

I am not arguing in one way or the other, just saying that this setup works today and give per container configuration option.

Contributor

kuon commented Aug 30, 2014

For what's worth, I am now using systemd to launch containers and forward logs to syslog, in addition with logrotate in copytruncate mode. This works fine..

I am not arguing in one way or the other, just saying that this setup works today and give per container configuration option.

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Aug 30, 2014

Contributor

@kuon This is good way, but docker internal mechanism of writing logs to stdout is not perfect. So if you write long lines or you have high flow of logs - you will get huge memory and CPU overhead just for writing logs to container stdout. So having native syslog support will be great anyway.

Contributor

LK4D4 commented Aug 30, 2014

@kuon This is good way, but docker internal mechanism of writing logs to stdout is not perfect. So if you write long lines or you have high flow of logs - you will get huge memory and CPU overhead just for writing logs to container stdout. So having native syslog support will be great anyway.

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Aug 30, 2014

Contributor

@randywallace The point is to do something with the collected logs that we already have in Docker. Instead of forcing people to implement something on their own, Docker can provide the facility to do it without having to hack stuff around (like running a syslog daemon inside your container).

Contributor

cpuguy83 commented Aug 30, 2014

@randywallace The point is to do something with the collected logs that we already have in Docker. Instead of forcing people to implement something on their own, Docker can provide the facility to do it without having to hack stuff around (like running a syslog daemon inside your container).

@ilowe

This comment has been minimized.

Show comment
Hide comment
@ilowe

ilowe Sep 9, 2014

I would like to echo the sentiment of @brianm and @zepouet and maybe suggest that there are really two discussions here:

The first one is, frankly, up to @crosbymichael and co. and it concerns how Docker handles an issue with the current implementation of logging. I think the guys are trying hard to come up with forward-facing solutions that will provide paths to new features, and I applaud that effort.

The second discussion, however, is being alluded to in previous comments; that is: is it appropriate for Docker to dictate a "logging framework"? No matter how many drivers "we" add, no matter how many options and config files, the tacit assumption becomes "everybody does logging like (or a subset of how) Docker does it".

The UNIX way is to do one thing and do it well. In the case of logging, this means let syslog do the work; for rotating logs we have logrotate, etc.

I think it would be better to have more intelligent ways to handle mounting and cross-mnt namespace access so that solutions described above like mounting /dev/log actually work. In the real world, I can't just drop all my YetAnotherSyslog code because I want to containerize something and I don't want to be a second-class citizen just because of that.

@randywallace provides an example of how easy it is to setup logging already using existing tools. I just think we're not thinking outside the box on how to provide a generally useful solution that also handles this case.

All of this is without mentioning the performance issues in containers at high load if the Docker daemon has to handle each packet. In high traffic situations, this is an absolute non-starter. We need to have access to kernel primitives at this point and a multi-layered userspace logging solution is going to force me to disable it and cobble my own each time.

As I said above, the more immediate decision is important for handling issues with log management. Of course, that should be solved. But I would hate to feel that Docker as an organization was wasting money and time building stuff I already have. The featureset so far is so out-of-the-box (both in terms of innovation and usability) that I really hate to see such a mundane and already-solved concern become the responsibility of the docker daemon.

To head off and forestall any other comments to the effect that "wouldn't it be nice if docker managed your logfiles" let me say that yes, it would. I would also like a built-in webserver so that I can launch my Node.js apps. I just don't feel that beyond the scope of improving the current stdout/stderr system this path leads anywhere but to having a whole group of people disabling docker log management and bending over backwards to use something else.

Of course, this all should be taken in the context of an assumption that the goal is to have advanced container technology that does it's thing and otherwise let's you do what you want (ie. maintaining neutrality). If docker is becoming a bit more of an "app hosting platform" where you can fully customize the inside and all the plumbing is handled for you, then this is definitely the way to go.

In case code speaks louder than words, I'm willing to work on a PR with a sketch of something if I get at least two people who will read it.

ilowe commented Sep 9, 2014

I would like to echo the sentiment of @brianm and @zepouet and maybe suggest that there are really two discussions here:

The first one is, frankly, up to @crosbymichael and co. and it concerns how Docker handles an issue with the current implementation of logging. I think the guys are trying hard to come up with forward-facing solutions that will provide paths to new features, and I applaud that effort.

The second discussion, however, is being alluded to in previous comments; that is: is it appropriate for Docker to dictate a "logging framework"? No matter how many drivers "we" add, no matter how many options and config files, the tacit assumption becomes "everybody does logging like (or a subset of how) Docker does it".

The UNIX way is to do one thing and do it well. In the case of logging, this means let syslog do the work; for rotating logs we have logrotate, etc.

I think it would be better to have more intelligent ways to handle mounting and cross-mnt namespace access so that solutions described above like mounting /dev/log actually work. In the real world, I can't just drop all my YetAnotherSyslog code because I want to containerize something and I don't want to be a second-class citizen just because of that.

@randywallace provides an example of how easy it is to setup logging already using existing tools. I just think we're not thinking outside the box on how to provide a generally useful solution that also handles this case.

All of this is without mentioning the performance issues in containers at high load if the Docker daemon has to handle each packet. In high traffic situations, this is an absolute non-starter. We need to have access to kernel primitives at this point and a multi-layered userspace logging solution is going to force me to disable it and cobble my own each time.

As I said above, the more immediate decision is important for handling issues with log management. Of course, that should be solved. But I would hate to feel that Docker as an organization was wasting money and time building stuff I already have. The featureset so far is so out-of-the-box (both in terms of innovation and usability) that I really hate to see such a mundane and already-solved concern become the responsibility of the docker daemon.

To head off and forestall any other comments to the effect that "wouldn't it be nice if docker managed your logfiles" let me say that yes, it would. I would also like a built-in webserver so that I can launch my Node.js apps. I just don't feel that beyond the scope of improving the current stdout/stderr system this path leads anywhere but to having a whole group of people disabling docker log management and bending over backwards to use something else.

Of course, this all should be taken in the context of an assumption that the goal is to have advanced container technology that does it's thing and otherwise let's you do what you want (ie. maintaining neutrality). If docker is becoming a bit more of an "app hosting platform" where you can fully customize the inside and all the plumbing is handled for you, then this is definitely the way to go.

In case code speaks louder than words, I'm willing to work on a PR with a sketch of something if I get at least two people who will read it.

@frank-dspeed

This comment has been minimized.

Show comment
Hide comment
@frank-dspeed

frank-dspeed Sep 9, 2014

I reviewed it all i think the only thing that needs to happen is that the log files get rotate able all else will break my existing setups probally and if i need logs from a process i can gather them via the dockerhost on os level via tools i don't need docker to handle that

I reviewed it all i think the only thing that needs to happen is that the log files get rotate able all else will break my existing setups probally and if i need logs from a process i can gather them via the dockerhost on os level via tools i don't need docker to handle that

@discordianfish

This comment has been minimized.

Show comment
Hide comment
@discordianfish

discordianfish Sep 15, 2014

Contributor

My 2 cents: It should be easy to get some default logging (aka something without that huge memory and CPU overhead) and possible (without unnecessary overhead) to integrate your own logging as @randywallace suggested, all that as lean as possible. Therefor I wouldn't try to interprete the logstream and just implement the bare minimal features for the default logging (truncated, rotate and maybe some 'tailing' to get only the last x lines).

Contributor

discordianfish commented Sep 15, 2014

My 2 cents: It should be easy to get some default logging (aka something without that huge memory and CPU overhead) and possible (without unnecessary overhead) to integrate your own logging as @randywallace suggested, all that as lean as possible. Therefor I wouldn't try to interprete the logstream and just implement the bare minimal features for the default logging (truncated, rotate and maybe some 'tailing' to get only the last x lines).

@kuon

This comment has been minimized.

Show comment
Hide comment
@kuon

kuon Sep 15, 2014

Contributor

@frank-dspeed You can rotate docker logs with logrotate using copytruncate, see #7333

Contributor

kuon commented Sep 15, 2014

@frank-dspeed You can rotate docker logs with logrotate using copytruncate, see #7333

@tve

This comment has been minimized.

Show comment
Hide comment
@tve

tve Sep 25, 2014

After reading all the comment I still feel that docker needs to do something to simplify logging. Some comments mentioned using -v /dev/log:/dev/log but that apparently doesn't really work because the link gets broken if syslog is restarted because it creates a fresh /dev/log leaving all running containers logging into a dead pipe. One can work around that by moving /dev/log to a directory of its own, such as /tmp/syslog/log and doing -v /mnt/syslog:/dev (suggested in http://jpetazzo.github.io/2014/08/24/syslog-docker/), but now all containers share /dev.

Suggestions made by @randywallace don't help me at all unless I'm missing something, many apps don't have the capability to log to a remote syslog, they expect a local syslog device.

tve commented Sep 25, 2014

After reading all the comment I still feel that docker needs to do something to simplify logging. Some comments mentioned using -v /dev/log:/dev/log but that apparently doesn't really work because the link gets broken if syslog is restarted because it creates a fresh /dev/log leaving all running containers logging into a dead pipe. One can work around that by moving /dev/log to a directory of its own, such as /tmp/syslog/log and doing -v /mnt/syslog:/dev (suggested in http://jpetazzo.github.io/2014/08/24/syslog-docker/), but now all containers share /dev.

Suggestions made by @randywallace don't help me at all unless I'm missing something, many apps don't have the capability to log to a remote syslog, they expect a local syslog device.

@mhart

This comment has been minimized.

Show comment
Hide comment
@mhart

mhart Oct 8, 2014

Any news on this?

What @tve and others have said is very important for anyone using -v /dev/log:/dev/log – the link does indeed get broken if syslog restarts:

$ docker run -d -v /dev/log:/dev/log ubuntu sh -c 'while true; do logger hello; sleep 5; done'

$ tail -f /var/log/syslog
2014-10-08T04:04:17.009793+00:00 notice logger: hello
2014-10-08T04:04:22.014052+00:00 notice logger: hello
2014-10-08T04:04:27.018377+00:00 notice logger: hello
^C

$ sudo restart rsyslog

$ tail -f /var/log/syslog
... no new logs from docker container

Which makes this a very brittle solution...

mhart commented Oct 8, 2014

Any news on this?

What @tve and others have said is very important for anyone using -v /dev/log:/dev/log – the link does indeed get broken if syslog restarts:

$ docker run -d -v /dev/log:/dev/log ubuntu sh -c 'while true; do logger hello; sleep 5; done'

$ tail -f /var/log/syslog
2014-10-08T04:04:17.009793+00:00 notice logger: hello
2014-10-08T04:04:22.014052+00:00 notice logger: hello
2014-10-08T04:04:27.018377+00:00 notice logger: hello
^C

$ sudo restart rsyslog

$ tail -f /var/log/syslog
... no new logs from docker container

Which makes this a very brittle solution...

@afolarin

This comment has been minimized.

Show comment
Hide comment
@afolarin

afolarin Oct 20, 2014

Is there a reason I shouldn't just (as of v1.3) if I want to inspect a container by container, log file by log file ?

$ docker exec -it my-container cat /path/to/my.log

Is there a reason I shouldn't just (as of v1.3) if I want to inspect a container by container, log file by log file ?

$ docker exec -it my-container cat /path/to/my.log

@wking wking referenced this issue Oct 23, 2014

Closed

NG: Logging #635

@randywallace

This comment has been minimized.

Show comment
Hide comment
@randywallace

randywallace Oct 27, 2014

@mhart You shouldn't need to mount the log device. If your syslog daemon on the host is listening on the gateway of the docker bridge (172.17.42.1 by default), this should work just fine, even across host syslog daemon restarts:

docker run -d --name logger_test ubuntu:raring /bin/bash -c 'while true; do /usr/bin/logger -n $(grep default < <(ip route) | grep -Eo "([0-9]{1,3}[\.]){3}[0-9]{1,3}") -p user.info -P 514 -i -t logger_test -u /tmp/unused hello; sleep 1; done'

@mhart You shouldn't need to mount the log device. If your syslog daemon on the host is listening on the gateway of the docker bridge (172.17.42.1 by default), this should work just fine, even across host syslog daemon restarts:

docker run -d --name logger_test ubuntu:raring /bin/bash -c 'while true; do /usr/bin/logger -n $(grep default < <(ip route) | grep -Eo "([0-9]{1,3}[\.]){3}[0-9]{1,3}") -p user.info -P 514 -i -t logger_test -u /tmp/unused hello; sleep 1; done'
@lennartkoopmann

This comment has been minimized.

Show comment
Hide comment
@lennartkoopmann

lennartkoopmann Oct 29, 2014

Graylog2 developer here. Not having much practical experience with Docker yet so I can't go much into Docker configuration specifics but I thought I'd join and leave a few comments based on my experience from building a logging system in the last 4-5 years:

Keep it as simple and hand off the logging to for example the local syslog subsystem as soon as possible. Tools like rsyslog or syslog-ng have spent enormous amounts of time to let the user flexibly configure stuff. Simple tasks like choosing the facility are easy to build but you can spent a lot of time implementing different TCP syslog framing methods for example. Do not try to build anything yourself that the popular syslog daemons are doing already. They should be available on basically every platform that Docker runs on.

If you want to ship Docker with log management capabilities like basic search, archiving or live tailing then use a log management system that already exists. Graylog2 for example has REST APIs that can be built upon while all the data management is abstracted. Even implementing something like log rotation yourself can go wrong in many different ways and cause OS compatibility nightmares.

You will also think about avoiding a Docker log silo that only contains Docker logs. You need to have all your logs (network hardware, OS, applications) in one place for proper correlation,

Key=Value pairs are a good way to structure data. There is also GELF, which Graylog2, Logstash, fluentd and nxlog speak. Structured syslog as defined in RFC5424 is probably the most compliant approach but could cause issues with maximum message length.

Just my suggestions. :)

Graylog2 developer here. Not having much practical experience with Docker yet so I can't go much into Docker configuration specifics but I thought I'd join and leave a few comments based on my experience from building a logging system in the last 4-5 years:

Keep it as simple and hand off the logging to for example the local syslog subsystem as soon as possible. Tools like rsyslog or syslog-ng have spent enormous amounts of time to let the user flexibly configure stuff. Simple tasks like choosing the facility are easy to build but you can spent a lot of time implementing different TCP syslog framing methods for example. Do not try to build anything yourself that the popular syslog daemons are doing already. They should be available on basically every platform that Docker runs on.

If you want to ship Docker with log management capabilities like basic search, archiving or live tailing then use a log management system that already exists. Graylog2 for example has REST APIs that can be built upon while all the data management is abstracted. Even implementing something like log rotation yourself can go wrong in many different ways and cause OS compatibility nightmares.

You will also think about avoiding a Docker log silo that only contains Docker logs. You need to have all your logs (network hardware, OS, applications) in one place for proper correlation,

Key=Value pairs are a good way to structure data. There is also GELF, which Graylog2, Logstash, fluentd and nxlog speak. Structured syslog as defined in RFC5424 is probably the most compliant approach but could cause issues with maximum message length.

Just my suggestions. :)

@maximkulkin

This comment has been minimized.

Show comment
Hide comment
@maximkulkin

maximkulkin Dec 10, 2014

Contributor

Hey guys, any volunteers to try #9513 patch and provide some feedback?

Contributor

maximkulkin commented Dec 10, 2014

Hey guys, any volunteers to try #9513 patch and provide some feedback?

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Jan 30, 2015

Contributor

I think @LK4D4 said my current proposal here is a little too complex and he is looking into something much simpler.

Contributor

crosbymichael commented Jan 30, 2015

I think @LK4D4 said my current proposal here is a little too complex and he is looking into something much simpler.

@LK4D4

This comment has been minimized.

Show comment
Hide comment
@LK4D4

LK4D4 Jan 30, 2015

Contributor

@crosbymichael Not very much simpler, but yes :) I'll prepare "proposal with code"

Contributor

LK4D4 commented Jan 30, 2015

@crosbymichael Not very much simpler, but yes :) I'll prepare "proposal with code"

@icecrime

This comment has been minimized.

Show comment
Hide comment
@icecrime

icecrime Mar 31, 2015

Contributor

I think this is closed by #10568.

Contributor

icecrime commented Mar 31, 2015

I think this is closed by #10568.

@varshneyjayant

This comment has been minimized.

Show comment
Hide comment
@varshneyjayant

varshneyjayant Jun 22, 2015

Can we configure it to send applications logs to rsyslog of host machine?

Can we configure it to send applications logs to rsyslog of host machine?

@oncletom

This comment has been minimized.

Show comment
Hide comment
@oncletom

oncletom Jun 22, 2015

@psquickitjayant by using --log-driver=syslog :-) cf. https://docs.docker.com/reference/run/#logging-drivers-log-driver

@psquickitjayant by using --log-driver=syslog :-) cf. https://docs.docker.com/reference/run/#logging-drivers-log-driver

@varshneyjayant

This comment has been minimized.

Show comment
Hide comment
@varshneyjayant

varshneyjayant Jun 22, 2015

@oncletom Thanks for the information. According to my understanding, we send stdout / stderr logs to host syslog. Possible to send logs from applications running inside container like Apache, cron, Nginx etc.?

@oncletom Thanks for the information. According to my understanding, we send stdout / stderr logs to host syslog. Possible to send logs from applications running inside container like Apache, cron, Nginx etc.?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment