New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If a remote TCP syslog server is down, docker does not start. #21966

Open
tecnobrat opened this Issue Apr 12, 2016 · 38 comments

Comments

Projects
None yet
@tecnobrat

tecnobrat commented Apr 12, 2016

If you try to start docker with a remote tcp syslog, which is currently unavailable, docker will fail to start.

This is especially problematic in a production environment where you are trying to ship logs to a central place. You could have network issues to your syslog server, and this will stop your production apps from running. I would say that having logs be delayed is optimal to not being able to start your server.

If a node is unresponsive:
$ docker run --log-opt syslog-address=tcp://google.com:1212 --log-driver=syslog nginx
wait about a minute
docker: Error response from daemon: Failed to initialize logging driver: dial tcp 216.58.216.142:1212: getsockopt: connection timed out.

Or if it flat our rejects the connection:
docker run --log-opt syslog-address=tcp://localhost:1212 --log-driver=syslog nginx
docker: Error response from daemon: Failed to initialize logging driver: dial tcp 127.0.0.1:1212: getsockopt: connection refused.

In both of these cases, your container doesn't launch. We could use UDP, which would avoid this problem, but also it causes issues if you have intermittent network issues, you will lose logs entirely.

I would suggest that docker still starts, but just caches the logs internally and still attempts to connect and dump its logs, instead of crashing.

Output of docker version:

Client:
 Version:      1.10.3
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   20f81dd
 Built:        Thu Mar 10 21:49:11 2016
 OS/Arch:      darwin/amd64

Server:
 Version:      1.10.3
 API version:  1.22
 Go version:   go1.5.3
 Git commit:   20f81dd
 Built:        Thu Mar 10 21:49:11 2016
 OS/Arch:      linux/amd64

Output of docker info:

Containers: 223
 Running: 15
 Paused: 0
 Stopped: 208
Images: 552
Server Version: 1.10.3
Storage Driver: aufs
 Root Dir: /mnt/sda1/var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 1037
 Dirperm1 Supported: true
Execution Driver: native-0.2
Logging Driver: json-file
Plugins:
 Volume: local
 Network: null host bridge
Kernel Version: 4.1.19-boot2docker
Operating System: Boot2Docker 1.10.3 (TCL 6.4.1); master : 625117e - Thu Mar 10 22:09:02 UTC 2016
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 3.858 GiB
Name: default
ID: WIKN:KOFK:L2P5:I5VU:JGGX:UPHB:4PCB:L7FC:PX3N:F42A:XC4N:AEGA
Debug mode (server): true
 File Descriptors: 137
 Goroutines: 215
 System Time: 2016-04-12T20:46:33.827097672Z
 EventsListeners: 1
 Init SHA1:
 Init Path: /usr/local/bin/docker
 Docker Root Dir: /mnt/sda1/var/lib/docker
Username: tecnobrat
Registry: https://index.docker.io/v1/
Labels:
 provider=virtualbox
@sandrokeil

This comment has been minimized.

Show comment
Hide comment
@sandrokeil

sandrokeil Apr 21, 2016

It would also be useful to use a file buffer instead and send the logs to the syslog server if it is available again.

sandrokeil commented Apr 21, 2016

It would also be useful to use a file buffer instead and send the logs to the syslog server if it is available again.

@mlaventure

This comment has been minimized.

Show comment
Hide comment
@mlaventure

mlaventure Apr 21, 2016

Contributor

Duplicate of #22129 (although different reason, reasoning is the same).

Contributor

mlaventure commented Apr 21, 2016

Duplicate of #22129 (although different reason, reasoning is the same).

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Apr 21, 2016

Member

@mlaventure the other way around; this was reported before 😄

Member

thaJeztah commented Apr 21, 2016

@mlaventure the other way around; this was reported before 😄

@lacarvalho91

This comment has been minimized.

Show comment
Hide comment
@lacarvalho91

lacarvalho91 May 13, 2016

I'm also having this problem, is there something in the works for this?

lacarvalho91 commented May 13, 2016

I'm also having this problem, is there something in the works for this?

@programmerq

This comment has been minimized.

Show comment
Hide comment
@programmerq

programmerq May 13, 2016

Contributor
$ docker version
Client:
 Version:      1.11.1
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   5604cbe
 Built:        Wed Apr 27 00:34:20 2016
 OS/Arch:      darwin/amd64

Server:
 Version:      1.11.1
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   8b63c77
 Built:        Tue May 10 10:39:20 2016
 OS/Arch:      linux/amd64

$ docker run --log-opt syslog-address=tcp://google.com:1212 --log-driver=syslog nginx:alpine
docker: Error response from daemon: Failed to initialize logging driver: dial tcp 209.141.120.170:1212: getsockopt: connection refused.

1.11 still has similar behavior. it did take longer than the originally described "about a minute".

Contributor

programmerq commented May 13, 2016

$ docker version
Client:
 Version:      1.11.1
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   5604cbe
 Built:        Wed Apr 27 00:34:20 2016
 OS/Arch:      darwin/amd64

Server:
 Version:      1.11.1
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   8b63c77
 Built:        Tue May 10 10:39:20 2016
 OS/Arch:      linux/amd64

$ docker run --log-opt syslog-address=tcp://google.com:1212 --log-driver=syslog nginx:alpine
docker: Error response from daemon: Failed to initialize logging driver: dial tcp 209.141.120.170:1212: getsockopt: connection refused.

1.11 still has similar behavior. it did take longer than the originally described "about a minute".

@michaelwilde

This comment has been minimized.

Show comment
Hide comment
@michaelwilde

michaelwilde May 25, 2016

It appears any logging driver (including Splunk) has an issue if a TCP connection occurs and the remote node is not available. It would be preferred for the container to launch and some modicum of retry was available so services were not affected. I'm not exactly sure what happens to the container if it starts and then the remote log receiver goes down.. i'd hope the container doesn't die.

michaelwilde commented May 25, 2016

It appears any logging driver (including Splunk) has an issue if a TCP connection occurs and the remote node is not available. It would be preferred for the container to launch and some modicum of retry was available so services were not affected. I'm not exactly sure what happens to the container if it starts and then the remote log receiver goes down.. i'd hope the container doesn't die.

@gbolo

This comment has been minimized.

Show comment
Hide comment
@gbolo

gbolo May 27, 2016

Any update on this? Need this fixed for tcp+tls

gbolo commented May 27, 2016

Any update on this? Need this fixed for tcp+tls

@michaelwilde

This comment has been minimized.

Show comment
Hide comment
@michaelwilde

michaelwilde May 27, 2016

...cuz we're at a point now where we are having to tell customers... "well, if your remote TCP endpoint for logging isn't available at the time container is started (for whatever reason), your darn container just won't start". People do want to use more sophisticated logging AND they shouldn't have to rely on syslog UDP (Which isn't that great a multiline logs), so they're left with logging to a filesystem, or using some other logging API's within their app... taking us back to 2005. We've got to get to a point where the container will always start if possible the logging drivers are a reliable option.

michaelwilde commented May 27, 2016

...cuz we're at a point now where we are having to tell customers... "well, if your remote TCP endpoint for logging isn't available at the time container is started (for whatever reason), your darn container just won't start". People do want to use more sophisticated logging AND they shouldn't have to rely on syslog UDP (Which isn't that great a multiline logs), so they're left with logging to a filesystem, or using some other logging API's within their app... taking us back to 2005. We've got to get to a point where the container will always start if possible the logging drivers are a reliable option.

@ionutalexandruisac

This comment has been minimized.

Show comment
Hide comment
@ionutalexandruisac

ionutalexandruisac Jun 3, 2016

Having the same issue with containers not being able to start if for example the Splunk logging driver has a connection issue. Any news on when we can expect a fix for this? You would expect a soft fail in such situations instead you get a really hard one.

ionutalexandruisac commented Jun 3, 2016

Having the same issue with containers not being able to start if for example the Splunk logging driver has a connection issue. Any news on when we can expect a fix for this? You would expect a soft fail in such situations instead you get a really hard one.

@sandrokeil

This comment has been minimized.

Show comment
Hide comment
@sandrokeil

sandrokeil Jun 3, 2016

You can use logspout as a workaround.

sandrokeil commented Jun 3, 2016

You can use logspout as a workaround.

@ionutalexandruisac

This comment has been minimized.

Show comment
Hide comment
@ionutalexandruisac

ionutalexandruisac Jun 15, 2016

thank you @sandrokeil for the work-around, unfortunately it won't do any good for me.
Any updates regarding this issue? Anybody from Docker actually looking at it?

ionutalexandruisac commented Jun 15, 2016

thank you @sandrokeil for the work-around, unfortunately it won't do any good for me.
Any updates regarding this issue? Anybody from Docker actually looking at it?

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Jun 15, 2016

Contributor

@ionutalexandruisac We've seen it. Logging is tricky, change something and you piss off half the user base, don't change something and you piss of the other half.

Right now docker tries to ensure you never lose logs, unfortunately in this case that means you can't even start your container... and perhaps worse after your container is started if the remote endpoint goes down for an extended period your container will be blocked on I/O.

Probably need an option to allow for lossy logs that works across all drivers, protocols, etc.

@michaelwilde Docker does not support multiline logging anyway, so syslog UDP would be fine here.

Contributor

cpuguy83 commented Jun 15, 2016

@ionutalexandruisac We've seen it. Logging is tricky, change something and you piss off half the user base, don't change something and you piss of the other half.

Right now docker tries to ensure you never lose logs, unfortunately in this case that means you can't even start your container... and perhaps worse after your container is started if the remote endpoint goes down for an extended period your container will be blocked on I/O.

Probably need an option to allow for lossy logs that works across all drivers, protocols, etc.

@michaelwilde Docker does not support multiline logging anyway, so syslog UDP would be fine here.

@lkm

This comment has been minimized.

Show comment
Hide comment
@lkm

lkm Jun 15, 2016

@cpuguy83 Not losing logs is good thing, but potential downtime is (I assume) worse for everybody. Couldn't you fall back to file based logging if log host is down and then send the logs when it comes back up? Perhaps a somewhat more difficult request to implement.

The option to essentially drop log events when log host down would also be welcomed!

lkm commented Jun 15, 2016

@cpuguy83 Not losing logs is good thing, but potential downtime is (I assume) worse for everybody. Couldn't you fall back to file based logging if log host is down and then send the logs when it comes back up? Perhaps a somewhat more difficult request to implement.

The option to essentially drop log events when log host down would also be welcomed!

@ionutalexandruisac

This comment has been minimized.

Show comment
Hide comment
@ionutalexandruisac

ionutalexandruisac Jun 15, 2016

@cpuguy83 - Thanks for your reply.
While I understand your point of view, I think @lkm is quite right: downtime is much worse than losing logs.
I guess it would be fairly simple to implement a feature to "allow container to start/run even if the logging driver is unavailable".
You can leave the default as it is now and that feature should be optional maybe, so that users who want it can actually have it if they want to. We have some critical services running on docker and it's really sad to see them crash and being unable to restart just because something happened to Splunk and is temporary unavailable.

ionutalexandruisac commented Jun 15, 2016

@cpuguy83 - Thanks for your reply.
While I understand your point of view, I think @lkm is quite right: downtime is much worse than losing logs.
I guess it would be fairly simple to implement a feature to "allow container to start/run even if the logging driver is unavailable".
You can leave the default as it is now and that feature should be optional maybe, so that users who want it can actually have it if they want to. We have some critical services running on docker and it's really sad to see them crash and being unable to restart just because something happened to Splunk and is temporary unavailable.

@nickperry

This comment has been minimized.

Show comment
Hide comment
@nickperry

nickperry Jun 16, 2016

The ideal in my opinion would be to expose 3 options for what to do when a remote logging driver cannot reach its end point.

Block
Spool
Discard

Discard could presumably be passed down to the log driver to implement. Or would it be better to have the driver report back-pressure / failure up to Docker to drop, such that this state can be logged? A driver would also need a way to detect restoration of connectivity and and start accepting messages again.

Spool could come later and needs significant discussion and engineering. Could it be that it's split into two options - internal spooling handled by Docker and external spooling, which is passed down to the log driver?

@michaelwilde @glennblock @outcoldman I guess the Splunk driver would only be able to support discard, not spool, due to the decision to go with HEC and not having a universal forwarder?

See also #16207 (comment) (also note next two comments in that thread).

nickperry commented Jun 16, 2016

The ideal in my opinion would be to expose 3 options for what to do when a remote logging driver cannot reach its end point.

Block
Spool
Discard

Discard could presumably be passed down to the log driver to implement. Or would it be better to have the driver report back-pressure / failure up to Docker to drop, such that this state can be logged? A driver would also need a way to detect restoration of connectivity and and start accepting messages again.

Spool could come later and needs significant discussion and engineering. Could it be that it's split into two options - internal spooling handled by Docker and external spooling, which is passed down to the log driver?

@michaelwilde @glennblock @outcoldman I guess the Splunk driver would only be able to support discard, not spool, due to the decision to go with HEC and not having a universal forwarder?

See also #16207 (comment) (also note next two comments in that thread).

@nickperry

This comment has been minimized.

Show comment
Hide comment
@nickperry

nickperry Jun 16, 2016

One work around for Splunk customers who want spooling would be to run Splunk universal forwarder on the Docker host with a TCP:// input and point the Splunk Docker log driver at that.

EDIT - sorry I was confused. We'd need a HEC input, not a TCP input and we can't run the HEC app on a universal forwarder.

nickperry commented Jun 16, 2016

One work around for Splunk customers who want spooling would be to run Splunk universal forwarder on the Docker host with a TCP:// input and point the Splunk Docker log driver at that.

EDIT - sorry I was confused. We'd need a HEC input, not a TCP input and we can't run the HEC app on a universal forwarder.

@outcoldman

This comment has been minimized.

Show comment
Hide comment
@outcoldman

outcoldman Jun 16, 2016

Contributor

I like this discussion.

@michaelwilde in case if link between Splunk and Host will die while container is running - it will keep running, all logs will go to the log configured for the Docker Daemon as error messages similar to Failed to send log "{MESSAGE}".

We have seen that some of our customers have issues with logging driver, which fails the start when driver cannot connect to the remote host.

In 99% of causes this issue is caused by misconfiguration of logging driver, wrong url or wrong route to the Splunk host. We verify connection intentionally just to inform user that splunk logging driver is configured wrong.

In case of big failure, when driver cannot connect to Splunk for some reason (let's say link between Docker Host and Splunk cluster is down) but customer need to scale out right now - we still want to identify user that logging does not work and customer needs to take an action - the easiest action is to switch to json logging driver. Later when link between docker host and client will be fixed - customer can index json log as well to keep all the logs in one place.

@nickperry can this workflow be applied to you? Do you see any issues with it? Better way.

Btw, installing Splunk Forwarder locally with Docker daemon is still a good solution. It can give you a lot, including retries and all other benefits.

Btw, I am working right now on some improvements for the Splunk Logging Driver, one of them will be --log-opt splunk-verifyconnection=true|false (which is Discard logic mentioned by @nickperry). If somebody is interesting they can ping me on DockerCon 2016 in Seattle (Splunk will have a booth and a small presentation) and I can show these improvements.

Contributor

outcoldman commented Jun 16, 2016

I like this discussion.

@michaelwilde in case if link between Splunk and Host will die while container is running - it will keep running, all logs will go to the log configured for the Docker Daemon as error messages similar to Failed to send log "{MESSAGE}".

We have seen that some of our customers have issues with logging driver, which fails the start when driver cannot connect to the remote host.

In 99% of causes this issue is caused by misconfiguration of logging driver, wrong url or wrong route to the Splunk host. We verify connection intentionally just to inform user that splunk logging driver is configured wrong.

In case of big failure, when driver cannot connect to Splunk for some reason (let's say link between Docker Host and Splunk cluster is down) but customer need to scale out right now - we still want to identify user that logging does not work and customer needs to take an action - the easiest action is to switch to json logging driver. Later when link between docker host and client will be fixed - customer can index json log as well to keep all the logs in one place.

@nickperry can this workflow be applied to you? Do you see any issues with it? Better way.

Btw, installing Splunk Forwarder locally with Docker daemon is still a good solution. It can give you a lot, including retries and all other benefits.

Btw, I am working right now on some improvements for the Splunk Logging Driver, one of them will be --log-opt splunk-verifyconnection=true|false (which is Discard logic mentioned by @nickperry). If somebody is interesting they can ping me on DockerCon 2016 in Seattle (Splunk will have a booth and a small presentation) and I can show these improvements.

@nickperry

This comment has been minimized.

Show comment
Hide comment
@nickperry

nickperry Jun 16, 2016

@outcoldman Thanks a lot for that response.

We've experimented with consuming the json-file logs via universal forwarder but multi-line stuff like java stack traces is causing us problems.

log-opt splunk-verifyconnection=true|false sounds great. What are your other enhancements?

nickperry commented Jun 16, 2016

@outcoldman Thanks a lot for that response.

We've experimented with consuming the json-file logs via universal forwarder but multi-line stuff like java stack traces is causing us problems.

log-opt splunk-verifyconnection=true|false sounds great. What are your other enhancements?

@glennblock

This comment has been minimized.

Show comment
Hide comment
@glennblock

glennblock Jun 16, 2016

+1 on the concerns here. This has been something we have heard brought up
several times by customers using our driver. The new param @outcoldman is
adding is specifically for this purpose. By default the driver behaves as
it does today, but if you opt-in you can override.

@nickperry have you tried using the Transaction command? It can help
assemble the lines when you search. We don't have a native way to handle
this at ingest time via HEC yet.
On Thu, Jun 16, 2016 at 7:16 AM nickperry notifications@github.com wrote:

@outcoldman https://github.com/outcoldman Thanks a lot for that
response.

We've experimented with consuming the json-file logs via universal
forwarder but multi-line stuff like java stack traces is causing us
problems.

log-opt splunk-verifyconnection=true|false sounds great. What are your
other enhancements?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#21966 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAInRF8AfCQ2AJ-18XsqtCUz8p0fFiSJks5qMVqdgaJpZM4IFyzU
.

glennblock commented Jun 16, 2016

+1 on the concerns here. This has been something we have heard brought up
several times by customers using our driver. The new param @outcoldman is
adding is specifically for this purpose. By default the driver behaves as
it does today, but if you opt-in you can override.

@nickperry have you tried using the Transaction command? It can help
assemble the lines when you search. We don't have a native way to handle
this at ingest time via HEC yet.
On Thu, Jun 16, 2016 at 7:16 AM nickperry notifications@github.com wrote:

@outcoldman https://github.com/outcoldman Thanks a lot for that
response.

We've experimented with consuming the json-file logs via universal
forwarder but multi-line stuff like java stack traces is causing us
problems.

log-opt splunk-verifyconnection=true|false sounds great. What are your
other enhancements?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#21966 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAInRF8AfCQ2AJ-18XsqtCUz8p0fFiSJks5qMVqdgaJpZM4IFyzU
.

@michaelwilde

This comment has been minimized.

Show comment
Hide comment
@michaelwilde

michaelwilde Jun 16, 2016

@outcoldman @nickperry @cpuguy83 @lkm @glennblock While we can do as much as we can at splunk to ensure "things work".. this is less about Splunk, and more about how Docker makes a singular decision that if a logging driver (that uses TCP) fails to connect to its destination host, the container does not start. One can make the argument "if one cannot log, one should not run". If thats going to be default, then fine.. but we'd all like to see an option where the container will press on and run itself (even with a logging driver CLI arg)--and of course re-attempt connection to said logging driver's remote host.

Syslog UDP is nice, but most people--when they can--use TCP for syslog (i think the track-ability of TCP pleases most network admins vs the fire, forget, and hope it arrives method of UDP syslogging) #opinion

michaelwilde commented Jun 16, 2016

@outcoldman @nickperry @cpuguy83 @lkm @glennblock While we can do as much as we can at splunk to ensure "things work".. this is less about Splunk, and more about how Docker makes a singular decision that if a logging driver (that uses TCP) fails to connect to its destination host, the container does not start. One can make the argument "if one cannot log, one should not run". If thats going to be default, then fine.. but we'd all like to see an option where the container will press on and run itself (even with a logging driver CLI arg)--and of course re-attempt connection to said logging driver's remote host.

Syslog UDP is nice, but most people--when they can--use TCP for syslog (i think the track-ability of TCP pleases most network admins vs the fire, forget, and hope it arrives method of UDP syslogging) #opinion

@akvadrako

This comment has been minimized.

Show comment
Hide comment
@akvadrako

akvadrako Aug 19, 2016

It would be really nice if the log drivers were not an additional point of failure. Ideally,

  • containers still run when they can't connect to their TCP endpoint
  • the driver continuously tries to reconnect if the connections fails at any point

akvadrako commented Aug 19, 2016

It would be really nice if the log drivers were not an additional point of failure. Ideally,

  • containers still run when they can't connect to their TCP endpoint
  • the driver continuously tries to reconnect if the connections fails at any point
@LaurentDumont

This comment has been minimized.

Show comment
Hide comment
@LaurentDumont

LaurentDumont Oct 7, 2016

I've just run into this with the Docker Splunk Driver. If for some reason, the remote Splunk server doesn't answer, the container fails to start. I'm a bit surprised that there isn't a fallback method where it could default to some other logging method instead of just stopping.

LaurentDumont commented Oct 7, 2016

I've just run into this with the Docker Splunk Driver. If for some reason, the remote Splunk server doesn't answer, the container fails to start. I'm a bit surprised that there isn't a fallback method where it could default to some other logging method instead of just stopping.

@michaelwilde

This comment has been minimized.

Show comment
Hide comment
@michaelwilde

michaelwilde Oct 7, 2016

I think docker has a belief that if a container can't log, a container shouldn't run. A good part of me agrees with the concept, but in practice there's a good use case for a periodic retry as a fallback and let the container start.

michaelwilde commented Oct 7, 2016

I think docker has a belief that if a container can't log, a container shouldn't run. A good part of me agrees with the concept, but in practice there's a good use case for a periodic retry as a fallback and let the container start.

@lkm

This comment has been minimized.

Show comment
Hide comment
@lkm

lkm Oct 8, 2016

@michaelwilde I think the one sane way to run this in production is so follow @nickperry approach and run the universal forwarded docker image on every server, and map it's HEC port (8088) to localhost. We use forwarder management to avoid manually configuring all the forwarders for inputs.

lkm commented Oct 8, 2016

@michaelwilde I think the one sane way to run this in production is so follow @nickperry approach and run the universal forwarded docker image on every server, and map it's HEC port (8088) to localhost. We use forwarder management to avoid manually configuring all the forwarders for inputs.

@LaurentDumont

This comment has been minimized.

Show comment
Hide comment
@LaurentDumont

LaurentDumont Oct 8, 2016

The safest solution would probably to log to a file, and ship everything with the forwarder. I can't see a situation (besides a full disk) where the docker daemon can't write.

The weird thing is that there is a "none" option for the logging driver. It could be the default fallback option if the primary logging driver fails.

LaurentDumont commented Oct 8, 2016

The safest solution would probably to log to a file, and ship everything with the forwarder. I can't see a situation (besides a full disk) where the docker daemon can't write.

The weird thing is that there is a "none" option for the logging driver. It could be the default fallback option if the primary logging driver fails.

@outcoldman

This comment has been minimized.

Show comment
Hide comment
@outcoldman

outcoldman Oct 8, 2016

Contributor

btw, for the Splunk Logging Driver we also have implemented retry logic, so in case of --log-driver splunk --log-opt splunk-verifyconnection=false you will be able to start your container, and in case if Splunk is not available - we will keep trying to send logs. See https://github.com/docker/docker/blob/master/docs/admin/logging/splunk.md#advanced-options for details

Contributor

outcoldman commented Oct 8, 2016

btw, for the Splunk Logging Driver we also have implemented retry logic, so in case of --log-driver splunk --log-opt splunk-verifyconnection=false you will be able to start your container, and in case if Splunk is not available - we will keep trying to send logs. See https://github.com/docker/docker/blob/master/docs/admin/logging/splunk.md#advanced-options for details

@LaurentDumont

This comment has been minimized.

Show comment
Hide comment
@LaurentDumont

LaurentDumont Oct 8, 2016

Weirdly enough, I get unknown log opt 'splunk-verify-connection' for splunk log driver with Docker 1.12.1 and Docker-Compose.

Scratch that, just saw it was for the 1.13 milestone :(

Or at least, it seems that the PR was never merged into 1.12 #24652

😢

LaurentDumont commented Oct 8, 2016

Weirdly enough, I get unknown log opt 'splunk-verify-connection' for splunk log driver with Docker 1.12.1 and Docker-Compose.

Scratch that, just saw it was for the 1.13 milestone :(

Or at least, it seems that the PR was never merged into 1.12 #24652

😢

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah

thaJeztah Oct 10, 2016

Member

@LaurentDumont the PR was continued in #25786, which is on the 1.13 milestone

Member

thaJeztah commented Oct 10, 2016

@LaurentDumont the PR was continued in #25786, which is on the 1.13 milestone

@marcusmyers

This comment has been minimized.

Show comment
Hide comment
@marcusmyers

marcusmyers Jun 1, 2017

Any more news on this? We are still experiencing this with 2017.15.0-CE.

marcusmyers commented Jun 1, 2017

Any more news on this? We are still experiencing this with 2017.15.0-CE.

@port22

This comment has been minimized.

Show comment
Hide comment
@port22

port22 Jul 6, 2017

Does not work with

Docker version 17.06.0-ce-rc4, build 29fcd5d

either:

# docker run -d --log-driver=fluentd --log-opt="fluentd-address=localhost:24224" 
--log-opt="fluentd-async-connect"  nginx
8cd56b8003bad9592a8f044f4992e137382bedfbe8c5a4450bbc66f26d8f9850
docker: Error response from daemon: failed to initialize logging driver: 
dial tcp [::1]:24224: getsockopt: connection refused.

in contrast to the docs who say:

If container cannot connect to the Fluentd daemon, the container stops immediately unless the fluentd-async-connect option is used.

port22 commented Jul 6, 2017

Does not work with

Docker version 17.06.0-ce-rc4, build 29fcd5d

either:

# docker run -d --log-driver=fluentd --log-opt="fluentd-address=localhost:24224" 
--log-opt="fluentd-async-connect"  nginx
8cd56b8003bad9592a8f044f4992e137382bedfbe8c5a4450bbc66f26d8f9850
docker: Error response from daemon: failed to initialize logging driver: 
dial tcp [::1]:24224: getsockopt: connection refused.

in contrast to the docs who say:

If container cannot connect to the Fluentd daemon, the container stops immediately unless the fluentd-async-connect option is used.
@vozerov

This comment has been minimized.

Show comment
Hide comment
@vozerov

vozerov Sep 19, 2017

@port22 try to run like this:

docker run -d --log-driver=fluentd --log-opt="fluentd-address=localhost:24224" --log-opt="fluentd-async-connect=true" nginx

worked for me

vozerov commented Sep 19, 2017

@port22 try to run like this:

docker run -d --log-driver=fluentd --log-opt="fluentd-address=localhost:24224" --log-opt="fluentd-async-connect=true" nginx

worked for me

@bbergshaven

This comment has been minimized.

Show comment
Hide comment
@bbergshaven

bbergshaven Sep 28, 2017

Any updates for the syslog driver as well?

bbergshaven commented Sep 28, 2017

Any updates for the syslog driver as well?

@bitbrain

This comment has been minimized.

Show comment
Hide comment
@bitbrain

bitbrain Dec 12, 2017

In Amazon ECS I get the following error when the logging driver is not available for some reason:

CannotStartContainerError: API error (500): Failed to initialize logging driver.

This seems to be related somehow.

Any news on this?

bitbrain commented Dec 12, 2017

In Amazon ECS I get the following error when the logging driver is not available for some reason:

CannotStartContainerError: API error (500): Failed to initialize logging driver.

This seems to be related somehow.

Any news on this?

@kinnalru

This comment has been minimized.

Show comment
Hide comment
@kinnalru

kinnalru Feb 17, 2018

What about syslog driver now?

kinnalru commented Feb 17, 2018

What about syslog driver now?

@shantanugadgil

This comment has been minimized.

Show comment
Hide comment
@shantanugadgil

shantanugadgil Apr 23, 2018

Same issue, same requirement.

I was hoping the mode and max-buffer-size variable could be helpful, if the remote syslog server would come online after some time. But no such luck! 😢 😞

# docker --version
Docker version 18.03.0-ce, build 0520e24

# docker run -it --log-driver syslog --log-opt syslog-address=tcp://logs.mydomain.com:514 --log-opt mode=non-blocking --log-opt max-buffer-size=4m alpine ping 127.0.0.1
docker: Error response from daemon: failed to initialize logging driver: dial tcp aaa.bbb.ccc.ddd:514: getsockopt: connection refused.

shantanugadgil commented Apr 23, 2018

Same issue, same requirement.

I was hoping the mode and max-buffer-size variable could be helpful, if the remote syslog server would come online after some time. But no such luck! 😢 😞

# docker --version
Docker version 18.03.0-ce, build 0520e24

# docker run -it --log-driver syslog --log-opt syslog-address=tcp://logs.mydomain.com:514 --log-opt mode=non-blocking --log-opt max-buffer-size=4m alpine ping 127.0.0.1
docker: Error response from daemon: failed to initialize logging driver: dial tcp aaa.bbb.ccc.ddd:514: getsockopt: connection refused.
@abstatic

This comment has been minimized.

Show comment
Hide comment
@abstatic

abstatic Jul 10, 2018

I am also facing the same issue with ECS. If for whatever reason the logging endpoint is down, the container won't even start.

Do we need to put our logging machine under HA too :(

abstatic commented Jul 10, 2018

I am also facing the same issue with ECS. If for whatever reason the logging endpoint is down, the container won't even start.

Do we need to put our logging machine under HA too :(

@deevodavis71

This comment has been minimized.

Show comment
Hide comment
@deevodavis71

deevodavis71 Jul 31, 2018

Same issue using "syslog" logging driver on Mac OSX Docker version 18.03.1-ce, build 9ee9f40. Kind of makes centralised logging pointless if the containers won't even start if the syslog server isn't available! Have tried the "non-blocking" option but it still fails to start...

deevodavis71 commented Jul 31, 2018

Same issue using "syslog" logging driver on Mac OSX Docker version 18.03.1-ce, build 9ee9f40. Kind of makes centralised logging pointless if the containers won't even start if the syslog server isn't available! Have tried the "non-blocking" option but it still fails to start...

@jortkoopmans

This comment has been minimized.

Show comment
Hide comment
@jortkoopmans

jortkoopmans Aug 3, 2018

@deevodavis71 ; I think you need to recreate the containers, in order to get the new host level logging driver settings propagated. You can check the situation using 'docker inspect' on the stuck containers, under the 'LogConfig' section.

Just ran into this problem as well, docker 18.06ce. Going udp for now.

jortkoopmans commented Aug 3, 2018

@deevodavis71 ; I think you need to recreate the containers, in order to get the new host level logging driver settings propagated. You can check the situation using 'docker inspect' on the stuck containers, under the 'LogConfig' section.

Just ran into this problem as well, docker 18.06ce. Going udp for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment