Sometimes service health is checked only once for docker services #6096

Himura2la · 2019-12-27T09:27:12Z

Do you want to request a feature or report a bug?

Bug

What did you do?

Use docker-compose labels with the docker provider for blue-green deployment
Use healthcheck feature to decide when I can stop the old service instance
By the way, this Traefik healthcheck should not pass by default for newly discovered backends #4544 feature would be great for my task

What did you expect to see?

New container starts with the same labels, so it goes as a second instance of the same service
First healthcheck on it fails, and Traefik does not route requests on it
Healthcheck repeats as configured using the traefik.http.services.service.loadbalancer.healthcheck.interval label.
Once the healthcheck passes, the deployment script stops the old instance.

What did you see instead?

New container starts with the same labels, so it goes as a second instance of the same service
First healthcheck on the new service fails, Traefik "removes the instance from server list" (wording from logs).
The instance becomes available and replies 200 for the direct request on a healthcheck endpoint
Traefik does not check it and the instance remains DOWN forever.

As a workaround, I changed the deployment script so that it checks the endpoint directly and stops the old instance once the new one is OK. When the instance remains only on in the service, it is healthchecked and fortunately becomes UP.

This is happens not every time, and I did not manage to determine the conditions to reproduce it.

The only clue I noticed is possibly related to this #3834 (comment) issue:

It works as expected if the first healthcheck runs twice
It stucks if the first healthcheck runs only once
(according to logs)

Output of `traefik version`:

Version:      2.1.1
Codename:     cantal
Go version:   go1.13.5
Built:        2019-12-12T19:01:37Z
OS/Arch:      linux/amd64

What is your environment & configuration (arguments, toml, provider, platform, ...)?

entryPoints:
  myservices:
    address: ":80"
providers:
  docker:
    endpoint: "unix:///var/run/docker.sock"
    exposedByDefault: false
    network: webgateway
api:
  dashboard: true
  insecure: true
log:
  level: info

  service:
    build:
      labels:
        - "traefik.enable=true"
        - "traefik.http.routers.service.rule=Host(`foo.bar.com`)"
        - "traefik.http.routers.service.entryPoints=service-entrypoint"
        - "traefik.http.services.service.loadbalancer.healthcheck.path=/"
        - "traefik.http.services.service.loadbalancer.healthcheck.interval=10s"
        - "traefik.http.services.service.loadbalancer.healthcheck.timeout=9s"
    networks:
      - webgateway

If applicable, please paste the log output in DEBUG level (`--log.level=DEBUG` switch)

**deploy new instance of service1**
time="2019-12-24T17:08:59Z" level=warning msg="Health check failed, removing from server list. Backend: \"service1@docker\" URL: \"http://172.18.0.6:80\" Weight: 1 Reason: HTTP request failed: Get http://172.18.0.6:80/: dial tcp 172.18.0.6:80: connect: connection refused"
time="2019-12-24T17:08:59Z" level=warning msg="Health check failed, removing from server list. Backend: \"service1@docker\" URL: \"http://172.18.0.6:80\" Weight: 1 Reason: HTTP request failed: Get http://172.18.0.6:80/: dial tcp 172.18.0.6:80: connect: connection refused"
time="2019-12-24T17:08:59Z" level=error msg="server not found"
time="2019-12-24T17:09:10Z" level=warning msg="Health check up: Returning to server list. Backend: \"service1@docker\" URL: \"http://172.18.0.6:80\" Weight: 1"
**remove old instance of service1, everything is fine**

**deploy new instance of service2**
time="2019-12-24T17:12:29Z" level=warning msg="Health check failed, removing from server list. Backend: \"service2@docker\" URL: \"http://172.18.0.3:80\" Weight: 1 Reason: HTTP request failed: Get http://172.18.0.3:80/: dial tcp 172.18.0.3:80: connect: connection refused"
**172.18.0.3 stuck in unhealthy status forever**

The text was updated successfully, but these errors were encountered:

Himura2la · 2019-12-30T13:55:38Z

I grabbed the deployment script logs from our CI:

Deploying 'foo_BLUE' in place of 'foo_GREEN'...
Using default tag: latest
latest: Pulling from foo
804555ee0376: Already exists
970251047358: Already exists
f3d4c41a4fd1: Already exists
32afd03f1854: Already exists
e014e07c0b51: Pulling fs layer
ccc5b75dfcb4: Pulling fs layer
e014e07c0b51: Verifying Checksum
e014e07c0b51: Download complete
ccc5b75dfcb4: Download complete
e014e07c0b51: Pull complete
ccc5b75dfcb4: Pull complete
Digest: sha256:f124b9d3gcf14be126f68274dcc5c8e36d9e41768997251eee8a8cfbe671f730
Status: Downloaded newer image for our.internal.registry/foo:latest
our.internal.registry/foo:latest
8a3f2f662b2a0f1731b75ec4453b63a5ce987256c092a7d3bf0f42544c429f42
Container started. Starting health check...
Using Traefik health check:
    Traefik API response: {"loadBalancer":{"servers":[{"url":"http://172.18.0.4:80"},{"url":"http://172.18.0.3:80"}],"healthCheck":{"path":"/bar","interval":"10s","timeout":"9s"},"passHostHeader":true},"status":"enabled","usedBy":["foo@docker"],"serverStatus":{"http://172.18.0.3:80":"UP","http://172.18.0.4:80":"DOWN"},"name":"foo@docker","provider":"docker","type":"loadbalancer"}
New 'foo_BLUE' service (172.18.0.4) seems unhealthy. Waiting (1)...
Using Traefik health check:
    Traefik API response: {"loadBalancer":{"servers":[{"url":"http://172.18.0.4:80"},{"url":"http://172.18.0.3:80"}],"healthCheck":{"path":"/bar","interval":"10s","timeout":"9s"},"passHostHeader":true},"status":"enabled","usedBy":["foo@docker"],"serverStatus":{"http://172.18.0.3:80":"UP","http://172.18.0.4:80":"DOWN"},"name":"foo@docker","provider":"docker","type":"loadbalancer"}
New 'foo_BLUE' service (172.18.0.4) seems unhealthy. Waiting (2)...
Using Traefik health check:
    Traefik API response: {"loadBalancer":{"servers":[{"url":"http://172.18.0.4:80"},{"url":"http://172.18.0.3:80"}],"healthCheck":{"path":"/bar","interval":"10s","timeout":"9s"},"passHostHeader":true},"status":"enabled","usedBy":["foo@docker"],"serverStatus":{"http://172.18.0.3:80":"UP","http://172.18.0.4:80":"DOWN"},"name":"foo@docker","provider":"docker","type":"loadbalancer"}
New 'foo_BLUE' service (172.18.0.4) seems unhealthy. Waiting (3)...
Using Traefik health check:
    Traefik API response: {"loadBalancer":{"servers":[{"url":"http://172.18.0.4:80"},{"url":"http://172.18.0.3:80"}],"healthCheck":{"path":"/bar","interval":"10s","timeout":"9s"},"passHostHeader":true},"status":"enabled","usedBy":["foo@docker"],"serverStatus":{"http://172.18.0.3:80":"UP","http://172.18.0.4:80":"DOWN"},"name":"foo@docker","provider":"docker","type":"loadbalancer"}
New 'foo_BLUE' service (172.18.0.4) seems unhealthy. Waiting (4)...
Using Traefik health check:
    Traefik API response: {"loadBalancer":{"servers":[{"url":"http://172.18.0.4:80"},{"url":"http://172.18.0.3:80"}],"healthCheck":{"path":"/bar","interval":"10s","timeout":"9s"},"passHostHeader":true},"status":"enabled","usedBy":["foo@docker"],"serverStatus":{"http://172.18.0.3:80":"UP","http://172.18.0.4:80":"DOWN"},"name":"foo@docker","provider":"docker","type":"loadbalancer"}
New 'foo_BLUE' service (172.18.0.4) seems unhealthy. Waiting (5)...
Using Traefik health check:
    Traefik API response: {"loadBalancer":{"servers":[{"url":"http://172.18.0.4:80"},{"url":"http://172.18.0.3:80"}],"healthCheck":{"path":"/bar","interval":"10s","timeout":"9s"},"passHostHeader":true},"status":"enabled","usedBy":["foo@docker"],"serverStatus":{"http://172.18.0.3:80":"UP","http://172.18.0.4:80":"DOWN"},"name":"foo@docker","provider":"docker","type":"loadbalancer"}
New 'foo_BLUE' service (172.18.0.4) seems unhealthy. Waiting (6)...
Using Traefik health check:
    Traefik API response: {"loadBalancer":{"servers":[{"url":"http://172.18.0.4:80"},{"url":"http://172.18.0.3:80"}],"healthCheck":{"path":"/bar","interval":"10s","timeout":"9s"},"passHostHeader":true},"status":"enabled","usedBy":["foo@docker"],"serverStatus":{"http://172.18.0.3:80":"UP","http://172.18.0.4:80":"DOWN"},"name":"foo@docker","provider":"docker","type":"loadbalancer"}
New 'foo_BLUE' service (172.18.0.4) seems unhealthy. Waiting (7)...
Using Traefik health check:
    Traefik API response: {"loadBalancer":{"servers":[{"url":"http://172.18.0.4:80"},{"url":"http://172.18.0.3:80"}],"healthCheck":{"path":"/bar","interval":"10s","timeout":"9s"},"passHostHeader":true},"status":"enabled","usedBy":["foo@docker"],"serverStatus":{"http://172.18.0.3:80":"UP","http://172.18.0.4:80":"DOWN"},"name":"foo@docker","provider":"docker","type":"loadbalancer"}
New 'foo_BLUE' service (172.18.0.4) seems unhealthy. Waiting (8)...
Using Traefik health check:
    Traefik API response: {"loadBalancer":{"servers":[{"url":"http://172.18.0.4:80"},{"url":"http://172.18.0.3:80"}],"healthCheck":{"path":"/bar","interval":"10s","timeout":"9s"},"passHostHeader":true},"status":"enabled","usedBy":["foo@docker"],"serverStatus":{"http://172.18.0.3:80":"UP","http://172.18.0.4:80":"DOWN"},"name":"foo@docker","provider":"docker","type":"loadbalancer"}
New 'foo_BLUE' service (172.18.0.4) seems unhealthy. Waiting (9)...
Using Traefik health check:
    Traefik API response: {"loadBalancer":{"servers":[{"url":"http://172.18.0.4:80"},{"url":"http://172.18.0.3:80"}],"healthCheck":{"path":"/bar","interval":"10s","timeout":"9s"},"passHostHeader":true},"status":"enabled","usedBy":["foo@docker"],"serverStatus":{"http://172.18.0.3:80":"UP","http://172.18.0.4:80":"DOWN"},"name":"foo@docker","provider":"docker","type":"loadbalancer"}
New 'foo_BLUE' service (172.18.0.4) seems unhealthy. Waiting (10)...
Using Traefik health check:
    Traefik API response: {"loadBalancer":{"servers":[{"url":"http://172.18.0.4:80"},{"url":"http://172.18.0.3:80"}],"healthCheck":{"path":"/bar","interval":"10s","timeout":"9s"},"passHostHeader":true},"status":"enabled","usedBy":["foo@docker"],"serverStatus":{"http://172.18.0.3:80":"UP","http://172.18.0.4:80":"DOWN"},"name":"foo@docker","provider":"docker","type":"loadbalancer"}
New 'foo_BLUE' service (172.18.0.4) seems unhealthy. Waiting (11)...
Using Traefik health check:
    Traefik API response: {"loadBalancer":{"servers":[{"url":"http://172.18.0.4:80"},{"url":"http://172.18.0.3:80"}],"healthCheck":{"path":"/bar","interval":"10s","timeout":"9s"},"passHostHeader":true},"status":"enabled","usedBy":["foo@docker"],"serverStatus":{"http://172.18.0.3:80":"UP","http://172.18.0.4:80":"DOWN"},"name":"foo@docker","provider":"docker","type":"loadbalancer"}
New 'foo_BLUE' service (172.18.0.4) seems unhealthy. Waiting (12)...
Using Traefik health check:
    Traefik API response: {"loadBalancer":{"servers":[{"url":"http://172.18.0.4:80"},{"url":"http://172.18.0.3:80"}],"healthCheck":{"path":"/bar","interval":"10s","timeout":"9s"},"passHostHeader":true},"status":"enabled","usedBy":["foo@docker"],"serverStatus":{"http://172.18.0.3:80":"UP","http://172.18.0.4:80":"DOWN"},"name":"foo@docker","provider":"docker","type":"loadbalancer"}
New 'foo_BLUE' service (172.18.0.4) seems unhealthy. Waiting (13)...
Using Traefik health check:
    Traefik API response: {"loadBalancer":{"servers":[{"url":"http://172.18.0.4:80"},{"url":"http://172.18.0.3:80"}],"healthCheck":{"path":"/bar","interval":"10s","timeout":"9s"},"passHostHeader":true},"status":"enabled","usedBy":["foo@docker"],"serverStatus":{"http://172.18.0.3:80":"UP","http://172.18.0.4:80":"DOWN"},"name":"foo@docker","provider":"docker","type":"loadbalancer"}
New 'foo_BLUE' service (172.18.0.4) seems unhealthy. Waiting (14)...
Using manual health check, because Traefik seems hung (HACK):
    Requesting http://172.18.0.4/bar in the docker network:

<div>
    page content
</div>

New 'foo_BLUE' service seems operational. Stopping 'foo_GREEN' project...
foo_GREEN
Deployment successful!

The delay between checks is 10 seconds.
If it fails 14 times, the last 5 times is checked manually like this: docker run --rm --network webgateway busybox wget -qO- "$health_check_address"

zvymazal · 2020-02-04T11:06:46Z

I'm encountering exactly the same behavior with the latest traefik release. It's clearly a healthcheck issue as the service is correctly registered with traefik and its responding to http request from traefik container but no healthchecks are being sent. Exactly as described above - once the old container is shut down then the healthchecks are resumed and container is reported correctly as healthy again.

Version:      2.1.3
Codename:     cantal
Go version:   go1.13.6
Built:        2020-01-21T17:30:29Z
OS/Arch:      linux/amd64```

zvymazal · 2020-02-13T15:58:49Z

I have tried to add some extra debug logging and reproduce the behavior to pinpoint where the problem can be. Here's what I observed:

If an event triggers https://github.com/containous/traefik/blob/0c90f6afa24ef390fec43ca654f806915e821daa/pkg/server/service/service.go#L201 then BackendConfig configurations get re-created and eventually healthchecks are updated at: https://github.com/containous/traefik/blob/0c90f6afa24ef390fec43ca654f806915e821daa/pkg/healthcheck/healthcheck.go#L115
The problem seems to be that disabledURLs is a property of BackendConfig and the value can get lost when configuration is updated. This is the case if two events trigger this behavior at nearly the same time. Exactly what is happening in: https://github.com/containous/traefik/blob/0c90f6afa24ef390fec43ca654f806915e821daa/pkg/server/routerfactory.go#L74
Then it's only a matter of specific timing if the server's URL is preserved in the healthcheck configuration or not.

To me it would make more sense to store the disabled URLs on the LB so that it cannot get lost when healthcheck configuration is changed or if there are multiple events in short succession. I'm not a go developer and have no insight into the rest of the code though.

Please find the extra debug logs prepended with >>> and referencing particular line in the source code (HEAD 0c90f6afa24ef390fec43ca654f806915e821daa).

Docker compose file for traefik:

version: "3.7"

services:
  traefik:
    image: "containous/traefik:latest"
    container_name: "traefik"
    command:
      - "--accesslog=true"
      - "--accesslog.bufferingsize=100"
      - "--accesslog.filepath=/var/log/traefik/access.log"
      - "--accesslog.format=json"
      - "--api=false"
      - "--api.dashboard=false"
      - "--entrypoints.web.address=:80"
      - "--entryPoints.web.forwardedHeaders.insecure"
      - "--log.format=common"
      - "--log.level=DEBUG"
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
    networks:
      - "proxy"
    ports:
      - "80:80"
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:ro"

networks:
  proxy:
    driver: bridge
    name: proxy

Docker compose file for application:

version: "3.7"

services:
  web:
    image: "xxx/xxx"
    networks:
      - "proxy"
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.web.rule=hostregexp(`{host:.+}`)"
      - "traefik.http.routers.web.entrypoints=web"
      - "traefik.http.routers.web.middlewares=retry@docker,header_https@docker"
      - "traefik.http.services.web.loadbalancer.healthcheck.headers.X-Forwarded-Proto=https"
      - "traefik.http.services.web.loadbalancer.healthcheck.interval=5s"
      - "traefik.http.services.web.loadbalancer.healthcheck.path=/_ping"
      - "traefik.http.services.web.loadbalancer.healthcheck.port=80"
      - "traefik.http.services.web.loadbalancer.healthcheck.timeout=4s"
      - "traefik.http.middlewares.retry.retry.attempts=3"
      - "traefik.http.middlewares.header_https.headers.customrequestheaders.X-Forwarded-Proto=https"

networks:
  proxy:
    external: true
    name: proxy

Debug log:

traefik    | time="2020-02-13T15:08:05Z" level=debug msg="Configuration received from provider docker: {\"http\":{\"routers\":{\"web\":{\"entryPoints\":[\"web\"],\"middlewares\":[\"retry@docker\",\"header_https@docker\"],\"service\":\"web\",\"rule\":\"hostregexp(`{host:.+}`)\"}},\"middlewares\":{\"header_https\":{\"headers\":{\"customRequestHeaders\":{\"X-Forwarded-Proto\":\"https\"}}},\"retry\":{\"retry\":{\"attempts\":3}}},\"services\":{\"web\":{\"loadBalancer\":{\"servers\":[{\"url\":\"http://172.19.0.3:80\"}],\"healthCheck\":{\"path\":\"/_ping\",\"port\":80,\"interval\":\"5s\",\"timeout\":\"4s\",\"headers\":{\"X-Forwarded-Proto\":\"https\"}},\"passHostHeader\":true}}}},\"tcp\":{},\"udp\":{}}" providerName=docker
traefik    | time="2020-02-13T15:08:05Z" level=debug msg=">>>: routerfactory.go:L61: Entering CreateRouters"
traefik    | time="2020-02-13T15:08:05Z" level=debug msg=">>>: router.go:L73: Entering BuildHandlers - tls: false"
traefik    | time="2020-02-13T15:08:05Z" level=debug msg="Creating Middleware (ResponseModifier)" middlewareName=header_https@docker middlewareType=Headers entryPointName=web routerName=web@docker
traefik    | time="2020-02-13T15:08:05Z" level=debug msg="Creating middleware" routerName=web@docker serviceName=web middlewareName=pipelining middlewareType=Pipelining entryPointName=web
traefik    | time="2020-02-13T15:08:05Z" level=debug msg="Creating load-balancer" entryPointName=web routerName=web@docker serviceName=web
traefik    | time="2020-02-13T15:08:05Z" level=debug msg="Creating server 0 http://172.19.0.3:80" serviceName=web serverName=0 entryPointName=web routerName=web@docker
traefik    | time="2020-02-13T15:08:05Z" level=debug msg="Added outgoing tracing middleware web" middlewareType=TracingForwarder middlewareName=tracing entryPointName=web routerName=web@docker
traefik    | time="2020-02-13T15:08:05Z" level=debug msg="Creating middleware" middlewareType=Headers entryPointName=web routerName=web@docker middlewareName=header_https@docker
traefik    | time="2020-02-13T15:08:05Z" level=debug msg="Setting up customHeaders/Cors from %v{map[X-Forwarded-Proto:https] map[] false [] []  [] 0 false [] [] false false  map[] false 0 false false false false  false false      false}" middlewareName=header_https@docker middlewareType=Headers entryPointName=web routerName=web@docker
traefik    | time="2020-02-13T15:08:05Z" level=debug msg="Adding tracing to middleware" entryPointName=web routerName=web@docker middlewareName=header_https@docker
traefik    | time="2020-02-13T15:08:05Z" level=debug msg="Creating middleware" middlewareName=retry@docker middlewareType=Retry entryPointName=web routerName=web@docker
traefik    | time="2020-02-13T15:08:05Z" level=debug msg="Adding tracing to middleware" entryPointName=web routerName=web@docker middlewareName=retry@docker
traefik    | time="2020-02-13T15:08:05Z" level=debug msg="Creating middleware" entryPointName=web middlewareName=traefik-internal-recovery middlewareType=Recovery
traefik    | time="2020-02-13T15:08:05Z" level=debug msg=">>>: service.go:L202 : Entering LaunchHealthCheck"
traefik    | time="2020-02-13T15:08:05Z" level=debug msg="Setting up healthcheck for service web@docker with [Hostname:  Headers: map[X-Forwarded-Proto:https] Scheme:  Path: /_ping Port: 80 Interval: 5s Timeout: 4s]" serviceName=web@docker
traefik    | time="2020-02-13T15:08:05Z" level=warning msg=">>>: healthcheck.go:L115 : new backend: [name: web@docker Options: [Hostname:  Headers: map[X-Forwarded-Proto:https] Scheme:  Path: /_ping Port: 80 Interval: 5s Timeout: 4s] disabledURLs: []]"
traefik    | time="2020-02-13T15:08:05Z" level=debug msg="Initial health check for backend: \"web@docker\""
traefik    | time="2020-02-13T15:08:05Z" level=debug msg=">>>: healthcheck.go:L152 : Enabled URLs: [http://172.19.0.3:80]"
traefik    | time="2020-02-13T15:08:05Z" level=debug msg=">>>: healthcheck.go:L152 : Disabled URLs: []"
traefik    | time="2020-02-13T15:08:05Z" level=warning msg="Health check failed, removing from server list. Backend: \"web@docker\" URL: \"http://172.19.0.3:80\" Weight: 1 Reason: HTTP request failed: Get http://172.19.0.3:80/_ping: dial tcp 172.19.0.3:80: connect: connection refused"
traefik    | time="2020-02-13T15:08:05Z" level=debug msg=">>>: healthcheck.go:L190 : backend status: [name: web@docker Options: [Hostname:  Headers: map[X-Forwarded-Proto:https] Scheme:  Path: /_ping Port: 80 Interval: 5s Timeout: 4s] disabledURLs: [[url: http://172.19.0.3:80 weight: 1]]]"
traefik    | time="2020-02-13T15:08:05Z" level=debug msg=">>>: router.go:L73: Entering BuildHandlers - tls: true"
traefik    | time="2020-02-13T15:08:05Z" level=debug msg=">>>: service.go:L202 : Entering LaunchHealthCheck"
traefik    | time="2020-02-13T15:08:05Z" level=debug msg="Setting up healthcheck for service web@docker with [Hostname:  Headers: map[X-Forwarded-Proto:https] Scheme:  Path: /_ping Port: 80 Interval: 5s Timeout: 4s]" serviceName=web@docker
traefik    | time="2020-02-13T15:08:05Z" level=warning msg=">>>: healthcheck.go:L115 : old backend: [name: web@docker Options: [Hostname:  Headers: map[X-Forwarded-Proto:https] Scheme:  Path: /_ping Port: 80 Interval: 5s Timeout: 4s] disabledURLs: [[url: http://172.19.0.3:80 weight: 1]]]"
traefik    | time="2020-02-13T15:08:05Z" level=warning msg=">>>: healthcheck.go:L115 : new backend: [name: web@docker Options: [Hostname:  Headers: map[X-Forwarded-Proto:https] Scheme:  Path: /_ping Port: 80 Interval: 5s Timeout: 4s] disabledURLs: []]"
traefik    | time="2020-02-13T15:08:05Z" level=debug msg="Stopping current health check goroutines of backend: web@docker"
traefik    | time="2020-02-13T15:08:05Z" level=debug msg="No default certificate, generating one"
traefik    | time="2020-02-13T15:08:05Z" level=debug msg="Initial health check for backend: \"web@docker\""
traefik    | time="2020-02-13T15:08:05Z" level=debug msg=">>>: healthcheck.go:L152 : Enabled URLs: []"
traefik    | time="2020-02-13T15:08:05Z" level=debug msg=">>>: healthcheck.go:L152 : Disabled URLs: []"
traefik    | time="2020-02-13T15:08:10Z" level=debug msg="Refreshing health check for backend: web@docker"
traefik    | time="2020-02-13T15:08:10Z" level=debug msg=">>>: healthcheck.go:L152 : Enabled URLs: []"
traefik    | time="2020-02-13T15:08:10Z" level=debug msg=">>>: healthcheck.go:L152 : Disabled URLs: []"
traefik    | time="2020-02-13T15:08:15Z" level=debug msg="Refreshing health check for backend: web@docker"
traefik    | time="2020-02-13T15:08:15Z" level=debug msg=">>>: healthcheck.go:L152 : Enabled URLs: []"
traefik    | time="2020-02-13T15:08:15Z" level=debug msg=">>>: healthcheck.go:L152 : Disabled URLs: []"
traefik    | time="2020-02-13T15:08:20Z" level=debug msg="Refreshing health check for backend: web@docker"
traefik    | time="2020-02-13T15:08:20Z" level=debug msg=">>>: healthcheck.go:L152 : Enabled URLs: []"
traefik    | time="2020-02-13T15:08:20Z" level=debug msg=">>>: healthcheck.go:L152 : Disabled URLs: []"

Please feel free to reach out if there's some more info I might be able to provide.

Workaround for traefik#6096

clownba0t · 2020-02-19T04:52:40Z

I seem to be seeing the same, or at least a very similar, issue when containers are restarted when docker comes back up after a host reboot - in this case, a vanilla Traefik v2.1.4 container and a single application container that registers itself with Traefik. The symptoms appear to be identical to those posted by others - two initial health checks are started very close to one another, one fails, then all subsequent health checks (including the second initial health check) run but do nothing and the server continues to appear down to Traefik despite actually being up.

clownba0t · 2020-02-19T12:28:11Z

I also seem to be seeing this happen on another host I manage that has two applications behind Traefik. In this case, application B's health check starts failing when application A is deployed (application B is not deployed, so its container remains unchanged throughout).

The root cause would appear to be the same, although in this case it's very interesting that the health check for an existing and unchanged container is affected. It seems that the docker provider restarts health checks for all containers in response to any single container event?

zvymazal · 2020-02-19T12:37:34Z

It seems that the docker provider restarts health checks for all containers in response to any single container event?

Yes, this seems to be the case.

traefiker · 2020-02-25T15:30:10Z

Closed by #6372.

Himura2la · 2020-02-26T06:25:24Z

Should the #3834 be also closed?

juliens · 2020-02-26T08:34:19Z

@Himura2la no, because the change is only on the 2.1 codebase.

traefiker added the status/0-needs-triage label Dec 27, 2019

juliens added area/healthcheck kind/bug/possible a possible bug that needs analysis before it is confirmed or fixed. and removed status/0-needs-triage labels Dec 27, 2019

ldez added this to issues in v2 via automation Feb 4, 2020

zvymazal added a commit to zvymazal/traefik that referenced this issue Feb 14, 2020

Disable healthcheck re-launch for https routers

18e5963

Workaround for traefik#6096

ldez assigned juliens Feb 19, 2020

juliens mentioned this issue Feb 25, 2020

Launch healthcheck only one time instead of two #6372

Merged

traefiker added this to the 2.1 milestone Feb 25, 2020

traefiker closed this as completed Feb 25, 2020

v2 automation moved this from issues to Done Feb 25, 2020

traefik locked and limited conversation to collaborators Apr 8, 2020

traefiker added the status/5-frozen-due-to-age label Apr 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sometimes service health is checked only once for docker services #6096

Sometimes service health is checked only once for docker services #6096

Himura2la commented Dec 27, 2019 •

edited by ldez

Loading

Himura2la commented Dec 30, 2019

zvymazal commented Feb 4, 2020

zvymazal commented Feb 13, 2020 •

edited

Loading

clownba0t commented Feb 19, 2020

clownba0t commented Feb 19, 2020 •

edited

Loading

zvymazal commented Feb 19, 2020

traefiker commented Feb 25, 2020

Himura2la commented Feb 26, 2020

juliens commented Feb 26, 2020

Sometimes service health is checked only once for docker services #6096

Sometimes service health is checked only once for docker services #6096

Comments

Himura2la commented Dec 27, 2019 • edited by ldez Loading

Do you want to request a feature or report a bug?

What did you do?

What did you expect to see?

What did you see instead?

Output of traefik version:

What is your environment & configuration (arguments, toml, provider, platform, ...)?

If applicable, please paste the log output in DEBUG level (--log.level=DEBUG switch)

Himura2la commented Dec 30, 2019

zvymazal commented Feb 4, 2020

zvymazal commented Feb 13, 2020 • edited Loading

clownba0t commented Feb 19, 2020

clownba0t commented Feb 19, 2020 • edited Loading

zvymazal commented Feb 19, 2020

traefiker commented Feb 25, 2020

Himura2la commented Feb 26, 2020

juliens commented Feb 26, 2020

Himura2la commented Dec 27, 2019 •

edited by ldez

Loading

Output of `traefik version`:

If applicable, please paste the log output in DEBUG level (`--log.level=DEBUG` switch)

zvymazal commented Feb 13, 2020 •

edited

Loading

clownba0t commented Feb 19, 2020 •

edited

Loading