Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Haproxy terminate socket connection when reloaded by Consul Template #525

Closed
brunomacf opened this issue Jan 27, 2016 · 8 comments
Closed

Comments

@brunomacf
Copy link

I'm using consul template to provide haproxy config file which load balance a set of thrift services. The configuration file for consul-template is:

template {
  source = "/var/template/haproxy.ctmpl"
  destination = "/etc/haproxy/haproxy.cfg"
  command = "haproxy -f /etc/haproxy/haproxy.cfg -sf $(pidof haproxy) &"
}

and for haproxy is:

global
    maxconn 2000
    pidfile /var/run/haproxy.pid
    log     127.0.0.1 local0
    log     127.0.0.1 local1 notice
    debug

    # echo "" | nc -U /var/run/haproxy.sock
    stats   socket /var/run/haproxy.sock mode 777

resolvers docker
    nameserver dns "172.17.0.1:53"

defaults
    log         global
    mode        tcp
    option      dontlognull
    option      tcplog
    retries     3
    timeout     check    5s
    timeout     client   1000s
    timeout     connect  1000s
    timeout     server   10s

frontend service1
    bind *:9000
    mode tcp
    default_backend service1_backend

frontend service2
    bind *:9001
    mode tcp
    default_backend service2_backend

backend service1_backend
    mode tcp
    balance roundrobin {{ range service "service1" }}
    server {{ .Node }} {{ .Address }}:{{ .Port }} {{ end }}

backend service2_backend
    mode tcp
    balance roundrobin {{ range service "service2" }}
    server {{ .Node }} {{ .Address }}:{{ .Port }} {{ end }}

listen statistics
    bind *:8778
    mode http
    stats enable
    stats show-desc BigDinosaur HAProxy Status
    stats uri /

The problem is that every time consul template reloads the haproxy config file the thrift socket connection passing by haproxy server is killed and all services retries to connect. Is there a workaround for this?

PS. 1: Everything is running through docker containers, but i think this is not a problem. :)

PS. 2: I'm running NodeJS services (nodejs 5.3.0), consul-template 0.12, haproxy 1.6)

@sethvargo
Copy link
Contributor

Hi @brunomacf

Consul Template is executing a subprocess, so I think this might be an haproxy issue itself. This also could be a duplicate of #442 manifesting itself differently.

Unfortunately I am not familiar with thrift sockets with respect to haproxy. Are you able to reproduce this issue without Consul Template? That will determine if this is a CT issue or haproxy one.

@brunomacf
Copy link
Author

Hi @sethvargo

As you pointed, it seems that the problem is more with haproxy and how it reloads (not so gracefully in my opinion) than with consul template. But one thing comes to my mind: my logs show that consul template are reloading haproxy very often, even if my containers stays the same. It would be great if consul templates reloads haproxy config file only when there is a change in consul context (e.g, some new container is registered or goes down). Is there some configuration of this kind yet?

Thanks a lot to your reply.

@sethvargo
Copy link
Contributor

Hi @brunomacf

Consul Template will only execute the command if the data has changed. What does your template look like?

@brunomacf
Copy link
Author

My haproxy template is exactly this:

global
    maxconn 2000
    pidfile /var/run/haproxy.pid
    log     127.0.0.1 local0
    log     127.0.0.1 local1 notice
    debug

    # echo "" | nc -U /var/run/haproxy.sock
    stats   socket /var/run/haproxy.sock mode 777

resolvers docker
    nameserver dns "172.17.0.1:53"

defaults
    log         global
    mode        tcp
    option      dontlognull
    option      tcplog
    retries     3
    timeout     check    5s
    timeout     client   1000s
    timeout     connect  1000s
    timeout     server   10s

frontend service1
    bind *:9000
    mode tcp
    default_backend service1_backend

frontend service2
    bind *:9001
    mode tcp
    default_backend service2_backend

backend service1_backend
    mode tcp
    balance roundrobin {{ range service "service1" }}
    server {{ .Node }} {{ .Address }}:{{ .Port }} {{ end }}

backend service2_backend
    mode tcp
    balance roundrobin {{ range service "service2" }}
    server {{ .Node }} {{ .Address }}:{{ .Port }} {{ end }}

listen statistics
    bind *:8778
    mode http
    stats enable
    stats show-desc BigDinosaur HAProxy Status
    stats uri /

But what i'm seeing here is haproxy been reload very often (like 10 to 10 seconds). The logs are (using docker-compose to run everything in conjunction):

service2    | Retrying connection...
haproxy     | 00000000:service1.accept(0007)=0009 from [172.17.0.1:43561]
service2    | Retry connection in 255 ms
haproxy     | 00000000:service1_backend.srvcls[0009:000a]
haproxy     | 00000000:service1_backend.clicls[0009:000a]
haproxy     | 00000000:service1_backend.closed[0009:000a]
service2    | Retrying connection...
haproxy     | 00000001:service1.accept(0007)=0009 from [172.17.0.1:43565]
service2    | Retry connection in 255 ms
haproxy     | 00000001:service1_backend.srvcls[0009:000a]
haproxy     | 00000001:service1_backend.clicls[0009:000a]
haproxy     | 00000001:service1_backend.closed[0009:000a]
service2    | Retrying connection...
haproxy     | 00000002:service1.accept(0007)=0009 from [172.17.0.1:43569]
service2    | Retry connection in 255 ms
haproxy     | 00000002:service1_backend.srvcls[0009:000a]
haproxy     | 00000002:service1_backend.clicls[0009:000a]
haproxy     | 00000002:service1_backend.closed[0009:000a]
service2    | Retrying connection...
haproxy     | 00000003:service1.accept(0007)=0009 from [172.17.0.1:43573]
service2    | Retry connection in 255 ms
haproxy     | 00000003:service1_backend.srvcls[0009:000a]
haproxy     | 00000003:service1_backend.clicls[0009:000a]
haproxy     | 00000003:service1_backend.closed[0009:000a]
service2    | Retrying connection...
haproxy     | 00000004:service1.accept(0007)=0009 from [172.17.0.1:43577]

PS. 1: Unfortunately my docker-compose do not log timestamps, but the "Retry connection" pops up from 10 to 10s as i said :)

PS. 2: When i connect service2 to service1 directly (without haproxy) the reconnection tries go away as expected and therefore the problem must be with haproxy reloads.

@sethvargo
Copy link
Contributor

Hi @brunomacf

Can you please show the Consul Template logs in debug mode? It will tell you why the template is being re-rendered and which dependencies are changing. Those logs don't indicate CT is restarting the processes to me, just that the process is being restarted.

@brunomacf
Copy link
Author

The logs since haproxy container initialization is shown bellow (consul-template in debug mode). The service account_v1 registered in consul (through gliderlabs registrator) is the "service2" that i was talking above and it is accessed by an rest api container.

2016/01/27 17:14:06 [DEBUG] ("service(mongodb)") Consul returned 1 services
2016/01/27 17:14:06 [DEBUG] ("service(mongodb)") 1 services after health check status filtering
2016/01/27 17:14:06 [DEBUG] (view) "service(mongodb)" no new data (contents were the same)
2016/01/27 17:14:06 [DEBUG] ("service(mongodb)") querying Consul with &{Datacenter: AllowStale:true RequireConsistent:false WaitIndex:12 WaitTime:1m0s Token: Near:}
2016/01/27 17:14:06 [DEBUG] ("service(account_v1)") Consul returned 1 services
2016/01/27 17:14:06 [DEBUG] ("service(account_v1)") 1 services after health check status filtering
2016/01/27 17:14:06 [DEBUG] (view) "service(account_v1)" no new data (contents were the same)
2016/01/27 17:14:06 [DEBUG] ("service(account_v1)") querying Consul with &{Datacenter: AllowStale:true RequireConsistent:false WaitIndex:12 WaitTime:1m0s Token: Near:}
2016/01/27 17:14:21 [INFO] consul-template v0.12.2
2016/01/27 17:14:21 [INFO] (runner) creating new runner (dry: false, once: false)
2016/01/27 17:14:21 [DEBUG] (runner) final config (tokens suppressed):

{
  "path": "/var/template/template.cfg",
  "consul": "consul.service.consul:8500",
  "auth": {
    "enabled": false,
    "username": "",
    "password": ""
  },
  "vault": {
    "renew": true,
    "ssl": {
      "enabled": true,
      "verify": true
    }
  },
  "ssl": {
    "enabled": false,
    "verify": true
  },
  "syslog": {
    "enabled": false,
    "facility": "LOCAL0"
  },
  "max_stale": 1000000000,
  "templates": [
    {
      "source": "/var/template/haproxy.ctmpl",
      "destination": "/etc/haproxy/haproxy.cfg",
      "command": "haproxy -f /etc/haproxy/haproxy.cfg -sf $(pidof haproxy) \u0026",
      "perms": 420,
      "backup": false
    }
  ],
  "retry": 5000000000,
  "wait": {
    "min": 2000000000,
    "max": 10000000000
  },
  "pid_file": "",
  "log_level": "debug",
  "deduplicate": {
    "enabled": false,
    "prefix": "consul-template/dedup/",
    "ttl": 15000000000
  },
  "reap": false
}

2016/01/27 17:14:21 [INFO] (runner) creating consul/api client
2016/01/27 17:14:21 [DEBUG] (runner) setting consul address to consul.service.consul:8500
2016/01/27 17:14:21 [INFO] (runner) creating vault/api client
2016/01/27 17:14:21 [DEBUG] (runner) enabling vault SSL
2016/01/27 17:14:21 [INFO] (runner) creating Watcher
2016/01/27 17:14:21 [INFO] (runner) starting
2016/01/27 17:14:21 [DEBUG] (runner) running initial templates
2016/01/27 17:14:21 [INFO] (runner) running
2016/01/27 17:14:21 [DEBUG] (runner) checking template /var/template/haproxy.ctmpl
2016/01/27 17:14:21 [INFO] (runner) was not watching 4 dependencies
2016/01/27 17:14:21 [INFO] (watcher) adding "service(mongodb)"
2016/01/27 17:14:21 [DEBUG] (watcher) "service(mongodb)" starting
2016/01/27 17:14:21 [INFO] (watcher) adding "service(account_v1)"
2016/01/27 17:14:21 [DEBUG] (watcher) "service(account_v1)" starting
2016/01/27 17:14:21 [INFO] (watcher) adding "service(account_v1)"
2016/01/27 17:14:21 [DEBUG] (watcher) "service(account_v1)" already exists, skipping
2016/01/27 17:14:21 [INFO] (watcher) adding "service(mongodb)"
2016/01/27 17:14:21 [DEBUG] (watcher) "service(mongodb)" already exists, skipping
2016/01/27 17:14:21 [INFO] (runner) diffing and updating dependencies
2016/01/27 17:14:21 [DEBUG] (runner) enabling quiescence for "/var/template/haproxy.ctmpl"
2016/01/27 17:14:21 [INFO] (runner) watching 2 dependencies
2016/01/27 17:14:21 [DEBUG] (view) "service(mongodb)" starting fetch
2016/01/27 17:14:21 [DEBUG] ("service(mongodb)") querying Consul with &{Datacenter: AllowStale:true RequireConsistent:false WaitIndex:0 WaitTime:1m0s Token: Near:}
2016/01/27 17:14:21 [DEBUG] (view) "service(account_v1)" starting fetch
2016/01/27 17:14:21 [DEBUG] ("service(account_v1)") querying Consul with &{Datacenter: AllowStale:true RequireConsistent:false WaitIndex:0 WaitTime:1m0s Token: Near:}
2016/01/27 17:14:21 [ERR] (view) "service(account_v1)" health services: error fetching: Get http://consul.service.consul:8500/v1/health/service/account_v1?passing=1&stale=&wait=60000ms: dial tcp: lookup consul.service.consul on 172.17.0.1:53: server misbehaving
2016/01/27 17:14:21 [INFO] (view) "service(account_v1)" errored, retrying in 5s
2016/01/27 17:14:21 [ERR] (runner) watcher reported error: health services: error fetching: Get http://consul.service.consul:8500/v1/health/service/account_v1?passing=1&stale=&wait=60000ms: dial tcp: lookup consul.service.consul on 172.17.0.1:53: server misbehaving
2016/01/27 17:14:21 [INFO] (runner) running
2016/01/27 17:14:21 [DEBUG] (runner) checking template /var/template/haproxy.ctmpl
2016/01/27 17:14:21 [INFO] (runner) missing data for 2 dependencies
2016/01/27 17:14:21 [INFO] (runner) diffing and updating dependencies
2016/01/27 17:14:21 [DEBUG] (runner) "service(account_v1)" is still needed
2016/01/27 17:14:21 [DEBUG] (runner) "service(mongodb)" is still needed
2016/01/27 17:14:21 [INFO] (runner) watching 2 dependencies
2016/01/27 17:14:21 [ERR] (view) "service(mongodb)" health services: error fetching: Get http://consul.service.consul:8500/v1/health/service/mongodb?passing=1&stale=&wait=60000ms: dial tcp: lookup consul.service.consul on 172.17.0.1:53: server misbehaving
2016/01/27 17:14:21 [INFO] (view) "service(mongodb)" errored, retrying in 5s
2016/01/27 17:14:21 [ERR] (runner) watcher reported error: health services: error fetching: Get http://consul.service.consul:8500/v1/health/service/mongodb?passing=1&stale=&wait=60000ms: dial tcp: lookup consul.service.consul on 172.17.0.1:53: server misbehaving
2016/01/27 17:14:21 [INFO] (runner) running
2016/01/27 17:14:21 [DEBUG] (runner) checking template /var/template/haproxy.ctmpl
2016/01/27 17:14:21 [INFO] (runner) missing data for 2 dependencies
2016/01/27 17:14:21 [INFO] (runner) diffing and updating dependencies
2016/01/27 17:14:21 [DEBUG] (runner) "service(mongodb)" is still needed
2016/01/27 17:14:21 [DEBUG] (runner) "service(account_v1)" is still needed
2016/01/27 17:14:21 [INFO] (runner) watching 2 dependencies
2016/01/27 17:14:26 [DEBUG] (view) "service(account_v1)" starting fetch
2016/01/27 17:14:26 [DEBUG] ("service(account_v1)") querying Consul with &{Datacenter: AllowStale:true RequireConsistent:false WaitIndex:0 WaitTime:1m0s Token: Near:}
2016/01/27 17:14:26 [DEBUG] (view) "service(mongodb)" starting fetch
2016/01/27 17:14:26 [DEBUG] ("service(mongodb)") querying Consul with &{Datacenter: AllowStale:true RequireConsistent:false WaitIndex:0 WaitTime:1m0s Token: Near:}
2016/01/27 17:14:26 [DEBUG] ("service(account_v1)") Consul returned 1 services
2016/01/27 17:14:26 [DEBUG] ("service(account_v1)") 1 services after health check status filtering
2016/01/27 17:14:26 [INFO] (view) "service(account_v1)" received data
2016/01/27 17:14:26 [DEBUG] (view) "service(account_v1)" starting fetch
2016/01/27 17:14:26 [DEBUG] ("service(account_v1)") querying Consul with &{Datacenter: AllowStale:true RequireConsistent:false WaitIndex:15 WaitTime:1m0s Token: Near:}
2016/01/27 17:14:26 [DEBUG] (runner) receiving dependency "service(account_v1)"
2016/01/27 17:14:26 [INFO] (runner) running
2016/01/27 17:14:26 [DEBUG] (runner) checking template /var/template/haproxy.ctmpl
2016/01/27 17:14:26 [INFO] (runner) missing data for 1 dependencies
2016/01/27 17:14:26 [INFO] (runner) diffing and updating dependencies
2016/01/27 17:14:26 [DEBUG] (runner) "service(mongodb)" is still needed
2016/01/27 17:14:26 [DEBUG] (runner) "service(account_v1)" is still needed
2016/01/27 17:14:26 [INFO] (runner) watching 2 dependencies
2016/01/27 17:14:26 [DEBUG] ("service(mongodb)") Consul returned 1 services
2016/01/27 17:14:26 [DEBUG] ("service(mongodb)") 1 services after health check status filtering
2016/01/27 17:14:26 [INFO] (view) "service(mongodb)" received data
2016/01/27 17:14:26 [DEBUG] (view) "service(mongodb)" starting fetch
2016/01/27 17:14:26 [DEBUG] ("service(mongodb)") querying Consul with &{Datacenter: AllowStale:true RequireConsistent:false WaitIndex:15 WaitTime:1m0s Token: Near:}
2016/01/27 17:14:26 [DEBUG] (runner) receiving dependency "service(mongodb)"
2016/01/27 17:14:26 [INFO] (runner) running
2016/01/27 17:14:26 [DEBUG] (runner) checking template /var/template/haproxy.ctmpl
2016/01/27 17:14:26 [INFO] (runner) diffing and updating dependencies
2016/01/27 17:14:26 [DEBUG] (runner) "service(mongodb)" is still needed
2016/01/27 17:14:26 [DEBUG] (runner) "service(account_v1)" is still needed
2016/01/27 17:14:26 [INFO] (runner) watching 2 dependencies
2016/01/27 17:14:28 [INFO] (runner) quiescence minTimer fired for /var/template/haproxy.ctmpl
2016/01/27 17:14:28 [INFO] (runner) received template "/var/template/haproxy.ctmpl" from quiescence
2016/01/27 17:14:28 [INFO] (runner) running
2016/01/27 17:14:28 [DEBUG] (runner) checking template /var/template/haproxy.ctmpl
2016/01/27 17:14:28 [DEBUG] (runner) checking ctemplate &{Source:/var/template/haproxy.ctmpl Destination:/etc/haproxy/haproxy.cfg Command:haproxy -f /etc/haproxy/haproxy.cfg -sf $(pidof haproxy) & Perms:-rw-r--r-- Backup:false}
2016/01/27 17:14:28 [DEBUG] (runner) wouldRender: true, didRender: false
2016/01/27 17:14:28 [INFO] (runner) diffing and updating dependencies
2016/01/27 17:14:28 [DEBUG] (runner) "service(mongodb)" is still needed
2016/01/27 17:14:28 [DEBUG] (runner) "service(account_v1)" is still needed
2016/01/27 17:14:28 [DEBUG] (runner) enabling quiescence for "/var/template/haproxy.ctmpl"
2016/01/27 17:14:28 [INFO] (runner) watching 2 dependencies
2016/01/27 17:15:27 [DEBUG] ("service(mongodb)") Consul returned 1 services
2016/01/27 17:15:27 [DEBUG] ("service(mongodb)") 1 services after health check status filtering
2016/01/27 17:15:27 [DEBUG] (view) "service(mongodb)" no new data (index was the same)
2016/01/27 17:15:27 [DEBUG] ("service(mongodb)") querying Consul with &{Datacenter: AllowStale:true RequireConsistent:false WaitIndex:15 WaitTime:1m0s Token: Near:}
2016/01/27 17:15:27 [DEBUG] ("service(account_v1)") Consul returned 1 services
2016/01/27 17:15:27 [DEBUG] ("service(account_v1)") 1 services after health check status filtering
2016/01/27 17:15:27 [DEBUG] (view) "service(account_v1)" no new data (index was the same)
2016/01/27 17:15:27 [DEBUG] ("service(account_v1)") querying Consul with &{Datacenter: AllowStale:true RequireConsistent:false WaitIndex:15 WaitTime:1m0s Token: Near:}
2016/01/27 17:16:27 [DEBUG] ("service(mongodb)") Consul returned 1 services
2016/01/27 17:16:27 [DEBUG] ("service(mongodb)") 1 services after health check status filtering
2016/01/27 17:16:27 [DEBUG] (view) "service(mongodb)" no new data (index was the same)
2016/01/27 17:16:27 [DEBUG] ("service(mongodb)") querying Consul with &{Datacenter: AllowStale:true RequireConsistent:false WaitIndex:15 WaitTime:1m0s Token: Near:}
2016/01/27 17:16:28 [DEBUG] ("service(account_v1)") Consul returned 1 services
2016/01/27 17:16:28 [DEBUG] ("service(account_v1)") 1 services after health check status filtering
2016/01/27 17:16:28 [DEBUG] (view) "service(account_v1)" no new data (index was the same)
2016/01/27 17:16:28 [DEBUG] ("service(account_v1)") querying Consul with &{Datacenter: AllowStale:true RequireConsistent:false WaitIndex:15 WaitTime:1m0s Token: Near:}
2016/01/27 17:17:29 [DEBUG] ("service(mongodb)") Consul returned 1 services
2016/01/27 17:17:29 [DEBUG] ("service(mongodb)") 1 services after health check status filtering
2016/01/27 17:17:29 [DEBUG] (view) "service(mongodb)" no new data (index was the same)
2016/01/27 17:17:29 [DEBUG] ("service(mongodb)") querying Consul with &{Datacenter: AllowStale:true RequireConsistent:false WaitIndex:15 WaitTime:1m0s Token: Near:}
2016/01/27 17:17:31 [DEBUG] ("service(account_v1)") Consul returned 1 services
2016/01/27 17:17:31 [DEBUG] ("service(account_v1)") 1 services after health check status filtering
2016/01/27 17:17:31 [DEBUG] (view) "service(account_v1)" no new data (index was the same)
2016/01/27 17:17:31 [DEBUG] ("service(account_v1)") querying Consul with &{Datacenter: AllowStale:true RequireConsistent:false WaitIndex:15 WaitTime:1m0s Token: Near:}

The "Consul returned 1 services" is quite often, right? Is this normal even if my services are not changing at all? I don't know whether this is breaking my thrift sockets connections or something else!

PS. 15263: Sorry my bad english... it sucks! :)

@sethvargo
Copy link
Contributor

Hi @brunomacf

That log output does not show any indication Consul Template is restarting the process. When Consul Template executes the command, it will have a log line like "executing command "haproxy reload ..."" which I am not seeing in that output.

The template you posted above and he log output here also do not seem to match.

The log output from haproxy also doesn't seem to indicate the haproxy process is restarting, but rather that it cannot connect to the given address and is retrying the connection.

@brunomacf
Copy link
Author

@sethvargo i'm closing this issue because the problem was between the screen of my laptop and the chair in which i was sitting on (if you understand me).... in my haproxy configuration i set a client timout of 10s and therefore haproxy was closing conmections after this time. Thanks very much for the support man! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants