-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Idle connections over overlay network ends up in a broken state after 15 minutes #31208
Comments
@christopherchines With IPVS or any other man in the middle NAT/Firewall the TCP keep-alive timer has to be tuned when you have "silent" long lived sessions. I will add a note about this in the documentation. The connection would have got terminated if the TCP packet was delivered to a different backend and resulted in a RST from that backend. But I guess whats happening here is after the initial session expires, when IPVS gets a TCP packet that is not SYN its dropping it and not sending it to backend. This makes sense because its a new TCP session for IPVS and doesn't have SYN bit set. |
@christopherobin How did you manage to work around this issue? I'm having the problem between my app that create a pool and my db. After 15 minute being idle the app is not able to reconnect. |
@GabKlein My current setup uses the following:
I took the values from https://access.redhat.com/solutions/23874 and tweaked them slightly for our setup. Didn't run in the issue since then. To check if it's working you can use |
Thanks you @christopherobin, I'm going to give it a shot. Adding this settings as a sysctl file is the best way? Do you have to reboot nodes or restart services to apply them? |
I'm using ansible to provision my server and it stores the variables in a file in If you are not rebooting you can create the files and run |
Pls check if this comment is applicable here. |
@mavenugo @christopherobin @GabKlein @sanimej Hey Team, this Moby GitHub has no assignee, can someone give us an overview of where this is at? Thank you |
@christopherobin Thanks a lot! Tweaking |
We are facing this issue too and tweaking docker version:
docker info:
Here a simple way to reproduce this issue with netcat. server.yml version: '3.2'
services:
server:
image: multicloud/netcat
ports:
- 9898
command: -lp 9898
deploy:
mode: replicated
replicas: 1
stdin_open: true
tty: true
networks:
- test-timeout
networks:
test-timeout:
external: true client.yml version: '3.2'
services:
client:
image: multicloud/netcat
command: server 9898
deploy:
mode: replicated
replicas: 1
stdin_open: true
tty: true
networks:
- test-timeout
networks:
test-timeout:
external: true
Tested with ubuntu and centos. I think this problem is related to the default service discovery of swarm because does not occur in |
This problem is due to the kernel module IPVS. Look at this line: https://github.com/torvalds/linux/blob/master/net/netfilter/ipvs/ip_vs_proto_tcp.c#L366 I changed the Compare with the following default kernel parameters, the
On the other side, tuning kernel parameters like |
@christopherobin, @bm-skutzke, how were you able to set This is a namespaced kernel parameter and it looks like tuning sysctl parameters is not yet possible in Docker swarm mode: #25209, #33649. I tried to bake I am asking, because based on your comments, it seems like both of you managed to pull that off in a Docker swarm mode setup somehow?.. (Running Docker |
@vassilvk I have been running my own VMs and bare-metal servers so I didn't run into your issue. I'm not entirely sure what is the best way to do it for Docker on Windows. Baking the parameters in the docker images themselves won't work (since the init in your container won't apply anything from those file and like you said they are namespaced) so you'll need to do it at the host level. I'd recommend opening an issue on https://github.com/linuxkit/linuxkit to have it baked in the default image and maybe try to make your own image in the meantime. It might also be possible to do set it by abusing |
Thanks @christopherobin - makes sense. |
We recently ran into this issue using Docker CE 18.03.1 on CentOS 7. Using Swarm overlay networking and endpoint mode virtual IP ( Our workaround is to set the database service to endpoint mode My question is: Does anyone work on this issue or do you have any other recommendations regarding workarounds? (Other than switching to Kubernetes.) Thanks in advance! |
@ju-la-berger - I solved the issue on my end by using keep-alive for the application-level connection. This is protocol specific (I am using gRPC). If Netty HTTP supports keep-alive, maybe you can try that. |
Please refer to: #37466 (comment) and https://success.docker.com/article/ipvs-connection-timeout-issue |
let me close this issue, with the comments above referring to solutions / how to configure |
WIP Pull request for setting sysctl for swarm services: #37701 / moby/swarmkit#2729 |
Description
In a swarm setup using overlay networks, idle connections between 2 services will end up in a broken state after 15 minutes.
The issue is related to the way docker overlay routes packets, using first iptables to mark them and use ipvs to forward them to the right hosts but the default expiration for connections on ipvs is set to 900 seconds (
ipvsadm -l --timeout
) after which it will stop forwarding packets even though the connection still exists; If this happens then any new packet on this connection will now try to go to the virtual IP for that service that has no valid resolution, resulting in a broken state where it is stuck in limbo while the kernel forever tries to resolve that virtual IP.Steps to reproduce the issue:
docker exec
in both of them, in one start anc
command in listen mode, in the other one connect to thatnc
server by using the service name DNS.netns
and find your connection by doingnsenter --net=2cc18e502f81 ipvsadm -lnc
tcpdump
shows lots of ARP packets going outDescribe the results you received:
Packet never reaches the target, kernel is stuck doing ARP requests over and over.
Describe the results you expected:
Either have the connection properly timeout, or find a way to restore the routing in ipvs.
Additional information you deem important (e.g. issue happens only occasionally):
Currently can be resolved by setting
net.ipv4.tcp_keepalive_time
to less than 900 seconds, to make sure the TCP connection doesn't expire but I'm not sure if it's a valid way to deal with this; At the very least this behavior should be documented.Output of
docker version
:Output of
docker info
:Additional environment details (AWS, VirtualBox, physical, etc.):
My current test setup is 5 vagrant boxes (2 managers + 3 workers), but it should happen in any environment.
The text was updated successfully, but these errors were encountered: