Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

portmap: delete UDP conntrack entries on teardown #123

Closed
squeed opened this issue Feb 20, 2018 · 15 comments
Closed

portmap: delete UDP conntrack entries on teardown #123

squeed opened this issue Feb 20, 2018 · 15 comments
Labels

Comments

@squeed
Copy link
Member

@squeed squeed commented Feb 20, 2018

As observed in kubernetes/kubernetes#59033, a quick teardown + spinup of portmappings can cause UDP "flows" to be lost, thanks to stale conntrack entries.

From the original issue:

  1. A server pod exposes UDP host port.
  2. A client sends packets to the server pod thru the host port. This creates a conntrack entry.
  3. The server pod's IP changs due to whatever reason, such as pod gets recreated.
  4. Due to the nature of UDP and conntrack, new request from the same client to the host port will keep hitting the stale conntrack entry.
  5. Client observes traffic black hole.
@squeed squeed added the bug label Feb 20, 2018
@brantburnett
Copy link

@brantburnett brantburnett commented Aug 24, 2018

I have found that this bug doesn't seem to only apply to changing pod IPs. It can also apply if there is incoming traffic that is being dropped because there is no pod, and then a pod is added that should start receiving the traffic.

Complete details and steps to reproduce: projectcalico/felix#1880

Loading

@vmendi
Copy link

@vmendi vmendi commented Mar 21, 2019

We are also affected by this. We lose metrics whenever there's a restart of the Datadog agent. Is there any plans to fix it? any workaround available?

Thanks

Loading

@Suckzoo
Copy link

@Suckzoo Suckzoo commented Apr 9, 2019

Seems like this issue has been opened for a year. Is there any plan to fix this issue? Or, could you let me know how to fix this problem by my hand?

Loading

@RohanKurane
Copy link

@RohanKurane RohanKurane commented May 9, 2019

Hello,

I think I am hitting a similar issue.

I deploy a pod with hostport and created using type:portmap.
Then when I try to delete the pod, and re-deploy the same pod with same name, I get the following error -

Warning  FailedCreatePodSandBox  1m    kubelet, ip-xxxxxxxxx.ec2.internal  Failed create pod sandbox: rpc error: code = Unknown desc = failed to add hostport mapping for sandbox k8s_server_default_436fb151-71b1-11e9-b2d9-128d1a3304a4_0(7993b317dc75381c0ed08a75019fde6c2fe70aaf5697af6cc1f0e85c7afddfd6): cannot open hostport 50051 for pod k8s_server_default_436fb151-71b1-11e9-b2d9-128d1a3304a4_0_: listen tcp :50051: bind: address already in use

Is this the same issue ? If so, is there a workaround till this issue is fixed ?
I am using CRI-O runtime and not docker. I do not believe I saw this error when I used docker.

I am using the following CNI plugins

     wget -qO- https://github.com/containernetworking/cni/releases/download/${CNI_VERSION}/cni-amd64-${CNI_VERSION}.tgz | bsdtar -xvf - -C /opt/cni/bin
     wget -qO- https://github.com/containernetworking/plugins/releases/download/${CNI_PLUGIN_VERSION}/cni-plugins-amd64-${CNI_PLUGIN_VERSION}.tgz | bsdtar -xvf - -C /opt/cni/bin

Thanks

Loading

@Simwar
Copy link

@Simwar Simwar commented May 16, 2019

It seems like this PR aimed to fix the issue: kubernetes/kubernetes#59286
But this one is needed to have the conntrack binary installed as well in the right path I believe: kubernetes/kubernetes#64640
On GKE for example, as you need to run the conntrack binary via the toolbox (toolbox conntrack -D -p udp)
The workaround is to run toolbox conntrack -D -p udp after the pod is restarted to clean up the conntrack entry.

There is a workaround but this is not ideal.
You can use an initContainer to run the conntrack command:

initContainers: 
        - image: <conntrack-image>
          imagePullPolicy: IfNotPresent 
          name: conntrack 
          securityContext: 
            allowPrivilegeEscalation: true 
            capabilities: 
              add: ["NET_ADMIN"] 
          command: ['sh', '-c', 'conntrack -D -p udp']

You need to set hostNetwork: true for this to work, so this is not ideal.

Loading

@msiebuhr
Copy link

@msiebuhr msiebuhr commented Oct 8, 2019

I've experimented a bit with this, and there's some small fixes -- in particular, conntrack exits with code 1 if no connections are deleted, causing the pod to get stuck in a init-loop.

As to the container, I've used gcr.io/google-containers/toolbox:20190523-00 (overview here).

initContainers: 
    - image: 'gcr.io/google-containers/toolbox:20190523-00'
      imagePullPolicy: 'IfNotPresent'
      name: 'conntrack'
      securityContext: 
       allowPrivilegeEscalation: true 
         capabilities: 
           add: ["NET_ADMIN"] 
      command: ['sh', '-c', 'conntrack -D -p udp --dport 8125 | true']

I'm having an issue with statsd + datadog, which uses UDP 8125, so I'm only deleting that. I didn't need hostNetwork: true anywhere, but that may be an consequence of running on GKE.

Loading

@msiebuhr
Copy link

@msiebuhr msiebuhr commented Oct 8, 2019

The above does seem to fix things for low-traffic client services, but our high-traffic services seem to get new conntrack entries set up between running the initContainer and (in our case) datadog-agent starting up.

If I SSH into the machine and manually run the conntrack-command, things come back as expected.

Loading

@raghurampai
Copy link

@raghurampai raghurampai commented Apr 8, 2020

This issue still exists in the latest. I encountered this issue with DaemonSet for the same scenario mentioned in the description. The workaround does has corner case as mentioned by msiebuhr.

Anyone was able to solve it in any alternate way?

Loading

@AnishShah
Copy link

@AnishShah AnishShah commented Nov 17, 2020

I can work on this issue. Is it okay if we just flush conntrack entries just like it is done in kubernetes/kubernetes#59286? Are there any concerns?

Loading

@aojea
Copy link
Contributor

@aojea aojea commented Nov 17, 2020

I have a PR for this, let me check something first,

Loading

@aojea
Copy link
Contributor

@aojea aojea commented Nov 17, 2020

nevermind @AnishShah , you deserve the honor, I just see the comment in the k/k issue.
sorry

Loading

@aojea
Copy link
Contributor

@aojea aojea commented Nov 17, 2020

@AnishShah this is the approach I was taken https://github.com/containernetworking/plugins/compare/master...aojea:conntrack?expand=1 , feel free to reuse it or discard it.
@squeed can you help us here?

Loading

@AnishShah
Copy link

@AnishShah AnishShah commented Nov 18, 2020

@aojea Since you have already created the PR, you can go ahead with it

Loading

@AnishShah
Copy link

@AnishShah AnishShah commented Nov 18, 2020

@aojea I had a brief look at your PR. I think we should delete UDP conntrack entries on teardown and setup both. Even in kubernetes/kubernetes#59286, it seems that we are deleting conntrack entries when we setup hostport iptable rules.

I tested your PR with a udp client/server pods

apiVersion: v1
kind: Namespace
metadata:
  name: udp
---
apiVersion: v1
kind: Pod 
metadata:
  name: udp-server
  namespace: udp
spec:
    containers:
      - name: udp-server
        image: aardvarkx1/udp-server
        imagePullPolicy: Always
        ports:
          - containerPort: 10001
            protocol: UDP
            hostPort: 10001
            name: udp-test
---
apiVersion: v1
kind: Pod
metadata:
  name: udp-client
  namespace: udp
spec:
    containers:
      - name: udp-client
        image: aardvarkx1/udp-client
        imagePullPolicy: Always
        env:
          - name: SERVER_ADDRESS
            valueFrom:
              fieldRef:
                fieldPath: status.hostIP

I notice one conntrack entry -

$ sudo conntrack -L -p UDP --orig-port-dst 10001
udp      17 29 src=10.8.0.19 dst=10.128.0.71 sport=33371 dport=10001 [UNREPLIED] src=10.8.0.24 dst=10.128.0.71 sport=10001 dport=33371 mark=0 use=1
conntrack v1.4.4 (conntrack-tools): 1 flow entries have been shown.

When I deleted udp-server, the conntrack entry changed to

$ sudo conntrack -L -p UDP --orig-port-dst 10001
udp      17 28 src=10.8.0.19 dst=10.128.0.71 sport=33371 dport=10001 [UNREPLIED] src=10.128.0.71 dst=10.8.0.19 sport=10001 dport=33371 mark=0 use=1
conntrack v1.4.4 (conntrack-tools): 1 flow entries have been shown.

Notice the change in src/dst IP for reply direction. When I created udp-server pod again, I don't see any changes in conntrack entry. It is the same as above. I ensured that the DNAT rules are created.

When I deleted this conntrack entry, it started seeing the correct conntrack entry again -

$ sudo conntrack -D -p UDP --orig-port-dst 10001
udp      17 28 src=10.8.0.19 dst=10.128.0.71 sport=33371 dport=10001 [UNREPLIED] src=10.128.0.71 dst=10.8.0.19 sport=10001 dport=33371 mark=0 use=1
conntrack v1.4.4 (conntrack-tools): 1 flow entries have been deleted.

$ sudo conntrack -L -p UDP --orig-port-dst 10001
udp      17 29 src=10.8.0.19 dst=10.128.0.71 sport=33371 dport=10001 [UNREPLIED] src=10.8.0.25 dst=10.128.0.71 sport=10001 dport=33371 mark=0 use=1
conntrack v1.4.4 (conntrack-tools): 1 flow entries have been shown.

Loading

@aojea
Copy link
Contributor

@aojea aojea commented Nov 18, 2020

@AnishShah I was thinking that, since this is always local traffic, same host, we only care about creation, i.e. when the new pod is created it needs to flush previous conntrack entries to take over that traffic ... flushing the entries on deletion doesn't matter if a new pod is not going to start listening on the exposed port, the traffic will not hit any pod anyway

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

10 participants