Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K3s - Traefik 2.2 UDP packetloss #7000

Closed
dragon2611 opened this issue Jul 5, 2020 · 7 comments
Closed

K3s - Traefik 2.2 UDP packetloss #7000

dragon2611 opened this issue Jul 5, 2020 · 7 comments
Assignees
Labels
area/udp kind/bug/possible a possible bug that needs analysis before it is confirmed or fixed. status/5-frozen-due-to-age
Projects

Comments

@dragon2611
Copy link

dragon2611 commented Jul 5, 2020

What did you do?

Configured Traefik to listen on UDP for TeamSpeak 3 (UDP 9987, TCP:30033 TCP:10011) and Mumble (TCP/UDP 64738)
Traefik is configured with HOST networking on the nodes as there's no load balancers and I want Traefik to see the origin IP for some middlewares I use for other services.

What did you expect to see?

Traefik Passes through UDP connections to pod

What did you see instead?

UDP connections are successful, but TS is reporting packet loss between 5% and 20% , Mumble you connect but there is no voice traffic.

Doesn't seem to make a difference if you are hitting Traefik on the same node or a different one to the one running the pod.

Output of traefik version: (What version of Traefik are you using?)

2.2.1

What is your environment & configuration (arguments, toml, provider, platform, ...)?

K3s v1.18.4+k3s1 running on 3 nodes (master + 2 workers, Calcio + Wireguard networking)

Reconfiguring to use Nodeport instead of via Traefik solves the packetloss issue.

@rtribotte rtribotte added kind/bug/possible a possible bug that needs analysis before it is confirmed or fixed. area/udp and removed status/0-needs-triage labels Jul 6, 2020
@jbdoumenjou
Copy link
Member

jbdoumenjou commented Jul 6, 2020

Hi @dragon2611 ,

could you share your configuration, logs and provide a reproducible example ?

@rtribotte rtribotte added this to issues in v2 via automation Jul 6, 2020
@dragon2611
Copy link
Author

dragon2611 commented Jul 6, 2020

Edit: Mumble doesn't show much in the way of Diagnostic info, infect I didn't notice something was wrong until it was reported by my friend there was no audio. (it being a bit difficult to test that without a 2nd client connected)

TS3 is showing the packet loss in the client info on a user connected through traefik, In the outbound direction if it helps any (I.e from the server back towards the client), Inbound packet loss seems fine.

The TS config, sadly it's prior to putting it in git so I just commented out the nodeport and uncommented the ingress routes

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: ts
  labels:
   app: ts
spec:
  serviceName: ts
  replicas: 1
  selector:
    matchLabels:
      app: ts
  template:
    metadata:
      labels:
        app: ts
    spec:
      containers:
      - name: ts
        image: dragon2611/mystuff:teamspeak-latest
        volumeMounts:
        - name: ts
          mountPath: /data
        env:
        - name: TS3SERVER_LICENSE
          value: accept

      imagePullSecrets:
      - name: dragon-dockerhub
      nodeSelector:
        nodetype: worker
        location: de
      volumes:
      - name: ts
        persistentVolumeClaim:
           claimName: ts

---

apiVersion: v1
kind: Service
metadata:
  name: ts
spec:
#  type: NodePort
  selector:
    app: ts
  ports:
  - name: tstcp1
    protocol: TCP
    port: 30033
    targetPort: 30033
#    nodePort: 30033
  - name: tstcp2
    protocol: TCP
    port: 10011
    targetPort: 10011
#    nodePort: 10011
  - name: tsudp
    protocol: UDP
    port: 9987
    targetPort: 9987
 #   nodePort: 9987

---
kind: IngressRouteUDP
apiVersion: traefik.containo.us/v1alpha1
metadata:
 name: tsudp
spec:
  entryPoints:
   - tsudp1
  routes:
  - services:
    - name: ts
      port: 9987
---

kind: IngressRouteTCP
apiVersion: traefik.containo.us/v1alpha1
metadata:
 name: tstcp1
spec:
  entryPoints:
   - tstcp1
  routes:
  - match: HostSNI(`*`)
    services:
    - name: ts
      port: 30033
---

kind: IngressRouteTCP
apiVersion: traefik.containo.us/v1alpha1
metadata:
 name: tstcp2
spec:
  entryPoints:
   - tstcp2
  routes:
  - match: HostSNI(`*`)
    services:
    - name: ts
      port: 10011

My Traefik 2.2 config

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: ingressroutes.traefik.containo.us

spec:
  group: traefik.containo.us
  version: v1alpha1
  names:
    kind: IngressRoute
    plural: ingressroutes
    singular: ingressroute
  scope: Namespaced

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: middlewares.traefik.containo.us

spec:
  group: traefik.containo.us
  version: v1alpha1
  names:
    kind: Middleware
    plural: middlewares
    singular: middleware
  scope: Namespaced

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: ingressroutetcps.traefik.containo.us

spec:
  group: traefik.containo.us
  version: v1alpha1
  names:
    kind: IngressRouteTCP
    plural: ingressroutetcps
    singular: ingressroutetcp
  scope: Namespaced

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: ingressrouteudps.traefik.containo.us

spec:
  group: traefik.containo.us
  version: v1alpha1
  names:
    kind: IngressRouteUDP
    plural: ingressrouteudps
    singular: ingressrouteudp
  scope: Namespaced

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: tlsoptions.traefik.containo.us

spec:
  group: traefik.containo.us
  version: v1alpha1
  names:
    kind: TLSOption
    plural: tlsoptions
    singular: tlsoption
  scope: Namespaced

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: tlsstores.traefik.containo.us

spec:
  group: traefik.containo.us
  version: v1alpha1
  names:
    kind: TLSStore
    plural: tlsstores
    singular: tlsstore
  scope: Namespaced

---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
  name: traefikservices.traefik.containo.us

spec:
  group: traefik.containo.us
  version: v1alpha1
  names:
    kind: TraefikService
    plural: traefikservices
    singular: traefikservice
  scope: Namespaced

---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: traefik-ingress-controller

rules:
  - apiGroups:
      - ""
    resources:
      - services
      - endpoints
      - secrets
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - extensions
    resources:
      - ingresses
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - extensions
    resources:
      - ingresses/status
    verbs:
      - update
  - apiGroups:
      - traefik.containo.us
    resources:
      - middlewares
      - ingressroutes
      - traefikservices
      - ingressroutetcps
      - ingressrouteudps
      - tlsoptions
      - tlsstores
    verbs:
      - get
      - list
      - watch

---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: traefik-ingress-controller

roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: traefik-ingress-controller
subjects:
  - kind: ServiceAccount
    name: traefik-ingress-controller
    namespace: traefik

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: traefik-ingress-controller
  namespace: traefik
---
kind: DaemonSet
apiVersion: apps/v1
metadata:
  name: traefik-ingress-controller
  namespace: traefik
  labels:
    k8s-app: traefik-ingress-lb
spec:
  selector:
    matchLabels:
      k8s-app: traefik-ingress-lb
      name: traefik-ingress-lb
  template:
    metadata:
      labels:
        k8s-app: traefik-ingress-lb
        name: traefik-ingress-lb
    spec:
      serviceAccountName: traefik-ingress-controller
      terminationGracePeriodSeconds: 60
      hostNetwork: true
      containers:
      - image: traefik:v2.2
        name: traefik-ingress-lb
        ports:
        - name: http
          containerPort: 80
          hostPort: 80
        - name: https
          containerPort: 443
          hostPort: 443
        - name: admin
          containerPort: 8080
          hostPort: 8080
        securityContext:
          capabilities:
            drop:
            - ALL
            add:
            - NET_BIND_SERVICE
        args:
        - --api
        - --serversTransport.insecureSkipVerify=true
        - --api.dashboard=true
        - --providers.kubernetescrd
        - --log.Level=DEBUG
        - --entryPoints.websecure.address=:443
        - --entryPoints.web.address=:80
        - --entryPoints.tstcp1.address=:30033
        - --entryPoints.tstcp2.address=:10011
        - --entryPoints.tsudp1.address=:9987/udp
#        - --entryPoints.mumbletcp.address=:64738
#        - --entryPoints.mumbleudp.address=:64738/udp

---
kind: IngressRoute
apiVersion: traefik.containo.us/v1alpha1
metadata:
  name: traefik-dashboard
  namespace: traefik
spec:
  entryPoints:
    - websecure
  routes:
  - match: Host(`traefik.mydomain.net`)
    kind: Rule
    services:
    - name: api@internal
      kind: TraefikService
      port: 8080
    middlewares:
    - name: common-ipwhitelist
      namespace: default
  tls:
   secretName: fbeast-wildcard
   namespace: default
---

TS3 server log

2020-07-06 21:13:41.774843|INFO    |ServerLibPriv |   |TeamSpeak 3 Server 3.12.1 (2020-03-27 10:38:47)
2020-07-06 21:13:41.818289|INFO    |ServerLibPriv |   |SystemInformation: Linux 5.4.0-40-generic #44-Ubuntu SMP Tue Jun 23 00:01:04 UTC 2020 x86_64 Binary: 64bit
2020-07-06 21:13:41.818357|WARNING |ServerLibPriv |   |The system locale is set to "C" this can cause unexpected behavior. We advice you to repair your locale!
2020-07-06 21:13:41.818368|INFO    |ServerLibPriv |   |Using hardware aes
2020-07-06 21:13:41.818908|INFO    |DatabaseQuery |   |dbPlugin name:    SQLite3 plugin, Version 3, (c)TeamSpeak Systems GmbH
2020-07-06 21:13:41.818932|INFO    |DatabaseQuery |   |dbPlugin version: 3.11.1
2020-07-06 21:13:41.854663|INFO    |DatabaseQuery |   |checking database integrity (may take a while)
2020-07-06 21:13:47.014131|WARNING |Accounting    |   |Unable to open licensekey.dat, falling back to limited functionality
2020-07-06 21:13:47.014614|INFO    |Accounting    |   |Licensing Information
2020-07-06 21:13:47.014644|INFO    |Accounting    |   |licensed to       : Anonymous
2020-07-06 21:13:47.014657|INFO    |Accounting    |   |type              : No License
2020-07-06 21:13:47.014678|INFO    |Accounting    |   |starting date     : Sat Feb  1 00:00:00 2020
2020-07-06 21:13:47.014693|INFO    |Accounting    |   |ending date       : Mon Feb  1 00:00:00 2021
2020-07-06 21:13:47.014709|INFO    |Accounting    |   |max virtualservers: 1
2020-07-06 21:13:47.014720|INFO    |Accounting    |   |max slots         : 32
2020-07-06 21:13:47.775229|INFO    |              |   |Puzzle precompute time: 741
2020-07-06 21:13:47.775932|INFO    |FileManager   |   |listening on 0.0.0.0:30033, [::]:30033
2020-07-06 21:13:47.977495|INFO    |VirtualServerBase|1  |listening on 0.0.0.0:9987, [::]:9987
2020-07-06 21:13:47.978911|INFO    |Query         |   |listening for query on 0.0.0.0:10011, [::]:10011
2020-07-06 21:13:47.979296|INFO    |              |   |creating QUERY_SSH_RSA_HOST_KEY file: ssh_host_rsa_key
2020-07-06 21:13:49.910033|INFO    |Query         |   |listening for ssh query on 0.0.0.0:10022, [::]:10022
2020-07-06 21:13:49.910195|INFO    |CIDRManager   |   |updated query_ip_whitelist ips: 127.0.0.1/32, ::1/128,
2020-07-06 21:13:53.804365|INFO    |VirtualServerBase|1  |client connected 'dragon2611'(id:157) from 10.42.157.12:43159
2020-07-06 21:13:58.635384|INFO    |VirtualServerBase|1  |file download from (id:0), '/icon_3150910763' by client 'dragon2611'(id:157)
2020-07-06 21:13:58.952599|INFO    |              |   |Error opening file "files/virtualserver_1/internal/icons/icon_3150910763": No such file or directory

Traefik logs from the pod I believe I was hitting

`These logs from one of the traefik pods might be useful, I can grab all of the logs but probably need to upload them somewhere

Note I removed and redeployed TS and Traefik, I suspect the middleware error is due to that and I need to re-apply it, Although that middleware isn't used in either the TS or mumble configs.

time="2020-07-06T21:13:28Z" level=error msg="middleware \"default-common-ipwhitelist@kubernetescrd\" does not exist" entryPointName=websecure routerName=traefik-traefik-dashboard-cf352835b92177dcc686@kubernetescrd
time="2020-07-06T21:13:28Z" level=debug msg="Creating middleware" middlewareName=traefik-internal-recovery middlewareType=Recovery entryPointName=websecure
time="2020-07-06T21:13:28Z" level=debug msg="No default certificate, generating one"
time="2020-07-06T21:13:29Z" level=error msg="the service \"default-tstcp2-673acf455cb2dab0b43a@kubernetescrd\" does not exist" entryPointName=tstcp2 routerName=default-tstcp2-673acf455cb2dab0b43a@kubernetescrd
time="2020-07-06T21:13:29Z" level=error msg="the service \"default-tstcp1-673acf455cb2dab0b43a@kubernetescrd\" does not exist" entryPointName=tstcp1 routerName=default-tstcp1-673acf455cb2dab0b43a@kubernetescrd
time="2020-07-06T21:13:29Z" level=error msg="the udp service \"default-tsudp-0@kubernetescrd\" does not exist" entryPointName=tsudp1 routerName=default-tsudp-0@kubernetescrd
time="2020-07-06T21:13:42Z" level=error msg="Error configuring TLS: secret traefik/fbeast-wildcard does not exist" ingress=traefik-dashboard namespace=traefik providerName=kubernetescrd
time="2020-07-06T21:13:42Z" level=debug msg="Configuration received from provider kubernetescrd: {\"http\":{\"routers\":{\"traefik-traefik-dashboard-cf352835b92177dcc686\":{\"entryPoints\":[\"websecure\"],\"middlewares\":[\"default-common-ipwhitelist\"],\"service\":\"api@internal\",\"rule\":\"Host(`traefik.flying-beast.net`)\",\"tls\":{}}}},\"tcp\":{\"routers\":{\"default-tstcp1-673acf455cb2dab0b43a\":{\"entryPoints\":[\"tstcp1\"],\"service\":\"default-tstcp1-673acf455cb2dab0b43a\",\"rule\":\"HostSNI(`*`)\"},\"default-tstcp2-673acf455cb2dab0b43a\":{\"entryPoints\":[\"tstcp2\"],\"service\":\"default-tstcp2-673acf455cb2dab0b43a\",\"rule\":\"HostSNI(`*`)\"}},\"services\":{\"default-tstcp1-673acf455cb2dab0b43a\":{\"loadBalancer\":{\"servers\":[{\"address\":\"10.42.107.45:30033\"}]}},\"default-tstcp2-673acf455cb2dab0b43a\":{\"loadBalancer\":{\"servers\":[{\"address\":\"10.42.107.45:10011\"}]}}}},\"udp\":{\"routers\":{\"default-tsudp-0\":{\"entryPoints\":[\"tsudp1\"],\"service\":\"default-tsudp-0\"}},\"services\":{\"default-tsudp-0\":{\"loadBalancer\":{\"servers\":[{\"address\":\"10.42.107.45:9987\"}]}}}},\"tls\":{}}" providerName=kubernetescrd
time="2020-07-06T21:13:42Z" level=debug msg="Middleware name not found in config (ResponseModifier)" entryPointName=websecure routerName=traefik-traefik-dashboard-cf352835b92177dcc686@kubernetescrd middlewareName=default-common-ipwhitelist@kubernetescrd middlewareType=undefined
time="2020-07-06T21:13:42Z" level=debug msg="Added outgoing tracing middleware api@internal" entryPointName=websecure routerName=traefik-traefik-dashboard-cf352835b92177dcc686@kubernetescrd middlewareName=tracing middlewareType=TracingForwarder
time="2020-07-06T21:13:42Z" level=error msg="middleware \"default-common-ipwhitelist@kubernetescrd\" does not exist" entryPointName=websecure routerName=traefik-traefik-dashboard-cf352835b92177dcc686@kubernetescrd
time="2020-07-06T21:13:42Z" level=debug msg="Creating middleware" middlewareType=Recovery entryPointName=websecure middlewareName=traefik-internal-recovery
time="2020-07-06T21:13:42Z" level=debug msg="No default certificate, generating one"
time="2020-07-06T21:13:42Z" level=debug msg="Creating TCP server 0 at 10.42.107.45:10011" serviceName=default-tstcp2-673acf455cb2dab0b43a serverName=0 entryPointName=tstcp2 routerName=default-tstcp2-673acf455cb2dab0b43a@kubernetescrd
time="2020-07-06T21:13:42Z" level=debug msg="Adding route * on TCP" entryPointName=tstcp2 routerName=default-tstcp2-673acf455cb2dab0b43a@kubernetescrd
time="2020-07-06T21:13:42Z" level=debug msg="Creating TCP server 0 at 10.42.107.45:30033" entryPointName=tstcp1 routerName=default-tstcp1-673acf455cb2dab0b43a@kubernetescrd serviceName=default-tstcp1-673acf455cb2dab0b43a serverName=0
time="2020-07-06T21:13:42Z" level=debug msg="Adding route * on TCP" entryPointName=tstcp1 routerName=default-tstcp1-673acf455cb2dab0b43a@kubernetescrd
time="2020-07-06T21:13:42Z" level=debug msg="Creating UDP server 0 at 10.42.107.45:9987" entryPointName=tsudp1 routerName=default-tsudp-0@kubernetescrd serverName=0 serviceName=default-tsudp-0
time="2020-07-06T21:13:43Z" level=debug msg="Handling connection from 154.x1.1xx.xx:51016"
time="2020-07-06T21:13:43Z" level=error msg="Error while serving UDP: read udp 10.42.157.12:35496->10.42.107.45:9987: read: connection refused"
time="2020-07-06T21:13:43Z" level=debug msg="Error while terminating connection: close udp 10.42.157.12:35496->10.42.107.45:9987: use of closed network connection"
time="2020-07-06T21:13:44Z" level=debug msg="Handling connection from 154.x1.1xx.xx:51016"
time="2020-07-06T21:13:44Z" level=error msg="Error while serving UDP: read udp 10.42.157.12:43814->10.42.107.45:9987: read: connection refused"
time="2020-07-06T21:13:44Z" level=debug msg="Error while terminating connection: close udp 10.42.157.12:43814->10.42.107.45:9987: use of closed network connection"
time="2020-07-06T21:13:44Z" level=debug msg="Handling connection from 154.x1.1xx.xx:51016"
time="2020-07-06T21:13:44Z" level=error msg="Error while serving UDP: read udp 10.42.157.12:48734->10.42.107.45:9987: read: connection refused"
time="2020-07-06T21:13:44Z" level=debug msg="Error while terminating connection: close udp 10.42.157.12:48734->10.42.107.45:9987: use of closed network connection"
time="2020-07-06T21:13:45Z" level=debug msg="Handling connection from 154.x1.1xx.xx:51016"
time="2020-07-06T21:13:45Z" level=error msg="Error while serving UDP: read udp 10.42.157.12:35323->10.42.107.45:9987: read: connection refused"
time="2020-07-06T21:13:45Z" level=debug msg="Error while terminating connection: close udp 10.42.157.12:35323->10.42.107.45:9987: use of closed network connection"
time="2020-07-06T21:13:47Z" level=debug msg="Handling connection from 154.x1.1xx.xx:51016"
time="2020-07-06T21:13:47Z" level=error msg="Error while serving UDP: read udp 10.42.157.12:33376->10.42.107.45:9987: read: connection refused"
time="2020-07-06T21:13:47Z" level=debug msg="Error while terminating connection: close udp 10.42.157.12:33376->10.42.107.45:9987: use of closed network connection"
time="2020-07-06T21:13:47Z" level=debug msg="Handling connection from 154.x1.1xx.xx:51016"`

@sharknoon
Copy link

I do have the same problem. I was very exited, when Traefik finally anounced UDP Support, but I wasnt able to use it. My Teamspeak Server (running on Docker) had a very high packet loss and I was forced to revert the traffik back to the bridge network.

This is my very basic configuration of traefik:

traefik.yml

entryPoints:
  teampseak-voice:
    address: ":9987/udp"

dynamic_conf.yml

udp:
  routers:
    teamspeak-voice:
      entryPoints:
        - "teamspeak-voice"
      service: teamspeak-voice@file

  services:
    teamspeak-voice:
      loadBalancer:
        servers:
          - address: "teamspeak:9987"

I have the official teamspeak Container named teamspeak with no additional configuration running in the same network (traefik-net). I have also added the Port to be exposed on 9987/udp.

The Client successfully connects to the server, but for some moments, the trafiic completly cuts off and others are stuttering.

After I have reverted to the direct bridge network, all of the mentioned errors went away.

@SantoDE
Copy link
Collaborator

SantoDE commented Oct 13, 2020

Hey all,

I tried to reproduce your issue on a fresh digitalocen droplet with the given docker-compose

version: "3.7"
services:
  teamspeak:
    image: mbentley/teamspeak
    environment:
      TS3SERVER_LICENSE: accept
    labels:
      - "traefik.udp.routers.ts3.entrypoints=ts3"
      - "traefik.enable=true"

  traefik:
    image: traefik:v2.3
    restart: always
    command:
      - "--log.level=DEBUG"
      - "--api.insecure=true"
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - "--entrypoints.traefik.address=:8080"
      - "--entrypoints.ts3.address=:9987/udp"
    ports:
      - "80:80"
      - "8080:8080"
      - "9987:9987/udp"
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:ro"

However, I couldn't. tested with a group of 4 people and all stayed stable throughout our test. I guess, we will need more information / or another reproducable example.

@traefiker
Copy link
Contributor

Hi! I'm Træfiker 🤖 the bot in charge of tidying up the issues.

I have to close this one because of its lack of activity 😞

Feel free to re-open it or join our Community Forum.

v2 automation moved this from issues to Done Dec 13, 2020
@BlackKadabra
Copy link

@SantoDE it's normal because you use 2.3 of traefik and author use 2.2.1.

2.2.2 version have two fixes about udp :
[udp] Fix mem leak on UDP connections (#6815 by ddtmachado)
[udp] Avoid overwriting already received UDP messages (#6797 by cbachert)

It probably fixe the problem encourtered by author, he just need to update traefik (tested yesterday)

@dragon2611
Copy link
Author

I'll check which version I'm running as I've updated since opening this and re-enable the UDP listener

@traefik traefik locked and limited conversation to collaborators Jan 31, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area/udp kind/bug/possible a possible bug that needs analysis before it is confirmed or fixed. status/5-frozen-due-to-age
Projects
No open projects
v2
Done
Development

No branches or pull requests

9 participants