Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrity check fails on passwords containing $ [was: Auth server return bad url] #102

Closed
vipcxj opened this issue Sep 22, 2023 · 24 comments
Labels
priority: high type: bug Something isn't working

Comments

@vipcxj
Copy link

vipcxj commented Sep 22, 2023

it return:

{
  "iceServers":[
      {
        "username":"1695376501:user1",
        "credential":"EPddI2tMN9vtfGMhup1RYE5nSkA=",
        "urls":[":192.168.0.247:3478?transport="]
      }
  ],
  "iceTransportPolicy":"all"
}

Here is the log of auth server:

2023-09-22T08:53:57.132824604Z	LEVEL(-2)	configmap-controller	reset ConfigMap store	{"configs": "store (1 objects): {version=\"v1alpha1\",admin:{name=\"stunner-daemon\",logLevel=\"all:INFO\",health-check=\"http://0.0.0.0:8086\"},auth:{realm=\"stunner.l7mp.io\",type=\"longterm\",shared-secret=\"<SECRET>\"},listeners=[\"stunner/owt-udp-gateway/owt-udp-listener\":{://$STUNNER_ADDR:3478?transport=<32768-65535>,public=-:-,cert/key=-/-,routes=[]}],clusters=[]}"}
2023-09-22T08:53:57.13284211Z	LEVEL(-5)	ctrl-runtime	Reconcile successful	{"controller": "configmap", "object": {"name":"stunnerd-config","namespace":"stunner"}, "namespace": "stunner", "name": "stunnerd-config", "reconcileID": "a52e5f73-3bf8-4397-b1b6-24f9a6c190f0"}
2023-09-22T08:53:57.393834523Z	LEVEL(-5)	ctrl-runtime	Reconciling	{"controller": "configmap", "object": {"name":"stunnerd-config","namespace":"stunner"}, "namespace": "stunner", "name": "stunnerd-config", "reconcileID": "89fb25cf-7e77-4f27-8260-243e2f33596c"}
2023-09-22T08:53:57.393856977Z	INFO	configmap-controller	reconciling	{"gateway-config": "stunner/stunnerd-config"}
2023-09-22T08:53:57.393954362Z	LEVEL(-2)	configmap-controller	reset ConfigMap store	{"configs": "store (1 objects): {version=\"v1alpha1\",admin:{name=\"stunner-daemon\",logLevel=\"all:INFO\",health-check=\"http://0.0.0.0:8086\"},auth:{realm=\"stunner.l7mp.io\",type=\"longterm\",shared-secret=\"<SECRET>\"},listeners=[\"stunner/owt-udp-gateway/owt-udp-listener\":{://$STUNNER_ADDR:31768?transport=<32768-65535>,public=-:31768,cert/key=-/-,routes=[]}],clusters=[]}"}
2023-09-22T08:53:57.393975489Z	LEVEL(-5)	ctrl-runtime	Reconcile successful	{"controller": "configmap", "object": {"name":"stunnerd-config","namespace":"stunner"}, "namespace": "stunner", "name": "stunnerd-config", "reconcileID": "89fb25cf-7e77-4f27-8260-243e2f33596c"}
2023-09-22T08:54:01.829351338Z	LEVEL(-5)	ctrl-runtime	Reconciling	{"controller": "configmap", "object": {"name":"stunnerd-config","namespace":"stunner"}, "namespace": "stunner", "name": "stunnerd-config", "reconcileID": "d49b5ba7-3cb6-41bf-8378-1a6a0d3eea89"}
2023-09-22T08:54:01.829441931Z	INFO	configmap-controller	reconciling	{"gateway-config": "stunner/stunnerd-config"}
2023-09-22T08:54:01.829587518Z	LEVEL(-2)	configmap-controller	reset ConfigMap store	{"configs": "store (1 objects): {version=\"v1alpha1\",admin:{name=\"stunner-daemon\",logLevel=\"all:INFO\",health-check=\"http://0.0.0.0:8086\"},auth:{realm=\"stunner.l7mp.io\",type=\"longterm\",shared-secret=\"<SECRET>\"},listeners=[\"stunner/owt-udp-gateway/owt-udp-listener\":{://192.168.0.247:3478?transport=<32768-65535>,public=192.168.0.247:3478,cert/key=-/-,routes=[]}],clusters=[]}"}
2023-09-22T08:54:01.82960411Z	LEVEL(-5)	ctrl-runtime	Reconcile successful	{"controller": "configmap", "object": {"name":"stunnerd-config","namespace":"stunner"}, "namespace": "stunner", "name": "stunnerd-config", "reconcileID": "d49b5ba7-3cb6-41bf-8378-1a6a0d3eea89"}
2023-09-22T08:54:56.920937223Z	INFO	handler	GetIceAuth: serving ICE config request	{"params": {"service":"turn","username":"user1","ttl":3600}}
2023-09-22T08:54:56.920979416Z	DEBUG	handler	getIceServerConf: serving ICE config request	{"params": {"service":"turn","username":"user1","ttl":3600}}
2023-09-22T08:54:56.920985649Z	DEBUG	handler	getIceServerConfForStunnerConf: considering Stunner config	{"stunner-config": "{version=\"v1alpha1\",admin:{name=\"stunner-daemon\",logLevel=\"all:INFO\",health-check=\"http://0.0.0.0:8086\"},auth:{realm=\"stunner.l7mp.io\",type=\"longterm\",shared-secret=\"<SECRET>\"},listeners=[\"stunner/owt-udp-gateway/owt-udp-listener\":{://192.168.0.247:3478?transport=<32768-65535>,public=192.168.0.247:3478,cert/key=-/-,routes=[]}],clusters=[]}", "params": {"service":"turn","username":"user1","ttl":3600}}
2023-09-22T08:54:56.921018655Z	DEBUG	handler	considering Listener	{"namespace": "stunner", "gateway": "owt-udp-gateway", "listener": "owt-udp-listener"}
2023-09-22T08:54:56.921031429Z	DEBUG	handler	getIceServerConfForStunnerConf: ready	{"repsonse": {"credential":"Q8ara2lUtQ8/vvKSlqAoXVW1bH8=","urls":[":192.168.0.247:3478?transport="],"username":"1695376496:user1"}}
2023-09-22T08:54:56.921098868Z	DEBUG	handler	getIceServerConf: ready	{"repsonse": {"iceServers":[{"credential":"Q8ara2lUtQ8/vvKSlqAoXVW1bH8=","urls":[":192.168.0.247:3478?transport="],"username":"1695376496:user1"}],"iceTransportPolicy":"all"}}
2023-09-22T08:54:56.921109065Z	INFO	handler	GetIceAuth: ready	{"response": {"iceServers":[{"credential":"Q8ara2lUtQ8/vvKSlqAoXVW1bH8=","urls":[":192.168.0.247:3478?transport="],"username":"1695376496:user1"}],"iceTransportPolicy":"all"}, "status": 200}
2023-09-22T08:55:01.455624663Z	INFO	handler	GetIceAuth: serving ICE config request	{"params": {"service":"turn","username":"user1","ttl":3600}}
2023-09-22T08:55:01.455658526Z	DEBUG	handler	getIceServerConf: serving ICE config request	{"params": {"service":"turn","username":"user1","ttl":3600}}
2023-09-22T08:55:01.45566384Z	DEBUG	handler	getIceServerConfForStunnerConf: considering Stunner config	{"stunner-config": "{version=\"v1alpha1\",admin:{name=\"stunner-daemon\",logLevel=\"all:INFO\",health-check=\"http://0.0.0.0:8086\"},auth:{realm=\"stunner.l7mp.io\",type=\"longterm\",shared-secret=\"<SECRET>\"},listeners=[\"stunner/owt-udp-gateway/owt-udp-listener\":{://192.168.0.247:3478?transport=<32768-65535>,public=192.168.0.247:3478,cert/key=-/-,routes=[]}],clusters=[]}", "params": {"service":"turn","username":"user1","ttl":3600}}
2023-09-22T08:55:01.455689067Z	DEBUG	handler	considering Listener	{"namespace": "stunner", "gateway": "owt-udp-gateway", "listener": "owt-udp-listener"}
2023-09-22T08:55:01.455701758Z	DEBUG	handler	getIceServerConfForStunnerConf: ready	{"repsonse": {"credential":"EPddI2tMN9vtfGMhup1RYE5nSkA=","urls":[":192.168.0.247:3478?transport="],"username":"1695376501:user1"}}
2023-09-22T08:55:01.455755771Z	DEBUG	handler	getIceServerConf: ready	{"repsonse": {"iceServers":[{"credential":"EPddI2tMN9vtfGMhup1RYE5nSkA=","urls":[":192.168.0.247:3478?transport="],"username":"1695376501:user1"}],"iceTransportPolicy":"all"}}
2023-09-22T08:55:01.455762762Z	INFO	handler	GetIceAuth: ready	{"response": {"iceServers":[{"credential":"EPddI2tMN9vtfGMhup1RYE5nSkA=","urls":[":192.168.0.247:3478?transport="],"username":"1695376501:user1"}],"iceTransportPolicy":"all"}, "status": 200}

Here is the log of the stunner pod:

03:29:56.051942 main.go:82: stunnerd INFO: watching configuration file at "/etc/stunnerd/stunnerd.conf"
03:29:56.052247 reconcile.go:113: stunner INFO: setting loglevel to "all:INFO"
03:29:56.052280 reconcile.go:141: stunner WARNING: running with no listeners
03:29:56.052395 reconcile.go:157: stunner WARNING: running with no clusters: all traffic will be dropped
03:29:56.052409 reconcile.go:177: stunner INFO: reconciliation ready: new objects: 2, changed objects: 0, deleted objects: 0, started objects: 0, restarted objects: 0
03:29:56.052423 reconcile.go:181: stunner INFO: status: READY, realm: stunner.l7mp.io, authentication: plaintext, listeners: NONE, active allocations: 0
03:29:56.055569 reconcile.go:113: stunner INFO: setting loglevel to "all:INFO"
03:29:56.055620 server.go:19: stunner INFO: listener stunner/owt-tcp-gateway/owt-tcp-listener: [tcp://10.233.74.92:3478<32768:65535>] (re)starting
03:29:56.055687 server.go:161: stunner INFO: listener stunner/owt-tcp-gateway/owt-tcp-listener: TURN server running
03:29:56.055693 reconcile.go:177: stunner INFO: reconciliation ready: new objects: 2, changed objects: 2, deleted objects: 0, started objects: 1, restarted objects: 0
03:29:56.055703 reconcile.go:181: stunner INFO: status: READY, realm: stunner.l7mp.io, authentication: longterm, listeners: stunner/owt-tcp-gateway/owt-tcp-listener: [tcp://10.233.74.92:3478<32768:65535>], active allocations: 0
08:52:45.642881 reconcile.go:113: stunner INFO: setting loglevel to "all:INFO"
08:52:45.643022 reconcile.go:177: stunner INFO: reconciliation ready: new objects: 0, changed objects: 2, deleted objects: 0, started objects: 0, restarted objects: 0
08:52:45.643051 reconcile.go:181: stunner INFO: status: READY, realm: stunner.l7mp.io, authentication: longterm, listeners: stunner/owt-tcp-gateway/owt-tcp-listener: [tcp://10.233.74.92:3478<32768:65535>], active allocations: 0
08:53:18.252190 config.go:347: watch-config WARNING: config file deleted "REMOVE", disabling watcher
08:53:20.252891 config.go:283: watch-config WARNING: waiting for config file "/etc/stunnerd/stunnerd.conf"
08:53:30.253135 config.go:283: watch-config WARNING: waiting for config file "/etc/stunnerd/stunnerd.conf"
08:53:40.252690 config.go:283: watch-config WARNING: waiting for config file "/etc/stunnerd/stunnerd.conf"
08:53:50.252410 config.go:283: watch-config WARNING: waiting for config file "/etc/stunnerd/stunnerd.conf"
08:53:57.254904 reconcile.go:113: stunner INFO: setting loglevel to "all:INFO"
08:53:57.254999 reconcile.go:157: stunner WARNING: running with no clusters: all traffic will be dropped
08:53:57.255015 server.go:19: stunner INFO: listener stunner/owt-udp-gateway/owt-udp-listener: [udp://10.233.74.92:3478<32768:65535>] (re)starting
08:53:57.255022 server.go:42: stunner INFO: setting up UDP listener socket pool at 10.233.74.92:3478 with 16 readloop threads
08:53:57.255282 server.go:161: stunner INFO: listener stunner/owt-udp-gateway/owt-udp-listener: TURN server running
08:53:57.255293 reconcile.go:177: stunner INFO: reconciliation ready: new objects: 1, changed objects: 0, deleted objects: 2, started objects: 1, restarted objects: 0
08:53:57.255301 reconcile.go:181: stunner INFO: status: READY, realm: stunner.l7mp.io, authentication: longterm, listeners: stunner/owt-udp-gateway/owt-udp-listener: [udp://10.233.74.92:3478<32768:65535>], active allocations: 0
08:53:57.394702 reconcile.go:113: stunner INFO: setting loglevel to "all:INFO"
08:53:57.394733 reconcile.go:157: stunner WARNING: running with no clusters: all traffic will be dropped
08:53:57.394739 reconcile.go:177: stunner INFO: reconciliation ready: new objects: 0, changed objects: 1, deleted objects: 0, started objects: 0, restarted objects: 0
08:53:57.394750 reconcile.go:181: stunner INFO: status: READY, realm: stunner.l7mp.io, authentication: longterm, listeners: stunner/owt-udp-gateway/owt-udp-listener: [udp://10.233.74.92:3478<32768:65535>], active allocations: 0
08:54:03.331831 reconcile.go:113: stunner INFO: setting loglevel to "all:INFO"
08:54:03.331866 reconcile.go:157: stunner WARNING: running with no clusters: all traffic will be dropped
08:54:03.331872 reconcile.go:177: stunner INFO: reconciliation ready: new objects: 0, changed objects: 1, deleted objects: 0, started objects: 0, restarted objects: 0
08:54:03.331884 reconcile.go:181: stunner INFO: status: READY, realm: stunner.l7mp.io, authentication: longterm, listeners: stunner/owt-udp-gateway/owt-udp-listener: [udp://10.233.74.92:3478<32768:65535>], active allocations: 0
08:54:03.331937 reconcile.go:113: stunner INFO: setting loglevel to "all:INFO"
08:54:03.331956 reconcile.go:157: stunner WARNING: running with no clusters: all traffic will be dropped
08:54:03.331959 reconcile.go:177: stunner INFO: reconciliation ready: new objects: 0, changed objects: 1, deleted objects: 0, started objects: 0, restarted objects: 0
08:54:03.331966 reconcile.go:181: stunner INFO: status: READY, realm: stunner.l7mp.io, authentication: longterm, listeners: stunner/owt-udp-gateway/owt-udp-listener: [udp://10.233.74.92:3478<32768:65535>], active allocations: 0
08:54:03.421273 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
08:54:03.421795 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
08:54:03.424394 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
08:54:03.426351 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
08:54:03.434381 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
08:54:03.443200 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
08:54:03.472294 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
08:54:03.513291 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
08:54:03.537337 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
08:54:03.551542 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
08:54:03.556254 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
08:54:03.731422 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
08:54:05.391543 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: BadFormat for message/cookie: 34353637 is invalid magic cookie (should be 2112a442)
08:54:05.757221 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: BadFormat for message/cookie: 34353637 is invalid magic cookie (should be 2112a442)

Here is my config:

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: GatewayClass
metadata:
  name: stunner-gatewayclass
spec:
  controllerName: "stunner.l7mp.io/gateway-operator"
  parametersRef:
    group: "stunner.l7mp.io"
    kind: GatewayConfig
    name: stunner-gatewayconfig
    namespace: stunner
  description: "STUNner is a WebRTC ingress gateway for Kubernetes"

---
apiVersion: stunner.l7mp.io/v1alpha1
kind: GatewayConfig
metadata:
  name: stunner-gatewayconfig
  namespace: stunner
spec:
  realm: stunner.l7mp.io
  authType: ephemeral
  sharedSecret: 'XXXXXXXXXX'
  loadBalancerServiceAnnotations:
    kubernetes.io/elb.class: shared
    kubernetes.io/elb.id: XXXXXXXXXX
    kubernetes.io/elb.lb-algorithm: LEAST_CONNECTIONS
    kubernetes.io/elb.session-affinity-flag: 'on'
    kubernetes.io/elb.session-affinity-option: '{"type": "SOURCE_IP", "persistence_timeout": 15}'
    kubernetes.io/elb.health-check-flag: 'on'
    kubernetes.io/elb.health-check-option: '{"delay": 3, "timeout": 15, "max_retries": 3}'
    kubernetes.io/elb.enable-transparent-client-ip: "true"


---
apiVersion: gateway.networking.k8s.io/v1alpha2
kind: Gateway
metadata:
  name: owt-udp-gateway
  namespace: stunner
spec:
  gatewayClassName: stunner-gatewayclass
  listeners:
    - name: owt-udp-listener
      port: 3478
      protocol: UDP
---
apiVersion: gateway.networking.k8s.io/v1alpha2
kind: UDPRoute
metadata:
  name: owt-media-plane
  namespace: stunner
spec:
  parentRefs:
    - name: owt-udp-listener
  rules:
    - backendRefs:
        - name: owt-server
          namespace: default
@rg0now rg0now added priority: high type: bug Something isn't working type: question Further information is requested and removed type: bug Something isn't working labels Sep 22, 2023
@rg0now
Copy link
Member

rg0now commented Sep 22, 2023

An immediate problem is that your UDPRoute cannot attach to the Gateway: parentRef.Name must match the name of the parent Gateway, not the name of the name of the listener:

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: UDPRoute
metadata:
  name: owt-media-plane
  namespace: stunner
spec:
  parentRefs:
    - name: owt-udp-gateway
  rules:
    - backendRefs:
        - name: owt-server
          namespace: default

You can see why this in a problem the stunnerd logs:

08:54:03.331956 reconcile.go:157: stunner WARNING: running with no clusters: all traffic will be dropped

If you want to attach the UDPRoute to a specific listener on the gateway, you must specify both the Gateway name and the listener name in the parentRef:

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: UDPRoute
metadata:
  name: owt-media-plane
  namespace: stunner
spec:
  parentRefs:
    - name: owt-udp-gateway
      sectionName: owt-udp-listener
  rules:
    - backendRefs:
        - name: owt-server
          namespace: default

This though still doesn't explain the other weirdness, the TURN errors:

08:54:03.421273 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header

This means the client sends truncated packets or some middlebox chops off the meaningful part of our TURN packets or this can be the wildest case of an MTU issue I've ever seen. I must see a tcpdump to know what's wrong here, but it is definitely not a STUNner issue.

@rg0now
Copy link
Member

rg0now commented Sep 22, 2023

Well, and then there is the auth issue. This is weird: we fail to generate a turn protocol scheme and a transport protocol, that is clearly a bug. Did you install from the stable channel or the latest version from the dev channel? There has been a massive rewrite lately that might have broken the dev installs, we'll look into that ASAP.

@vipcxj
Copy link
Author

vipcxj commented Sep 22, 2023

@rg0now Thank you for your quick response. I found I can't connect the turnserver, it return 401, so I change the auth type from ephemeral to static, The problem remains. Then I found your response, And change the config. It works. Then I change the auth type back, I can't get to the server again. In the log, I found:

10:11:39.029140 handlers.go:38: stunner-auth INFO: longterm auth request: username="1695381093:12345" realm="stunner.l7mp.io" srcAddr=192.168.0.82:38130
10:11:39.029170 handlers.go:53: stunner-auth INFO: longterm auth request: success
10:11:39.029240 server.go:194: turn ERROR: error when handling datagram: failed to handle Allocate-request from 192.168.0.82:38130: integrity check failed

What is it means?

Well, and then there is the auth issue. This is weird: we fail to generate a turn protocol scheme and a transport protocol, that is clearly a bug. Did you install from the stable channel or the latest version from the dev channel? There has been a massive rewrite lately that might have broken the dev installs, we'll look into that ASAP.

I install the stable operotor weeks ago. At first it return good url. Today I found the url is broken. I have not reinstalled the operator.

I change the auth type back to static, and I can connect to server again, Here is the log:

10:20:50.456079 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
10:20:52.997995 handlers.go:25: stunner-auth INFO: plaintext auth request: username="user-1" realm="stunner.l7mp.io" srcAddr=192.168.0.82:13830
10:20:53.050236 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
10:20:53.089937 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
10:20:53.121255 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
10:20:53.214464 allocation.go:290: turn INFO: No Permission or Channel exists for 10.233.75.18:39127 on allocation 10.233.97.159:32869
10:20:53.256763 handlers.go:25: stunner-auth INFO: plaintext auth request: username="user-1" realm="stunner.l7mp.io" srcAddr=192.168.0.82:13830
10:20:53.256809 handlers.go:84: stunner-auth INFO: permission granted on listener "stunner/owt-udp-gateway/owt-udp-listener" for client "192.168.0.82:13830" to peer 10.233.75.18 via cluster "stunner/owt-media-plane"
10:20:55.920946 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header
10:20:55.975054 server.go:194: turn ERROR: error when handling datagram: failed to create stun message from packet: unexpected EOF: not enough bytes to read header

By the way, there are logs of similar errors such as "not enough bytes to read header" and "BadFormat for message/cookie: 34353637 is invalid magic cookie (should be 2112a442)". It seems that the stunner still work well with these errors.

@rg0now
Copy link
Member

rg0now commented Sep 22, 2023

This was me breaking the auth service, which, combined with anyone bug in our helm charts that defaults the auth-service container image version to the latest dev version even in the stable install, caused that everyone was immediately affected.

We'll try to settle this ASAP. Meanwhile you can manually downgrade the auth-service container image version to v0.15.0, that should fix this until we stabilize the dev version and roll a new release. Or you can use a fix TURN credential for now.

As per the truncated TURN messages, that's still unexplainable. I doubt this issue has anything to do with the auth service bug, maybe a misbehaving client? We'll see once we fix the auth-service.

@vipcxj
Copy link
Author

vipcxj commented Sep 22, 2023

Here is my stunner image:

docker.io/l7mp/stunnerd:0.15.0
docker.io/l7mp/stunner-auth-server:dev

I have change docker.io/l7mp/stunner-auth-server:dev to docker.io/l7mp/stunner-auth-server:0.15.0

The url is right now. But ephemeral auth type still not work.

10:58:15.365317 handlers.go:38: stunner-auth INFO: longterm auth request: username="1695383890:12345" realm="stunner.l7mp.io" srcAddr=192.168.0.82:6508
10:58:15.365345 handlers.go:53: stunner-auth INFO: longterm auth request: success
10:58:15.365429 server.go:194: turn ERROR: error when handling datagram: failed to handle Allocate-request from 192.168.0.82:6508: integrity check failed

And the error not change.

@rg0now
Copy link
Member

rg0now commented Sep 22, 2023

From the below it seems that the auth-credentials are accepted by STUNner.:

10:58:15.365345 handlers.go:53: stunner-auth INFO: longterm auth request: success

The subsequent error is then caused by that your client generates a wrong integrity hash into the packets:

10:58:15.365429 server.go:194: turn ERROR: error when handling datagram: failed to handle Allocate-request from 192.168.0.82:6508: integrity check failed

Can you please confirm that it works with static username/password pairs, only the ephemeral auth credentials are what trigger this issue? Judging from the weird errors you get, it can be some misbehaving TURN client. Are you trying to connect from a browser?

@rg0now
Copy link
Member

rg0now commented Sep 22, 2023

I cannot reproduce the ephemeral auth issue with the latest dev version. Please uninstall stunnerd and the gateway-operator and reinstall from the dev channel and report back any issue you find. Thx!

As per the "truncated TURN packets problem: please use the simple-tunnel tutorial to check whether your cluster is OK. That demo uses our own TURN client so at least the "misbehaving client" problem will go away.

@vipcxj
Copy link
Author

vipcxj commented Sep 23, 2023

From the below it seems that the auth-credentials are accepted by STUNner.:

10:58:15.365345 handlers.go:53: stunner-auth INFO: longterm auth request: success

The subsequent error is then caused by that your client generates a wrong integrity hash into the packets:

10:58:15.365429 server.go:194: turn ERROR: error when handling datagram: failed to handle Allocate-request from 192.168.0.82:6508: integrity check failed

Can you please confirm that it works with static username/password pairs, only the ephemeral auth credentials are what trigger this issue? Judging from the weird errors you get, it can be some misbehaving TURN client. Are you trying to connect from a browser?

I have tried this many times and only ephemeral auth triggers this issue. I can't say for sure if this has been an issue before, as this seems to be the first time I've tried to use ephemeral auth. it's always been static auth before, I'll do some further research next Monday.

@vipcxj
Copy link
Author

vipcxj commented Jan 8, 2024

Ephemeral auth still not work. Here are the errors:

04:04:58.347104 reconcile.go:180: stunner INFO: status: READY, realm: stunner.l7mp.io, authentication: static, listeners: NONE, active allocations: 0
04:04:58.347664 cds_client.go:83: cds-client INFO: connection successfully opened to config discovery server at ws://10.16.0.27:13478/api/v1/configs/stunner/stunner-study-ai?watch=true
04:04:58.348430 reconcile.go:112: stunner INFO: setting loglevel to "all:INFO"
04:04:58.348492 server.go:28: stunner INFO: listener stunner/stunner-study-ai/stunner-study-ai-udp-listener: [turn-udp://10.16.0.140:3478<0:0>] (re)starting
04:04:58.348507 server.go:45: stunner INFO: setting up UDP listener socket pool at 0.0.0.0:3478 with 16 readloop threads
04:04:58.348865 server.go:166: stunner INFO: listener stunner/stunner-study-ai/stunner-study-ai-udp-listener: TURN server running
04:04:58.348874 reconcile.go:176: stunner INFO: reconciliation ready: new objects: 2, changed objects: 1, deleted objects: 0, started objects: 1, restarted objects: 0
04:04:58.348883 reconcile.go:180: stunner INFO: status: READY, realm: stunner.l7mp.io, authentication: ephemeral, listeners: stunner/stunner-study-ai/stunner-study-ai-udp-listener: [turn-udp://10.16.0.140:3478<0:0>], active allocations: 0
04:05:21.743309 reconcile.go:112: stunner INFO: setting loglevel to "all:INFO"
04:05:21.743356 reconcile.go:176: stunner INFO: reconciliation ready: new objects: 0, changed objects: 2, deleted objects: 0, started objects: 0, restarted objects: 0
04:05:21.743366 reconcile.go:180: stunner INFO: status: READY, realm: stunner.l7mp.io, authentication: ephemeral, listeners: stunner/stunner-study-ai/stunner-study-ai-udp-listener: [turn-udp://10.16.0.140:3478<0:0>], active allocations: 0
04:06:41.100143 handlers.go:38: stunner-auth INFO: ephemeral auth request: username="1704690399:1" realm="stunner.l7mp.io" srcAddr=10.16.0.129:14411
04:06:41.100172 handlers.go:53: stunner-auth INFO: ephemeral auth request: success
04:06:41.100224 server.go:202: turn ERROR: Failed to handle datagram: failed to handle Allocate-request from 10.16.0.129:14411: integrity check failed
04:06:54.791511 handlers.go:38: stunner-auth INFO: ephemeral auth request: username="1704690412:1" realm="stunner.l7mp.io" srcAddr=10.16.0.129:2687
04:06:54.791550 handlers.go:53: stunner-auth INFO: ephemeral auth request: success
04:06:54.791618 server.go:202: turn ERROR: Failed to handle datagram: failed to handle Allocate-request from 10.16.0.129:2687: integrity check failed
04:07:11.944288 handlers.go:38: stunner-auth INFO: ephemeral auth request: username="1704690430:1" realm="stunner.l7mp.io" srcAddr=10.16.0.129:19056
04:07:11.944317 handlers.go:53: stunner-auth INFO: ephemeral auth request: success
04:07:11.944381 server.go:202: turn ERROR: Failed to handle datagram: failed to handle Allocate-request from 10.16.0.129:19056: integrity check failed
04:07:16.036893 handlers.go:38: stunner-auth INFO: ephemeral auth request: username="1704690434:1" realm="stunner.l7mp.io" srcAddr=10.16.0.129:19341
04:07:16.036921 handlers.go:53: stunner-auth INFO: ephemeral auth request: success

Change to static auth, error disappeared, and it works.

@rg0now
Copy link
Member

rg0now commented Jan 8, 2024

I cannot reproduce this issue with latest stable. I tested with simple-tunnel, changed the authentication to ephemeral:

kubectl apply -f - <<EOF
apiVersion: stunner.l7mp.io/v1
kind: GatewayConfig
metadata:
  name: stunner-gatewayconfig
  namespace: stunner
spec:
  authRef:
    kind: Secret
    name: stunner-auth-secret
    namespace: stunner
  dataplane: default
  realm: stunner.l7mp.io
---
apiVersion: v1
kind: Secret
metadata:
  name: stunner-auth-secret
  namespace: stunner
type: Opaque
stringData:
  type: ephemeral
  secret: my-shared-secret
EOF

Then tested with turncat:

cd <stunner>
export IPERF_ADDR=$(kubectl get svc iperf-server -o jsonpath="{.spec.clusterIP}")
go run cmd/turncat/main.go --log=all:INFO udp://127.0.0.1:5000 k8s://stunner/udp-gateway:udp-listener  udp://$IPERF_ADDR:5001
iperf -c localhost -p 5000 -u -i 1 -l 100 -b 8000 -t 5
...
[  1] 0.0000-5.0114 sec  5.18 KBytes  8.46 Kbits/sec   3.593 ms 0/53 (0%) 16.024/12.296/101.769/8.420 ms 10 pps 10/0(0) pkts 0.065999

@vipcxj
Copy link
Author

vipcxj commented Jan 9, 2024

@rg0now I try again, still not work, same error. Is there a docker image bundled with turncat, so I can use it to test?

@levaitamas
Copy link
Member

levaitamas commented Jan 9, 2024

Hi @vipcxj !

@rg0now I try again, still not work, same error. Is there a docker image bundled with turncat, so I can use it to test?

l7mp/net-debug has turncat built-in.

@vipcxj
Copy link
Author

vipcxj commented Jan 9, 2024

@levaitamas I found the problem~ It's secret key.
I use the secret key in your example, it works , then change back to our secret key, it faild.
Here is our secret key:

apiVersion: v1
kind: Secret
metadata:
  name: stunner-study-ai-auth
  namespace: stunner
type: Opaque
stringData:
  type: ephemeral
  secret: 4h$sdgh[k)tAgjTR54gfjus3wayrt
  # secret: my-shared-secret

@rg0now
Copy link
Member

rg0now commented Feb 1, 2024

This is (hopefully) resolved now. Feel free to reopen if problem persists.

@rg0now rg0now closed this as completed Feb 1, 2024
@vipcxj
Copy link
Author

vipcxj commented Feb 2, 2024

@rg0now which commit solve it? I have shorten the secret as the workaround. So it seems that I will not trigger this issue any more.

@rg0now
Copy link
Member

rg0now commented Feb 2, 2024

Sorry for the confusion, I thought we resolved it here. Can you please confirm that the secret 4h$sdgh[k)tAgjTR54gfjus3wayrt does not work while my-shared-secret does? If yes, they I'll re-label this as a bug and try to look into it.

@rg0now rg0now reopened this Feb 2, 2024
@vipcxj
Copy link
Author

vipcxj commented Feb 2, 2024

Can't replace secret right now, but I confirm that I did have this issue when I commented earlier, and it's not clear if you've done any fixes since then. I fixed it by use a shorter secret. So it seems that too long secret will tirgger the issue.

@rg0now
Copy link
Member

rg0now commented Feb 2, 2024

Thx for the report, I'm trying to track this down now. My bet would be that it's the symbols like $ and the like that make the operator go south, but we'll see.

@rg0now rg0now added type: bug Something isn't working and removed type: question Further information is requested status: cannot reproduce labels Feb 2, 2024
@rg0now
Copy link
Member

rg0now commented Feb 2, 2024

So this seems like a good old UNIX/Bash gotcha.

If you use the below to set the secret, then any value you try to use that is containing a symbol like $ in stringData gets corrupted:

kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: stunner-auth-secret
  namespace: stunner
type: Opaque
stringData:
  type: ephemeral
  secret: "4h$sdgh[k)tAgjTR54gfjus3wayrt"
EOF
kubectl -n stunner get secrets stunner-auth-secret --template={{.data.secret}} | base64 -d
4h[k)tAgjTR54gfjus3wayrt

The good news is that the corrupted secret nicely finds its way into the STUNner dataplane config, so at least we don't mess things up during the base64 encode/decode roundtrip:

stunnerctl -n stunner config udp-gateway -o jsonpath='{.auth.credentials.secret}'
4h[k)tAgjTR54gfjus3wayrt

However, place your Secret into a YAML file, say, /tmp/my-secret.yaml, like this:

cat /tmp/my-secret.yaml 
apiVersion: v1
kind: Secret
metadata:
  name: stunner-auth-secret
  namespace: stunner
type: Opaque
stringData:
  type: ephemeral
  secret: "4h$sdgh[k)tAgjTR54gfjus3wayrt"

And then kubectl-apply it and all is fine:

kubectl apply -f /tmp/my-secret.yaml
kubectl -n stunner get secrets stunner-auth-secret --template={{.data.secret}} | base64 -d
4h$sdgh[k)tAgjTR54gfjus3wayrt
stunnerctl -n stunner config udp-gateway -o jsonpath='{.auth.credentials.secret}' 
4h$sdgh[k)tAgjTR54gfjus3wayrt

So I guess shell-escaping bites us here: the shell interprets $sdg in the secret as a shell variable and subtitutes it with its value (empty string). Let's try to escape the $ in the secret (observe how we now use \$ now):

kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: stunner-auth-secret
  namespace: stunner
type: Opaque
stringData:
  type: ephemeral
  secret: "4h\$sdgh[k)tAgjTR54gfjus3wayrt"
EOF
kubectl -n stunner get secrets stunner-auth-secret --template={{.data.secret}} | base64 -d
4h$sdgh[k)tAgjTR54gfjus3wayrt

So this fixes it, and this is indeed a shell escaping thingie...

Today I (re)learned something important again: just use standard YAML manifests that do not get processed by the shell and then all is fine.

@vipcxj
Copy link
Author

vipcxj commented Feb 2, 2024

@rg0now I got the auth secret from stunner auth server, so if the secret is changed by something, it not matter what I gotten. Because I do not achieve the secret from k8s directly.

@vipcxj
Copy link
Author

vipcxj commented Feb 2, 2024

@rg0now And I have tried config secret both directly in GatewayConfig and an authRef of secret, nothing changed. The issue does not disappeared until I changed the secret. By the way, the new secret does not contain '$', but still contain '@', and it is 13 bytes long.

@rg0now
Copy link
Member

rg0now commented Feb 2, 2024

Can you please specify exactly what didn't work for you? Just post all YAMLs and command lines what you did so that I can reproduce the problem.

@vipcxj
Copy link
Author

vipcxj commented Feb 2, 2024

@rg0now

apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: stunner-study-ai
  labels:
    {{- include "common.labels" . | nindent 4 }}
spec:
  controllerName: "stunner.l7mp.io/gateway-operator"
  parametersRef:
    group: "stunner.l7mp.io"
    kind: GatewayConfig
    name: stunner-study-ai
    namespace: stunner
  description: "STUNner is a WebRTC ingress gateway for Kubernetes"
---
apiVersion: v1
kind: Namespace
metadata:
  name: stunner
---
apiVersion: v1
kind: Secret
metadata:
  name: stunner-study-ai-auth
  namespace: stunner
type: Opaque
stringData:
  type: ephemeral
  secret: 4h$sdgh[k)tAgjTR54gfjus3wayrt
---
apiVersion: stunner.l7mp.io/v1
kind: GatewayConfig
metadata:
  name: stunner-study-ai
  namespace: stunner
spec:
  realm: stunner.l7mp.io
  authRef:
    kind: Secret
    name: stunner-study-ai-auth
    namespace: stunner
  # dataplane: default
  # authType: plaintext
  # userName: "user-1"
  # password: "pass-1"
  loadBalancerServiceAnnotations:
    kubernetes.io/elb.class: performance
    kubernetes.io/elb.id: XXXXXXXXXXX
    kubernetes.io/elb.lb-algorithm: LEAST_CONNECTIONS 
    kubernetes.io/elb.session-affinity-mode: SOURCE_IP
    kubernetes.io/elb.session-affinity-option: '{"persistence_timeout": "15"}'
    kubernetes.io/elb.health-check-flag: 'off'
    # kubernetes.io/elb.health-check-option: '{"delay": 3, "timeout": 15, "max_retries": 3}'
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: stunner-study-ai
  namespace: stunner
spec:
  gatewayClassName: stunner-study-ai
  listeners:
    - name: stunner-study-ai-udp-listener
      port: 3478
      protocol: TURN-UDP
---
apiVersion: stunner.l7mp.io/v1
kind: UDPRoute
metadata:
  name: stunner-study-ai
  namespace: stunner
spec:
  parentRefs:
    - name: stunner-study-ai
  rules:
    - backendRefs:
        - name: video-server
          namespace: default

Then my webrtc client achieve the auth config just from the stunner auth server, and unable to connect the peer.

04:04:58.347104 reconcile.go:180: stunner INFO: status: READY, realm: stunner.l7mp.io, authentication: static, listeners: NONE, active allocations: 0
04:04:58.347664 cds_client.go:83: cds-client INFO: connection successfully opened to config discovery server at ws://10.16.0.27:13478/api/v1/configs/stunner/stunner-study-ai?watch=true
04:04:58.348430 reconcile.go:112: stunner INFO: setting loglevel to "all:INFO"
04:04:58.348492 server.go:28: stunner INFO: listener stunner/stunner-study-ai/stunner-study-ai-udp-listener: [turn-udp://10.16.0.140:3478<0:0>] (re)starting
04:04:58.348507 server.go:45: stunner INFO: setting up UDP listener socket pool at 0.0.0.0:3478 with 16 readloop threads
04:04:58.348865 server.go:166: stunner INFO: listener stunner/stunner-study-ai/stunner-study-ai-udp-listener: TURN server running
04:04:58.348874 reconcile.go:176: stunner INFO: reconciliation ready: new objects: 2, changed objects: 1, deleted objects: 0, started objects: 1, restarted objects: 0
04:04:58.348883 reconcile.go:180: stunner INFO: status: READY, realm: stunner.l7mp.io, authentication: ephemeral, listeners: stunner/stunner-study-ai/stunner-study-ai-udp-listener: [turn-udp://10.16.0.140:3478<0:0>], active allocations: 0
04:05:21.743309 reconcile.go:112: stunner INFO: setting loglevel to "all:INFO"
04:05:21.743356 reconcile.go:176: stunner INFO: reconciliation ready: new objects: 0, changed objects: 2, deleted objects: 0, started objects: 0, restarted objects: 0
04:05:21.743366 reconcile.go:180: stunner INFO: status: READY, realm: stunner.l7mp.io, authentication: ephemeral, listeners: stunner/stunner-study-ai/stunner-study-ai-udp-listener: [turn-udp://10.16.0.140:3478<0:0>], active allocations: 0
04:06:41.100143 handlers.go:38: stunner-auth INFO: ephemeral auth request: username="1704690399:1" realm="stunner.l7mp.io" srcAddr=10.16.0.129:14411
04:06:41.100172 handlers.go:53: stunner-auth INFO: ephemeral auth request: success
04:06:41.100224 server.go:202: turn ERROR: Failed to handle datagram: failed to handle Allocate-request from 10.16.0.129:14411: integrity check failed
04:06:54.791511 handlers.go:38: stunner-auth INFO: ephemeral auth request: username="1704690412:1" realm="stunner.l7mp.io" srcAddr=10.16.0.129:2687
04:06:54.791550 handlers.go:53: stunner-auth INFO: ephemeral auth request: success
04:06:54.791618 server.go:202: turn ERROR: Failed to handle datagram: failed to handle Allocate-request from 10.16.0.129:2687: integrity check failed
04:07:11.944288 handlers.go:38: stunner-auth INFO: ephemeral auth request: username="1704690430:1" realm="stunner.l7mp.io" srcAddr=10.16.0.129:19056
04:07:11.944317 handlers.go:53: stunner-auth INFO: ephemeral auth request: success
04:07:11.944381 server.go:202: turn ERROR: Failed to handle datagram: failed to handle Allocate-request from 10.16.0.129:19056: integrity check failed
04:07:16.036893 handlers.go:38: stunner-auth INFO: ephemeral auth request: username="1704690434:1" realm="stunner.l7mp.io" srcAddr=10.16.0.129:19341
04:07:16.036921 handlers.go:53: stunner-auth INFO: ephemeral auth request: success

After I change authType to plaintext, the issue disappears. Of couse, change the secret to a shorter one also works.

@rg0now rg0now changed the title Auth server return bad url Integrity check fails on passwords containing $ [was: Auth server return bad url] Feb 2, 2024
@rg0now rg0now closed this as completed in 1bb46b7 Feb 2, 2024
@rg0now
Copy link
Member

rg0now commented Feb 2, 2024

This should be fixed in 1bb46b7 and will be available in dev once the CI pipeline has finished building the image.

The bug was in STUNner: we apply environment substitution on the config file while parsing it (this allows to customize per-gateway configs to per-pod context locally) and this plays badly with TURN credential parsing if the password or the secret contains a $ symbol. In such cases we think everything that comes after the $ is the name an environment variable and we try to substitute it (to the empty string).

Thanks a lot @vipcxj for helping us tracking this down, this was a particularly ugly one. Feel free to reopen if the problem persists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: high type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants