Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example app udp-greeter.yaml not working - help needed #133

Closed
velak4340 opened this issue Mar 25, 2024 · 10 comments
Closed

Example app udp-greeter.yaml not working - help needed #133

velak4340 opened this issue Mar 25, 2024 · 10 comments
Assignees
Labels
type: question Further information is requested

Comments

@velak4340
Copy link

Hi,
I am tying to setup the example to make sure this solution can be used for my scenario and looks like all the pods/services are running fine. But when i test the example to make sure everything works fine, getting below error... I am running this example on AWS EKS cluster. any ideas what could be wrong??

./turncat - k8s://stunner/udp-gateway:udp-listener udp://${PEER_IP}:9001
12:08:58.342349 turncat.go:570: turncat WARNING: relay setup failed for client /dev/stdin: could not allocate new TURN relay transport for client file:/dev/stdin: all retransmissions failed for F0PK7+FzGLlJGP6B

NAME                                                               READY   STATUS    RESTARTS   AGE
pod/stunner-auth-5c488547b-96755                                   1/1     Running   0          3d23h
pod/stunner-gateway-operator-controller-manager-79448cb5f5-p9x9f   2/2     Running   0          3d23h
pod/udp-gateway-7bd49f95d9-k8l6d                                   1/1     Running   0          3d3h

NAME                                                                  TYPE           CLUSTER-IP       EXTERNAL-IP                                                                    PORT(S)          AGE
service/stunner-auth                                                  ClusterIP      x.x.x.x     <none>                                                                         8088/TCP         3d23h
service/stunner-config-discovery                                      ClusterIP      x.x.x.x   <none>                                                                         13478/TCP        3d23h
service/stunner-gateway-operator-controller-manager-metrics-service   ClusterIP      x.x.x.x   <none>                                                                         8443/TCP         3d23h
service/udp-gateway                                                   LoadBalancer   x.x.x.x   *********   3478:32616/UDP   3d3h

@rg0now rg0now added the type: question Further information is requested label Mar 25, 2024
@rg0now
Copy link
Member

rg0now commented Mar 25, 2024

Can you please provide more info? We'd need at least the output from kubectl get gateways,gatewayconfigs,gatewayclasses,udproutes.stunner.l7mp.io --all-namespaces -o yaml, plus the logs from the operator and one of the stunnerd pods (if running) and anything you think important for tracking this down.

@velak4340
Copy link
Author

Hi @rg0now, Thanks for reply. Attached some information you requested. i don't see any stunnerd pods running...
stunner-udp-gateway-log.txt
stunner-gateway-operator-logs.txt
stunnerconfig.txt

@rg0now
Copy link
Member

rg0now commented Mar 26, 2024

I can't see any apparent problem with your setup. Can you please elevate the loglevel on the gateway so that we see why the connection hangs and rerun the test? Here is a simple way to set the maximum loglevel:

kubectl -n stunner patch gatewayconfig stunner-gatewayconfig --type=merge -p '{"spec": {"logLevel": "all:TRACE"}}'

@rg0now rg0now self-assigned this Mar 26, 2024
@velak4340
Copy link
Author

velak4340 commented Mar 27, 2024

i updated the log level in stunner-gatewayconfig to trace and stunner-gateway-operator-controller-manager to debug before getting this logs... please check if helps
stunner-gateway-operator-controller.txt

one thing i noticed is that LoadBalancer (Network LB) service exposes only UDP . should it expose TCP as well?. If so, how to do that..
service/udp-gateway LoadBalancer x.x.x.x ********* 3478:32616/UDP 3d3h

@rg0now
Copy link
Member

rg0now commented Mar 27, 2024

Unfortunately I'm no expert in AWS load-balancers, but you may be on the right track here: last time we looked at it AWS required a TCP health-checker to accept a UDP LoadBalancer. Can you experiment with the following annotations added to the Gateway?

stunner.l7mp.io/enable-mixed-protocol-lb: true
service.beta.kubernetes.io/aws-loadbalancer-healthcheck-port: "8086"
service.beta.kubernetes.io/aws-loadbalancer-healthcheck-protocol: "http"
service.beta.kubernetes.io/aws-loadbalancer-healthcheck-path: "/live" 

What is strange is that the stunner-udp-gateway-log.txt actually shows a successful authentication attempt from someone (please check the source IP: is that one of your pods or it's coming from the outside?). Can you resend the stunner-udp-gateway-log.txt, but this time with the elevated loglevel?

@useafterfree
Copy link

I am also seeing this problem:

turncat -v - k8s://stunner/udp-gateway:udp-listener udp://${PEER_IP}:9001:

08:41:32.460190 main.go:81: turncat-cli DEBUG: Reading STUNner config from URI "k8s://stunner/udp-gateway:udp-listener"
08:41:32.460296 main.go:163: turncat-cli DEBUG: Searching for CDS server
08:41:32.460312 k8s_client.go:154: cds-fwd DEBUG: Obtaining kubeconfig
08:41:32.461017 k8s_client.go:161: cds-fwd DEBUG: Creating a Kubernetes client
08:41:32.461312 k8s_client.go:196: cds-fwd DEBUG: Querying CDS server pods in namespace "<all>" using label-selector "stunner.l7mp.io/config-discovery-service=enabled"
08:41:32.488454 k8s_client.go:367: cds-fwd DEBUG: Found pod: stunner-system/stunner-gateway-operator-controller-manager-foo-bar
08:41:32.488604 k8s_client.go:376: cds-fwd DEBUG: Creating a SPDY stream to API server using URL "https://10.0.1.4:16443/api/v1/namespaces/stunner-system/pods/stunner-gateway-operator-controller-manager-foo-bar/portforward"
08:41:32.488725 k8s_client.go:384: cds-fwd DEBUG: Creating a port-forwarder to pod
08:41:32.488771 k8s_client.go:400: cds-fwd DEBUG: Waiting for port-forwarder...
08:41:32.516363 k8s_client.go:419: cds-fwd DEBUG: Port-forwarder connected to pod stunner-system/stunner-gateway-operator-controller-manager-foo-bar at 127.0.0.1:37641
08:41:32.516420 cds_api.go:215: cds-client DEBUG: GET: loading config for gateway stunner/udp-gateway from CDS server 127.0.0.1:37641
08:41:32.527517 main.go:88: turncat-cli DEBUG: Generating STUNner authentication client
08:41:32.527574 main.go:95: turncat-cli DEBUG: Generating STUNner URI
08:41:32.527591 main.go:102: turncat-cli DEBUG: Starting turncat with STUNner URI: turn://8.0.0.8:3478?transport=udp
08:41:32.527637 turncat.go:186: turncat INFO: Turncat client listening on file://stdin, TURN server: turn://8.0.0.8:3478?transport=udp, peer: udp:10.152.183.128:9001
08:41:32.527653 main.go:118: turncat-cli DEBUG: Entering main loop
08:41:32.527739 turncat.go:227: turncat DEBUG: new connection from client /dev/stdin
08:41:32.535533 client.go:110: turnc DEBUG: Resolved STUN server 8.0.0.8:3478 to 8.0.0.8:3478
08:41:32.535563 client.go:119: turnc DEBUG: Resolved TURN server 8.0.0.8:3478 to 8.0.0.8:3478

@rg0now
Copy link
Member

rg0now commented Apr 2, 2024

Can you please elevate the loglevel on the gateway so that we see why the connection hangs and rerun the test?

Just to make it clear: after elevating the stunnerd loglevel to all:TRACE, please repeat the turncat test and post the logs from the stunnerd pod, and not from the operator. The below would do it for the current setup:

kubectl -n stunner logs $(kubectl -n stunner get pod -l app=stunner -o jsonpath='{.items[0].metadata.name}')

This is because we need to see whether the connection request from turncat has made it to stunnerd (if not, then this is a LB issue), and if it did, then what happened to the connection after authentication. The last line of log we see above is this:

05:41:38.565349 handlers.go:25: stunner-auth INFO: static auth request: username="user-1" realm="stunner.l7mp.io" srcAddr=X.X.X.81:39986

We need to see what happened afterwards in the dataplane. Frankly, the whole thing is quite mysterious: if the authentication request were not successful then we would see that clearly in the logs, but if it was (like in our case), then why the client did not continue with establishing the connection? That's what the trace level logs would reveal (I hope).

One minor silly thing: after running turncat, try to send something and press Enter, because turncat waits on the standard input for data to be sent to the greeter. I guess you know that anyway, just to be absolutely sure.

@useafterfree
Copy link

Can you please elevate the loglevel on the gateway so that we see why the connection hangs and rerun the test?

Just to make it clear: after elevating the stunnerd loglevel to all:TRACE, please repeat the turncat test and post the logs from the stunnerd pod, and not from the operator. The below would do it for the current setup:

kubectl -n stunner logs $(kubectl -n stunner get pod -l app=stunner -o jsonpath='{.items[0].metadata.name}')

This is because we need to see whether the connection request from turncat has made it to stunnerd (if not, then this is a LB issue), and if it did, then what happened to the connection after authentication. The last line of log we see above is this:

05:41:38.565349 handlers.go:25: stunner-auth INFO: static auth request: username="user-1" realm="stunner.l7mp.io" srcAddr=X.X.X.81:39986

We need to see what happened afterwards in the dataplane. Frankly, the whole thing is quite mysterious: if the authentication request were not successful then we would see that clearly in the logs, but if it was (like in our case), then why the client did not continue with establishing the connection? That's what the trace level logs would reveal (I hope).

One minor silly thing: after running turncat, try to send something and press Enter, because turncat waits on the standard input for data to be sent to the greeter. I guess you know that anyway, just to be absolutely sure.

Does turncat automatically add in credentials from the deployment, or do we have to add them in the udp connection string?

@rg0now
Copy link
Member

rg0now commented Apr 2, 2024

Theoretically, it should. It actually asks the operator for the running config of the gateway corresponding to the k8s:// URI so it should see up-to-date settings. It even generates its own ephemeral auth credential if that's what you've set. So try turncat as above (without the auth credentials) and if you get an authentication error then that's a bug.

@rg0now
Copy link
Member

rg0now commented Apr 18, 2024

Closing this for now, feel free to reopen of anything new comes up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants