Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to get Cloudrun example working #2690

Closed
polds opened this issue Aug 20, 2021 · 11 comments
Closed

Unable to get Cloudrun example working #2690

polds opened this issue Aug 20, 2021 · 11 comments

Comments

@polds
Copy link

polds commented Aug 20, 2021

I posted about this over on the forum, but the more I tweak it the more I believe that I'm probably not the cause. I'm trying to get the Cloudrun example running in either Cloudrun or GKE (GKE is the ideal goal). I was originally trying a reverse proxy through nginx, but threw together a really simple go app to see if it was a packet forwarding issue, that tries to connect to both the IPv4 and IPv6 address of one of my Tailscale services.
I can see both GKE and Cloudrun joining my network with ephemeral tokens and being allocated an IPv6 address, from my local network I can hit the running service using the Tailscale ip, but they're unable to hit any of my running services.

Both of the gists I posted above work locally using standard docker build/run connecting with Tailscale, etc. It just doesn't work on GCP. Using the Go app example on Cloudrun and GKE the IPv4 test times out and over IPv6 on Cloudrun there's an i/o timeout and on GKE just a connect: cannot assign requested address.

This is a bugreport generated from a GKE pod: BUG-3df2c46a509f2ad1cef11ebfc6df074a7094fc2e47f19e4a4d34feedb585986f-20210820043224Z-ff622957e28b3d52, I could probably get one from a cloudrun instance if it'd help.

I'm looking for some ideas about what might be happening.

@DentonGentry
Copy link
Contributor

Could you say what IP address the GKE app is trying to connect to at the time of the BUG-3df2c46a509f2ad1cef11ebfc6df074a7094fc2e47f19e4a4d34feedb585986f-20210820043224Z-ff622957e28b3d52 ?

@DentonGentry
Copy link
Contributor

Looking at https://gist.github.com/polds/2ffdbd1251a76b6c9287df809b26880f#file-nginx-conf-L11

        proxy_pass http://100.99.71.121:5000; 

Use of an ephemeral auth key was mentioned: ephemeral nodes only get a Tailscale IPv6 address, of the form fd7a:115c:a1e0:ab12:1234:1234:1234:1234. It won't be able to connect to an IPv4 address of the form 100.x.y.z.

You'd need to look up the IPv6 address of the 100.99.71.121, which can be done using tailscale ip -6 hostname

The proxy_pass will look like (note the brackets):

        proxy_pass http://[fd7a:115c:a1e0:ab12:1234:1234:1234:1234]:5000; 

@polds
Copy link
Author

polds commented Aug 20, 2021

Could you say what IP address the GKE app is trying to connect to

That comes from the second gist which tries both the IPv4 (http://100.99.71.121:5000) and the IPv6 address (http://[fd7a:115c:a1e0:ab12:4843:cd96:6263:4779]:5000) successively (neither of which work).

You'd need to look up the IPv6 address of the 100.99.71.121

I've tried the same with the nginx config using the IPv6 address which didn't work either.

As a question, my local docker runs also pull an IPv6 address but work connect to the IPv4 just fine. Why would that change?

Also as a further question, does that mean ephemeral keys cannot use subnet routers?

@DentonGentry
Copy link
Contributor

As a question, my local docker runs also pull an IPv6 address but work connect to the IPv4 just fine. Why would that change?

If the local Docker container is also using an Ephemeral key to authenticate, then it only gets an IPv6 address. It it is managing to connect to an IPv4 endpoint then it is bridging through the host to do so.

Also as a further question, does that mean ephemeral keys cannot use subnet routers?

Nodes authenticated using an Ephemeral key can use subnet routers for IPv6 prefixes. IPv4 routes like 10.1.1.0/24 will not work.

@DentonGentry
Copy link
Contributor

fd7a:115c:a1e0:ab12:4843:cd96:6263:4779 is node ID [g6dGy]. Shortly before the BUG-3df2c46a509f2ad1cef11ebfc6df074a7094fc2e47f19e4a4d34feedb585986f-20210820043224Z-ff622957e28b3d52 we see what looks like fd7a:115c:a1e0:ab12:4843:cd96:6263:4779 trying to connect to the GKE node (it sent a call-me-maybe through the DERP relay):

2021-08-20 04:48:46.56512355 +0000 UTC: magicsock: disco: d:3c0bf5063ba74d06->d:ae63304db1e18149 ([g6dGy], derp-10) sent call-me-maybe
2021-08-20 04:48:46.565163692 +0000 UTC: magicsock: disco: d:3c0bf5063ba74d06->d:ae63304db1e18149 ([g6dGy], derp-10) sent ping tx=2269c0b5e291
2021-08-20 04:48:46.58736709 +0000 UTC: magicsock: disco: d:3c0bf5063ba74d06<-d:ae63304db1e18149 ([g6dGy], 127.3.3.40:10)  got pong tx=2269c0b5e291 latency=22ms pong.src=127.3.3.40:10
2021-08-20 04:48:49.564866241 +0000 UTC: magicsock: disco: timeout waiting for pong e26309f3daad from 10.5.0.4:41641 ([g6dGy], d:ae63304db1e18149)
2021-08-20 04:48:49.564994499 +0000 UTC: magicsock: disco: timeout waiting for pong b744c2185dc9 from 73.118.208.63:41645 ([g6dGy], d:ae63304db1e18149)
2021-08-20 04:48:49.565036374 +0000 UTC: magicsock: disco: timeout waiting for pong 76d9d8096a1e from 192.168.1.55:41641 ([g6dGy], d:ae63304db1e18149)
2021-08-20 04:48:49.565255984 +0000 UTC: magicsock: disco: timeout waiting for pong 1935eb9243a7 from 198.54.131.77:41641 ([g6dGy], d:ae63304db1e18149)

The GKE node tries sending pings to every IP address it knows of for fd7a:115c:a1e0:ab12:4843:cd96:6263:4779, and the path that works and gets a response is... 127.3.3.40:10 ?
Does that make sense?

@polds
Copy link
Author

polds commented Aug 20, 2021

I'm not sure what that 127.3.3.:40:10 address is, but based on that it does look like it should be responding?

I've thrown together a repo that has 3 different attempts at getting Cloudrun to work - GKE is still the desired target, but I figure if I can get one working the other would be easy. I'm hoping you might be able to take a look and just see if I'm just plain blatantly missing something.

The repo is at tailscale-poc the three different (and failed) technologies I'm using are:

At the moment I've got these deployed with:

gcloud builds submit --config=cloudbuild.yaml --substitutions=_TAILSCALE_AUTH="tskey-mykey",_TAILSCALE_ENDPOINT="http://[fd7a:115c:a1e0:ab12:4843:cd96:6263:4779]:5000"

So they are all attempting to reach the IPv6 target of a service, none of them are working - except when run locally.

Thanks for looking at the logs for this. I'm stumped.

@DentonGentry
Copy link
Contributor

https://github.com/polds/tailscale-poc/blob/main/cmd/fetch-headers/main.go uses http.NewRequest*, which does not automatically use a SOCKS5 proxy from environment variables. It has to be told to do so, either by making the URL be socks5:// or by setting the Proxy:

        client := &http.Client{
		Transport: &http.Transport{
			Proxy:                 http.ProxyFromEnvironment,
		},
	}

@polds
Copy link
Author

polds commented Aug 20, 2021

Thanks for pointing that out and taking a look. It still doesn't work, and also wouldn't explain the other two containers. Do you happen to have a working Cloudrun container/repo I could try deploying into my network to see if it's just something I've done incorrectly?

This is the error I see from the Go example:

fetching http://[fd7a:115c:a1e0:ab12:4843:cd96:6263:4779]:5000...

unable to issue http request: Get "http://[fd7a:115c:a1e0:ab12:4843:cd96:6263:4779]:5000": dial tcp [fd7a:115c:a1e0:ab12:4843:cd96:6263:4779]:5000: connect: network is unreachable

@DentonGentry
Copy link
Contributor

DentonGentry commented Aug 20, 2021

The code I used in CloudRun and which eventually turned into the content of the Knowledge Base article connects to an MQTT server in my house. The code can be seen at https://gist.github.com/DentonGentry/157c87a20fabfa83e798426d5b44a057

The Paho MQTT client for Go I used will use the ALL_PROXY environment variable automatically if it is set, and connect via SOCKS5.


In the Go example you showed, I'd expect to see something about use of a Proxy. I think it still isn't connecting to SOCKS5, it is making a straight HTTP connection and saying the address is unreachable.

@polds
Copy link
Author

polds commented Aug 20, 2021

Switching my Go example to use HTTP_PROXY instead of ALL_PROXY has made the go example work on Cloudrun - you were correct about it not using the SOCKS5 proxy. I suspect the nginx and caddy examples are doing something similar. I'll have to find their settings for using socks proxies and see if that helps.

Progress! (And thanks again)

@DentonGentry
Copy link
Contributor

Since it seems like the effort is unblocked, I'll close this now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants