-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tailscale not installing and stuck in "deploying" status #1874
Comments
➤ Bug Clerk commented: Thank you for submitting this TrueNAS Bug Report! So that we can quickly investigate your issue, please attach a Debug file and any other information related to this issue through our secure and private upload service below. Debug files can be generated in the UI by navigating to System > Advanced > Save Debug. https://ixsystems.atlassian.net/servicedesk/customer/portal/15/group/37/create/153 |
➤ Tyler Jiles commented: i have uploaded the debug file |
➤ Bonnie Follweiler commented: Good Morning Tyler Jiles. I have moved this ticket into our queue to review. An engineering representative will update with any further questions or details in the near future. |
Hello, can you please share the output of the following commands (run them on the host) sudo k3s kubectl get secret -n ix-tailscale
sudo k3s kubectl get sa -n ix-tailscale
sudo k3s kubectl get deploy -n ix-tailscale tailscale -o jsonpath={'.spec.template.spec.automountServiceAccountToken'} | jq
sudo k3s kubectl get rolebinding -n ix-tailscale -o jsonpath={'.items'} | jq
sudo k3s kubectl get role -n ix-tailscale -o jsonpath={'.items'} | jq Thanks |
Hello, I'm having the same issue with it being stuck in "deploying" forever. I ran the command and here's the output: NAME TYPE DATA AGE |
What does your log say? |
I believe this might the same issue as I've had for months every time I've tried any Tailscale chart for TrueNAS SCALE. The container logs always die with:
This happens both with the iX chart, and also the Truecharts version of Tailscale. |
Hmm, |
@stavros-k I actually don't remember. It was about 6 months I originally installed it. Is there a way to check after the fact? I'm happy to try both options on fresh installs of TrueNAS SCALE on a couple of VMs. I've had this issue since day one with both charts, and through all new versions until current v23.10.1. Tailscale was basically the first thing I tried to setup. |
What user do you use to login to TrueNAS webui? |
Ah, I have an |
Interesting, I'll spin a VM and make some tests to check if that might be the issue. |
Bummer, I just installed a VM with 23.10.1 with |
@stavros-k I have a |
@stavros-k I have the exact same issue as @jimeh in their comment - stuck on permission and getting authkey. Tried with/without the following options: host network, advertise routes, and accept DNS. I log into the TrueNAS Scale web UI as root, fyi. Edited: Actually, my logs end with
|
Yea this was mostly, to make sure it isn't something very obvious, I wouldn't expect this to be the issue. |
@stavros-k if it helps, initial install for me was v22.12.3.1 on 2023-07-01. First three things I did was create a ZFS pool, create a custom user, and then attempt run the Tailscale chart. I haven't had time yet to properly poke around with things to check the CA certs in the container against those that the HTTPS API endpoints are signed with. I'll try and put aside for that next week. |
That would be great, Thanks! |
Seeing the same issue. Set up as Admin user during setup and I do have a root user. Output of requested commands:
App logs
|
Just commenting here to say that I have the exact same issue when I try installing Tailscale (both iX and Truecharts versions) with Cobia. I've tried running Tailscale on each new Cobia release, but still nothing. I have to revert back to Bluefin to get Tailscale to deploy properly. I'm currently still on the last release of Bluefin due to this. |
Ditto. Fresh 23.10.1.3 install, log with admin, stuck on Deploying whenever "Host Network" is checked. Tried all checkbox combinations, it is definitely related to the "Host Network" setting. Edit: Removing completely the Tailscale app and reinstalling with "Host Network" checked did work, so it appears that changing the existing instance is actually the issue. Still no Tailscale interface visible in the GUI. Will keep experimenting. Edit 2: Even though the interface does not show in the GUI, it is there, and working. I can ping, SSH and otherwise manage other Tailscale nodes via their IP's. |
@Cellobita I wonder why that is for you. I've tried with "Host Network" both enabled and disabled. Doesn't change the issue and the container's logs still say the same few lines. When the next Cobia release drops, I'll try again to see if there's any change. |
@maru801 I have no idea, this is my first Tailscale deployment - still learning the ropes. Keep in mind that I had to delete the existing Tailscale instance and reinstall with "userspace" unchecked and "host network" checked. This did the trick, and afterwards I did the same procedure on a few more TN Scale servers, and they all worked without a hitch. |
@maru801 no luck for me either with @Cellobita's instructions. A bit off-topic, but while we wait for a fix: For those looking for an alternative while this issue with Tailscale+Cobia is fixed, I've set up a Debian VM in TrueNAS's Virtualization tab, and installed Tailscale within that VM. Would still prefer to run it as a chart app though so I don't have to run an entire VM just for Tailscale. Some things to keep in mind when setting this up:
|
@jenesuispasbavard Thanks for the tips. I'm still on Bluefin as Tailscale still works with that for me. It's TN Scale Cobia that's giving me this issue. |
Does anyone having this issue use custom domain and/or additional domains? |
@stavros-k I believe I just left that all at their defaults when I installed Truenas Scale. I just gave it a custom host name. I only have "local" under the domain and no other additional domains. Sorry, I'm not going to be able to help test this as I'm waiting until the next Cobia release to update to it and try again with this issue. |
Hello, @aaronpoweruser |
No, I have never heard of that domain, I am not behind a vpn or proxy. |
You might wanna check your DNS setup. something seems to high-jacking requests and redirecting somewhere else. |
Updated DNS via router and true nas to new immediate affect (I did not flush dns cache so possible my changes did not take). Tailscale is working after upgrading to |
I decided to do a fresh install of the latest Truenas Scale release (v23.10.1.3) to see if I could get Tailscale to work with having all of my settings erased. Nothing. After running Tailscale, when I try to check out the logs for it, it takes a good while of nothing showing up until I get the same 4 lines as @jenesuispasbavard states they're getting (all ending with the "context deadline exceeded" line). I've looked around my router's settings to see if I could update the DNS like @aaronpoweruser did, but my router appears to not have anywhere I can do this myself. I reverted back to Truenas Scale 22.12.4.2. To note, I was able to almost get it to work one time. I tried running Tailscale with userspaces unchecked and host network checked (as others said to try out in this thread), and it almost worked. The app actually accepted the ssh key I gave it (and I was able to give access on my Tailscale homepage. However, it didn't do anything else after that. It just froze doing nothing. |
SCALE 23.10.2 should be available tomorrow, but its bug tracker did not show any fixed issues obviously related to this. I have been using Tailscale in 23.10.1.3 and it is working perfectly for me, on all five SCALE devices that I have under my management (they all updated flawlessly to 1.58.2 a few days ago). I know this is infuriating for those that can't get it to work, but still worth mentioning. |
Just in case, I've tried running Tailscale again with Scale v23.10.2 and nothing. Reiterating other's previous logs, here is what the log says (after loading for a minute):
|
I have the same error on v.23.10.2
Host Networking - enabled edit: upgraded from TN Core to Scale. Does that matter? reading on the previous comments,
--> No edit 2: I am using an admin user (non-root). I tried to install it using the root user in the GUI too though, but I get the same error. I tried changing my nameserver to 1.1.1.1 in Global Configuration, but still same error Output of the k3s command:
|
I tried once again to fix this issue now that the first stable release of Dragonfish is out for Scale. I updated to v24.04.0. So I managed to fix my issue. Turns out it was the nameserver I was using. I was using my default gateway as one of the three nameservers. I switched to the following:
I think I only needed one of the two, but either way, Tailscale loaded up fine after the change. To add more detail, I found if I include my default gateway address as one of the three nameservers, it results in Tailscale not being able to load and get stuck with the permissions issue. So as long as I don't include at all my default gateway and use other DNS addresses, this issue is fixed for me. |
@maru801 That's an interesting find. Is your default gw one of those that ISPs send? Or is it some other like pf/opn sense or other brand? Any notable settings in the DNS configuration in your default gw? I'm still not able to reproduce tho, I have my default gw as my only nameserver, on 2 locations. |
@stavros-k I have been using the default nameserver that's auto-populated when you first install Truenas. And yes, it's the default gateway that my ISP sends (my network is still running with an old ISP-provided all-in-one router+modem). I'm not using anything like pf/opn sense. In fact, the only things I ever changed in the Global Config section for network is the hostname (and now the two nameserver additions). Everything else is how it was auto-populated when I installed Truenas. I'm thinking that this issue might be regional or ISP specific. Like maybe everyone that gets this issue shares something in common with their ISP that's just incompatible with Tailscale. I was using my ISP's default DNS server just fine on Bluefin. Anything newer so far (Cobia, Dragonfish) results in Tailscale not being able to start with my ISP's DNS server. I have no idea what changed in the code after Cobia to cause this. I don't know if this is related, but if it helps (since it's also a network issue), I also found that with Dragonfish, I'm not able to access another server on my local network by using its hostname. I was able to on Bluefin. Now, I need to specify the IP address instead of the hostname. |
@stavros-k I'm also suffering from this issue after upgrading to the stable Dragonfish release (was working when on RC). I will try to provide all the instructions you've asked for above. I installed as root on 23 before upgrading to 24 RC then finally to 24 release which broke. Using TailScale from TrueNAS not TrueCharts, I've enabled Advertise Exit Nodes with my local network under advertised routes. Userspace and Accept DNS unchecked, Host Network is checked (for the advertised routes). Networking is through a setup bridge (br0) with a static IP, Nameservers is only gw.
Any help would be greatly appreciated as I can't find any solutions in this thread that work for me. |
I don't think its Tailscale issue, because the error is that tailscale can't access the kubernetes secret store to get/set some details. But it indeed seems to be more related to DNS. (It's always DNS lol). |
Okay I found a way to fix my issue, I'm going to post in hopes that this affects others. I found at least two possible avenues of issues, but I'm honestly not sure which fixed it. I had setup a Bridge (br0) for my Home Assistant OS VM install on the TruenNAS and while everything seemed to work at first after the simple switch (Just setting up the br0 instance with the static IP of the old interface), I noticed that the issue in the logs were all about accessing the Kubernetes IP ranges (172.xxx.xxx.xxx). So after digging around I found under | Apps > Settings > Advanced Settings | there is a section for specifying details about the route interface and it was empty. It seemed to not impact anything but I never checked this location before. I setup my router as the default gateway and my bridge here. Then as @stavros-k mentioned its possibly DNS related I went to my Network page and put under Name Servers Quad9 and Cloudflair as Name Servers so that everything would be have some form of default (before it just had the Default Route with my router). After this, for good measure, I rebooted the TrueNAS and when things came back up I was able to get TailScale to finally work again. (I did a full reinstall with a new key as part of my previous troubleshooting unsure if needed). |
Removing the app and reinstalling with userspace unchecked and host checked worked for me. |
tailscale is stuck in the deploying status and seems to be failing to launch (see history below):
2023-12-04 18:16:06
Back-off restarting failed container tailscale in pod tailscale-6bdbbf4876-bgwrp_ix-tailscale(31c45dfc-6b4d-4738-96ba-f19a5c9664bf)
2023-12-04 18:11:09
Startup probe failed: command "tailscale status" timed out
2023-12-04 18:10:54
Created container tailscale
2023-12-04 18:10:54
Started container tailscale
2023-12-04 18:10:52
Created pod: tailscale-6bdbbf4876-bgwrp
2023-12-04 18:10:52
Successfully assigned ix-tailscale/tailscale-6bdbbf4876-bgwrp to ix-truenas
2023-12-04 18:10:52
Add eth0 [172.16.0.8/16] from ix-net
2023-12-04 18:10:52
Container image "tailscale/tailscale:v1.54.0" already present on machine
2023-12-04 18:10:51
Scaled up replica set tailscale-6bdbbf4876 to 1
The text was updated successfully, but these errors were encountered: