Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tailscale: services.tailscale.loginServer && services.tailscale.preAuthKey && systemd.services.tailscale-autoconnect #204622

Open
ghuntley opened this issue Dec 5, 2022 · 3 comments

Comments

@ghuntley
Copy link
Member

ghuntley commented Dec 5, 2022

Project description

The goal is to improve operability between services.tailscale.* and services.headscale.*

Metadata

Opening up GitHub issue for discussion about adjusting services.tailscale to enable

  1. Defining a custom preAuthKey (it is excepted someone would configure ACLs to deny by default any freshly provisioned servers with this key)
  2. Definition a custom control plane (ie. allow someone to wire services.tailscale into services.headscale
  3. Shipping a systemd unit for one-shot activation of 1) and 2)
  4. Potentially shipping extraUpFlags 👇 for usage with systemd.services.tailscale-autoconnect
    extraUpFlags = mkOption {
        type = types.listOf types.str;
        description = lib.mdDoc "Extra flags passed to the Tailscale up command.";
        example = literalExpression ''[ "--exit-node" "exitnode.example.com" ]'';
        default = [];
    };

Proposed API something like this but no strong feelings, I'm after the outcome here...

{ pkgs, config, lib, ... }:

let
  cfg = config.services.tailscale;

in
 {

  ###### interface

  options.services.tailscale = {
    loginServer = mkOption {
        type = types.str;
        default = "https://controlplane.tailscale.com";
        description = lib.mdDoc ''Base URL of the Tailscale (or Headscale) control server.'';
        example = literalExpression ''"https://controlplane.example.com"'';
    };

    preAuthKey = mkOption {
        type = types.str;
        default = "";
        description = lib.mdDoc ''Node pre-authentication key; if it begins with "file:", then it's a path to a file containing the authkey'';
        example = literalExpression ''"file://var/lib/tailscale/auth.key"'';
    };
  }

  ###### implementation


  config = mkIf config.services.tailscale.enable {

    environment.systemPackages = with pkgs; [ tailscale ];

    # create a oneshot job to authenticate to Tailscale
    systemd.services.tailscale-autoconnect = {
      description = "Automatic connection to Tailscale";

      # make sure tailscale is running before trying to connect to tailscale
      after = [ "network-pre.target" "tailscale.service" ];
      wants = [ "network-pre.target" "tailscale.service" ];
      wantedBy = [ "multi-user.target" ];

      # set this service as a oneshot job
      serviceConfig.Type = "oneshot";

      # have the job run this shell script
      script = with pkgs; ''
        # wait for tailscaled to settle
        sleep 2

        # check if we are already authenticated to tailscale
        status="$(${tailscale}/bin/tailscale status -json | ${jq}/bin/jq -r .BackendState)"
        if [ $status = "Running" ]; then # if so, then do nothing
          exit 0
        fi

        # otherwise authenticate with tailscale
        ${tailscale}/bin/tailscale up --login-server=${loginServer} -authkey=${preAuthKey}
      '';
  };
}

cc existing maintainers of both packages // @kradalby @danderson @mbaillie @twitchyliquid64

@kradalby
Copy link
Member

kradalby commented Dec 5, 2022

@Xe might also be interested in this (author of https://tailscale.com/blog/nixos-minecraft/, which is where I copied my first iteration of this).

I think the extra flags option is needed, at least for me to find this useful (my version).

Maybe changing the sleep 2 to something that runs a loop on $status = "Running" with a timeout instead to make it a bit more robust/not depend on timing?

@Xe
Copy link
Contributor

Xe commented Dec 5, 2022

I did actually just write another iteration of this for my blog (in an upcoming post about Terraform crimes), but I put the authkey in /etc/tailscale/authkey. My implementation has a systemd one shot job that connects the machine to Tailscale after lustrating the key over from Ubuntu running cloud-init and nixos-infect. You can see my code here: https://github.com/Xe/automagic-terraform-nixos/blob/main/main.tf#L51-L83

I would find this useful, especially because it would make it a lot easier for me to automate the creation of NixOS machines on my tailnet.

I have to admit that the sleep 2 is there as a part of the hacking process for that Minecraft post and is fairly arbitrarily chosen (sleeping for 1 second wasn't reliable enough). At the time tailscale up didn't wait for tailscaled to be ready and now it does, so that sleep 2 may not actually be relevant.

I'd also suggest creating a separate autoLogin or autoConnect set of options for this, IE:

 options.services.tailscale.autoConnect = {
    enable = mkEnableOption "Enables automatic connection to your tailnet via the following options";
 
    loginServer = mkOption {
        type = types.str;
        default = "https://controlplane.tailscale.com";
        description = lib.mdDoc ''Base URL of the Tailscale (or Headscale) control server.'';
        example = literalExpression ''"https://controlplane.example.com"'';
    };

    preAuthKey = mkOption {
        type = types.str;
        default = "/var/lib/tailscale/auth.key";
        description = lib.mdDoc ''Node pre-authentication key; if it begins with "file:", then it's a path to a file containing the authkey'';
        example = literalExpression ''"/var/lib/tailscale/auth.key"'';
    };
  }

Then you could have your mkIf on config.services.tailscale.autoConnect.enable instead. You could also add a ConditionPathExists to the oneshot job for the preAuthKey path that the user provides.

@danderson
Copy link
Contributor

We almost certainly want to wait a beat before exploring this. Tailscale 1.34 is about to ship with user account switching, which changes the interfaces for connecting an account and specifying custom control servers. If we try to ship something now with 1.32, we'll just have to tear it out again in a week.

The authkey is a secret, so per NixOS's conventions should only be specifiable as a file on disk IMO. So I would lose the file: syntax and call it authkeyPath or something. If someone really wants to ship a world-readable authkey in their image, they can always use one of the various file management parts of NixOS to install a secret on disk.

Lose the sleep in the shell script. The logic you actually want there is to loop waiting for the unix socket path to exist. Once it exists, the CLI can connect to tailscaled without races. The status check also needs to be a bit more intelligent, because otherwise every time the authentication expires or the admin de-authorizes the machine in the admin panel, this logic will freak out and try to reauth, which will knock the machine offline rather than do anthing useful.

Honestly, this logic is complex enough that it almost certainly wants to be a program more advanced than a shell script. Ideally we'd teach the tailscale CLI to do "only do this login if you've never been authed before" or something, because that's behavior we want all over the place, and these tailscale up hacks only do that if you ignore the edge cases.

In general I have no objections to the idea, just ^ a bunch of thoughts about how to implement it so that it won't break constantly in future :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants