Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/acme: 20.03 -> 20.09 regression with opensmtpd #101389

Closed
wizeman opened this issue Oct 22, 2020 · 13 comments · Fixed by #147784 · May be fixed by #123261
Closed

nixos/acme: 20.03 -> 20.09 regression with opensmtpd #101389

wizeman opened this issue Oct 22, 2020 · 13 comments · Fixed by #147784 · May be fixed by #123261
Labels
0.kind: bug 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md

Comments

@wizeman
Copy link
Member

wizeman commented Oct 22, 2020

Describe the bug
When upgrading from 20.03 to 20.09, opensmtpd stopped working (apparently) due to ownership changes in #91121:

Oct 22 14:40:01 hostname systemd[1]: Started opensmtpd.service.
Oct 22 14:40:03 hostname smtpd[758]: warn:  /var/lib/acme/mx.example.com/fullchain.pem: not owned by uid 0
Oct 22 14:40:03 hostname smtpd[758]: smtpd: load_pki_tree: failed to load certificate file
Oct 22 14:40:03 hostname systemd[1]: opensmtpd.service: Main process exited, code=exited, status=1/FAILURE
Oct 22 14:40:03 hostname systemd[1]: opensmtpd.service: Failed with result 'exit-code'.

As you can see, opensmtpd seems to require the certificates to be owned by root.
I tried adding chown root fullchain.pem key.pem to security.acme.certs.postRun but it seems that the ownership resets to acme when rebooting.

To Reproduce

  1. Configure acme to generate a certificate
  2. Configure opensmtpd to use the certificate

Expected behavior
opensmtpd should start without errors.

Notify maintainers
@m1cr0man @lheckemann @mweinelt @flokli

Metadata

 - system: `"x86_64-linux"`
 - host os: `Linux 5.4.72, NixOS, 20.09.git.4d3d4221cad (Nightingale)`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.3.7`
@andir
Copy link
Member

andir commented Oct 25, 2020

Merely a workaround but you could probably copy them to another directory after each run. This isn't pretty but would at least unblock you - if this is a blocker at all.

In the long run we might want to consider supporting this and maybe OpenSMTPD can be taugtht to not enforce the uid 0 rule. Removing the code that does that specific check seems rather trivial: https://github.com/OpenSMTPD/OpenSMTPD/blob/a6a39cb0cb6ee72d194f84d6763997df7e0c0bf3/usr.sbin/smtpd/ssl.c#L117-L121. I wonder what they threat model looked like. Is their assumption having files as another uid is less/more safe?

@wizeman
Copy link
Member Author

wizeman commented Oct 25, 2020

Merely a workaround but you could probably copy them to another directory after each run. This isn't pretty but would at least unblock you - if this is a blocker at all.

Thanks, I will try this!

I wonder what they threat model looked like. Is their assumption having files as another uid is less/more safe?

Presumably, having a file as another uid is less safe, otherwise it wouldn't make sense to enforce that rule. One reason might be that if a file has 600 or 400 permissions and is owned by root, then it can only be read by root. If it is owned by another uid, then it can be read by root and that uid. But I don't know if this is the actual reason for the rule.

@m1cr0man
Copy link
Contributor

Presumably, having a file as another uid is less safe, otherwise it wouldn't make sense to enforce that rule. One reason might be that if a file has 600 or 400 permissions and is owned by root, then it can only be read by root. If it is owned by another uid, then it can be read by root and that uid. But I don't know if this is the actual reason for the rule.

The reason for enforcing the acme user is twofold:

  1. We didn't want the ACME services to run as root as there was no technical reason to do so and it posed a security risk (you have a program contacting some HTTP service and writing files to disk, it should be minimally privileged)
  2. In order to ensure that all certs could be created/updated by the ACME services it was necessary to enforce the acme user everywhere.

Copying them to another directory would be the best option here. The real thing you're fighting with regards to adding chown to postRun is the acme-fixperms.service, but even if that worked the 750 permissions would break the renew service entirely (as the acme user wouldn't be able to write new certs). I think this is a particularly unique issue, given that they are enforcing UID 0 for certs (to what end?), and having thought it through, allowing the user to be changed would be more cumbersome than simply copying the certs in the postRun script.

It might be possible to do some adaptation in the service scripts so that all the chown commands use $USER instead of acme. That would mean you could then override all the systemd service users by hand (via `systemd.services..serviceConfig.User`) and get what you need. However I don't see this solving any issues that a copy command in postRun doesn't solve, and is also more complex to use.

@flokli
Copy link
Contributor

flokli commented Oct 25, 2020

I'd also propose copying them to another directory for now, outside of the acme module.

Once systemd 247 has landed, this should probably be revisited. It adds LoadCredential, which can be used to pass around credentials to processes without them having access to the bare files themselves.

Then, files will be made readable by the process that runs the service:

The data is accessible from the unit's processes via the file system, at a read-only
location that (if possible and permitted) is backed by non-swappable memory. The data is only
accessible to the user associated with the unit, via the
User=/DynamicUser= settings (as well as the superuser). […]

This will probably be root, at least currently, as we run opensmtpd.service take care of changing to a less privileged user.

In the more long-term, it might be a good idea to go into discussions with upstream of running opensmtpd, as a less-privileged user, and provide the necessary capabilities to bind on low ports, like we do with nginx.

@wizeman
Copy link
Member Author

wizeman commented Oct 25, 2020

I can confirm that copying to another directory seems to work for me. Thanks!

@aneeshusa
Copy link
Contributor

I'm running into the same issue with PostgreSQL (sanitized):

Oct 26 00:58:36 myhost systemd[1]: Starting PostgreSQL Server...
Oct 26 00:58:37 myhost postgres[19852]: [19852] FATAL:  private key file "/var/lib/acme/myhost.example.com/key.pem" must be owned by the database user or root
Oct 26 00:58:37 myhost postgres[19852]: [19852] LOG:  database system is shut down
Oct 26 00:58:37 myhost systemd[1]: postgresql.service: Main process exited, code=exited, status=1/FAILURE
Oct 26 00:58:37 myhost systemd[1]: postgresql.service: Failed with result 'exit-code'.
Oct 26 00:58:37 myhost systemd[1]: Failed to start PostgreSQL Server.

We do already run PostgreSQL as the user postgres, and it's unclear if LoadCredential will work for it from the systemd docs. Looking at the implementation it seems to try to use ACLs and otherwise fallback to file system ownership. Might just have to try it once we get systemd 247.

@aneeshusa
Copy link
Contributor

I added some pre-start code to do the "copy certs to a new directory" and it looks like PostgreSQL also cares about the permissions:

Oct 26 01:24:14 myhost systemd[1]: Starting PostgreSQL Server...
Oct 26 01:24:14 myhost postgres[20921]: [20921] FATAL:  private key file "/run/postgresql/certs/key.pem" has group or world access
Oct 26 01:24:14 myhost postgres[20921]: [20921] DETAIL:  File must have permissions u=rw (0600) or less if owned by the database user, or permissions u=rw,g=r (0640) or less if owned by root.
Oct 26 01:24:14 myhost postgres[20921]: [20921] LOG:  database system is shut down
Oct 26 01:24:14 myhost systemd[1]: postgresql.service: Main process exited, code=exited, status=1/FAILURE

Handling it in my pre-start code but something to keep in mind when developing a better solution.

@stale
Copy link

stale bot commented Apr 26, 2021

I marked this as stale due to inactivity. → More info

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Apr 26, 2021
@fadenb
Copy link
Contributor

fadenb commented Apr 26, 2021

Did not verify (yet) whether this still breaks.
If this is solved we should document how it was solved.

@m1cr0man
Copy link
Contributor

Totally forgot about this actually. It's been so long that @flokli's comment about using LoadCredential can and should be tested as a solution, as it's something that would work to fix the mentioned Postgres issues too.

I have to fix some bugs with the postfix tests too, was planning to do it this weekend and will do this at the same time if no one is in a rush for it.

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Apr 26, 2021
@m1cr0man
Copy link
Contributor

For both opensmtp and Postgresql using LoadCredential is a viable solution. Shown below is a minimal opensmtpd config which loads:

{ pkgs, config, ... }:
let
  mailDomain = "example.com";
  certDir = config.security.acme.certs."${mailDomain}".directory;
in {
  services.opensmtpd = {
    enable = true;
    setSendmail = false;
    serverConfiguration = ''
    pki ${mailDomain} cert "cert.pem"
    pki ${mailDomain} key "key.pem"
    listen on localhost tls pki ${mailDomain}
    action act1 relay host smtp://127.0.0.1:10027
    match for local action act1
    '';
  };

  systemd.services.opensmtpd.requires = ["acme-finished-${mailDomain}.target"];
  systemd.services.opensmtpd.serviceConfig.RuntimeDirectory = "opensmtpd";
  systemd.services.opensmtpd.serviceConfig.WorkingDirectory = "/var/run/opensmtpd";
  systemd.services.opensmtpd.serviceConfig.LoadCredential = [
    "cert.pem:${certDir}/cert.pem"
    "key.pem:${certDir}/key.pem"
  ];
  systemd.services.opensmtpd.preStart = ''
    ln -s $CREDENTIALS_DIRECTORY/cert.pem .
    ln -s $CREDENTIALS_DIRECTORY/key.pem .
  '';

  security.acme.certs."${mailDomain}".postRun = ''
    systemctl restart opensmtpd
  '';
}

This results in cert.pem and key.pem having the correct permissions for opensmtpd to start up, and from testing they are correctly updated when the service is restarted. The last few lines (from LoadCredential down) are almost identical for Postgresql. There may be a way to avoid the symlinks and RuntimeDirectory change if there is a way to use environment variables in the smtpd.conf, or if the $CREDENTIALS_DIRECTORY can be predicted (it seems to be /run/credentials/$servicename, but I dont know hot reliable that is).

The only catch I can see with this right now is that LoadCredential doesn't appear to support service reloads. This is a problem for the likes of Postgres which accepts a SIGHUP for a configuration reload from disk (and thus refreshing certs) without dropping connections. That said, some users may be fine with simply restarting the service for cert renewals.

This feature is very new to Systemd and I can't find much discussion on it beyond the RFE so if someone knows if reloading/refreshing the files created by LoadCredential is possible please let me know! This does not stop us adding a useACMEHost-like option to opensmtpd right now though. :)

@m1cr0man
Copy link
Contributor

A bit of research into whether CREDENTIALS_DIRECTORY will always be in /run/credentials/$servicename led me to this section of the systemd source, where an assumption is being made that creds_path will always be underneath /run/credentials, and thus adds a mount to ensure /run/credentials exists. I can't see exactly where it picks the subdirectory path but I imagine at this point it is similarly hardcoded.

With that, the above opensmtpd configuration can be greatly simplified:

{ pkgs, config, ... }:
let
  mailDomain = "m1cr0man.com";
  certDir = config.security.acme.certs."${mailDomain}".directory;
  credsDir = "/run/credentials/opensmtpd.service";
in {
  services.opensmtpd = {
    enable = true;
    setSendmail = false;
    serverConfiguration = ''
    pki ${mailDomain} cert "${credsDir}/cert.pem"
    pki ${mailDomain} key "${credsDir}/key.pem"
    listen on localhost tls pki ${mailDomain}
    action act1 relay host smtp://127.0.0.1:10027
    match for local action act1
    '';
  };

  systemd.services.opensmtpd.requires = ["acme-finished-${mailDomain}.target"];
  systemd.services.opensmtpd.serviceConfig.LoadCredential = [
    "cert.pem:${certDir}/cert.pem"
    "key.pem:${certDir}/key.pem"
  ];

  security.acme.certs."${mailDomain}".postRun = ''
    systemctl restart opensmtpd
  '';
}

Now we don't need to set Runtime/WorkingDirectory or add preStart scripts. This is remarkably subtle to add to existing services where specific permissions are required for the cert files.

@stale
Copy link

stale bot commented Nov 16, 2021

I marked this as stale due to inactivity. → More info

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Nov 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md
Projects
None yet
6 participants