Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/acme: harden systemd units #123258

Merged
merged 2 commits into from Aug 8, 2021
Merged

Conversation

mweinelt
Copy link
Member

Motivation for this change

This change applies some hopefully non-controversial hardening. Hardening is always a fickle business and I hope I didn't step on anyones usecase.

I've run through the acme test suite and it is still working πŸ˜€

βœ— PrivateNetwork=                                             Service has access to the host's network                                                                 0.5
βœ— RestrictAddressFamilies=~AF_(INET|INET6)                    Service may allocate Internet sockets                                                                    0.3
βœ— DeviceAllow=                                                Service has a device ACL with some special devices                                                       0.1
βœ— IPAddressDeny=                                              Service does not define an IP address allow list                                                         0.2
βœ— PrivateUsers=                                               Service has access to other users                                                                        0.2
βœ— ProtectSystem=                                              Service has very limited write access to the OS file hierarchy                                           0.1
βœ— SystemCallFilter=~@privileged                               System call allow list defined for service, and @privileged is included (e.g. chown is allowed)          0.2
βœ— SystemCallFilter=~@resources                                System call allow list defined for service, and @resources is included (e.g. ioprio_set is allowed)      0.2
βœ— RootDirectory=/RootImage=                                   Service runs within the host's root directory                                                            0.1
βœ— UMask=                                                      Files created by service are world-readable by default                                                   0.1

β†’ Overall exposure level for acme-a.example.test.service: 1.5 OK :-)
Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • Determined the impact on package closure size (by running nix path-info -S before and after)
  • Ensured that relevant documentation is up to date
  • Fits CONTRIBUTING.md.

RestrictRealtime = true;
RestrictSUIDSGID = true;
SystemCallArchitectures = "native";
SystemCallFilter = "@system-service ~@resources ~@ipc ~@keyring ~@resources ~@setuid";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the most controversial part honestly.

As I understand it, in a mixed allow/deny list the evaluation goes from left to right and adds and removes things in order. So I add @system-service, then remove @resources, @ipc, etc. from it.

If you specify both types of this option (i.e. allow-listing and deny-listing), the first encountered will take precedence and will dictate the default action (termination or approval of a system call). Then the next occurrences of this option will add or delete the listed system calls from the set of the filtered system calls, depending of its type and the default action. (For example, if you have started with an allow list rule for read() and write(), and right after it add a deny list rule for write(), then write() will be removed from the set.)

Honestly I'm now not sure, why systemd-analyze security still shows I allow @privileged. I'm not worried about @privileged, because we actually allow @chown.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional to have ~@resources there twice? Also how does this affect a call to systemctl (for example to restart another service as configured in security.acme.certs.<cert>.postRun)?

Copy link
Member Author

@mweinelt mweinelt May 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional to have ~@resources there twice?

Indeed unintentional, thanks for spotting this.

Also how does this affect a call to systemctl (for example to restart another service as configured in security.acme.certs..postRun)?

This is what is in @resources, so I think not at all.

❯ systemd-analyze syscall-filter @resources
@resources
    # Alter resource settings
    ioprio_set
    mbind
    migrate_pages
    move_pages
    nice
    sched_setaffinity
    sched_setattr
    sched_setparam
    sched_setscheduler
    set_mempolicy
    setpriority
    setrlimit

Unless you meant the syscall filter in general. I think @default, which is part of @system-service takes care of that:

❯ systemd-analyze syscall-filter @default
@default
    # System calls that are always permitted
    brk
    cacheflush
    clock_getres
    clock_getres_time64
    clock_gettime
    clock_gettime64
    clock_nanosleep
    clock_nanosleep_time64
    execve
    exit
    exit_group
    futex
    futex_time64
    get_robust_list
    get_thread_area
    getegid
    getegid32
    geteuid
    geteuid32
    getgid
    getgid32
    getgroups
    getgroups32
    getpgid
    getpgrp
    getpid
    getppid
    getresgid
    getresgid32
    getresuid
    getresuid32
    getrlimit
    getsid
    gettid
    gettimeofday
    getuid
    getuid32
    membarrier
    mmap
    mmap2
    munmap
    nanosleep
    pause
    prlimit64
    restart_syscall
    rseq
    rt_sigreturn
    sched_yield
    set_robust_list
    set_thread_area
    set_tid_address
    set_tls
    sigreturn
    time
    ugetrlimit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the duplicate @resources deny entry.

Type = "oneshot";
User = "acme";
Group = mkDefault "acme";
UMask = 0022;
Copy link
Member Author

@mweinelt mweinelt May 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

βœ— UMask=                                                      Files created by service are world-readable by default                                                   0.1

I'm not sure, but can't we default to 0027 (640, 750) instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

@mweinelt mweinelt May 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, this interferes with access to http-01 challenges.

[Sun May 16 16:42:35.722162 2021] [core:error] [pid 3269:tid 140107940099648] (13)Permission denied: [client 192.168.1.1:51694] AH00035: access to /.well-known/acme-challenge/YjIqXHTp3AifLwWilpIr2bOwrNaMqBtlkfMXeaGGXhA denied (filesystem path '/var/lib/acme/acme-challenge/.well-known') because search permissions are missing on a component of the path
[Sun May 16 16:42:37.718537 2021] [core:error] [pid 3269:tid 140107906528832] (13)Permission denied: [client 192.168.1.1:51698] AH00035: access to /.well-known/acme-challenge/YjIqXHTp3AifLwWilpIr2bOwrNaMqBtlkfMXeaGGXhA denied (filesystem path '/var/lib/acme/acme-challenge/.well-known') because search permissions are missing on a component of the path
[Sun May 16 16:42:37.719453 2021] [core:error] [pid 3269:tid 140107898136128] (13)Permission denied: [client 192.168.1.1:51700] AH00035: access to /.well-known/acme-challenge/YjIqXHTp3AifLwWilpIr2bOwrNaMqBtlkfMXeaGGXhA denied (filesystem path '/var/lib/acme/acme-challenge/.well-known') because search permissions are missing on a component of the path
[Sun May 16 16:42:40.271658 2021] [core:error] [pid 3269:tid 140107789030976] (13)Permission denied: [client 192.168.1.1:51706] AH00035: access to /.well-known/acme-challenge/5MYKzt-kQKMOzuxQ1y_tF_Fzz29w1v-S8VCYKZ4nL_E denied (filesystem path '/var/lib/acme/acme-challenge/.well-known') because search permissions are missing on a component of the path
[Sun May 16 16:42:42.274052 2021] [core:error] [pid 3271:tid 140107940099648] (13)Permission denied: [client 192.168.1.1:51710] AH00035: access to /.well-known/acme-challenge/5MYKzt-kQKMOzuxQ1y_tF_Fzz29w1v-S8VCYKZ4nL_E denied (filesystem path '/var/lib/acme/acme-challenge/.well-known') because search permissions are missing on a component of the path
[Sun May 16 16:42:43.275959 2021] [core:error] [pid 3270:tid 140107889743424] (13)Permission denied: [client 192.168.1.1:51714] AH00035: access to /.well-known/acme-challenge/5MYKzt-kQKMOzuxQ1y_tF_Fzz29w1v-S8VCYKZ4nL_E denied (filesystem path '/var/lib/acme/acme-challenge/.well-known') because search permissions are missing on a component of the path

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah also #106603 which I almost forgot about!

@m1cr0man
Copy link
Contributor

I'm getting a warning on my system after applying these changes:

May 16 18:27:37 myserv systemd[1]: acme-example.com.service:48: Failed to parse system call, ignoring: ~@ipc
May 16 18:27:37 myserv systemd[1]: acme-example.com.service:48: Failed to parse system call, ignoring: ~@keyring
May 16 18:27:37 myserv systemd[1]: acme-example.com.service:48: Failed to parse system call, ignoring: ~@resources
May 16 18:27:37 myserv systemd[1]: acme-example.com.service:48: Failed to parse system call, ignoring: ~@setuid

Is the syntax correct? I don't know what it should be personally.

@mweinelt
Copy link
Member Author

Hm, so it seems that is not the way to remove individual syscall groups after all. I'll be checking up the docs … again.

@mweinelt
Copy link
Member Author

mweinelt commented May 16, 2021

@m1cr0man I've updated the SystemCallFilter and added some comments.

I also added ProtectSystem = "strict";, so now paths you want to access need to be added to ReadOnlyPaths or ReadWritePaths. Does that sound reasonable to you? The tests are not failing, but we would also need to add security.acme.certs.<name>.directory, wherever that gets used.

βœ— PrivateNetwork=                                             Service has access to the host's network                                                             0.5
βœ— RestrictAddressFamilies=~AF_(INET|INET6)                    Service may allocate Internet sockets                                                                0.3
βœ— DeviceAllow=                                                Service has a device ACL with some special devices                                                   0.1
βœ— IPAddressDeny=                                              Service does not define an IP address allow list                                                     0.2
βœ— PrivateUsers=                                               Service has access to other users                                                                    0.2
βœ— SystemCallFilter=~@privileged                               System call allow list defined for service, and @privileged is included (e.g. chown is allowed)      0.2
βœ— RootDirectory=/RootImage=                                   Service runs within the host's root directory                                                        0.1
βœ— UMask=                                                      Files created by service are world-readable by default                                               0.1

β†’ Overall exposure level for acme-a.example.test.service: 1.4 OK :-)

Copy link
Contributor

@m1cr0man m1cr0man left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m1cr0man I've updated the SystemCallFilter and added some comments.

I also added ProtectSystem = "strict";, so now paths you want to access need to be added to ReadOnlyPaths or ReadWritePaths. Does that sound reasonable to you?

Yeah so I've just done some testing with this. We prefix the postRun scripting with "+" which from the docs means it ignores most of the hardening being imposed here.

I modified the postRun test such that it writes to a file outside of /var/lib/acme and it was successful. This is something I would be concerned about breaking in existing setups but with this test I would be confident that they won't be affected.

Happy to see it merged now, but if you could apply this patch so that the test suite covers what I'm explaining above that would be great.

diff --git a/nixos/tests/acme.nix b/nixos/tests/acme.nix
index 6f98b0da378..edd387cae0c 100644
--- a/nixos/tests/acme.nix
+++ b/nixos/tests/acme.nix
@@ -105,9 +105,9 @@ in import ./make-test-python.nix ({ lib, ... }: {
         security.acme.certs."a.example.test".keyType = "ec384";
         security.acme.certs."a.example.test".postRun = ''
           set -euo pipefail
-          touch test
-          chown root:root test
-          echo testing > test
+          touch /home/test
+          chown root:root /home/test
+          echo testing > /home/test
         '';
       };
 
@@ -375,7 +375,7 @@ in import ./make-test-python.nix ({ lib, ... }: {
           switch_to(webserver, "cert-change")
           webserver.wait_for_unit("acme-finished-a.example.test.target")
           check_connection_key_bits(client, "a.example.test", "384")
-          webserver.succeed("grep testing /var/lib/acme/a.example.test/test")
+          webserver.succeed("grep testing /home/test")
 
       with subtest("Correctly implements OCSP stapling"):
           switch_to(webserver, "ocsp-stapling")

@mweinelt mweinelt requested a review from a team June 3, 2021 03:25
@mweinelt
Copy link
Member Author

mweinelt commented Jul 6, 2021

Rebased after merging #121750.

@mweinelt mweinelt merged commit f49b03c into NixOS:master Aug 8, 2021
@mweinelt mweinelt deleted the acme-hardening branch August 8, 2021 13:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants