-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
launchd: Lower security permissions for daemon, startup on reboot #5698
base: master
Are you sure you want to change the base?
Conversation
This allows the spawning program to be the nix-daemon instead of /bin/sh. That means that the Full Disk Access permission can be only for the nix-daemon.
8dfe7a0
to
5d959b3
Compare
When a darwin host is rebooted, /nix was not mounted. Let the daemon also wait until the store mounting service is finished.
FWIW, the FDA perm issue was likely resolved in #5172 |
I was about to make the exact same PR. This solved two things:
I'll confirm that this change is needed for the new osx upgrade, but I don't know if removing wait4path is good, or if there's a better alternative. (ex. I use external SSD for /nix, I guess I'll need to research what happens when I boot without it connected). |
<key>RunAtLoad</key> | ||
<true/> | ||
<dict> | ||
<key>OtherJobEnabled</key> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clever, this effectively prevents the deamon from starting without nix store existing (replacing the need for wait4path?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the idea. I think there may be a better way, as man launchd.plist
mentions that OtherJobsEnabled is to be avoided. There is a RunAtLoad replacement called WatchPaths which I would like to explore just a little more before this should be considered done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I don't think PathState or WatchPaths really work here. It looks like launchd wants to eagerly check the path to the daemon exists when they are used (even without RunAtLoad). Could be the way I am using it, of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also the LaunchEvents key which might work for working on volume mount but the docs are very sparse for it, so I didn't try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this still work if you don’t have a /nix/store volume? Say you’re still on macOS X?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This suggests that the mount command run by org.nixos.darwin-store perhaps exited prior to the actual filesystem being available. That seems rather surprising.
Is there anything specific to suggest that is more likely than the PathState trigger having some latency? Do we know if the mechanism is just polling? The manpage does suggest it's both race-prone and lossy. I'm not certain what lossy means here, but my first guess would be that it might miss filesystem conditions that don't persist for longer than some polling interval?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there anything specific to suggest that is more likely than the PathState trigger having some latency?
Perhaps I misunderstood, but it looks like in your second log that it runs diskutil
, that exits, it unloads the job, and then spawns nix-daemon. And in the first log the nix-daemon spawn fails because /nix/var/nix/profiles/default/bin/nix-daemon
isn't accessible yet. Latency wouldn't cause the path to become inaccessible. So my impression was that diskutil
must have exited prior to the filesystem actually being accessible and therefore launchd tried to launch nix-daemon too soon. Though that doesn't answer the question of why the nix-daemon job would have launched at all given that PathState means it shouldn't launch until the path is accessible.
Though looking at the second log now, there's a delay of several seconds in between diskutil exiting and nix-daemon being launched. So it's clearly waiting for something. PathState having latency would explain that delay (which could also just be launchd prioritizing other work prior to responding to PathState), but doesn't explain why the first log failed.
Do you have a log of a failure that includes the org.nixos.darwin-store
lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lilyball sure. I need a little bit to go setup a vm for myself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On commit 5d959b33c5a75f3053280a8a34c711f964437766
without filevault enabled.
sh <(curl -L https://jsoo1-nix-install-tests.cachix.org/serve/v89k06b82dkhalcdkdhnfbmrfr6fp1w9/install) --tarball-url-prefix https://jsoo1-nix-install-tests.cachix.org/serve
The first install goes well:
% grep nixos /var/log/com.apple.xpc.launchd/launchd.log
2021-12-03 14:43:25.520241 (system/org.nixos.darwin-store) <Notice>: internal event: WILL_SPAWN, code = 0
2021-12-03 14:43:25.520245 (system/org.nixos.darwin-store) <Notice>: service state: spawn scheduled
2021-12-03 14:43:25.520246 (system/org.nixos.darwin-store) <Notice>: service state: spawning
2021-12-03 14:43:25.520307 (system/org.nixos.darwin-store) <Notice>: launching: speculative
2021-12-03 14:43:25.521383 (system/org.nixos.darwin-store [740]) <Notice>: xpcproxy spawned with pid 740
2021-12-03 14:43:25.521399 (system/org.nixos.darwin-store [740]) <Notice>: internal event: SPAWNED, code = 0
2021-12-03 14:43:25.521401 (system/org.nixos.darwin-store [740]) <Notice>: service state: xpcproxy
2021-12-03 14:43:25.521468 (system) <Notice>: Bootstrap by launchctl[739] for /Library/LaunchDaemons/org.nixos.darwin-store.plist succeeded (0: )
2021-12-03 14:43:25.521506 (system/org.nixos.darwin-store [740]) <Notice>: internal event: SOURCE_ATTACH, code = 0
2021-12-03 14:43:25.526997 (system/org.nixos.darwin-store [740]) <Notice>: service state: running
2021-12-03 14:43:25.527000 (system/org.nixos.darwin-store [740]) <Notice>: internal event: INIT, code = 0
2021-12-03 14:43:25.527005 (system/org.nixos.darwin-store [740]) <Notice>: Successfully spawned diskutil[740] because speculative
2021-12-03 14:43:25.591782 (system/org.nixos.darwin-store [740]) <Notice>: signaled service: Terminated: 15
2021-12-03 14:43:25.591792 (system/org.nixos.darwin-store [740]) <Notice>: service state: SIGTERMed
2021-12-03 14:43:25.591794 (system/org.nixos.darwin-store [740]) <Notice>: scheduling cleanup in 5 sec after sending Terminated: 15
2021-12-03 14:43:25.592549 (system/org.nixos.darwin-store [740]) <Notice>: service exited: dirty = 0, supported pressured-exit = 0
2021-12-03 14:43:25.592551 (system/org.nixos.darwin-store [740]) <Notice>: exited due to SIGTERM | sent by launchd[1]
2021-12-03 14:43:25.592553 (system/org.nixos.darwin-store [740]) <Notice>: service state: exited
2021-12-03 14:43:25.592556 (system/org.nixos.darwin-store [740]) <Notice>: internal event: EXITED, code = 0
2021-12-03 14:43:25.592558 (system) <Notice>: service inactive: org.nixos.darwin-store
2021-12-03 14:43:25.592560 (system/org.nixos.darwin-store [740]) <Notice>: service state: not running
2021-12-03 14:43:25.592582 (system/org.nixos.darwin-store) <Notice>: Service only ran for 0 seconds. Pushing respawn out by 10 seconds.
2021-12-03 14:43:25.592586 (system/org.nixos.darwin-store) <Notice>: internal event: WILL_SPAWN, code = 0
2021-12-03 14:43:25.592587 (system/org.nixos.darwin-store) <Notice>: service state: spawn scheduled
2021-12-03 14:43:25.592589 (system/org.nixos.darwin-store) <Notice>: service throttled by 10 seconds
2021-12-03 14:43:35.598100 (system/org.nixos.darwin-store) <Notice>: service state: spawning
2021-12-03 14:43:35.598154 (system/org.nixos.darwin-store) <Notice>: launching: non-ipc demand
2021-12-03 14:43:35.598730 (system/org.nixos.darwin-store [746]) <Notice>: xpcproxy spawned with pid 746
2021-12-03 14:43:35.598746 (system/org.nixos.darwin-store [746]) <Notice>: internal event: SPAWNED, code = 0
2021-12-03 14:43:35.598748 (system/org.nixos.darwin-store [746]) <Notice>: service state: xpcproxy
2021-12-03 14:43:35.598750 (system/org.nixos.darwin-store [746]) <Notice>: deferred event: domain spawn response: 0
2021-12-03 14:43:35.598754 (system/org.nixos.darwin-store [746]) <Notice>: internal event: SOURCE_ATTACH, code = 0
2021-12-03 14:43:35.601500 (system/org.nixos.darwin-store [746]) <Notice>: service state: running
2021-12-03 14:43:35.601508 (system/org.nixos.darwin-store [746]) <Notice>: internal event: INIT, code = 0
2021-12-03 14:43:35.601513 (system/org.nixos.darwin-store [746]) <Notice>: Successfully spawned diskutil[746] because non-ipc demand
2021-12-03 14:43:35.647161 (system/org.nixos.darwin-store [746]) <Notice>: job state = running
2021-12-03 14:43:35.873600 (system/org.nixos.darwin-store [746]) <Notice>: service exited: dirty = 0, supported pressured-exit = 0
2021-12-03 14:43:35.873609 (system/org.nixos.darwin-store [746]) <Notice>: exited due to exit(0)
2021-12-03 14:43:35.873612 (system/org.nixos.darwin-store [746]) <Notice>: service state: exited
2021-12-03 14:43:35.873614 (system/org.nixos.darwin-store [746]) <Notice>: internal event: EXITED, code = 0
2021-12-03 14:43:35.873616 (system/org.nixos.darwin-store [746]) <Notice>: job state = exited
2021-12-03 14:43:35.873630 (system) <Notice>: service inactive: org.nixos.darwin-store
2021-12-03 14:43:35.873632 (system/org.nixos.darwin-store [746]) <Notice>: service state: not running
2021-12-03 14:43:35.874726 (system/org.nixos.darwin-store) <Notice>: job is not monitored, can't poll
2021-12-03 14:44:53.708894 (system/org.nixos.nix-daemon) <Notice>: internal event: WILL_SPAWN, code = 0
2021-12-03 14:44:53.708898 (system/org.nixos.nix-daemon) <Notice>: service state: spawn scheduled
2021-12-03 14:44:53.708900 (system/org.nixos.nix-daemon) <Notice>: service state: spawning
2021-12-03 14:44:53.708965 (system/org.nixos.nix-daemon) <Notice>: launching: speculative
2021-12-03 14:44:53.710107 (system/org.nixos.nix-daemon [2512]) <Notice>: xpcproxy spawned with pid 2512
2021-12-03 14:44:53.710124 (system/org.nixos.nix-daemon [2512]) <Notice>: internal event: SPAWNED, code = 0
2021-12-03 14:44:53.710127 (system/org.nixos.nix-daemon [2512]) <Notice>: service state: xpcproxy
2021-12-03 14:44:53.710189 (system) <Notice>: Bootstrap by launchctl[2511] for /Library/LaunchDaemons/org.nixos.nix-daemon.plist succeeded (0: )
2021-12-03 14:44:53.710249 (system/org.nixos.nix-daemon [2512]) <Notice>: internal event: SOURCE_ATTACH, code = 0
2021-12-03 14:44:53.719456 (system/org.nixos.nix-daemon [2512]) <Notice>: service state: running
2021-12-03 14:44:53.719460 (system/org.nixos.nix-daemon [2512]) <Notice>: internal event: INIT, code = 0
2021-12-03 14:44:53.719465 (system/org.nixos.nix-daemon [2512]) <Notice>: Successfully spawned nix-daemon[2512] because speculative
2021-12-03 14:44:53.821718 (system/org.nixos.nix-daemon [2512]) <Notice>: signaled service: Terminated: 15
2021-12-03 14:44:53.821730 (system/org.nixos.nix-daemon [2512]) <Notice>: service state: SIGTERMed
2021-12-03 14:44:53.821732 (system/org.nixos.nix-daemon [2512]) <Notice>: scheduling cleanup in 5 sec after sending Terminated: 15
2021-12-03 14:44:53.821902 (system/org.nixos.nix-daemon [2512]) <Notice>: service exited: dirty = 0, supported pressured-exit = 0
2021-12-03 14:44:53.821904 (system/org.nixos.nix-daemon [2512]) <Notice>: exited due to SIGTERM | sent by launchd[1]
2021-12-03 14:44:53.821906 (system/org.nixos.nix-daemon [2512]) <Notice>: service state: exited
2021-12-03 14:44:53.821909 (system/org.nixos.nix-daemon [2512]) <Notice>: internal event: EXITED, code = 0
2021-12-03 14:44:53.821911 (system) <Notice>: service inactive: org.nixos.nix-daemon
2021-12-03 14:44:53.821921 (system/org.nixos.nix-daemon [2512]) <Notice>: service state: not running
2021-12-03 14:44:53.821940 (system/org.nixos.nix-daemon) <Notice>: Service only ran for 0 seconds. Pushing respawn out by 10 seconds.
2021-12-03 14:44:53.821944 (system/org.nixos.nix-daemon) <Notice>: internal event: WILL_SPAWN, code = 0
2021-12-03 14:44:53.821946 (system/org.nixos.nix-daemon) <Notice>: service state: spawn scheduled
2021-12-03 14:44:53.821948 (system/org.nixos.nix-daemon) <Notice>: service throttled by 10 seconds
2021-12-03 14:44:53.821964 (system/org.nixos.nix-daemon) <Notice>: launch already in progress
2021-12-03 14:45:03.827328 (system/org.nixos.nix-daemon) <Notice>: service state: spawning
2021-12-03 14:45:03.827362 (system/org.nixos.nix-daemon) <Notice>: launching: xpc event
2021-12-03 14:45:03.827998 (system/org.nixos.nix-daemon [2518]) <Notice>: xpcproxy spawned with pid 2518
2021-12-03 14:45:03.828014 (system/org.nixos.nix-daemon [2518]) <Notice>: internal event: SPAWNED, code = 0
2021-12-03 14:45:03.828016 (system/org.nixos.nix-daemon [2518]) <Notice>: service state: xpcproxy
2021-12-03 14:45:03.828018 (system/org.nixos.nix-daemon [2518]) <Notice>: deferred event: domain spawn response: 0
2021-12-03 14:45:03.828022 (system/org.nixos.nix-daemon [2518]) <Notice>: internal event: SOURCE_ATTACH, code = 0
2021-12-03 14:45:03.832118 (system/org.nixos.nix-daemon [2518]) <Notice>: service state: running
2021-12-03 14:45:03.832126 (system/org.nixos.nix-daemon [2518]) <Notice>: internal event: INIT, code = 0
2021-12-03 14:45:03.832130 (system/org.nixos.nix-daemon [2518]) <Notice>: Successfully spawned nix-daemon[2518] because xpc event
After a reboot, though:
% grep nixos /var/log/com.apple.xpc.launchd/launchd.log
2021-12-03 14:56:43.145942 (system) <Notice>: pending spawn, domain in on-demand-only mode: org.nixos.darwin-store
2021-12-03 14:56:43.147175 (system) <Notice>: pending spawn, domain in on-demand-only mode: org.nixos.nix-daemon
2021-12-03 14:56:43.167185 (system/org.nixos.darwin-store) <Notice>: internal event: WILL_SPAWN, code = 0
2021-12-03 14:56:43.167188 (system/org.nixos.darwin-store) <Notice>: service state: spawn scheduled
2021-12-03 14:56:43.167190 (system/org.nixos.darwin-store) <Notice>: service state: spawning
2021-12-03 14:56:43.167230 (system/org.nixos.darwin-store) <Notice>: launching: speculative
2021-12-03 14:56:43.168707 (system/org.nixos.darwin-store [70]) <Notice>: xpcproxy spawned with pid 70
2021-12-03 14:56:43.168719 (system/org.nixos.darwin-store [70]) <Notice>: internal event: SPAWNED, code = 0
2021-12-03 14:56:43.168721 (system/org.nixos.darwin-store [70]) <Notice>: service state: xpcproxy
2021-12-03 14:56:43.200394 (system/org.nixos.nix-daemon) <Notice>: internal event: WILL_SPAWN, code = 0
2021-12-03 14:56:43.200397 (system/org.nixos.nix-daemon) <Notice>: service state: spawn scheduled
2021-12-03 14:56:43.200399 (system/org.nixos.nix-daemon) <Notice>: service state: spawning
2021-12-03 14:56:43.200421 (system/org.nixos.nix-daemon) <Notice>: launching: speculative
2021-12-03 14:56:43.201162 (system/org.nixos.nix-daemon [99]) <Notice>: xpcproxy spawned with pid 99
2021-12-03 14:56:43.201173 (system/org.nixos.nix-daemon [99]) <Notice>: internal event: SPAWNED, code = 0
2021-12-03 14:56:43.201175 (system/org.nixos.nix-daemon [99]) <Notice>: service state: xpcproxy
2021-12-03 14:56:43.233166 (system/org.nixos.darwin-store [70]) <Notice>: internal event: SOURCE_ATTACH, code = 0
2021-12-03 14:56:43.233228 (system/org.nixos.nix-daemon [99]) <Notice>: internal event: SOURCE_ATTACH, code = 0
2021-12-03 14:56:43.495702 (system/org.nixos.nix-daemon [99]) <Warning>: Could not find and/or execute program specified by service: 2: No such file or directory: /nix/var/nix/profiles/default/bin/nix-daemon
2021-12-03 14:56:43.495706 (system/org.nixos.nix-daemon [99]) <Error>: Service could not initialize: posix_spawn(/nix/var/nix/profiles/default/bin/nix-daemon) not accessible error: 0x6f: Invalid or missing Program/ProgramArguments
2021-12-03 14:56:43.495709 (system/org.nixos.nix-daemon [99]) <Error>: initialization failure: 21A559: xpcproxy + 23780 [840][F29643C9-8E6C-3632-93A1-5214FFD1DC57]: 0x6f
2021-12-03 14:56:43.495711 (system/org.nixos.nix-daemon [99]) <Notice>: Service setup event to handle failure and will not launch until it fires.
2021-12-03 14:56:43.495713 (system/org.nixos.nix-daemon [99]) <Error>: Missing executable detected. Job: 'org.nixos.nix-daemon' Executable: '/nix/var/nix/profiles/default/bin/nix-daemon'
2021-12-03 14:56:43.495715 (system/org.nixos.nix-daemon [99]) <Notice>: internal event: INIT, code = 111
2021-12-03 14:56:43.502612 (system/org.nixos.nix-daemon [99]) <Notice>: trampoline exited with code: 78
2021-12-03 14:56:43.502617 (system/org.nixos.nix-daemon [99]) <Notice>: service exited: dirty = 0, supported pressured-exit = 0
2021-12-03 14:56:43.502619 (system/org.nixos.nix-daemon [99]) <Notice>: exited due to exit(78)
2021-12-03 14:56:43.502621 (system/org.nixos.nix-daemon [99]) <Notice>: already handled failed init, ignoring
2021-12-03 14:56:43.502623 (system/org.nixos.nix-daemon [99]) <Notice>: service state: exited
2021-12-03 14:56:43.502625 (system/org.nixos.nix-daemon [99]) <Notice>: internal event: EXITED, code = 0
2021-12-03 14:56:43.502627 (system) <Notice>: service inactive: org.nixos.nix-daemon
2021-12-03 14:56:43.502641 (system/org.nixos.nix-daemon [99]) <Notice>: service state: not running
2021-12-03 14:56:43.502643 (system/org.nixos.nix-daemon) <Notice>: internal event: WILL_SPAWN, code = 0
2021-12-03 14:56:43.502647 (system/org.nixos.nix-daemon) <Notice>: service state: spawn scheduled
2021-12-03 14:56:43.972199 (system/org.nixos.darwin-store [70]) <Notice>: service state: running
2021-12-03 14:56:43.972203 (system/org.nixos.darwin-store [70]) <Notice>: internal event: INIT, code = 0
2021-12-03 14:56:43.972206 (system/org.nixos.darwin-store [70]) <Notice>: Successfully spawned diskutil[70] because speculative
2021-12-03 14:56:52.950907 (system/org.nixos.darwin-store [70]) <Notice>: job state = running
2021-12-03 14:56:53.302459 (system/org.nixos.darwin-store [70]) <Notice>: service exited: dirty = 0, supported pressured-exit = 0
2021-12-03 14:56:53.302461 (system/org.nixos.darwin-store [70]) <Notice>: exited due to exit(0)
2021-12-03 14:56:53.302463 (system/org.nixos.darwin-store [70]) <Notice>: service state: exited
2021-12-03 14:56:53.302465 (system/org.nixos.darwin-store [70]) <Notice>: internal event: EXITED, code = 0
2021-12-03 14:56:53.302467 (system/org.nixos.darwin-store [70]) <Notice>: job state = exited
2021-12-03 14:56:53.302481 (system) <Notice>: service inactive: org.nixos.darwin-store
2021-12-03 14:56:53.302483 (system/org.nixos.darwin-store [70]) <Notice>: service state: not running
No RunAtLoad
2021-12-03 15:07:52.156447 (system) <Notice>: pending spawn, domain in on-demand-only mode: org.nixos.darwin-store
2021-12-03 15:07:52.177604 (system/org.nixos.darwin-store) <Notice>: internal event: WILL_SPAWN, code = 0
2021-12-03 15:07:52.177606 (system/org.nixos.darwin-store) <Notice>: service state: spawn scheduled
2021-12-03 15:07:52.177608 (system/org.nixos.darwin-store) <Notice>: service state: spawning
2021-12-03 15:07:52.177633 (system/org.nixos.darwin-store) <Notice>: launching: speculative
2021-12-03 15:07:52.178811 (system/org.nixos.darwin-store [70]) <Notice>: xpcproxy spawned with pid 70
2021-12-03 15:07:52.178823 (system/org.nixos.darwin-store [70]) <Notice>: internal event: SPAWNED, code = 0
2021-12-03 15:07:52.178825 (system/org.nixos.darwin-store [70]) <Notice>: service state: xpcproxy
2021-12-03 15:07:52.241388 (system/org.nixos.darwin-store [70]) <Notice>: internal event: SOURCE_ATTACH, code = 0
2021-12-03 15:07:52.605618 (system/org.nixos.darwin-store [70]) <Notice>: service state: running
2021-12-03 15:07:52.605639 (system/org.nixos.darwin-store [70]) <Notice>: internal event: INIT, code = 0
2021-12-03 15:07:52.605643 (system/org.nixos.darwin-store [70]) <Notice>: Successfully spawned diskutil[70] because speculative
2021-12-03 15:08:01.843502 (system/org.nixos.darwin-store [70]) <Notice>: job state = running
2021-12-03 15:08:01.988518 (system/org.nixos.nix-daemon) <Notice>: internal event: WILL_SPAWN, code = 0
2021-12-03 15:08:01.988533 (system/org.nixos.nix-daemon) <Notice>: service state: spawn scheduled
2021-12-03 15:08:01.988535 (system/org.nixos.nix-daemon) <Notice>: service state: spawning
2021-12-03 15:08:01.988778 (system/org.nixos.nix-daemon) <Notice>: launching: xpc event
2021-12-03 15:08:01.990454 (system/org.nixos.nix-daemon [233]) <Notice>: xpcproxy spawned with pid 233
2021-12-03 15:08:01.990467 (system/org.nixos.nix-daemon [233]) <Notice>: internal event: SPAWNED, code = 0
2021-12-03 15:08:01.990470 (system/org.nixos.nix-daemon [233]) <Notice>: service state: xpcproxy
2021-12-03 15:08:01.990514 (system/org.nixos.nix-daemon [233]) <Notice>: internal event: SOURCE_ATTACH, code = 0
2021-12-03 15:08:02.010866 (system/org.nixos.nix-daemon [233]) <Notice>: service state: running
2021-12-03 15:08:02.010892 (system/org.nixos.nix-daemon [233]) <Notice>: internal event: INIT, code = 0
2021-12-03 15:08:02.010896 (system/org.nixos.nix-daemon [233]) <Notice>: Successfully spawned nix-daemon[233] because xpc event
2021-12-03 15:08:02.057506 (system/org.nixos.darwin-store [70]) <Notice>: service exited: dirty = 0, supported pressured-exit = 0
2021-12-03 15:08:02.057508 (system/org.nixos.darwin-store [70]) <Notice>: exited due to exit(0)
2021-12-03 15:08:02.057510 (system/org.nixos.darwin-store [70]) <Notice>: service state: exited
2021-12-03 15:08:02.057512 (system/org.nixos.darwin-store [70]) <Notice>: internal event: EXITED, code = 0
2021-12-03 15:08:02.057514 (system/org.nixos.darwin-store [70]) <Notice>: job state = exited
2021-12-03 15:08:02.057526 (system) <Notice>: service inactive: org.nixos.darwin-store
2021-12-03 15:08:02.057528 (system/org.nixos.darwin-store [70]) <Notice>: service state: not running
I thought I had tested that configuration and experienced a failure, but I may not have. Seems like PathState
for the nix-daemon may be all that is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed the change reverting OtherJobsEnabled
.
Oh nice! Maybe the installer I was using didn't have this step, though GlobalPermissionsEnable was true for the volume it had created. |
I imagine this will be one of those that needs a bit of testing. If you're willing/able to set up a cachix cache with the right name (see #4577 (comment)) successful CI runs on your branch will create an installer others can try out. |
Mentioning a few people who might have thoughts: @andersk @callahad @matthewbauer @LnL7 @lilyball |
I agree. I will try. I am not a cachix user, currently, so it might take a little while for me to setup... |
It's possible it didn't, or only did in some circumstances. I have not actually hit that issue myself, so my perspective is all secondhand through reading reports and trying to help a few people troubleshoot it. :) |
@abathur I setup a cachix cache jsoo1-nix-install-tests and put a CACHIX_AUTH_TOKEN secret in my fork's secrets. Is that all that I needed to do? |
I think you may also need to enable workflows in your fork. I don't really recall, but maybe there's a link or button to do so at https://github.com/jsoo1/nix/actions? |
Ok, done! |
bf56f90
to
bfbc0c6
Compare
I am running the action on my fork so a cache can be available. Can anyone say what is going on here? https://github.com/jsoo1/nix/runs/4384590357 (cachix push failed?) |
Try re-running it. It may take a few rolls... I should've mentioned earlier that it might fail. It looks like the failure is because the previous step timed out. There's some sort of flaky execution bug on darwin that can cause some hangs and/or aborts on an EOF error. |
It is now RunOnlyOnce, so it is automatically unloaded after running. It should also be forcefully loaded by launchctl bootstrap...
scripts/create-darwin-volume.sh
Outdated
<key>KeepAlive</key> | ||
<dict> | ||
<key>PathState</key> | ||
<dict> | ||
<key>$NIX_ROOT/store</key> | ||
<false/> | ||
</dict> | ||
</dict> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part concerns me. What does it mean to try and "keep alive" the mount command? It runs, and then exits. And since the job is now marked LaunchOnlyOnce
launchd won't relaunch it. My best guess as to what this does is it says "don't run the mounter if the store is already mounted", but I'm not sure what the goal there is. If the problem is that trying to mount it when it's already mounted fails, we should update the mount command instead to be idempotent.
If we didn't mark this as LaunchOnlyOnce
then I could see this as attempting to remount automatically if the nix store unmounts, though I suspect that would get rather annoying. I am a bit concerned that this new job configuration means I can't use launchctl kickstart
to request the volume be remounted. I assume the use of LaunchOnlyOnce
is meant to avoid having it immediately remount the volume if the user unmounts it (or to try and mount in a loop if the mount command fails).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am pretty confused about that, also. It seems like PathState should be more of a run condition than a KeepAlive condition, don't you think? Using LaunchOnlyOnce actually unloads the service when it is done, which is the hacky way that the nix-daemon configuration works with OtherJobEnabled. The docs are quite clear that OtherJobEnabled is not actually about enabled jobs, but about loaded ones. The idea is that the nix-daemon starts after the darwin-store is unloaded.
On the plus side, that should mean that you should be able to launchctl bootstrap system /Library/LaunchDaemons/org.nixos.darwin-store.plist
to your hearts content to retry mounting the store volume.
(Edit) As to a run condition vs a keep alive condition, WatchPaths exists but for some reason did not do what I hoped. Perhaps that avenue could be explored a bit more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed the fact that OtherJobEnabled
was set to false
. I am curious if launchd guarantees that it loads all of the launchdaemons before evaluating any of the OtherJobEnabled
conditions, because if not then it could plausibly load the daemon first, and decide it should be launched before loading the store mounter.
Also the lack of documentation on LaunchOnlyOnce
about it actually unloading the job makes me nervous about this approach as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I would really prefer just PathState
. If anyone else out there can help test, it would help a lot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another concern here will be that, regardless of the other ergonomics of the method, the volume doesn't just have to be mounted before the daemon starts--it also needs to be mounted before macOS tries to load or run anything that might be stored on it (files open in a restoring editor app, the shell for restoring terminal tabs/windows).
This isn't a problem for unencrypted systems, but there is a race-condition when FileVault is enabled. Any mechanism change will need to get tested against this case. We had a fairly straightforward test before, which was to enable FileVault, install Nix, open a file from the store volume in TextEdit, restart, and see if the document reopens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes. Makes sense. I will go try out the encrypted store branch and see what happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I recall, we never got that "it must mount before GUI" to actually be foolproof. The problem being that fstab wouldn't mount the volume before the GUI, and the LaunchDaemon that did this couldn't actually block the GUI, it would just cause the mount attempt to happen sooner and hopefully win. My recollection is that it mostly worked but sometimes failed, and that's why I actually still use an unencrypted Nix volume on my work laptop (my newer M1 laptop is using FileVault, but I've only restarted a handful of times and without a Nix-installed app running so there hasn't been much chance for the race to fail there).
Or is there something I'm missing about the current setup that actually makes it work?
Personally, it seems to me that we should be able to use autofs to have it mount automatically when /nix
is accessed. There's an old technical paper on autofs that indicates that /etc/fstab
is actually used by autofs so I'm not sure why the mountpoints defined there aren't mounted automatically upon access. If it did mount automatically upon access then we wouldn't expect to have an issue, because we could just attempt to launch nix-daemon
and it would mount the volume for us. So the question is, why doesn't this work? And is there anything we haven't considered that would actually make it have the "mount automatically on access, blocking the operation until the mount completes" behavior? Because that would solve everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a problem for unencrypted systems, but there is a race-condition when FileVault is enabled. Any mechanism change will need to get tested against this case. We had a fairly straightforward test before, which was to enable FileVault, install Nix, open a file from the store volume in TextEdit, restart, and see if the document reopens.
Ok I tested this in a fresh vm: enabled filevault, opened TextEdit on a file in the store volume, restarted (with reopen windows when logging back in
checked) and TextEdit was still there with the contents as before the reboot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I recall, we never got that "it must mount before GUI" to actually be foolproof. The problem being that fstab wouldn't mount the volume before the GUI, and the LaunchDaemon that did this couldn't actually block the GUI, it would just cause the mount attempt to happen sooner and hopefully win.
It's broadly accurate that we're depending on winning the race.
My recollection is that it mostly worked but sometimes failed,
If anyone has receipts it'd be great to have them posted publicly, especially if it's reproducible. I had a reliable protocol for causing race-condition failures with the fstab-only mount, and when testing the daemon-mounted volume I was not once able to induce the same failure. I do not recall anyone telling me that they observed such a race-condition failure after using the test (and now release) installers. (But I'll leave room for eating crow since I'm also somewhat forgetful...)
Personally, it seems to me that we should be able to use autofs ... why doesn't this work? And is there anything we haven't considered that would actually make it have the "mount automatically on access, blocking the operation until the mount completes" behavior? Because that would solve everything.
I generally agree that this path sounded promising, but we flogged this pretty extensively back around #4181 (comment) through the end of that first PR. I've already expended days of scarce hobby time I can't get back banging my head against automount/autofs with nothing to show for it. As before, I'd be happy to have someone prove me wrong there--a simpler system-sanctioned solution would of course be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have a vm with recovery mode available soon. I want to use that to look into automount. It definitely seems like the best way, but it seems like autofs has had some breaking changes the last few macos releases. I haven't been using darwin or nixos in that time, though. I'll keep you posted.
If the goal is just "stop using
All that said, the docs on |
I think the other mystery I was hoping to solve here was that the nix volume wasn't mounted on reboot for me, either. The other goal is definitely to avoid giving /bin/sh full disk access.
Both of those are sounding pretty reasonable. I was even thinking that if a full-fledged .app could be provided, code signing and all that might let us actually get capabilities we want but it seems maybe more than I want to bite off.
I totally agree here. My actions finally completed, so cachix cache at |
I'm not certain it's the "best" way to get it, but here's where I get the installer URL from https://github.com/jsoo1/nix/runs/4388555488?check_suite_focus=true#step:4:3 You'll end up with something like |
I don't suppose there's any We could also have the launch event be a Another thought I had was if we could make the LaunchDaemon for nix-daemon use |
That is a great question. I started looking around and came across this technologeeks.com/docs/launchd.pdf which mentions an incomplete list of keys in
Also an interesting thought. I think the nix-darwin daemon configuration has a socket listener configuration already (which seems to work for me?). |
There is no |
Oh I see! What do you think the purpose of this is, then? https://github.com/LnL7/nix-darwin/blob/44da835ac40dab5fd231298b59d83487382d2fab/modules/services/nix-daemon.nix#L60 |
This seems to be all that is necessary to run the daemon when /nix is mounted. * Keep RunAtLoad=false so that the daemon executable is not found when launchd loads the service. * Keep RunAtLoad=true in darwin-store so that it always runs.
@jsoo1 My best guess? It's probably broken. I also don't know what launchd's behavior is in the event that its socket gets replaced. I don't know if launchd ever terminates socket-launchd jobs automatically (can launchd tell whether anyone is still talking to the process? If it can, then it could plausibly ask the job to exit after idling long enough), though I suspect it defers to the job to determine when to exit due to being idle. nix-daemon isn't going to idle-exit so that's probably fine, but if it does ever terminate, I have no idea if launchd will reestablish the original socket in order to relaunch it. I really do think nix-daemon should support this setup on macOS (and it should default to this configuration in the installed LaunchDaemon, and then nix-darwin should be updated accordingly), but it just doesn't look like it will work correctly today. |
I briefly looked into an automountd solution but that is where the trail runs cold for me. I cannot for the life of me figure out how to debug autofs not mounting the store. Recovery mode is (I think) too early to tell, and the logs just seem to not exist in the syslogs. I think it is clearly the best solution but I just cannot figure it out. |
I just filed #5739 about launchd socket activation. |
I marked this as stale due to inactivity. → More info |
Is this PR still relevant/useful? |
It's been a long time, but |
Also curious on the status of this during triaging. |
Re-skimming this and seeing that @lilyball bumped #5739 4 minutes after @jsoo1 asked if this was still useful makes me think @lilyball saw that as a good way forward. Re: That makes me think that there hasn't been (yet) a workable option here for ensuring the daemon waits for the volume, thus the interest in exploring the socket option? |
To be fair to Also I'm not sure how socket activation works in launchd but wouldn't it be prone to the same kind of race condition we have already? |
This allows the spawning program to be the nix-daemon instead of
/bin/sh. That means that the Full Disk Access permission can be only
for the nix-daemon.
This should significantly decrease the security risk of the Full Disk Access as mentioned in
#4640, though it is not a complete solution.
It also solves the problem that the /nix volume may not be mounted upon reboot, and keeps the darwin-store service from restarting using the launchd analog of
oneshot
.