Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Start new systemd services explicitly? #23221

Closed
copumpkin opened this issue Feb 26, 2017 · 15 comments
Closed

[RFC] Start new systemd services explicitly? #23221

copumpkin opened this issue Feb 26, 2017 · 15 comments

Comments

@copumpkin
Copy link
Member

copumpkin commented Feb 26, 2017

Currently, new services added by a new NixOS configuration only get started implicitly by switch-to-configuration when targets wanting or requiring them get started. For example, in #23121, httpd.service is wantedBy multi-user.target, so it only gets started when we call systemctl start multi-user.target

The issue arises when we want to add new services during startup, as we do with the EC2 userdata reconfiguration machinery. We currently only do this for EC2, but in #22105 I suggest we should generalize it, as more and more hosting/cloud providers (GCE, Packet.net, DigitalOcean, etc.) have similar functionality. It's a nice piece of functionality, since it lets you configure machines declaratively without needing to speak to them directly (e.g., in EC2 autoscaling groups, secure setups where the machine configuring boxes doesn't necessarily have access to the network the boxes go into, etc.).

When adding units at startup, the target that wants a given service might not be active yet. In #23121, we don't start multi-user.target because it hasn't been activated yet, and we probably shouldn't start it that early in the boot process.

I'm not sure how best to solve this. We could make switch-to-configuration.pl explicitly enumerate new services added in a new configuration and call systemctl start --no-block on it (as suggested in this blog post), or we could possibly get the pre-existing targets (which are still starting up) to somehow trigger new services that want them. I don't know how to tell systemd to do that though...

Or maybe it makes sense to call systemctl start --noblock default.target on every invocation of switch-to-configuration.pl?

cc @edolstra @wkennington @shlevy

@fpletz
Copy link
Member

fpletz commented Feb 26, 2017

In my opinion it would be preferable if we could delegate that logic to systemd, so ideally systemctl start --noblock default.target would be used. Not sure if it just works in switch-to-configuration.pl or some hackery is needed to make it work.

@copumpkin
Copy link
Member Author

Yeah, it seems like all instances of unitsToStart might be replaceable by that, right? unitsToRestart and unitsToReload would still follow the existing logic.

@copumpkin
Copy link
Member Author

copumpkin commented Feb 26, 2017

Staring at this, I'm growing more convinced to do what @fpletz says and replace all the unitsToStart logic with systemctl start --noblock default.target. I think the behavior would be roughly the same (i.e., it won't start hibernate.target and friends, or anything with RefuseManualStart), but my scenario will work better and there will be less code to maintain.

The main downside I can see is that we lose the "starting the following units:" message. We'd retain "the following new units were started:" though, so it doesn't seem too bad.

Any objections?

Edit: I guess we'd also need to keep some unitsToStart logic to support this use case: https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/system/activation/switch-to-configuration.pl#L406-L408
Edit 2: I think I should leave the unitsToStart logic for services, but take it out for targets, and then leave that to default.target.

@bennofs
Copy link
Contributor

bennofs commented Feb 26, 2017

Should we really start default.target? Like, if you switched to a different target (say emergency.target), that would switch you back to default, right?

@copumpkin
Copy link
Member Author

I guess we should start whichever target systemd wants to start by default. I think it chooses default.target by default, but it can be overridden on the kernel command line. We'd follow the same logic?

@bennofs
Copy link
Contributor

bennofs commented Feb 26, 2017

@copumpkin well but you can switch to a different target after boot, by doing things like systemctl isolate emergency.target. nixos-rebuild switch would then undo that, but perhaps that's actually what we want here (this is question of what the intended semantics in such a case is)

@copumpkin
Copy link
Member Author

Yeah, not really sure. I'm just going to push up the current code as a PR and get some feedback.

copumpkin added a commit to copumpkin/nixpkgs that referenced this issue Feb 26, 2017
…lves

Instead we use default.target, which mostly has the same behavior, but allows
us to start up new services during boot, as needed by amazon-init (and presumably
cloud-init if people used it much for NixOS). See here for more info:
NixOS#23221
@copumpkin
Copy link
Member Author

@fpletz @bennofs see #23224

@edolstra
Copy link
Member

Maybe I don't understand the problem correctly, but if the issue is that the reconfiguration service runs before multi-user.target has been reached, isn't the solution to order it after multi-user.target (or default.target)? We probably don't want to reconfigure the system while it's still booting...

BTW, it might be interesting to look into system-update.target (see systemd.special(7)), which can be used to boot into an alternative (i.e. much smaller) configuration to run an update process. For example, the EC2 initrd could create /system-update if it sees the appropriate instance user data, to trigger the reconfiguration script in a restricted environment (i.e. with only limited networking and no other services).

@copumpkin
Copy link
Member Author

copumpkin commented Feb 27, 2017

@edolstra well, in part it feels like a "flash of unstyled content" issue like the bad ol' days of web programming: in an ideal world, I'd have an EC2 API that looked like RunInstances : NixOS-Config -> Running-Machine. In practice, my best approximation at that is userdata, and it's not too unreasonable of an approximation, but if we let the machine boot to multi-user.target before applying the configuration, you get a brief glimpse (that you can SSH into and is otherwise observable) of an incidental configuration that nobody asked for except whoever built the AMI you're using. It also increases total startup time, of course. Why don't we want to reconfigure the system while it's still booting?

It seems like calling daemon-reload in the middle of multi-user.target starting up should cause it to learn about new multi-user.target dependencies without much hand-holding, but that doesn't seem to be the case.

I'll take a look at system-update.target though. Sounds potentially interesting, but also a bigger rework of the current machinery.

@copumpkin
Copy link
Member Author

I guess using system-update.target would allow us to configure the kernel in userdata as well, but at the cost of an extra reboot, which could get kind of slow...

@bennofs
Copy link
Contributor

bennofs commented Feb 27, 2017

@copumpkin could you perhaps just boot to system-update.target and then switch, without reboot, to multi-user.target? That wouldn't exactly correspond to the spec, but should work.

@copumpkin
Copy link
Member Author

copumpkin commented Feb 27, 2017

Yeah, perhaps. Although I'm a little concerned because all the stuff about system-update.target talks about offline updates, which isn't what we want. I guess I'd make amazon-init.service depend on network-online.target and assume that it'd do the right thing and configure the network properly as a result?

@copumpkin
Copy link
Member Author

The folks in #systemd are steering me away from system-update.target and towards sticking my logic into initramfs. Any thoughts there?

@copumpkin
Copy link
Member Author

Somewhat moot now given 6018cf4, but I'm still going to try to improve that process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants