Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iso: only include the host's dependencies #197

Closed
wants to merge 10 commits into from

Conversation

Pacman99
Copy link
Member

fixes #194

I've only tested that the iso build evaluates. I still have to test that it builds. And this change could break the current iso build, so it definitely has to be tested thoroughly.

I came up with a pretty simple solution though, so I thought I may as well share it. Instead of importing all modules, I just added the hosts toplevel into the iso's storeContents.

cc @blaggacao

@Pacman99 Pacman99 marked this pull request as draft March 17, 2021 17:49
@nrdxp
Copy link
Collaborator

nrdxp commented Mar 17, 2021

This looks great!

I was wanting to devise a way to actually test the iso as part of CI, but building the actual iso image itself is quite expensive both in space and time. Not sure if there would be any other way to reliably test it on each change though without having to build and squash the image.

If we can come up with something, perhaps we can include it in this PR.

@blaggacao
Copy link
Contributor

This is really elegant, but at least we would need the (basic) user profiles so that remote access still works. Maybe re-importing the base suite suffices. Basically replicate how it is done here. That would solve this problem and not break the docs (though complementing them is necessary).

@Pacman99
Copy link
Member Author

I was wanting to devise a way to actually test the iso as part of CI, but building the actual iso image itself is quite expensive both in space and time. Not sure if there would be any other way to reliably test it on each change though without having to build and squash the image.

I would like to see that too. But it does take a while to build the iso. Perhaps if we could just import the iso modules into the test node and inspect different parts of the machine once started up.

Maybe re-importing the base suite suffices. Basically replicate how it is done here. That would solve this problem and not break the docs (though complementing them is necessary).

I didn't even think of that! Yeah I knew this would break things, thanks for figuring what exactly it broke :)
So I tried adding the base suite, but then that runs into the problem of home-manager missing. I think getting that to work will require a rework of hosts/default.nix and devosSystem. We might want to consider not passing ALL the modules as one list and instead seperating them. Or passing them as an attrset, so different parts of devosSystem can pick apart the necessary modules. And then a final attrValues of the attrset could be used in the final host.

@Pacman99
Copy link
Member Author

Pacman99 commented Mar 17, 2021

With #198, I can selectively import the base suite and flakeModules - which contains home-manager. Since those are the only two that declare new options(I think!), that should get any user's base suite to evaluate.

@blaggacao
Copy link
Contributor

I think core is valuable too for the unified "sysadmin experience".

@Pacman99
Copy link
Member Author

Pacman99 commented Mar 18, 2021

Yeah I'm actually thinking now of just including everything but local and lib.

And then add a note in the doc that only the base suite is imported int the iso, so any service you want started up or package installed should be in the base suite or core profile. But all packages and dependencies will still be in the nix store, just not part of the environment.

@blaggacao
Copy link
Contributor

That is a good idea, the environments should be probably as close as possible in every aspect, just not starting more systemd services than declared in suites.base and profiles/core.

I have a hard time following up on the home-mamager problwm you reported. What is it about?

@Pacman99
Copy link
Member Author

That is a good idea, the environments should be probably as close as possible in every aspect, just not starting more systemd services than declared in suites.base and profiles/core.

I add a change to forward environment.systemPackages. But either way every other dependency is in the store, they just have to be installed to the profile.

I have a hard time following up on the home-mamager problwm you reported. What is it about?

So the base suite imports the users which set the home-manager options. But without importing extern, those options don't exist, so the evaluation fails. And it indicates a general problem with trying to import parts of the host's config without the necessary modules. If a user had declared options in a custom module, then importing the base suite would then fail.

@blaggacao
Copy link
Contributor

blaggacao commented Mar 18, 2021

Ok, I understand. So we need to make sure to actually import all modules but only core, nixos-user and root-user profiles. Right?

Could we filter them out via suites.allUsers and suites.allProfiles? (that seems to have better fallback dynamics with user declared modules)

@blaggacao
Copy link
Contributor

blaggacao commented Mar 18, 2021

For a live installer we actually only need core + root. We can drop nixos user. We should actually drop nixos user, since deploy-rs (in light of serokell/deploy-rs#76) does require passwordless sudo and it's quite easy to set a password for nixos during install (you are prompted). That actually would be on first use so it doesn't matter. But still we don't need the nixos user on the live installer, do we?

@Pacman99
Copy link
Member Author

Filtering out users is a good idea, but does the root user have ssh access to the iso by default? So if I understand right, the user adds an ssh public key to the root user module which is imported to the iso build. And then you can now ssh into the system as the root user. But we don't want to include the other users as that would be unnecessary and will likely have a password.

So instead of importing the base suite, I'm now just importing suites.allUsers.root.

bors bot added a commit that referenced this pull request Mar 18, 2021
198: hosts/devosSystem: pass modules as attrset r=nrdxp a=Pacman99

This is a fairly simple change that only changes the lib api for devosSystem. But doesn't add any features by itself. Hosts now pass modules to devosSystem as an attrset. And devosSystem just grabs all modules in the set and passes it to nixosSystem. 

I plan to use this in #197 to selectively import modules. And I think it could help with nix-darwin - and other config systems - support, since not all profiles and modules are config system agnostic. This could be a workaround to add rudimentary support for other config systems by only importing the necessary modules.

Overall I think its a useful change and extends the abilities of `devosSystem`

Co-authored-by: Pacman99 <pachum99@gmail.com>
@nrdxp
Copy link
Collaborator

nrdxp commented Mar 18, 2021

I have occasion to test this out as it looks like I fudged my zfs ashift settings and need to recreate my pool. I'll report back how it goes.

@Pacman99
Copy link
Member Author

Updated to include core, global, flakeModules, and modOverrides. The last two aren't really necessary, but they will prevent an evaluation failure in the rare case someone uses home-manager or sets some other option expecting an overridden module

@Pacman99 Pacman99 marked this pull request as ready for review March 18, 2021 04:42
@blaggacao
Copy link
Contributor

but does the root user have ssh access to the iso by default?

No, but it is documented how to do that. Wording might be in need of alight tweaking after this discussion.

lib/devos/devosSystem.nix Outdated Show resolved Hide resolved
@blaggacao
Copy link
Contributor

blaggacao commented Mar 18, 2021

Updated to include core, global, flakeModules, and modOverrides.

I think this is actually necesarry (and all other modules, too) in order to be robust.

This is why I would rather invert the logic woth a selector:

  • everytjing
  • less all suites.allProfiles - except core
  • less all suites.allUsers - except root

Such a negative selector is much more robust than a positive one. Actually a positive one is not correct here.
(Because as per the problem, we know what we not want. We don't strictly know, nor need to know, what we want.)

modules.modOverrides
modules.flakeModules
modules.global
hostConfig.lib.specialArgs.suites.allUsers.root
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work as allUsers is actually a list not a set. We can just export users in suites, which we currently don't do, but assuming users haven't modified this may be a bad idea, as I actually deleted this user on my personal config and it caused a failure. Plus the default root user only sets an empty root password anyway, which I think the ISO config already does.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, at least for now, I'm just going to import the base suite. So that way all users are imported and it gives the ability for users to configure whats included in the iso.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main goal here is to allow users to set their ssh public key.

modules.core
modules.modOverrides
modules.flakeModules
modules.global
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does future self know that it needs to add lines here if it modifies hosts/default.nix adding more modules?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like thats a good thing, that we have to evaluate the importance of modules added to hosts/default.nix before adding them to isoConfig.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future self can silently break things beyond the reach of the hosts/default.nix scope.

I think theres potential to break things either way. moving the discussion for positive vs. negative selector here.

Copy link
Contributor

@blaggacao blaggacao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the reason for why I'm more in favor of a negative selector ...

Future self can silently break things beyond the reach of the hosts/default.nix scope.

@Pacman99
Copy link
Member Author

@blaggacao I'm not entirely sure what you mean? Isn't the goal here to only import the necessary things as to prevent unecessary system startup while still copying the closure. And with that in mind I would think a positive selector is better. So in the future if more modules get added, we have to go through and evaluate if they should be in the isoConfig.

Could you perhaps share a nix expression as to what should be imported. I think might clear up confusion.

@nrdxp
Copy link
Collaborator

nrdxp commented Mar 18, 2021

If we really wanted to go deep, we could pass args to each module manually, and then, if the module happens to define a systemd service, we can filter it out of it's config session before including it. Might be more complex than it sounds though. I was experimenting with this already for #127

@blaggacao
Copy link
Contributor

blaggacao commented Mar 18, 2021

The goal can be reached in two ways, indeed.

Per #194 we want ro avoid systemd services to start up. If I recall the docs correctly, those type of things are only defined in profiles. That might be user profiles (in theory: user systemd serbices) or regular profiles.

No systemd serbice is ever avtivated by a module (maybe specified, etc. But not switched on).

But modules define options that a profile (which we need here) might use. We just don't know if and which one.

Hence, we need to include all modules, but no profile, except the ones we clearly document,so that users will know, that if they put a systemd service in one of those profiles, it will start as part of the live installer.

I suggest, we only document core and root. (And if there is no root, nothing bad happens, you just can't ssh into the live installer, where the only user you really need is root).


That way it becomes clear: for ssh to work, you need to enable openssh service in core amd put your public key in root.

That's a fair contract, users can (easily) conform to.

@blaggacao
Copy link
Contributor

So the logic would be: +-

modulesList - suites.allProfiles - suites.allUsers + core + root

@blaggacao
Copy link
Contributor

blaggacao commented Mar 18, 2021

If generally beeing able to ssh into root is unwanted, we can also choose nixos insstead. I think that might be a concern, indeed. We just would need to make sure that for deploy-rs this user has passwordless sudo. But maybe that can actually be considered an issue in upstream deploy-rs.

So let's rather use nixos and file an issue in deploy-rs.

@Pacman99
Copy link
Member Author

Pacman99 commented Mar 18, 2021

Okay I think I now clearly follow what your saying and I agree. I just pushed a commit that I believe accomplishes it. Its all modules except local and lib. Which is essentially modulesList - allProfiles. And as I said earlier, I'm just importing the entire base suite. I don't know how to filter out users, and also I don't see a point in doing so. User services are usually really light if any are even defined. And it could be nice to login to your user and get an environment your comfortable with.

Copy link
Contributor

@blaggacao blaggacao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make another try re-adding local. Maybe we can later think about tearing local into smaller pieces in search of a concise interface. This breaks image naming...

@@ -8,11 +8,16 @@ lib.nixosSystem (args // {
modpath = "nixos/modules";
cd = "installer/cd-dvd/installation-cd-minimal-new-kernel.nix";

hostConfig = (lib.nixosSystem (args // { modules = moduleList; })).config;

isoModules = (builtins.attrValues (builtins.removeAttrs modules [ "local" "lib" ]));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Result of my tests: removing local removes the correct hostname. The result is a badly named iso image:

result/iso/nixos-nixos-20.09.20210312.60b18a0-x86_64-linux.iso instead of

result/iso/nixos-POS-20.09.20210312.60b18a0-x86_64-linux.iso

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can pass hostName manually

@blaggacao
Copy link
Contributor

blaggacao commented Mar 18, 2021

I think we should revert to strictly solve #194 and leave the other stuff for another PR. It is too hard to isolate and understand cause-effects relationship with mixed concerns if every end-to-end test takes more than 2 hours.

and stuff happens (after 1.5h) 😜

8053063680 bytes (8.1 GB, 7.5 GiB) copied, 1753 s, 4.6 MB/s
dd: error writing '/dev/sda': No space left on device

@Pacman99
Copy link
Member Author

I think we should revert to strictly solve #194 and leave the other stuff for another PR.

Isn't that the entire point of all the changes in this PR? I'm confused as to what you would like to revert to. Either we import all modules or figure out a way to include the closure and import necessary modules. This PR is trying to do the latter.

It is too hard to isolate and understand cause-effects relationship with mixed concerns if every end-to-end test takes more than 2 hours.

I think its ok if we take some time to figure out this PR. Also another way to test could be to build the isoConfig's toplevel and inspect the files in there.

@blaggacao blaggacao mentioned this pull request Mar 18, 2021
5 tasks
@blaggacao
Copy link
Contributor

blaggacao commented Mar 18, 2021

I'm gonging to test now alternative implementation #202 which tries not to introduce side effects. It still includes 2bd3314 as we already started to go down that road with #198 although its not strictly necessary for solving #194

I have to admit I also did this because I have lost a bit track of all the things we discussed here, so I needed to rebuild it to understand what #194 really requires as a minimum implementation.

@Pacman99
Copy link
Member Author

Borrowed @blaggacao's idea of using disabledModules here. And I think this should avoid potential side effects.

];
environment.systemPackages = hostConfig.environment.systemPackages;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the rationale (use case) for this again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well by including the hostConfig toplevel, all packages are now in the nix store. So I thought as convenience we could just forward to the environment.

Copy link
Contributor

@blaggacao blaggacao Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not sure if we can include the toplevel derivation, it takes a lot more time to produce the squashfs, now. Well, maybe we can leave it, but open an issue in search for a better solution (eg. w/o compression?).

So we might just leave this line, as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm currently attempting to build and I see what you mean. If this PR increases the time to build or the iso size by a lot, it might not be the best solution.

Perhaps we could just forward the systemPackages and not include the toplevel build. Or we could add a separate build that has the entire toplevel. So one for network install and one that includes everything making an offline install possible.

@blaggacao

This comment has been minimized.

@Pacman99
Copy link
Member Author

Pacman99 commented Mar 19, 2021

nix build .#nixosConfigurations.NixOS.config.system.build.iso works for me Not sure where the segfault comes from. But that seems like a nix issue.

flk iso NixOS also works

@@ -20,7 +25,12 @@ lib.nixosSystem (args // {
nix.registry = lib.mapAttrs (n: v: { flake = v; }) inputs;
isoImage.storeContents = [
self.devShell.${config.nixpkgs.system}
self.devShell.${config.nixpkgs.system}.drvPath
hostConfig.system.build.toplevel
Copy link
Contributor

@blaggacao blaggacao Mar 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is actually redundant: https://github.com/NixOS/nixpkgs/blob/a6155fc19e31fe0a49b75409d02a2d8b6bb97aa0/nixos/modules/installer/cd-dvd/iso-image.nix#L627-L638

and we need to remove the *.drvPath. Those are the size culprits. I thought you had already removed them ^^ 😸

They are equivalent to isoImage.includeSystemBuildDependencies behemoth

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh I see, fixed in latest commit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does includeSystemBuildDependencies mean? I would assume if you pass the toplevel, that would include all buildInputs for that store path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also remove the .toplevel, it's already included as in the linked default config is shown.

what does includeSystemBuildDependencies mean?

https://github.com/NixOS/nixpkgs/blob/master/nixos/modules/installer/cd-dvd/iso-image.nix#L482-L486

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you talking about this line: https://github.com/NixOS/nixpkgs/blob/a6155fc19e31fe0a49b75409d02a2d8b6bb97aa0/nixos/modules/installer/cd-dvd/iso-image.nix#L630?

That would get the toplevel of the isoConfig. Which won't include the profiles that have been disabled. And the goal is to include the profile's closure, just without including them in the config.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, just realized that my self and renamed to fullHostConfig and left a comment. That makes the difference clearer.

@blaggacao
Copy link
Contributor

Can you also merge Pacman99#3? So I can try again off of this branch here?

@blaggacao
Copy link
Contributor

@Pacman99 For various reasons, this has taken me a whole day already, and I still wasn't able to test this fully. Are you ok, if we turn over the page and continue on #202 ? It's just too hard for me to keep track of all those things while not being able to commit here... (sry)

@Pacman99
Copy link
Member Author

I'll leave this PR open as a draft. But yeah this was mainly meant to demonstrate an idea and I don't think that the idea is viable. I think we'll have to experiment more to get a better overall solution. But I'm not sure #202 is much better since its basically the same as this PR but with a few tweaks.

@Pacman99
Copy link
Member Author

Perfectly fine with continuing there though since you're the one doing most of the testing anyways itll be easier for you to update.

@Pacman99 Pacman99 marked this pull request as draft March 19, 2021 04:10
@Pacman99
Copy link
Member Author

I think the CI runner is still trying to evaluate this PR, but I don't think the CI runner is even available right now.

bors bot added a commit that referenced this pull request Mar 22, 2021
202: iso: avoid systemd service startup r=nrdxp a=blaggacao

fixes #194
alternative to #197

# Manual Tests


<details>
<summary>was unrelated</summary>

- [ ] `flk install NixOS --impure` correctly onto `/mnt` ❎ (looks like no profile is present)

Issue: #204
Upstream Issue: NixOS/nixpkgs#116938

</details>

- [x] acceptable build time / closure size ca 850MB (for a simple base OS) ✔️ 
- [x] local profile with `cage` service is disabled, that is: boots into terminal ✔️ 
- [x] success: air gapped / offline devshell enter ✔️ 
- [ ] failure: aire gapped target install: &larr; non blocking bonus item ❎ 
```console
$ flk install POS
warning: you don't have internet access; disabling some network-dependent features
building the flake in path:/iso/devos?narHash=sha265-...
warning: you don't have internet access; disabling some network-dependent features
error: unable to download 'https://api.github.com/repos/NixOS/nixpkgs/df8e3...': Couldn't resolve host name (6)
```

&rarr; detailed rationale in the commit messages

:heart: @Pacman99 for the excellent and detailed discussions in #197 and the may ideas, suggestions and code.

Co-authored-by: David Arnold <dar@xoe.solutions>
@Pacman99
Copy link
Member Author

Included in #202

@Pacman99 Pacman99 closed this Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

iso: systemd service startup
3 participants