-
-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor nixos-install to separate out filesystem build logic #23026
Conversation
@copumpkin, thanks for your PR! By analyzing the history of the files in this pull request, we identified @edolstra, @obadz and @Shados to be potential reviewers. |
ln -s /run $mountPoint/var/run | ||
for f in /etc/resolv.conf /etc/hosts; do rm -f $mountPoint/$f; [ -f "$f" ] && cp -Lf $f $mountPoint/etc/; done | ||
for f in /etc/passwd /etc/group; do touch $mountPoint/$f; [ -f "$f" ] && mount --rbind -o ro $f $mountPoint/$f; done | ||
set -x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, I'd obviously take this out later 😄
|
||
set -e | ||
|
||
mountPoint="$1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not going to pretend this has a user-friendly CLI. It's mostly intended to be used by nixos-install.sh
and the upcoming new image building machinery right now. If someone has a suggestion on how to remove the duplicate closure vs. path stuff in this CLI, I'm all ears.
# inside one. See https://github.com/NixOS/nix/issues/1242 for more discussion. | ||
if [[ "$i" =~ \.closure$ ]]; then | ||
echo "importing serialized closure $i to $mountPoint..." | ||
fakechroot -- chroot $mountPoint nix-store --import < $i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure I could use unshare
for a user namespace here but fakechroot
will work on grsecurity hosts, whereas the user namespaces have been turned off. fakechroot
seemed pretty harmless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, with 1.12
I think you can just set the store to be a store at a different root without fakechroot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, is there a strong reason not to just hard-code this to use 1.12 and lose the fakechroot dep?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, trying that out now 😄
I made a few changes that make the installer tests get farther, but I'm still noticing some weirdness that I haven't figured out yet. I successfully install NixOS and then fail here, on
I've been studying the current P.S: please excuse the messy PR right now; I'm trying to post my current state to aid in debugging, and will obviously clean up and improve things (like e.g., the |
Aha, @7c6f434c points out that I need the build-time closure of the system for that part of the test to pass, which ends up on the system in the current |
I don't think it's feasible to have nixos-install build the closure outside of the target - consider that on the actual LiveCD anything you download/build to the LiveCD's |
@dezgeg good point, but it seems like we could probably make the "build inside target" thing happen more explicitly, rather than the current approach which feels sort of accidental and pollutes images with build-time crap that you don't necessarily want on small images |
Okay, amazingly I was able to get all the installer tests passing. Cleaning up the change now and will update soon!! |
d9d1eb5
to
8e6f1f5
Compare
I think the latest commit works for everything, but doesn't address @dezgeg's issue yet. I'll think about how best to deal with that now. But anyone who's feeling in the mood for a review, I'd welcome it! |
@copumpkin Is the commit message also up to date? |
# Inherit binary caches from the host | ||
# TODO: will this still work with Nix 1.12 now that it has no perl? Probably not... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copumpkin Have you tested this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not with 1.12, nope. I guess there's no downside now; when I first wrote it I was trying to avoid 1.12 due to the schema version issues. I'll point it at nixUnstable
and try out the new stuff!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I've switched nixos-prepare-root
to nixUnstable
, but still haven't tested using nixUnstable
on nixos-install
. I don't think that hurts this and I don't want to force more nixUnstable
than we need.
binary_caches="$(@perl@/bin/perl -I @nix@/lib/perl5/site_perl/*/* -e 'use Nix::Config; Nix::Config::readConfig; print $Nix::Config::config{"binary-caches"};')" | ||
extraBuildFlags+=(--option "binary-caches" "$binary_caches") | ||
|
||
nixpkgs="$(readlink -f $(nix-instantiate --find-file nixpkgs))" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strictly speaking should probably have quotes around the inner $()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yessir!
unset NIX_BUILD_HOOK | ||
# A place to drop temporary closures | ||
tmpdir="$(mktemp -d)" | ||
trap "rm -rf $tmpdir" EXIT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I prefer to put the trap before the mktemp
call to handle the (admittedly silly) case of an interrupt between the mktemp
and the trap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yessir!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dammit you thwarted my evil attempts to subvert everything. Fine, I'll add quotes there too!
system_root=$closure | ||
# Create a temporary file ending in .closure (so nixos-prepare-root knows to --import it) to transport the store closure | ||
# to the filesytem we're preparing. Also delete it on exit! | ||
nix-store --export $closure > $system_closure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you may want nix-store --export $(nix-store -qR $closure)
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, good point. Hadn't tested that branch
# inside one. See https://github.com/NixOS/nix/issues/1242 for more discussion. | ||
if [[ "$i" =~ \.closure$ ]]; then | ||
echo "importing serialized closure $i to $mountPoint..." | ||
fakechroot -- chroot $mountPoint nix-store --import < $i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, with 1.12
I think you can just set the store to be a store at a different root without fakechroot
# valid, delete it to get it out of the way, but as a result nothing | ||
# will work anymore.) | ||
|
||
fakechroot -- chroot $mountPoint nix-store --register-validity < $i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
# inside one. See https://github.com/NixOS/nix/issues/1242 for more discussion. | ||
if [[ "$i" =~ \.closure$ ]]; then | ||
echo "importing serialized closure $i to $mountPoint..." | ||
fakechroot -- chroot $mountPoint nix-store --import < $i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, is there a strong reason not to just hard-code this to use 1.12 and lose the fakechroot dep?
# Create the required /bin/sh symlink; otherwise lots of things | ||
# (notably the system() function) won't work. | ||
mkdir -m 0755 -p $mountPoint/bin | ||
# !!! assuming that @shell@ is in the closure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just add a shell.closure
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do!
if [ -z "$noChannelCopy" ] && [ -n "$channel" ]; then | ||
echo "copying channel..." | ||
fakechroot -- chroot $mountPoint nix-env --option build-use-substitutes false \ | ||
"${extraBuildFlags[@]}" -p /nix/var/nix/profiles/per-user/root/channels --set "$channel" --quiet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will entail actual building, right? Or is there a channel.closure
I missed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a channel_closure
a little farther down, but you're right, I probably screwed something up here.
ln -sfn /nix/var/nix/profiles/per-user/root/channels $mountPoint/root/.nix-defexpr/channels | ||
|
||
# Grub needs an mtab. | ||
ln -sfn /proc/mounts $mountPoint/etc/mtab |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like it belongs a bit better to nixos-install
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep!
I really, really don't want to complicate the installer with Also, Nix 1.12 should enable us to simplify the installer quite a bit. We will be able to use |
@edolstra can you explain how this is complicating anything, if I get rid of the fakechroot? I'd call it better separation of concerns, and was quite clear about why I'm doing this and what functionality it enables. Just to reiterate, I literally cannot build disk images on several platforms. Either they take half an hour on EC2, or simply do not work under at least two local virtualization solutions. I understand wanting to keep few moving parts, but this seems like a refactor that splits the logic out with a sensible separation, so the whole "this is more complicated because it's different" thing just feels vague and unactionable. I'm "complicating" the installer (by removing 75% of its code?) precisely because you told me that I shouldn't duplicate the logic in it in my last PR attempting to build images in a sensible way. You seem to be leaving me in a conundrum where you don't want me to "complicate" the installer, but don't want me to duplicate its logic, but technology prevents me from actually using it to build images. Can you tell me what you expect me to do? Edit: just TBC, most of the "diff" lines in this commit just arise from moving logic from one script to another (which has explicit comments as to its purpose and contract) so I can better reuse it. So I really don't get what's more complicated than before, outside of the |
To put this in perspective, I've been paying packet.net $0.40 an hour for the past several days just so I could test this and fix the installer tests, because I have no Linux boxes of my own, I can't run this inside a VM on my Mac, and it's completely untenable to do any of this on EC2 (if I want to get it done this year at least). |
@copumpkin I think you wanted to ping @edolstra 😄 |
For anyone following, @edolstra and I hashed out our concerns on IRC, and I'd summarize them as follows:
I actually think that in the slightly longer term, there's a good resolution that will make Until then, @edolstra has said he won't veto this PR if others think this is worthwhile. I'm going to post to the mailing list and see if we can get a sample of opinions. I know of at least a few people who have been bitten by the virtualization issues, but it'd be nice to hear more. If you have thoughts on the matter, please chime in here, either with 👍 (if you support merging this PR) or 👎 (if you don't) on this comment, and if you have thoughts to add in support of your thumb, please do. |
My position: separating phases in multiple scripts that do almost the same as the current script but have clearer functionality is a nice thing to have. My personal interests: I am currently using something that reuses parts of NixOS code but has significant differences (NixPkgs is still the main package source). Personally, I am obviously interested in maximal separation of concern in NixOS code. Basically, I can help people on IRC with parts of NixOS that are useful or at least runnable outside literal NixOS, and being able to help people is nice… [Ideally I would also hope that eventually there can be explicit support for convenient booting NixOS and non-NixOS Nix-based systems sharing a store and updating different profiles. This would allow to create handcoded proof-of-concept bootscripts to show off a single feature and then integrate them into NixOS if anyone cares. This is not in scope of the current PR, though.] |
To be fair this script didn't even invoke |
I've been (intermittently) following this PR with much anticipation.
This is huge to me. I think the costs in terms of discrepancies can be minimized, and what's left is made up for in speed and portability. In concept, I'm a big 👍 on this. It looks like we have a path forward without using (Where "we" mostly means @copumpkin, I suspect -- I'll mostly be cheering from the sidelines on this one) |
nixos/tests/installer.nix
Outdated
@@ -123,6 +123,9 @@ let | |||
$machine->succeed("type -tP ps | tee /dev/stderr") =~ /.nix-profile/ | |||
or die "nix-env failed"; | |||
|
|||
# This should do nothing | |||
$machine->succeed("nixos-rebuild switch >&2"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, nixos-rebuild switch
does not do nothing. In all cases, even if this is the same as the previous system, it will still create a new generation to the system profile, update the symbolic link, and run the activation script.
Maybe, what you should be testing is the the current-system
still targets the same path in the /nix/store
, to check the the output is the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up taking this out since it doesn't really do much, and I don't want to test basic functionality of nixos-rebuild switch
in this test.
Thanks for all the support! Anyway, almost all feedback is now incorporated as of my push just now. Will make a few more tweaks soon (to do with which closures I pass around and how) but I think we're pretty close. |
@copumpkin Can you re-request review from me when you think this is merge ready? |
Will do, thanks! |
9aaa698
to
26b75bb
Compare
@shlevy requested another review. I think this is basically ready, except for @dezgeg's RAM concern which I'd like to address in a separate commit, and will write another VM test for limited RAM to check it. All installer tests currently pass. One thing I noticed after rebasing onto master (perhaps it was happening before too?) is that the |
I guess the |
mkdir -m 0755 -p $mountPoint/bin | ||
# !!! assuming that @shell@ is in the closure | ||
ln -sf @shell@ $mountPoint/bin/sh | ||
# TODO: do I need to set NIX_SUBSTITUTERS here or is the --option binary-caches above enough? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does anyone know what this does and whether it's necessary? My understanding is that we only used it before to handle the copy-from-other-stores.pl
substituter, but I'm not really sure.
mkdir -m 1775 -p $mountPoint/nix/store | ||
|
||
# TODO: move this back | ||
# chown @root_uid@:@nixbld_gid@ $mountPoint/nix/store |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I welcome thoughts about this. My sense is that it should go into nixos-install
, but I'm not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If sandboxing is enabled, is root:root
more sensible ownership for the store than root:nixbld
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps, but we can't assume it's always enabled. I wish we could!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure that all the bind mounts will give the desired effect even with sandboxing enabled, mind you. If you don't assume sandboxing, then nixos-prepare-root.sh
should create a non-broken Nix situation that includes performing chown
, so moving chown
back from the specific installation script makes a lot of sense…
This should be ready for review if anyone's feeling adventurous 😄 |
The key distinction I'm drawing is that there's a component that deals with the store of the machine being built, and another component for the store building it. The inner part of it assumes nothing from the builder (doesn't need chroot or root powers) so it can run comfortably inside a Nix build, as well as nixos-rebuild. I have some upcoming work that will use that to significantly speed up and streamline image builds for NixOS, especially on virtualized hosts like EC2, but it's also a reasonable speedup on native hosts.
26b75bb
to
d990aa7
Compare
I've moved the I also added some other checks to the installer tests to make sure other things are being done properly. For a while we generated images with invalid Nix registration data, so I added a verify, and I also added a non-root user to make sure that the daemon is set up correctly and that users can use it to run builds. |
patchPhase = '' | ||
substituteInPlace src/libexpr/json-to-value.cc \ | ||
--replace 'std::less<Symbol>, gc_allocator<Value *>' \ | ||
'std::less<Symbol>, gc_allocator<std::pair<const Symbol, Value *> >' | ||
|
||
sed -i '/if (settings.readOnlyMode) {/a curSchema = getSchema();' src/libstore/local-store.cc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copumpkin I was curious about this, too. Perhaps a comment here pointing at a specific Nix issue, or just a short description, would help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was hoping for a 1.11.9 to go out, but I'll add a comment until then
A bug that's fixed upstream but I need a 1.11.9 release :) basically the
forward schema compatibility in 1.11 missed a corner case and failed during
read-only evaluations with nix-instantiate. On my phone right now but if
you check recently closed issues/PRs on Nix repo you'll see it.
…On Sun, Apr 16, 2017 at 17:25 Shea Levy ***@***.***> wrote:
***@***.**** approved this pull request.
------------------------------
In pkgs/tools/package-management/nix/default.nix
<#23026 (comment)>:
> patchPhase = ''
substituteInPlace src/libexpr/json-to-value.cc \
--replace 'std::less<Symbol>, gc_allocator<Value *>' \
'std::less<Symbol>, gc_allocator<std::pair<const Symbol, Value *> >'
+
+ sed -i '/if (settings.readOnlyMode) {/a curSchema = getSchema();' src/libstore/local-store.cc
What's this?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#23026 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAKP8PlraC3ZSCW6ce1QYWqfHAwq1H2ks5rwodVgaJpZM4MGnd5>
.
|
The moment we get a .9 release we can remove that entire patch phase since
the other thing in there is also fixed.
…On Sun, Apr 16, 2017 at 17:31 Daniel Peebles ***@***.***> wrote:
A bug that's fixed upstream but I need a 1.11.9 release :) basically the
forward schema compatibility in 1.11 missed a corner case and failed during
read-only evaluations with nix-instantiate. On my phone right now but if
you check recently closed issues/PRs on Nix repo you'll see it.
On Sun, Apr 16, 2017 at 17:25 Shea Levy ***@***.***> wrote:
> ***@***.**** approved this pull request.
> ------------------------------
>
> In pkgs/tools/package-management/nix/default.nix
> <#23026 (comment)>:
>
> > patchPhase = ''
> substituteInPlace src/libexpr/json-to-value.cc \
> --replace 'std::less<Symbol>, gc_allocator<Value *>' \
> 'std::less<Symbol>, gc_allocator<std::pair<const Symbol, Value *> >'
> +
> + sed -i '/if (settings.readOnlyMode) {/a curSchema = getSchema();' src/libstore/local-store.cc
>
> What's this?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#23026 (review)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAAKP8PlraC3ZSCW6ce1QYWqfHAwq1H2ks5rwodVgaJpZM4MGnd5>
> .
>
|
Okay, it looks there's a lot of support for merging this. I'm going to go ahead and merge, and if there are any outstanding objections after this we can revert the commit. The follow-up PR is #24964 |
@copumpkin: it seems very likely that this merge is what breaks the build of ova and thus blocks nixos-unstable.
I haven't really read this PR nor the followup. |
BTW, the followup PR fixes the build. |
Yep I mentioned that in the followup PR, sorry! Was reluctant to put effort into fixing build of those images if I was about to scrap that code by merging the other PR.
… On Apr 23, 2017, at 04:41, Vladimír Čunát ***@***.***> wrote:
BTW, the followup PR fixes the build.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
OK to me as long as it's not going to block the channel for days. |
Okay I'll look into fixing it if I can't get the other PR merged soon :)
@shlevy do you have a moment to look?
…On Sun, Apr 23, 2017 at 11:20 Vladimír Čunát ***@***.***> wrote:
OK to me as long as it's not going to block the channel for days.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#23026 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAKP-y3016VK23ix7-B2PHphKoRoiLGks5ry2whgaJpZM4MGnd5>
.
|
The key distinction I'm drawing is that there's a component that deals with the store of the machine being built, and another component for the store building it. The inner part of it assumes nothing from the builder (doesn't need chroot or root powers) so it can run comfortably inside a Nix build, as well as nixos-rebuild. I have some upcoming work that will use that to significantly speed up and streamline image builds for NixOS, especially on virtualized hosts like EC2, but it's also a
reasonable speedup on native hosts.
This still doesn't quite work (the installer tests don't pass) but this is the general shape of what I'd like to use, so would appreciate any early feedback anyone has.Diff-wise, it might be productive to also diff
nixos-prepare-root.sh
against the oldnixos-install.sh
, since much of the stuff that I deleted from the latter ended up in the former, often unchanged. I don't know how to make GitHub do that for me, but if it can, please post a comment.cc @edolstra @domenkozar @shlevy @obadz @cleverca22
Motivation for this change
The current NixOS image building machinery needs to jump through quite a few hoops to use the current
nixos-install
inside a build, so it fires up a VM and behaves impurely inside it. While that's convenient, in practice it's a bit finicky, doing some work on the host, some work on nix database inside the VM, and other work on the nix store/db inside the filesystem being built.This PR is the first step to fixing that, with another PR afterwards that will use
nixos-prepare-root
from the image building machinery. See also #21943 (comment) for more about how it works.Things done
(nix.useSandbox on NixOS,
or option
build-use-sandbox
innix.conf
on non-NixOS)
nix-shell -p nox --run "nox-review wip"
./result/bin/
)