-
-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nixos/tests: Use a patched QEMU for testing #20500
Conversation
The reason to patch QEMU is that with latest Nix, tests like "printing" or "misc" fail because they expect the store paths to be owned by uid 0 and gid 0. Starting with NixOS/nix@5e51ffb, Nix builds inside of a new user namespace. Unfortunately this also means that bind-mounted store paths that are part of the derivation's inputs are no longer owned by uid 0 and gid 0 but by uid 65534 and gid 65534. This in turn causes things like sudo or cups to fail with errors about insecure file permissions. So in order to avoid that, let's make sure the VM always gets files owned by uid 0 and gid 0 and does a no-op when doing a chmod on a store path. In addition, this adds a virtualisation.qemu.program option so that we can make sure that we only use the patched version if we're *really* running NixOS VM tests (that is, whenever we have imported test-instrumentation.nix). Tested against the "misc" and "printing" tests. Signed-off-by: aszlig <aszlig@redmoonstudios.org>
Cc: @domenkozar because this should surface on Hydra deployments in the wild with #19396. |
Also I'm open for ideas to solve this in a better way. |
Well, the easy solution is not to use user namespaces. I'd prefer not to have to patch qemu (and who knows how many other packages) to work around bugs caused by user namespaces. |
@edolstra: These problems are not only surfacing with user namespaces but also on non-NixOS without user namespaces. I guess the reason we didn't get bug reports yet is that the amount of people running NixOS tests on non-NixOS systems is close to zero. For the issue related to |
@edolstra: Also, there is another option to fix this in a generic way, which I forgot above: Use seccomp to force st_uid/st_gid to be 0 on |
Why does it have to be 0 only for store paths? Can't we force everything to be 0? |
@edolstra: Hm, good point... going to implement it. |
@edolstra: Okay, this is very tricky to implement, because if we want to emulate the This would lead to an infinite recursion, but usually the way to properly implement this with seccomp is to match only on specific arguments of If we go for the second implementation, we could use a tracer to pause and re-inject the emulated syscall from outside the sandbox. But having a tracer attached to each process not only has an impact on performance but also can cause side-effects in terms of Both methods however have another big drawback: We'd lose cross-platform compatibility of On 32bit x86 we can just swap And on top of all that, we still have a build environment that's different than we have in older Nix, so we will see builds failing on newer Nix where they have succeeded in older Nix versions. So all in all, I think this is not worth it, so it might be a good idea to either remove user namespaces altogether or make it the new standard for Nix builders (which also means, that we need to do patches like this very PR). Speaking of builders, using user namespaces indeed has one particular advantage: We could implement a real |
Our NixOS test runner is using
virtio
to make the store available to the VMs within the test. Since NixOS/nix@5e51ffb we now build with user namespaces and we have a mapping from the uid/gid of the current build user to a uid/gid within the newly created user namespace. In NixOS/nix#1131 this is going to be 0/0 again but even using 1000/100 before (in NixOS/nix@ff0c0b6) doesn't avoid the following problem.So the problem here is that the bind-mounted file systems within the build process will have a different uid/gid mapping and this in turn leads to test failures like these:
What happens here is that the programs tested here (
sudo
andcups
) are doing strict permission checks and bail out because the paths that are mounted within the builder sandbox now have uid/gid 65534 and thus don't match with the user that the process is running (uid/gid 0) as.From the perspective of namespacing this makes sense, because the uid 0 within the user namespace isn't the same as uid 0 from the upper namespace.
Right now it seems that the official Hydra doesn't yet run with a recent enough Nix version to be hitting this problem, but it will eventually get there as well.
So the following options come in mind to address this:
sudo
andcups
to not do restrictive checks on paths within the Nix store.bindfs
to override the uid/gid (very slow, because it's implemented as FUSE).stat()
.With this PR I've chosen the latter, because it's the most generic and least invasive approach. However we still need to fix build failures of some programs (for example go's tests fail because of this https://headcounter.org/hydra/build/1455527/nixlog/1/raw, but I haven't looked into it in detail yet).
Cc: @edolstra