Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spurious (unrelated) XDEV errors while building things for which you cannot "bootstrap" #8395

Closed
RaitoBezarius opened this issue May 24, 2023 · 1 comment · Fixed by #8399
Labels

Comments

@RaitoBezarius
Copy link
Member

Describe the bug

Take commit exposed by https://hydra.nixos.org/build/218780863.
Rebuild one of the test.
Observe XDEV failures.

Why is this happening? I don't understand why Nix is trying to renameat2 a chroot path to a Nix store path in a context where nix-build is called with --store /mnt BTW.

Now, if you take this nice patch (pardon for the brutality):

From 4886c5db115ac4c8d8a78dbaa62729f070efe6f9 Mon Sep 17 00:00:00 2001
From: Raito Bezarius <masterancpp@gmail.com>
Date: Wed, 24 May 2023 21:16:44 +0200
Subject: [PATCH] localStore: do not cleanup for disk full situations for now

---
 src/libstore/build/local-derivation-goal.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/libstore/build/local-derivation-goal.cc b/src/libstore/build/local-derivation-goal.cc
index 7929cfe35..0c78710db 100644
--- a/src/libstore/build/local-derivation-goal.cc
+++ b/src/libstore/build/local-derivation-goal.cc
@@ -315,6 +315,7 @@ void LocalDerivationGoal::cleanupPostChildKill()
 
 bool LocalDerivationGoal::cleanupDecideWhetherDiskFull()
 {
+    return false;
     bool diskFull = false;
 
     /* Heuristically check whether the build failure may have
-- 
2.40.1

You get back a clearer idea of the problem: this is missing a kbd.dev in the system.extraDependencies (we should definitely have ensureClosure and do not care about that, unfortunately, we do not have). Fixed in NixOS/nixpkgs#229826 but due to Git and multiple changes, I think this was lot and led me to believe that an unrelated PR was the cause for that.

Steps To Reproduce

  1. Just build the simple installer test on the nixpkgs 36226e3b93c5e7db9110392fb0242ca95c331530.
  2. See error.

Expected behavior

A clear error about the problem, warnings potentially due to renameat2 not being possible, but not fatal errors like those.

nix-env --version output

Reproducible on 2.12, 2.15.

Additional context

We spent a tortuous time debugging this with GDB, Nix symbols, strace, etc.
It seems like there are some XDEV in the sandbox / chroot helper / something and I have seen @thufschmitt fixing some of them for Podman regarding some overlay.

Note that NixOS tests in a VM runs in a overlaid /nix/store with a writeable part (tmpfs) and a ro part coming from the host via 9p.

Priorities

Add 👍 to issues you find important.

@thufschmitt
Copy link
Member

That seems to be indeed very similar to #6280 (the original issue about EXDEV on podman containers). There's a trivial workaround on the Nix side which is to replace this renameFile call by moveFile (which will fallback to a copy if the rename call fails). That might be a bit costly if it happens too often though, so I'd also like to understand why that happens and whether we can avoid it in a more principled way

thufschmitt added a commit to tweag/nix that referenced this issue May 25, 2023
For some reason (probably linked to
https://www.kernel.org/doc/html/latest/filesystems/overlayfs.html?highlight=overlayfs#renaming-directories),
renaming the output paths of a derivation from their sandbox location
(`/nix/store/abc-foo.drv.chroot/nix/store/def-foo`) to their final one
(`/nix/store/def-foo`) fails in some scenarios with `EXDEV` (“invalid
cross-device link”).

Work around that by falling back to copying them in that case, which is
much less efficient, but correct.

Fix NixOS#8395
thufschmitt added a commit to tweag/nix that referenced this issue May 25, 2023
When encountering a build error, Nix moves the output paths out of the
chroot into their final location (for “easier debugging of build
failures”). However this was broken for chroot stores as it was moving
it to the _logical_ location, not the _physical_ one.

Fix it by moving to the physical (_real_) location.

Fix NixOS#8395
github-actions bot pushed a commit that referenced this issue May 28, 2023
When encountering a build error, Nix moves the output paths out of the
chroot into their final location (for “easier debugging of build
failures”). However this was broken for chroot stores as it was moving
it to the _logical_ location, not the _physical_ one.

Fix it by moving to the physical (_real_) location.

Fix #8395

(cherry picked from commit d16a199)
github-actions bot pushed a commit that referenced this issue Jun 6, 2023
When encountering a build error, Nix moves the output paths out of the
chroot into their final location (for “easier debugging of build
failures”). However this was broken for chroot stores as it was moving
it to the _logical_ location, not the _physical_ one.

Fix it by moving to the physical (_real_) location.

Fix #8395

(cherry picked from commit d16a199)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants