New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad file descriptor
when building content addressed derivation
#6516
Comments
@thufschmitt You mentioned this same problem almost two years ago. Could you confirm this is still a bug and it doesn't depend on my particular configuration or derivations, please? Forgive me for directly pinging you but I need to be sure that this is currently broken. Moreover, if I repeat I would like to help but this is my first time trying to read Nix's source code and I fear this bug isn't easy to fix. I can't even imagine what is causing it. |
@aciceri I have these occasionnally (but might not be the same cause as the original one. It mostly disappeared until a couple of months ago). But I couldn’t manage to reproduce it in a deterministic-ish way :(
I don’t think so. I think the error is “just” that some fds get (in a totally not deterministic fashion) closed too early or too late, but things seem to work well when it doesn’t happen |
I'm getting this error quite regularly. I have the ngi0 cache enabled, which I think may be causing this issue to occur more often. I also see this at one place in the journal:
Is there a way to enable more debug logging to maybe catch more of what's happening internally? I get this error quite often (especially now I'm building a lot of thing), so I'd like to help with figuring out where the issue lies.
a possibly related coredump
another coredump that happened while trying to build ca-derivations
|
@Mindavi I can't see any connection with the ngi0 cache to be honest, what do you mean? These errors happen during the building of derivations, the more things it can fetch from the cache and the lower the chances of these errors occurring are. However I would really like that ca-derivations would work too and I'm available to work on this but this is the first time I put my hands on |
It also seems to happen without the ngi0 cache: https://github.com/helsinki-systems/harmonia/runs/7183636987?check_suite_focus=true |
This also seems to happen without using content-addressed derivations. |
I can reliably reproduce this when using ca-derivations. Is there any way I can debug this? This is quite annoying. |
This now happens to me repeatedly when not using CA derivations (when building stuff) and is really annoying, although probably difficult to debug, as there is no information about what type of file descriptor has this "use-after-free" like problem. Maybe such diagnostic should be added. |
Just removing |
that will just cause nix to quickly run out of file descriptors, I suppose (given it opens thousands of them in mere seconds regularly) |
Note: I've successfully worked around this by doing {
systemd.extraConfig = "DefaultLimitNOFILE=1048576";
} Seems like the default of 1024 is causing issues, but I'm not sure it's worth fixing if just increasing it fixes it. |
It doesn't seem to solve the problem entirely unfortunately. Perhaps it needs to be a very high value to avoid it. |
Can confirm that even outside of this bug Nix frequently runs into problems with a low |
Still seeing this, maybe it helps that I'm now building with ubsan enabled. The crash from nix-daemon
I demangled some of the symbols from this and it seems to be pointing to somewhere here: Lines 667 to 687 in 05d0892
|
I have been working on debugging this, but haven't been able to reliably reproduce it. I got the error myself quite a few times when I first enabled CA derivations, but now it's not happening at all. Anyone who is still consistently getting the error: can you make a small example flake that is (somewhat) reliably triggering it? (with |
This issue has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/content-addressed-nix-call-for-testers/12881/217 |
Describe the bug
When I try to build ca derivations I get sporadic errors about "bad file descriptor"s. Sometimes they are built and sometimes not.
Steps To Reproduce
Same command again gives another output
I've built nix from master (the latest commit) but I had the same problem with current
nixos-unstable
's nix (2.8) and even withnixUnstable
on stable (2.5)Additional context
Sometimes I get "core dumped" in the nix daemon logs but not necessarily.
The text was updated successfully, but these errors were encountered: