-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nix install run out of memory and fail when binary package size > RAM size #1969
Comments
This is a major issue because installation something like CUDA 9.0 ( > 2GB ) on a medium machine with 4GB RAM fails... |
This should partially be addressed by #619 / #1754. However another reason for the memory use regression is that evaluation and building are now done in a single process (i.e. nix-build is no longer a wrapper around nix-instantiate and nix-store -r). A solution might be to force a Boehm GC run after evaluation. |
Thx @edolstra |
Indeed: This is on something more "lightweight" when compiling GCC5 There is a memory leaks somewhere c-family/.deps/c-gimplify.TPo ../../gcc-5.5.0/gcc/c-family/c-gimplify.c �[31;1merror:�[0m out of memory |
NixOS/nix#1969 for Nix 2.0 migration
Boehm only governs eval memory which is peanuts compared to memory used by copying around std::string's-- profiling shows copying paths is by far the largest contributor to peak memory usage, requiring strings with both the compressed (via binary cache) and decompressed to in memory at the same time (briefly, but nevertheless). Our usage of std::string::append appears particularly painful, FWIW. Decompression adds to a string two pages at a time (on glibc)... :(. This isn't addressed by the linked issues, although is similar in spirit so perhaps can be handled similarly. (we have "Sink"s for compression but decompression-- by far the more common for everyone that isn't hydra-- decompresses into a string instead of being a Source or something) I poked at this for a few hours yesterday but didn't find a clean and satisfactory way to improve this. This is particularly problematic for the above reason (needs to store nar and nar.xz in memory, concurrently) but especially because downloading is done by worker threads causing memory requirement to be something like cores*(max(nar + nar.xz)), which I suspect is what causes memory usage problems reported. Swap helps but at some point Nix is responsible for using disk for huge paths instead of manipulating them in-memory. |
Swap helps but at some point Nix is responsible for using disk for huge paths instead of manipulating them in-memory. I agree. Something like if: sounds reasonnable to me. unpacking nar > 2GB in memory hoping swap would handle it looks like a dangerous solution to me. |
Even if swap does handle it, uncompressing to RAM then writing to disk instead of writing to disk directly can push out quite a bit of disk caches that would survive otherwise.
|
Maybe useful: http://stxxl.org |
Just his this with CI that needs cudatoolkit. Is there a workaround other than downgrading nix or adding swap? |
copyStorePath() now pipes the output of srcStore->narFromPath() directly into dstStore->addToStore(). The sink used by the former is converted into a source usable by the latter using boost::coroutine2. This is based on [1]. This reduces the maximum resident size of $ nix build --store ~/my-nix/ /nix/store/b0zlxla7dmy1iwc3g459rjznx59797xy-binutils-2.28.1 --substituters file:///tmp/binary-cache-xz/ --no-require-sigs from 418592 KiB to 53416 KiB. (The previous commit also reduced the runtime from ~4.2s to ~3.4s, not sure why.) A further improvement will be to download files into a Sink. [1] master...Mathnerd314:dump-fix-coroutine#diff-dcbcac55a634031f9cc73707da6e4b18 Issue #1969.
The recent commits should address the worst of this (yay!!), can this be closed? Also.... 2.1 or whatever would be next, soon-ish? :D |
Sadly this fix is still very much needed. As of 0cb1e52 I am still unable to build a derivation depending upon a 15GB tarball (an FPGA toolchain provided by a hardware vendor) on a machine with 32GB of RAM. |
This reduces memory consumption of nix copy --from https://cache.nixos.org --to ~/my-nix /nix/store/95cwv4q54dc6giaqv6q6p4r02ia2km35-blender-2.79 from 176 MiB to 82 MiB. (The remaining memory is probably due to xz decompression overhead.) Issue NixOS#1681. Issue NixOS#1969.
@bgamari How are you depending on that tarball? If it's via a path reference (e.g. |
@edolstra indeed it is via a path reference (since the tarball must be downloaded manually due to vendor login requirements). |
@bgamari I would consider |
This reduces memory consumption of nix copy --from https://cache.nixos.org --to ~/my-nix /nix/store/95cwv4q54dc6giaqv6q6p4r02ia2km35-blender-2.79 from 176 MiB to 82 MiB. (The remaining memory is probably due to xz decompression overhead.) Issue NixOS#1681. Issue NixOS#1969.
dup of #1681? |
This reduces memory consumption of nix copy --from https://cache.nixos.org --to ~/my-nix /nix/store/95cwv4q54dc6giaqv6q6p4r02ia2km35-blender-2.79 from 176 MiB to 82 MiB. (The remaining memory is probably due to xz decompression overhead.) Issue #1681. Issue #1969.
Fixes `error: out of memory` of `nix-store --serve --write` when receiving packages via SSH (and perhaps other sources). See NixOS#1681 NixOS#1969 NixOS#1988 NixOS/nixpkgs#38808. Performance improvement on `nix-store --import` of a 2.2 GB cudatoolkit closure: When the store path already exists: Before: 10.82user 2.66system 0:20.14elapsed 66%CPU (0avgtext+0avgdata 12556maxresident)k After: 11.43user 2.94system 0:16.71elapsed 86%CPU (0avgtext+0avgdata 4204664maxresident)k When the store path doesn't yet exist (after `nix-store --delete`): Before: 11.15user 2.09system 0:13.26elapsed 99%CPU (0avgtext+0avgdata 4204732maxresident)k After: 5.27user 1.48system 0:06.80elapsed 99%CPU (0avgtext+0avgdata 12032maxresident)k The reduction is 4200 MB -> 12 MB RAM usage, and it also takes less time.
Try out #2206 |
What does the 'backlog' label mean? It's not a priority? You can add me to the list of users who consider this issue a blocker for Nix 2.0, and hope to see a release with the the fixes soon. (I'm currently dealing with FPGA toolchains of ~8 GiB.) |
I believe the backlog label just means it hasn't been triaged yet. |
This reduces memory consumption of nix copy --from https://cache.nixos.org --to ~/my-nix /nix/store/95cwv4q54dc6giaqv6q6p4r02ia2km35-blender-2.79 from 176 MiB to 82 MiB. (The remaining memory is probably due to xz decompression overhead.) Issue NixOS#1681. Issue NixOS#1969.
https://travis-ci.com/cachix/cachix/jobs/147861767 still has the issue, snippets:
|
Nix has 6GB ram available:
and it needs to unpack, in total 4GB - while I run it with
|
Just to make sure — you're either running single-user or the daemon is 2.1.2 as well? |
Afaik it's still single user on linux :) |
This is because this NAR is corrupt:
Note the missing NAR header, which looks like this:
|
Thank you @edolstra - I'll add the checks in place to prevent this. |
So this is fixed in Nix 2.1 and my bug had similar error, but different cause. |
copyStorePath() now pipes the output of srcStore->narFromPath() directly into dstStore->addToStore(). The sink used by the former is converted into a source usable by the latter using boost::coroutine2. This is based on [1]. This reduces the maximum resident size of $ nix build --store ~/my-nix/ /nix/store/b0zlxla7dmy1iwc3g459rjznx59797xy-binutils-2.28.1 --substituters file:///tmp/binary-cache-xz/ --no-require-sigs from 418592 KiB to 53416 KiB. (The previous commit also reduced the runtime from ~4.2s to ~3.4s, not sure why.) A further improvement will be to download files into a Sink. [1] NixOS/nix@master...Mathnerd314:dump-fix-coroutine#diff-dcbcac55a634031f9cc73707da6e4b18 Issue NixOS#1969. (cherry picked from commit 48662d1)
In case of binary installation where size of the NAR > size of the RAM, I triggered an issue where Nix 2.0 simply run out of memory and interrupt.
This was triggered by a simple "nix-build" and was not existing before nix 2.0
The text was updated successfully, but these errors were encountered: