Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nix copy uses too much memory #1681

Open
LisannaAtHome opened this issue Nov 15, 2017 · 49 comments
Open

nix copy uses too much memory #1681

LisannaAtHome opened this issue Nov 15, 2017 · 49 comments
Assignees
Labels
new-cli Relating to the "nix" command performance stale

Comments

@LisannaAtHome
Copy link

I'm running nix copy in runInLinuxVM, and notice that for any nontrivial closures, the VM will run out of memory during the copying process. I left it set at the default 512 megabytes. I could obviously increase the amount of memory the VM is given, but that doesn't scale for copying complex derivations with many dependencies.

I suggest adding an option to only load and copy the contents of the paths one at a time, or even better, a way to specify an upper bound on the memory to be used while copying.

@copumpkin
Copy link
Member

Intuitively it feels that it should be possible for it to run in constant memory. What am I missing?

@lheckemann
Copy link
Member

I'm encountering this issue with a single path — nix copy, nix-store --import, and a number of other commands I've tried all fail to import the path. Would be great to know if there's any way at all I can import it…

@LisannaAtHome
Copy link
Author

LisannaAtHome commented Mar 17, 2018

Possibly related to #1969 ? Looks like some patches have gone in recently that might improve things here: 48662d1 3e6b194

edolstra added a commit to edolstra/nix that referenced this issue Mar 26, 2018
Continuation of 97002b6. This makes
the daemon use constant memory. For example, it reduces the daemon's
maximum RSS on

  $ nix copy --from ~/my-nix --to daemon /nix/store/1n7x0yv8vq6zi90hfmian84vdhd04bgp-blender-2.79a

from 264 MiB to 7 MiB.

We now use a TunnelSource to prevent the connection from ending up in
an undefined state if an exception is thrown while the NAR is being
sent.

Issue NixOS#1681.
edolstra added a commit to edolstra/nix that referenced this issue Mar 27, 2018
This reduces memory consumption of

  nix copy --from file://... --to ~/my-nix /nix/store/95cwv4q54dc6giaqv6q6p4r02ia2km35-blender-2.79

from 514 MiB to 18 MiB for an uncompressed binary cache, and from 192
MiB to 53 MiB for a bzipped binary cache. It may also be faster
because fetching can happen concurrently with decompression/writing.

Continuation of 48662d1.

Issue NixOS#1681.
edolstra added a commit to edolstra/nix that referenced this issue Mar 27, 2018
This reduces memory consumption of

  nix copy --from https://cache.nixos.org --to ~/my-nix /nix/store/95cwv4q54dc6giaqv6q6p4r02ia2km35-blender-2.79

from 176 MiB to 82 MiB. (The remaining memory is probably due to xz
decompression overhead.)

Issue NixOS#1681.
Issue NixOS#1969.
@shlevy shlevy added the backlog label Apr 1, 2018
@Ralith
Copy link

Ralith commented Apr 2, 2018

I see commits purporting to address this for a number of different cases, but none concerning uploading to a S3 bucket. Trying to copy a 2.8GB store path to a S3 bucket took nearly 4GB of memory and more than twenty minutes of 100% CPU. Has that been fixed?

dtzWill pushed a commit to dtzWill/nix that referenced this issue Apr 4, 2018
This reduces memory consumption of

  nix copy --from file://... --to ~/my-nix /nix/store/95cwv4q54dc6giaqv6q6p4r02ia2km35-blender-2.79

from 514 MiB to 18 MiB for an uncompressed binary cache, and from 192
MiB to 53 MiB for a bzipped binary cache. It may also be faster
because fetching can happen concurrently with decompression/writing.

Continuation of 48662d1.

Issue NixOS#1681.
dtzWill pushed a commit to dtzWill/nix that referenced this issue Apr 4, 2018
This reduces memory consumption of

  nix copy --from https://cache.nixos.org --to ~/my-nix /nix/store/95cwv4q54dc6giaqv6q6p4r02ia2km35-blender-2.79

from 176 MiB to 82 MiB. (The remaining memory is probably due to xz
decompression overhead.)

Issue NixOS#1681.
Issue NixOS#1969.
@andrewchambers
Copy link

Hitting this issue trying to do something like - nixos-rebuild build ; nix copy ./result --to ssh://low_ram_machine

@dtzWill will those experimental changes help with ssh copy?

@edolstra
Copy link
Member

@Ralith I'm probably not going to make S3BinaryCacheStore do uploads in constant space. It might not even be supported by aws-sdk-cpp.

I assume the 100% CPU is caused by compression, which you can disable.

@copumpkin
Copy link
Member

copumpkin commented Apr 12, 2018

FWIW I too am another big-upload-to-S3 guy using nix copy 😄

It would surprise me if aws-sdk-cpp didn't support it, given that S3 supports almost arbitrarily large objects and multi-part uploads. If someone figured out how to implement it, would you accept the PR?

@Ralith
Copy link

Ralith commented Apr 13, 2018

I assume the 100% CPU is caused by compression, which you can disable.

It seems very strange that it would take twenty minutes on my i7-4980HQ, even so. 2.8GB is big but it's not that big.

@edolstra
Copy link
Member

IIRC xz compression can easily take that long.

@coretemp
Copy link

This is what I am seeing too:

a...........> copying path '/nix/store/fl3mcaqqk2vg0dmk01dfbs6nbm5skpzc-systemd-237' from 'https://cache.nixos.org'...
a...........> error: out of memory

The main problem I see is that it merely says "out of memory", instead of saying how much it tried to allocate, and how much was available before the allocation in the error message. Copying data should run in constant space as others have already mentioned.

If the compression is causing higher memory requirements than needed, this is a problem too, because it raises the hosting costs for no reason other than the initial deployment.

Before the deployment at least 300MB was available on host a.

@dtzWill
Copy link
Member

dtzWill commented Apr 23, 2018

FWIW it looks like they do support streaming at least for fetches:

https://sdk.amazonaws.com/cpp/api/LATEST/index.html

(Near end, look for IOStreams).

Hopefully upload has similar.

Seconded re:xz compression taking that long. There's an option somewhere to enable parallel xz compression is you have idle cores. IIRC the result will be slightly bigger for the same compression level.

Anyway, if someone tackled the API spelunking would it be welcome? Or is there a reason that will have problems or is a bad idea?

EDIT: oops I think we already use the stream thing, although at a glance it looks like we pull it all into a string but that seems resolvable. Anyway fetch from s3 is probably not as important.

@lheckemann
Copy link
Member

As far as I can tell, the fixes in 2.0.1 still don't really fix the issue.

@edolstra
Copy link
Member

edolstra commented May 3, 2018

@lheckemann IIRC we didn't cherry-pick any memory improvements in 2.0.1. You need master for some of the fixes or my experimental branch for the rest.

@lheckemann
Copy link
Member

Oh, that would explain it! Any chance they could be included in a 2.0.2 release? There have been so many complaints about this issue on IRC and I've run into it myself more times than I would like as well.

@SebastianCallh
Copy link

Does "nixops deploy" use this? I get out of memory during deploy, even though I have several gigabytes free (both on disk and working memory) which is odd. Just wondering if this is addressed here or should be investigated further.

@coretemp
Copy link

@SebastianCallh you are not specifying which machine goes out of memory, so I assume you don't know it's talking about the machine you are deploying to. The solution to this is to use 512MB of swap.

Perhaps I might commit some of my changes to fix this in an AWS environment when t2.nanos are being used, but only if there is interest in them from people with commit access.

@SebastianCallh
Copy link

@coretemp That was the machine I was referring to. The machine being deployed too has plenty of both disk and working memory to spare when the error occurs.

edolstra added a commit that referenced this issue May 30, 2018
Continuation of 97002b6. This makes
the daemon use constant memory. For example, it reduces the daemon's
maximum RSS on

  $ nix copy --from ~/my-nix --to daemon /nix/store/1n7x0yv8vq6zi90hfmian84vdhd04bgp-blender-2.79a

from 264 MiB to 7 MiB.

We now use a TunnelSource to prevent the connection from ending up in
an undefined state if an exception is thrown while the NAR is being
sent.

Issue #1681.
@domenkozar
Copy link
Member

coretemp was banned since, so we can unlock.

@nh2
Copy link
Contributor

nh2 commented Jun 27, 2019

I have backported @edolstra's memory fixes to Nix 2.0.4 (because I'm still using that in one place):

2.0.4...nh2:nh2-2.0.4-issue-1681-cherry-pick

Note this fixes the case where the machine that's running nixops runs out of memory.

@edolstra edolstra reopened this Jun 27, 2019
@nh2
Copy link
Contributor

nh2 commented Jun 27, 2019

I think this issue is solved in Nix 2.2 at least for my use cases (given that my ram problems in nixops disappear in my backport, including #38808).

But it would make sense to ask around among the subscribers to this issues if you have observed any further nix copy or nix-copy-closure related memory problems since these commits landed.

If not, we can probably close this.

(There is still #2774 which says that 2.2 is used and which is relatively recent.)

So, does anybody here still have memory problems with current nix?

@AleXoundOS
Copy link

AleXoundOS commented Jun 29, 2019

So, does anybody here still have memory problems with current nix?

I have.
I'm the author of #2774. And even slowly started to write my own solution to the problem of downloading binary cache (using a reasonable amount of RAM). Also, here at my work, the lack of a ready to use mirroring solution is the main issue that currently prevents our company from using NixOS. Since no internet connection possible and everything needs to be downloaded beforehand.

@lordcirth

This comment has been minimized.

@zimbatm

This comment has been minimized.

@tazjin
Copy link
Member

tazjin commented Oct 8, 2019

So, does anybody here still have memory problems with current nix?

Yes, on Nix 2.2.2 I'm still seeing several GB of memory usage when substituting large paths (e.g. GHC) from a cache (as part of a larger build). This is problematic for running Nixery on something like Cloud Run where memory is hard-capped at 2GB.

I haven't yet tried this with 2.3 to see if it makes a difference, but it's on the todo-list.

Edit: I won't be able to test this with 2.3 easily, as it no longer works in gVisor even with my SQLite connection patch. Might get around to more advanced debugging during the weekend ...

@nagisa
Copy link

nagisa commented Dec 11, 2019

I have observed this when copying a locally built output to a http cache:

nix copy --to 'http://localhost:3000' /nix/store/HASH-NAME-v0.1.0 --option narinfo-cache-negative-ttl 0 --option narinfo-cache-positive-ttl 0

and have observed nix copy to consume approximately the same amount of memory as data copied. That is, the output as reported by nix-copy was 8G for all the outputs it copies and I have seen nix-copy process to consume approximately as much.

The memory usage slowly but surely rises towards that number (and never goes down) as nix copy is compressing outputs.

@nagisa
Copy link

nagisa commented Dec 11, 2019

I think what happens here is that nix copy stores the compressed result in the memory and then sends it all out in one go rather than streaming the data out as it compresses the nar.xz.

EDIT: nix version 2.3.1

@stale
Copy link

stale bot commented Feb 13, 2021

I marked this as stale due to inactivity. → More info

@AriFordsham
Copy link

Is there a plan to fix this for nix copy?

@stale stale bot removed the stale label Aug 16, 2021
@Ericson2314
Copy link
Member

I think it is fixed on master.

@AriFordsham
Copy link

@Ericson2314 I mean the specific issue of copying --to file://, as documented in #2774 . It doesn't seem to be fixed, even on master - I have recorded my measurements there.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixosstag.fcio.net/t/code-of-conduct-or-whatever/2134/21

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2-nov-2021-discourse-migration-to-flying-circus/15752/7

@stale
Copy link

stale bot commented May 2, 2022

I marked this as stale due to inactivity. → More info

@stale stale bot added the stale label May 2, 2022
@fricklerhandwerk fricklerhandwerk added new-cli Relating to the "nix" command performance labels Sep 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new-cli Relating to the "nix" command performance stale
Projects
None yet
Development

No branches or pull requests