Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chroot builds are slow #179

Closed
edolstra opened this issue Dec 2, 2013 · 99 comments

Comments

Projects
None yet
@edolstra
Copy link
Member

commented Dec 2, 2013

Chroot builds have a significant overhead. For instance, this expression:

with import <nixpkgs> {};
with lib;
let deps = map (n: runCommand "depM-${toString n}" {} "touch $out") (range 1 100);
in runCommand "foo" { x = deps; } "touch $out"

(i.e. a trivial build with 100 trivial dependencies) takes 4.7s to build on my laptop without chroot, but 39.6s with chroot.

The main overhead seems to be in setting up the chroot (i.e. all the bind mounts), but the various CLONE_* flags also have significant overhead.

Unfortunately, there is not much we can do about this since it's all in the kernel, but it does mean we can't enable chroot builds by default on NixOS.

This is on Linux 3.4.70.

@vcunat

This comment has been minimized.

Copy link
Member

commented Dec 2, 2013

Oh, not good. Are there some other standard sandboxing options? (except for LD_PRELOADing some libc hooks)

@edolstra

This comment has been minimized.

Copy link
Member Author

commented Dec 2, 2013

I can imagine a cheaper chroot that just bind-mounts the entire Nix store. Maybe we could even put an ACL on /nix/store to deny "r" but not "x" permission to nixbld users. That way builds can only access store paths that they already know.

Also, maybe it's faster on newer kernels.

@vcunat

This comment has been minimized.

Copy link
Member

commented Dec 2, 2013

Well, I don't think packages try finding something by listing /nix/store. In general, maybe we could deny "r" on it for everyone, but I fail to see any significant gain.

Maybe providing some cheap variant of chroot by default could be a good compromise (with possibility to switch to stronger guarantees).

@peti

This comment has been minimized.

Copy link
Member

commented Dec 16, 2013

The benefits of chrooted builds are far more significant than the performance cost. Chroot builds should be totally enabled on NixOS by default!!!

@alexanderkjeldaas

This comment has been minimized.

Copy link

commented Mar 5, 2014

Are the bind mounts done in parallel?

@edolstra

This comment has been minimized.

Copy link
Member Author

commented Mar 5, 2014

No.

@domenkozar

This comment has been minimized.

Copy link
Member

commented Oct 28, 2014

@edolstra I'd still prefer purity/determinism over performance and enable chroots on Linux by default.

@domenkozar

This comment has been minimized.

Copy link
Member

commented Oct 28, 2014

On my machine (SDD, running kernel 3.14):

real 0m27.129s
user 0m0.139s
sys 0m0.038s

@vcunat

This comment has been minimized.

Copy link
Member

commented Oct 30, 2014

@iElectric: are you sure your measurement is correct? It shows mostly waiting and no real work. Or is that because the work is in fact done in another process?

@wmertens

This comment has been minimized.

Copy link
Contributor

commented Oct 30, 2014

👍 for a mini chroot that has all of nix store. This could be reused, no? Same chroot for all builds?

@domenkozar

This comment has been minimized.

Copy link
Member

commented Oct 30, 2014

@vcunat i'd say most of the time it's waiting for nix-daemon IO

@aristidb

This comment has been minimized.

Copy link
Contributor

commented Jan 7, 2015

Computers have become faster in the past 2 years. We should re-evaluate whether the speed is really worth the significant impurities.

Note that the fact that NixOS default Hydra not using chroot leads to packages "randomly" failing to build locally for those who do use it.

So at least Hydra should enable it.

@edolstra

This comment has been minimized.

Copy link
Member Author

commented Jan 7, 2015

Hydra does use it. It's the other way around, users like me might not have it enabled and think that a package builds properly when it doesn't. (Happened today with a PHP update, which turns out to do a download during its build.)

@domenkozar

This comment has been minimized.

Copy link
Member

commented Jan 7, 2015

Yes, leaving our deterministic promise aside for a sake of some small overhead.

@benley

This comment has been minimized.

Copy link
Member

commented Jan 7, 2015

When nix sets up chroots, is most of the time spent setting up bind mounts? Or does it do a lot of file copying too? If the latter, have you considered using something like Cowdancer (http://www.netfort.gr.jp/~dancer/software/cowdancer.html.en) to get copy-on-write bind mounts? It's low-overhead and fast to set up. Debian uses it in cowbuilder/pbuilder, which makes for an excellent ephemeral-chrooted build system.

@vcunat

This comment has been minimized.

Copy link
Member

commented Jan 7, 2015

@benley: COW isn't needed, as all accessible in the chroot is read-only anyway. From the comments it seems noone has analyzed precisely what's the main cost, but bind mounts are suspected (and they probably were never meant to be used so massively).

@copumpkin

This comment has been minimized.

Copy link
Member

commented Jan 16, 2015

Has anyone looked into proot for this purpose?

@edolstra

This comment has been minimized.

Copy link
Member Author

commented Jan 19, 2015

"PRoot uses the CPU emulator QEMU user-mode to execute transparently guest programs." I doubt that's faster than bind mounts :-)

@vcunat

This comment has been minimized.

Copy link
Member

commented Jan 19, 2015

Yeah, PRoot might be faster to setup, but it sounds significantly slower to run longer builds (which happen a lot). Various other preloading solutions might also slow down system calls, although probably not so much.

@copumpkin

This comment has been minimized.

Copy link
Member

commented Jan 19, 2015

@edolstra oh sorry, my understanding was that it only used QEMU when the guest was of a different architecture

@benley

This comment has been minimized.

Copy link
Member

commented Jan 19, 2015

I believe proot only uses qemu when it's running binaries from a non-native architecture. The proot website is fairly clear about that, unless I'm badly misinterpreting it: http://proot.me/

@benley

This comment has been minimized.

Copy link
Member

commented Jan 19, 2015

It does still intercept system calls in userland, and it's going to have some unavoidable speed overhead.

@alexanderkjeldaas

This comment has been minimized.

Copy link

commented Jan 20, 2015

Isn't it documented to use ptrace? If so it will signal the controlling
process and wait for a command on every syscall that is intercepted.

On Tue, Jan 20, 2015 at 12:47 AM, Benjamin Staffin <notifications@github.com

wrote:

It does still intercept system calls in userland, and it's going to have
some unavoidable speed overhead.


Reply to this email directly or view it on GitHub
#179 (comment).

@wmertens

This comment has been minimized.

Copy link
Contributor

commented Jan 22, 2015

My sophisticated web searches (i.e. "proot benchmark") didn't show up
anything. Anybody tried it yet?

On Tue Jan 20 2015 at 4:17:27 AM Alexander Kjeldaas <
notifications@github.com> wrote:

Isn't it documented to use ptrace? If so it will signal the controlling
process and wait for a command on every syscall that is intercepted.

On Tue, Jan 20, 2015 at 12:47 AM, Benjamin Staffin <
notifications@github.com

wrote:

It does still intercept system calls in userland, and it's going to have
some unavoidable speed overhead.


Reply to this email directly or view it on GitHub
#179 (comment).


Reply to this email directly or view it on GitHub
#179 (comment).

@cedric-vincent

This comment has been minimized.

Copy link

commented Feb 24, 2015

Hello all,

I confirm that PRoot uses QEMU to run non-native binaries only, and
that it is currently based on ptrace; which is known to cause a
significant slowdown. However, in order to decrease this slowdown as
much as possible, PRoot uses process_vm_{readv/writev} (available on
Linux 3.2+) and seccomp mode 2 (available on Linux 3.5+). For
information, here follow figures I've published when I've enabled
seccomp mode 2 in PRoot:

https://github.com/cedric-vincent/PRoot/blob/v5.1.0/doc/proot/changelog.txt#L510

My suggestion is to give PRoot a try if your kernel version is equal
or greater than 3.5, and if it's not too difficult to replace in your
scripts calls to "chroot" and to "mount --bind" with a call to
"proot". If PRoot is not fast enough, this will be likely fixed in
the future using kernel namepaces (available on Linux 3.8+).

Regards,
Cedric.

@Ericson2314

This comment has been minimized.

Copy link
Member

commented Sep 29, 2015

Seems like using Linux namespaces would dovetail with the pure Darwin stdenv work. All the better if they are faster than chroots.

@copumpkin

This comment has been minimized.

Copy link
Member

commented Sep 29, 2015

They actually already use Linux namespaces. chroot is a bad name for them.

@benley

This comment has been minimized.

Copy link
Member

commented Sep 29, 2015

Heh, in that case NixOS should call them Containers and pick up some buzzword publicity points. "Build all the things in containers!" containers containers containers containers containers. ;-)

@nh2

This comment has been minimized.

Copy link
Contributor

commented Jun 14, 2018

Then I'm missing the problem somehow; doesn't building a derivation usually take much longer than the 24 ms measured here?

@zimbatm

This comment has been minimized.

Copy link
Member

commented Jun 14, 2018

It depends on the type of derivation. writeText and writeScript for example are fast and the overhead is not negligible. If Nix wants to compete with project-level build systems like Bazel then this is going to be a limitation.

@ryantrinkle

This comment has been minimized.

Copy link
Contributor

commented Jun 14, 2018

@nh2 Currently, most derivations are large, but there are many situations where breaking things up more finely would be good. For example, I use nix for precompressing assets to be served in my company's web apps. To do this incrementally, it makes sense to have a derivation per file (or even per (file, compression method) pair), but for small files, that's quite slow. I doubt the sandboxing is the only overhead there, but it is definitely a non-trivial factor, given that, e.g. gzipping a 3kb file is ~1 ms.

@zimbatm

This comment has been minimized.

Copy link
Member

commented Jun 15, 2018

It would be good to look at how Bazel does it as they are facing similar problems.

@nh2

This comment has been minimized.

Copy link
Contributor

commented Jun 15, 2018

Thanks for the explanations!

Maybe we should use a selective approach until Linux namespaces are very fast. While sandboxing ver every derivation is certainly desirable, it would already be a huge benefit if we could, for starters, sandbox "the average build" of nixpkgs libraries and applications. For example, I'd be very happy to pay a 24 ms overhead if in turn my 5 hour Chromium build is guaranteed to be pure. But right now it's full sandboxing or no sandboxing.

Another point: The nsenter benchmark at #179 (comment) measures 4 ms mean time. However, we are already dangerously close to Linux's process startup overhead that this number probably is not very meaningful. For example, just running the help text with time nsenter --help > /dev/null takes anything between 1 and 4 ms on my computer.

We should probably benchmark whatever nsenter does in a loop in C to get meaningful numbers for that.

@ryantrinkle

This comment has been minimized.

Copy link
Contributor

commented Jun 15, 2018

FWIW, here are the results on my machine (the same one I used for the prior benchmarks), for nsenter --help >/dev/null:

» nix run -f channel:nixos-unstable bench -c bench "nsenter --help >/dev/null" -o unshare.html 
[4 copied (3.8 MiB), 11.5 MiB DL]
benchmarking nsenter --help >/dev/null
time                 3.108 ms   (3.088 ms .. 3.126 ms)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 3.222 ms   (3.188 ms .. 3.289 ms)
std dev              161.1 μs   (93.24 μs .. 275.8 μs)
variance introduced by outliers: 31% (moderately inflated)

And here it is for true:

benchmarking true
time                 2.470 ms   (2.452 ms .. 2.492 ms)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 2.456 ms   (2.445 ms .. 2.469 ms)
std dev              37.67 μs   (31.60 μs .. 46.29 μs)

So sandboxing is about an order of magnitude slower than running a minimal command. I definitely agree that this amount of time is not important for most use cases today.

@zimbatm

This comment has been minimized.

Copy link
Member

commented Jun 16, 2018

And building a stdenv.mkDerivation is also going to execute bash which stat(2) for rc and profile files all over the place.

@edolstra

This comment has been minimized.

Copy link
Member Author

commented Aug 2, 2018

@zimbatm Yes, but Nix does not require the use of stdenv.mkDerivation.

BTW on Linux 4.17 I get a 37% slowdown in the test mentioned in #179 (comment). That's a big improvement over the 742% slowdown in 2013...

@copumpkin

This comment has been minimized.

Copy link
Member

commented Aug 2, 2018

Any idea of a good threshold for acceptable? I doubt it'll ever be zero cost, but purity-by-default is a big win IMO and I'd be willing to pay a slight cost on it. Especially since the benchmark you're citing mostly affects tiny derivations and not big builds. One even smallish build will completely eclipse a ton of small slowdowns on unit files and NixOS configuration files.

@zimbatm

This comment has been minimized.

Copy link
Member

commented Aug 3, 2018

Having sandboxing turned on by default would be great. It would reduce the number of issues with nixpkgs submissions that don't compile and user reports. We'll be able to trim the PR and Issue templates. That being said, if nix is running inside of a docker container it won't work as docker containers don't support cgroups by default.

Back on the subject of sandboxing, is it possible to re-use sandboxes between runs? if sandboxes could be re-used then they could also be pooled where the pool size = maxJobs.

@Ericson2314

This comment has been minimized.

Copy link
Member

commented Aug 3, 2018

We must be sound now. We must compete with the likes of Bazel on granularity soon. That's how I see it.

@edolstra

This comment has been minimized.

Copy link
Member Author

commented Aug 3, 2018

@copumpkin I think the 37% slowdown is okay-ish, though obviously not ideal.

@zimbatm No, I don't think sandboxes can be reused. The main overhead seems to be setting up the mount namespace, which is necessarily different for each build (since they have different input closures). Of course, you could bind-mount all of /nix/store, but that reduces security a bit (since even if it has -wx permission, builders would be able to access path contents if they know the store path).

@7c6f434c

This comment has been minimized.

Copy link
Member

commented Aug 3, 2018

@ryantrinkle

This comment has been minimized.

Copy link
Contributor

commented Aug 3, 2018

@7c6f434c Good question! I think we would need to benchmark bind mounting to see.

@zimbatm

This comment has been minimized.

Copy link
Member

commented Aug 3, 2018

Another motivation to enforce the sandboxing is that we could get rid of the nixbld\d+ users. Each sandbox gets it's own pid namespace so they could all run with the same uid/gid. That would be great to limit the footprint nix has on non-nixos systems.

@7c6f434c

This comment has been minimized.

Copy link
Member

commented Aug 3, 2018

@edolstra

This comment has been minimized.

Copy link
Member Author

commented Aug 3, 2018

In ad1c827 I implemented automatic UID allocation from a range. You would still like to ensure that the UID range doesn't clash with any existing accounts, though it's unlikely people have UIDs in the range 872415232+...

@7c6f434c

This comment has been minimized.

Copy link
Member

commented Aug 4, 2018

Well, if the range is configurable it should be easy to move outside the ranges used by other tools; definitely simpler than listing eight build users in global passwd. Thanks.

@domenkozar

This comment has been minimized.

Copy link
Member

commented Nov 2, 2018

@edolstra I wanted to implement sandboxing to be on for Docker after @garbas talk, but really that road leads back to Nix doing it by default for overall good experience. Given that we're at the okayish threshold now, and kernel 4.19 was released that will be next LTS, can we make sandboxing by default on? :)

@copumpkin

This comment has been minimized.

Copy link
Member

commented Nov 2, 2018

Why docker? I missed the talk but intuitively it feels like a step backwards

@domenkozar

This comment has been minimized.

Copy link
Member

commented Nov 2, 2018

@copumpkin just to enable sanboxing for https://github.com/NixOS/docker, since it helps sandbox networking during Nix builds.

@zimbatm

This comment has been minimized.

Copy link
Member

commented Nov 2, 2018

Does the Nix sandboxing work inside of Docker now?

@dtzWill

This comment has been minimized.

Copy link
Contributor

commented Nov 3, 2018

@copumpkin

This comment has been minimized.

Copy link
Member

commented Nov 3, 2018

Oh sorry, I misread and thought you wanted to change our sandboxing mechanism to use docker, rather than get docker to work from inside one of our sandboxes 😄 sorry!

@dtzWill

This comment has been minimized.

Copy link
Contributor

commented Nov 3, 2018

@copumpkin

This comment has been minimized.

Copy link
Member

commented Nov 3, 2018

oh, I see, thanks!

@edolstra edolstra closed this in 812e393 Nov 7, 2018

@domenkozar

This comment has been minimized.

Copy link
Member

commented Nov 7, 2018

🎉

@Ericson2314

This comment has been minimized.

Copy link
Member

commented Apr 10, 2019

#2759 A wildly different idea for maybe-even-faster sandboxing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.