Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/qemu-vm: Allow building a Nix store image instead of relying on 9p + more #140792

Merged
merged 11 commits into from
Oct 28, 2021

Conversation

talyz
Copy link
Contributor

@talyz talyz commented Oct 6, 2021

Motivation for this change

The goal of this PR is to speed up some IO heavy tests by building a Nix store disk image instead of using 9p to access the host store.

To this end, the following things are done:

  • In make-disk-image.nix:
    • Add the onlyNixStore argument, which allows building an image containing only the Nix store
    • Reintroduce the installBootLoader argument, which seemed to have been accidentally removed
    • Add the copyChannel argument, to control whether a default nix channel is copied to the image
    • Add the additionalPaths argument to allow additional paths to be copied to the image's nix store
  • In qemu-vm.nix:
    • Add the useNixStoreImage option, which controls whether an image is used rather than 9p
    • Rename the pathsInNixDB option to the more general additionalPaths, since its purpose is different when building an image
  • In trivial-builders.nix:
    • Add writeStringReferencesToFile, a builder which extracts a string's references to derivations and paths and writes them to a text file, removing the input string itself from the dependency graph
  • In testing-python.nix:
    • Make sure all derivations referenced by the test script are added to additionalPaths, since they need to be explicitly made available when building an image

These changes are then put to use in the discourse and gitlab tests. With my laptop (AMD Ryzen 7 PRO 2700U) as the reference system, I see the following test run times:

  • Discourse

    • No change:
      25 mins, 49 secs

    • Building a Nix store image:
      4 mins, 44 secs

    • Building a Nix store image and bumping the core count to 4:
      3 mins, 6 secs

  • GitLab

    • No change:
      Times out after 28 mins

    • Building a Nix store image:
      7 mins, 48 secs

    • Building a Nix store image and bumping the core count to 4:
      7 mins, 17 secs

The times include the time it takes to build the image (~40 secs for discourse and ~1 min, 20 secs for gitlab).

Things done
  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • 21.11 Release Notes (or backporting 21.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

@roberth
Copy link
Member

roberth commented Oct 7, 2021

I wonder if something like fscache could be a solution. Potentially it gives the best of both worlds: not building an entire image, yet avoiding 9p for repeated accesses. Could you compare performance with cache=fscache instead of cache=loose?

@talyz
Copy link
Contributor Author

talyz commented Oct 7, 2021

@roberth Tried setting cache=fscache now: for discourse, it gave me a run time of 25 mins, 45 secs, so probably within error.

@mohe2015
Copy link
Contributor

mohe2015 commented Oct 7, 2021

I have no idea but could https://virtio-fs.gitlab.io/ be an alternative?

@talyz
Copy link
Contributor Author

talyz commented Oct 7, 2021

@mohe2015 Yeah, I was considering it, but figured it wouldn't give quite the same performance boost, plus it needs a userspace daemon running on the host, so this seemed like the safer bet. It could certainly be interesting to try, though - if the benchmarks are correct, it should give a decent boost over 9p :)

@knedlsepp
Copy link
Member

I'm curious about the downsides of this. Will this increase disk usage or network traffic on hydra or the binary-cache? Do we need to worry about that?

@andir
Copy link
Member

andir commented Oct 8, 2021

I have no idea but could https://virtio-fs.gitlab.io/ be an alternative?

That currently requires cooperation from the Host OS as virtiofs requires a daemon running as root (and a socket available to QEmu during execution).

@mohe2015
Copy link
Contributor

mohe2015 commented Oct 8, 2021

Yeah, that was already mentioned above. As far as I can tell this solution has the disadvantage of more disk usage but the performance is just so much better

@talyz
Copy link
Contributor Author

talyz commented Oct 8, 2021

@knedlsepp AFAIK, this should only increase disk usage on the builder. The disk image doesn't end up in the test's closure and I don't see why any of it should end up in the binary cache. Maybe someone with more insight into hydra can chime in, though :)

@talyz talyz requested a review from dpausp October 9, 2021 09:05
@talyz
Copy link
Contributor Author

talyz commented Oct 15, 2021

Anything more to do here? Any objections to merging it?

Copy link
Member

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a potential simplification. Otherwise lgtm!

nixos/lib/testing-python.nix Outdated Show resolved Hide resolved
@ofborg ofborg bot requested a review from roberth October 15, 2021 17:31
@talyz
Copy link
Contributor Author

talyz commented Oct 28, 2021

Okay, now it's done!

shared = { source = ''"''${SHARED_DIR:-$TMPDIR/xchg}"''; target = "/tmp/shared"; };
nix-store = mkIf (!cfg.useNixStoreImage) {
source = builtins.storeDir;
target = "/nix/store";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
target = "/nix/store";
target = "/nix/store"; # should be storeDir, but is not supported in scripts yet

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, technically, it should be set to whatever storeDir was set to in the nix derivation used in the vm, which currently isn't exposed, but could be added to its passthru attributes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it was possible, that would be the solution, but you can't choose storeDir from inside the Nix language.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, you're right. I guess the reasonable thing to do then is just un-hardcode all the values.

@ofborg ofborg bot requested a review from roberth October 28, 2021 10:38
@yu-re-ka yu-re-ka merged commit 33b3dae into NixOS:master Oct 28, 2021
@talyz talyz deleted the qemu-vm-build-root-image branch October 28, 2021 10:56
roberth added a commit to hercules-ci/nixpkgs that referenced this pull request Oct 31, 2021
…n useNixStoreImage = true

The involved test was nixosTests.nextcloud.basic21.

It has a testScript that is strict in nodes.nextcloud.config.system.build.vm,
in assertions about imagemagick being in the system closure.

The recursion was introduced in 329a446 from
NixOS#140792
@roberth
Copy link
Member

roberth commented Oct 31, 2021

While #144014 breaks the infinite recursion when useNixStoreImage is disabled, the same could occur in a test that does use it. We can use two basic techniques:

  • shadow known bad config values using // and whatnot. This allows the expression to fail before it would enter into infinite recursion.
  • don't pass the real nodes into testScript, but a very similar set of nodes with additionalPaths mkForced to [].

The first has no performance impact. The second does.
If we combine them, we only have a performance impact when necessary, but we lose a bit of safety when it comes to new options in the future.

@talyz
Copy link
Contributor Author

talyz commented Nov 1, 2021

@roberth It seems like hydra doesn't want to build the disk images due to their size. Raising the max_output_size is probably the solution. Do you know if this is doable? Also, do you know if the disk image would end up in the cache even though it's not part of the final closure?

@roberth
Copy link
Member

roberth commented Nov 1, 2021

Raising the max_output_size

This may be a hard sell.

if the disk image would end up in the cache even though it's not part of the final closure?

Not 100% sure, but I believe it caches everything it builds. You generally don't want to throw away your tools.

@talyz
Copy link
Contributor Author

talyz commented Nov 1, 2021

Raising the max_output_size

This may be a hard sell.

A shame if it's true, since that would mean all tests using this will always fail :/

if the disk image would end up in the cache even though it's not part of the final closure?

Not 100% sure, but I believe it caches everything it builds. You generally don't want to throw away your tools.

True, but I would have guessed that all tools we want to keep are primary inputs anyway. It would be interesting to know if there are any attributes that control this, like preferLocalBuild or something.

@talyz
Copy link
Contributor Author

talyz commented Nov 2, 2021

Yep, everything ends up in the cache, it seems. At least I'm able to fetch stuff that shouldn't be referenced by any build's final closure. Unless the outputs are GCd at some point, they would eat up quite a lot of disk space and be essentially useless.

@roberth
Copy link
Member

roberth commented Nov 2, 2021

A potential solution is to override this setting in release.nix.
There's a bunch of wiring in between that needs to pass this bit of info along, but that could be done.
Invocations via pkgs.nixosTests would be unaffected by this, but it only works as long as hydra doesn't build package tests, which it should though: #136026
Maybe it needs to be a developer-only feature for now :(

@flokli
Copy link
Contributor

flokli commented Nov 2, 2021 via email

@talyz
Copy link
Contributor Author

talyz commented Nov 2, 2021

Well, currently, hydra fails to build the disk images since they larger than max_output_size, so there shouldn't be any need to revert in order to save space. If we want the tests to be able to build on hydra, however, something needs to change. To me, it feels like the proper solution would be to add an attribute which tells hydra not to bother caching the output - dontCache or something. I'm not sure how feasible it would be to do that, though.

It's all a bit ironic, since making this build better on hydra was one of my goals with this - the GitLab aarch64 tests were almost constantly timing out and the x86_64 test required ~20mins to run. At least it made the situation better on ofborg, where both aarch64 and x86_64 timed out previously.

@kvtb
Copy link
Contributor

kvtb commented Feb 23, 2022

9p read performance is not so bad, what woes in those Ruby=on-Rails cases are opendir and stat
NixOS might add some extension to 9p to improve it given read-only and same-timestamp properties of Nix Store.
Once upon a time there was a NixOS patch to 9p kernel module

@kvtb
Copy link
Contributor

kvtb commented Feb 23, 2022

As for virtiofs which requires root...
Well, nix build also requires root - either directly or with help of nix-daemon running under root.
So nix-daemon could setup virtiofs on request, just like does the daemon of libvirt.

@roberth
Copy link
Member

roberth commented Jul 12, 2022

Another possibility is to create an image not for the entire system, but only for the paths that matter for performance.
The /nix/store would then be an overlay of a block filesystem on top of 9p. This can reduce the size of the filesystem.
It also removes the need for putting the testScript dependencies in an image, solving the recursion problem more elegantly and without eval overhead.

When the filesystem is smaller, it may also be feasible to generate it at the last minute, in the test runner, so that it doesn't have be stored in the store. If the filesystem generation is quick enough, that'd be worthwhile. This can be done with mkfs.ext4 -d, which will be faster than make-disk-image.nix because it lacks virtualisation overhead and even faster than make-ext4-fs.nix.

Instead of virtualisation.useNixStoreImage = true; we'd say

virtualisation.copyStorePaths = [ config.services.gitlab.package ];

So our options for improving this are:

  • Optimize 9p directory operations in linux.
  • Add virtiofs to nix-daemon and test driver. This breaks in non-root sandbox, so 9p should also be supported. Deciding at eval time is bad UX. Deciding in the sandbox is impure and probably tricky to generate an fstab for it.
  • Put only part of the system closure in the image.

@roberth
Copy link
Member

roberth commented Jul 14, 2022

I've done some exploration with adding an overlay containing a subset of derivations, with squashfs and ext4.

Conclusion so far: a non-stored squashfs is a viable replacement for a stored ext4 image in the gitlab test, but not quite as efficient as what could be achieved by patching mkfs.ext4 -d.

Squashfs work in #181537.

Findings

  • a last-minute squashfs is feasible. Costs ~6s on my system for a 2G+ gitlab image with many .rb files
  • an overlay with an image containing only the gitlab packages is 40% slower than an image containing the entire store
  • ext4 has an option to load a directory into a filesystem image without virtualisation, but it doesn't support the selection of individual paths like gensquashfs and archivers do
  • squashfs has flags that optimize not just image creation but runtime performance. Compression gave a 25% slowdown.
  • a squashfs image containing the whole system toplevel seems to be equally fast as squashing only the gitlab packages
  • simulating an ext4 construction with mkfs.ext4 -d ./copied-paths (ie not counting the copying of paths to work around the missing optimization in mkfs.ext4 -d) seems 4× slower than mksquashfs 6s -> 23s. This is still ~100s faster than the overlay approach, or 25%. Only 8% slower than a saved ext4 image.

Open questions:

  • How about a complete squashfs image with all store paths? The ext4 numbers suggest that savings are possible.
  • How do these performance numbers hold up on the hardware of ofborg and hydra.nixos.org, or contributors' hardware? I think I'm willing to make assumptions though, as long as a typical contributor laptop benefits from the change.
  • How does all of this compare to possible performance optimizations in the 9p kernel module?
  • How does squashfs perform without an overlay? An optimization that applies when writableStore = false.

My data and some notes:

ext4 store image (this PR baseline)
277 seconds as reported by the test driver
+ ? seconds to build the image

9p host store + reversed overlay (ineffective) (probably close to baseline before this PR)
1198 seconds as reported by the test driver

9p host store + squashfs overlay
524 seconds as reported by the test driver
+ ? seconds to build the image

9p host store + uncompressed squashfs overlay
416 seconds as reported by the test driver
+ 6 seconds to build the image

9p host store + erofs overlay (default options)
386 seconds as reported by the test driver
+ 5 seconds to build the image

erofs (writableStore = false)
241 seconds as reported by the test driver
+ 7 seconds to build the image

9p host store + ext4 overlay
393 seconds as reported by the test driver
+ 23 seconds to build the image, excluding cp -r storepaths .

9p host store + uncompressed squashfs overlay with entire toplevel in the squashfs
410s including 5s for building the image

Hardware:

AMD Ryzen 9 3900 12-Core
ssd: SAMSUNG MZQLB1T9HAJR-00007
host store: btrfs

jsoo1 pushed a commit to awakesecurity/nixpkgs that referenced this pull request Oct 19, 2022
…n useNixStoreImage = true

The involved test was nixosTests.nextcloud.basic21.

It has a testScript that is strict in nodes.nextcloud.config.system.build.vm,
in assertions about imagemagick being in the system closure.

The recursion was introduced in 329a446 from
NixOS#140792
jsoo1 pushed a commit to awakesecurity/nixpkgs that referenced this pull request Oct 21, 2022
…n useNixStoreImage = true

The involved test was nixosTests.nextcloud.basic21.

It has a testScript that is strict in nodes.nextcloud.config.system.build.vm,
in assertions about imagemagick being in the system closure.

The recursion was introduced in 329a446 from
NixOS#140792
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants