-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Lazy trees v2 #13225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Lazy trees v2 #13225
Conversation
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/determinate-nix-3-5-introducing-lazy-trees/64350/7 |
This comment was marked as resolved.
This comment was marked as resolved.
This adds a setting 'lazy-trees' that causes flake inputs to be "mounted" as virtual filesystems on top of /nix/store as random "virtual" store paths. Only when the store path is actually used as a dependency of a store derivation do we materialize ("devirtualize") the input by copying it to its content-addressed location in the store. String contexts determine when devirtualization happens. One wrinkle is that there are cases where we had store paths without proper contexts, in particular when the user does `toString <path>` (where <path> is a source tree in the Nix store) and passes the result to a derivation. This usage was always broken, since it can result in derivations that lack correct references. But to ensure that we don't change evaluation results, we introduce a new type of context that results in devirtualization but not in store references. We also now print a warning about this.
Running the test suite uncovered a couple of failures: https://github.com/NixOS/nix/actions/runs/15130602489/job/42530714771?pr=13235 |
Particularly nixpkgsLibTests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using randomness or even fingerprints for placeholders makes the language non-deterministic and impure.
Some things we can do, in approximate order of increasing cost/benefit:
- copy to the store on demand (and concurrently)
- have lazy path values
- have rope-structured strings with lazily computed fixed-length substrings to represent lazy hashing
- extend the language with weaker interfaces that perform better, e.g. as described in #10689 (comment)
With those solutions in mind, I don't think we have to ruin the language semantics with any kind of "entropy", including fingerprints.
Path equality and ordering (as observable to users) must remain identical, and we have plenty of options for making it lazier without sacrificing determinism or compatibility.
@roberth I understand your concerns but I think it would be hard to make improvments if changes are not allowed incrementally. I think having large pull requests lingering also makes impacts negatively other pull requests where we are too afraid to apply large scale refactorings. Would you be fine if the settings would be replaced with a compile-time flag? |
How about guard behind an experimental flag and let more people use it first? |
It's already a flag not enabled by default, so that should be covered. Commonly we use experimental flags for new features that might change over time. This flag should not introduce any new features so, but potentially bugs, so I don't think it needs to be experimental in this case. The documentation of the flag should potentially mention the current randomness however. |
@Mic92 Incremental improvements are allowed, but this is not an improvement. If any contributor, including Nix team members, wants to make breaking changes to the language, they should discuss that with the team first, and it is unlikely to result in an actual change to the language, as Nix promises reproducible behavior on old expressions. Large PRs are ok if they're refactors, especially if they're previously agreed upon. However, this PR is not a refactor. It changes the semantics of the Nix language significantly, and for worse.
What would be the purpose, except enabling a bug? Any kind of flag, compile time or otherwise, has a real cost in terms of maintenance. Also we shouldn't be seducing users into bad workarounds. We've seen how that ends - or, I guess, doesn't end. I appreciate that you try to find a middle ground @Mic92, but this is not a negotiation. We don't balance out insufficient work with personal authority or upvotes. Good work gets merged. |
fetchers::Input & input, const fetchers::Input & originalInput, ref<SourceAccessor> accessor, bool requireLockable) | ||
{ | ||
auto storePath = settings.lazyTrees | ||
? StorePath::random(input.getName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For context this is the line the concern is about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Path equality and ordering (as observable to users) must remain identical, and we have plenty of options for making it lazier without sacrificing determinism or compatibility.
Concretely, an easy incremental solution would be to just hash the source without copying, and eat the cost of that. It may still avoid the more expensive copying operation.
By having the correct string in the first place, we're not exposing the wrong string in the first place.
More optimizations can be applied later, with the usual condition that they don't regress the language semantics.
(e.g. the list I commented in my review, or something else)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For debugging, behind a flag, it might be interesting to put an X
instead of the last Nix32 hash character, indicating that some code path didn't realise its string context as it should have, but other than that, I see no use for string replacements because of the whole language semantics thing. We can't have an observable wrong string, so there won't be anything to replace beyond what we're doing for ca-derivations
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be beneficial to use a "dummy" hash like all a's here, instead of something random? The names are all coming from the flake inputs, right? So we would end up with them being unique anyway, unless I am misunderstanding this section, which is well possible. We would simply have to keep the dummy hash stable to maintain compatibility, then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concretely, an easy incremental solution would be to just hash the source without copying, and eat the cost of that. It may still avoid the more expensive copying operation. By having the correct string in the first place, we're not exposing the wrong string in the first place.
Seems still quite slow to the 2s I get with the current implementation:
% time find $HOME/git/nixpkgs -type f | xargs cat >/dev/null
find $HOME/git/nixpkgs -type f 0.18s user 1.36s system 8% cpu 18.286 total
xargs cat > /dev/null 0.37s user 8.25s system 43% cpu 19.634 total
And nixpkgs is not even the biggest repository. For people doing data-science or game development, asset sizes can be way beyond what we have in nixpkgs (i.e. 40G for datasets/game assets).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the hashing be cached by mtime
? stat
-ing a big monorepo is still slow but much faster than hashing it.
If the contents is a Git repository, could we reuse the information in the Git index to hash faster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and no.
Yes to speed up our internal "git status" call. See this comment for further improving the performance of that.
A NAR hash isn't a Merkle hash unfortunately, so intermediate hash like this don't contribute to the current hashing scheme, but we can add a flake attribute to pick a non-NAR hash, such as provided by the git-hashing
feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the hashing be cached by
mtime
?stat
-ing a big monorepo is still slow but much faster than hashing it.
What about this good idea?
I use that approach in programs that need to know the hashes of input files but hashing is slow.
Keep a cache hash(abspath) -> ( inode, mtime, hash(contents) )
somewhere.
That way you know when you need to re-hash the contents. Re-hashing at the speed of stat()
. It also allows to gracefully handle silly systems such as some Windows filesystems where mtimes are second-resolution (do the real hash when the file is too young). Include the file size if extra paranoid that the system lies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For nixpkgs without .git
folder specifically, I want to point out that on my SSD, getting the directory names ("find
") is the bottleneck:
It takes 3 seconds, not matter if we do nothing more, or stat
, or cat
, or sha256sum
.
(Serial single cat
or sha256sum
takes 5 seconds, but that goes away when using 2 threads.)
# on nixpkgs checkout on SSD; using `find *` instead of `find .` to exclude `.git`
# Just find (no stat, just `getdents64` + directory open/closes)
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && command time find * -type f > /dev/null
# 0.07user 0.65system 0:03.62elapsed 20%CPU (0avgtext+0avgdata 4864maxresident)k
# Sequential single cat
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && find * -type f | command time xargs cat > /dev/null
# 0.06user 1.18system 0:05.42elapsed 22%CPU (0avgtext+0avgdata 2592maxresident)k
# Sequential single stat
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && find * -type f | command time xargs stat > /dev/null
# 0.40user 0.66system 0:03.63elapsed 29%CPU (0avgtext+0avgdata 3904maxresident)k
# Sequential batched stat
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && find * -type f | command time xargs -n 1000 stat > /dev/null
# 0.39user 0.63system 0:03.58elapsed 28%CPU (0avgtext+0avgdata 3896maxresident)k
# Sequential batched cat
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && find * -type f | command time xargs -n 1000 cat > /dev/null
# 0.06user 1.21system 0:05.41elapsed 23%CPU (0avgtext+0avgdata 2340maxresident)k
# Parallel batched stat
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && find * -type f | command time xargs -P 10 -n 1000 stat > /dev/null
# 0.39user 0.64system 0:03.60elapsed 28%CPU (0avgtext+0avgdata 3900maxresident)k
# Parallel batched cat
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && find * -type f | command time xargs -P 10 -n 1000 cat > /dev/null
# 0.08user 1.18system 0:03.68elapsed 34%CPU (0avgtext+0avgdata 2340maxresident)k
# Sequential stat in single find process
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && command time find * -type f -printf "%s\n" > /dev/null
# 0.06user 0.88system 0:03.85elapsed 24%CPU (0avgtext+0avgdata 4708maxresident)k
# Sequential single sha256sum
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && find * -type f | command time xargs sha256sum > /dev/null
# 0.16user 0.82system 0:05.17elapsed 19%CPU (0avgtext+0avgdata 3888maxresident)k
# Parallel single sha256sum (-P 2)
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && find * -type f | command time xargs -P 2 sha256sum > /dev/null
# 0.19user 0.79system 0:03.80elapsed 25%CPU (0avgtext+0avgdata 3904maxresident)k
# Parallel single sha256sum (-P 10)
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && find * -type f | command time xargs -P 10 sha256sum > /dev/null
# 0.21user 0.77system 0:03.77elapsed 26%CPU (0avgtext+0avgdata 3896maxresident)k
When including .git
(find .
), stat based hash caching is already useful by factor 3:
# Just find (no stat, just `getdents64` + directory open/closes)
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && command time find . -type f > /dev/null
# 0.06user 0.66system 0:03.71elapsed 19%CPU (0avgtext+0avgdata 4864maxresident)k
# Sequential single cat
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && find . -type f | command time xargs cat > /dev/null
# 0.08user 4.04system 0:11.82elapsed 34%CPU (0avgtext+0avgdata 2456maxresident)k
# Sequential single stat
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && find . -type f | command time xargs stat > /dev/null
# 0.44user 0.71system 0:03.76elapsed 30%CPU (0avgtext+0avgdata 3916maxresident)k
# Sequential batched stat
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && find . -type f | command time xargs -n 1000 stat > /dev/null
# 0.44user 0.80system 0:03.79elapsed 32%CPU (0avgtext+0avgdata 3900maxresident)k
# Sequential batched cat
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && find . -type f | command time xargs -n 1000 cat > /dev/null
# 0.09user 4.01system 0:12.12elapsed 33%CPU (0avgtext+0avgdata 2216maxresident)k
# Parallel batched stat
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && find . -type f | command time xargs -P 10 -n 1000 stat > /dev/null
# 0.47user 0.77system 0:03.76elapsed 33%CPU (0avgtext+0avgdata 3900maxresident)k
# Parallel batched cat
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && find . -type f | command time xargs -P 10 -n 1000 cat > /dev/null
# 0.10user 4.10system 0:08.11elapsed 51%CPU (0avgtext+0avgdata 2324maxresident)k
# Sequential stat in single find process
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && command time find . -type f -printf "%s\n" > /dev/null
# 0.07user 0.92system 0:04.00elapsed 24%CPU (0avgtext+0avgdata 4864maxresident)k
# Sequential single sha256sum
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && find . -type f | command time xargs sha256sum > /dev/null
# 3.03user 3.58system 0:12.38elapsed 53%CPU (0avgtext+0avgdata 3888maxresident)k
# Parallel single sha256sum (-P 2)
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && find . -type f | command time xargs -P 2 sha256sum > /dev/null
# 3.05user 3.52system 0:09.64elapsed 68%CPU (0avgtext+0avgdata 3888maxresident)k
# Parallel single sha256sum (-P 10)
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' && find . -type f | command time xargs -P 10 sha256sum > /dev/null
# 3.11user 3.46system 0:08.83elapsed 74%CPU (0avgtext+0avgdata 3888maxresident)k
It's a bit odd that we get the factor >= 3x because .git
only contains 16% of the files:
find * -type f | wc -l
# 48567
find . -type f | wc -l
# 56328
When caches are not dropped, stat
remains is 3x much faster than cat
, and is as fast as just find
without stat:
# Just find (no stat, just `getdents64` + directory open/closes)
command time find . -type f > /dev/null
# 0.05user 0.25system 0:00.31elapsed 99%CPU (0avgtext+0avgdata 4796maxresident)k
# Sequential single cat
find . -type f | command time xargs cat > /dev/null
# 0.06user 1.48system 0:01.55elapsed 100%CPU (0avgtext+0avgdata 2352maxresident)k
# Sequential single stat
find . -type f | command time xargs stat > /dev/null
# 0.40user 0.52system 0:00.92elapsed 101%CPU (0avgtext+0avgdata 3900maxresident)k
# Sequential batched stat
find . -type f | command time xargs -n 1000 stat > /dev/null
# 0.45user 0.54system 0:00.98elapsed 101%CPU (0avgtext+0avgdata 3900maxresident)k
# Sequential batched cat
find . -type f | command time xargs -n 1000 cat > /dev/null
# 0.07user 1.48system 0:01.54elapsed 100%CPU (0avgtext+0avgdata 2212maxresident)k
# Parallel batched stat
find . -type f | command time xargs -P 10 -n 1000 stat > /dev/null
# 0.42user 0.61system 0:00.31elapsed 331%CPU (0avgtext+0avgdata 3900maxresident)k
# Parallel batched cat
find . -type f | command time xargs -P 10 -n 1000 cat > /dev/null
# 0.07user 1.52system 0:00.90elapsed 176%CPU (0avgtext+0avgdata 2212maxresident)k
# Sequential stat in single find process
command time find . -type f -printf "%s\n" > /dev/null
# 0.05user 0.29system 0:00.35elapsed 99%CPU (0avgtext+0avgdata 4864maxresident)k
# Sequential single sha256sum
find . -type f | command time xargs sha256sum > /dev/null
# 2.89user 1.30system 0:04.19elapsed 100%CPU (0avgtext+0avgdata 3888maxresident)k
# Parallel single sha256sum (-P 2)
find . -type f | command time xargs -P 2 sha256sum > /dev/null
# 2.89user 1.31system 0:03.79elapsed 111%CPU (0avgtext+0avgdata 3952maxresident)k
# Parallel single sha256sum (-P 10)
find . -type f | command time xargs -P 10 sha256sum > /dev/null
# 2.86user 1.35system 0:03.77elapsed 111%CPU (0avgtext+0avgdata 3932maxresident)k
I think this proves sufficiently that adding stat()
will not make lazy trees slower, and it should solve the nondeterminism problem because we get the real hash by caching.
For the "40G for datasets/game assets" use case, stat
based hash caching will of course have even bigger benefits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @nh2, thank you for the stat stats!
The hash cache you propose
hash(abspath) -> ( inode, mtime, hash(contents) )
seems like a good fit for making sure the stat cache is virtually always accurate.
I'd like to stress though that since normal, non-git-hashing
store path hashes are not Merkle hashes, we don't get to use those hashes as part of the store path.
Nonetheless, they're useful!
Step 1. Determine more reliably whether or not to NAR-hash the entire thing. This only pays off if no files have changed, and slows the other cases a bit due to the extra bookkeeping.
Step 2. Produce an internal Merkle hash (e.g. using git-hashing
), which can serve as an alternate cache key for the evaluation cache, so that we can avoid the NAR hash in cases where we don't really need it for proper evaluation semantics, e.g. nixpkgs#legacyPackages
, but not (for now) NixOS, which needs paths for module deduplication. In that case a copy will still be triggered by toString m.file
or some similar code. This requires that flakes are loaded from lazy path values, not store path strings as in this PR.
Also we may need this and/or something like this to avoid unnecessarily repeating Git clean filter calls, but we aren't doing yet (#13464).
adding stat() will not make lazy trees slower,
This was already true from a design / requirements perspective. That's assuming the current evaluation caching approach, which requires that we know what's in the entire Git workdir, similar to git status
, which also needs to perform all the stat calls.
A failure to call stat more than exactly once per file should be considered an implementation detail.
It's good to know the cost of adding extra stat calls, in case we need to temporarily implement it with extra calls. Doing it with fewer than 1 stat per Git workdir file is unfortunately not an option. (unless we want to get some inotify daemon involved, but I don't think we want those extra moving parts?)
The proper solution is to see how we can either make libgit2's git status
equivalent faster and/or more useful, or if the complexity of a persistent cache is an issue for libgit2 upstream, we could look to replace it in our libfetchers by using lower level libgit2 operations in ways we see fit.
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/2025-05-21-nix-team-meeting-minutes-228/65204/1 |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/changelog-lazy-trees-and-security-improvements/66033/8 |
Motivation
This adds a setting
lazy-trees
that causes flake inputs to be "mounted" as virtual filesystems on top of /nix/store as random "virtual" store paths. Only when the store path is actually used as a dependency of a store derivation do we materialize ("devirtualize") the input by copying it to its content-addressed location in the store.String contexts determine when devirtualization happens. One wrinkle is that there are cases where we had store paths without proper contexts, in particular when the user does
toString <path>
(where<path>
is a source tree in the Nix store) and passes the result to a derivation. This usage was always broken, since it can result in derivations that lack correct references. But to ensure that we don't change evaluation results, we introduce a new type of context that results in devirtualization but not in store references. We also now print a warning about this.Supersedes/incorporates #12432 and #12915. It replaces #6530. The big difference with #6530 is that the latter treated each source tree as its own root filesystem, which caused a lot of backward compatibility issues that required ugly hacks. With the new approach, lazy source trees appear under
/nix/store
, just using a virtual store path that doesn't exist in the real filesystem.TODO: run the test suite with
lazy-trees = true
.Context
Add 👍 to pull requests you find important.
The Nix maintainer team uses a GitHub project board to schedule and track reviews.