Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to realise CA derivations during queryMissing #8622

Merged
merged 1 commit into from Aug 10, 2023

Conversation

pwaller
Copy link
Contributor

@pwaller pwaller commented Jul 1, 2023

This enables nix to correctly report what will be fetched in the case
that everything is a cache hit.

Note however that if an intermediate build of something which is not
cached could still cause products to end up being substituted if the
intermediate build results in a CA path which is in the cache.

Fixes #8615.

Signed-off-by: Peter Waller p@pwaller.net

Motivation

  • Give a better user experience
  • Improve performance
  • educe the number of HTTP requests to caches.

Context

See #8615. TL;DR: If you use content-addressed derivations, the behaviour of cache hits was poor, and nix would spend quite a bit of time to incorrectly report that things needed to be built when they were cache hits.

This is my first contribution to nix. I've applied some basic level of reasoning to the fix, but you should assume the logic of the first implementation is completely incorrect until proven otherwise: please help me to get it correct.

I also have not yet had time to write a test for this, which I intend to do, but I wanted to send the PR early to avoid wasting time on this if the approach is somehow wrong.

Please see the TODO notes within the PR and comment on them. I intend to remove these before taking the PR out of draft.

Checklist for maintainers

Maintainers: tick if completed or explain if not relevant

  • agreed on idea
  • agreed on implementation strategy
  • tests, as appropriate
    • functional tests - tests/**.sh
    • unit tests - src/*/tests
    • integration tests - tests/nixos/*
  • documentation in the manual
  • documentation in the internal API docs
  • code and comments are self-explanatory
  • commit message explains why the change was made
  • new feature or incompatible change: updated release notes

Priorities

Add 馃憤 to pull requests you find important.

@github-actions github-actions bot added the tests label Jul 1, 2023
src/libstore/misc.cc Outdated Show resolved Hide resolved
src/libstore/misc.cc Outdated Show resolved Hide resolved
src/libstore/misc.cc Outdated Show resolved Hide resolved
@roberth roberth added ca-derivations Derivations with content addressed outputs store Issues and pull requests concerning the Nix store labels Jul 1, 2023
@Ericson2314
Copy link
Member

BTW an alternative approach I've thought of is trying to add some sort of "dry run" mode to the main builder logic. Then we wouldn't need effectively two implementations of the same thing.

@pwaller
Copy link
Contributor Author

pwaller commented Jul 3, 2023

BTW an alternative approach I've thought of is trying to add some sort of "dry run" mode to the main builder logic. Then we wouldn't need effectively two implementations of the same thing.

@Ericson2314 Hmm- IIRC nix-build currently reports the set of things that will need building/fetching upfront. Would your suggestion extend to running that logic under dry-run there beforehand?

So far as I understand, that approach would be quite a bit more extensive of a change with more conseqeunces, I think. At least, it's not something I would want to attempt to implement. What do you think to getting this in as a simpler fix first, or is that a no-go?

The fix in this PR does make a dramatic difference for me as it stands, so I would be quite happy to see an improvement land soon even if there is a better/simpler implementation that could be done in the future which makes it obsolete. What do you think? Am I overestimating the difficulty of your proposal, is it something that conceivably go into the next version of nix?

@thufschmitt thufschmitt self-assigned this Jul 3, 2023
Copy link
Member

@thufschmitt thufschmitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

The logic looks essentially good, just a few things (that you mostly already noticed)

src/libstore/misc.cc Outdated Show resolved Hide resolved
src/libstore/misc.cc Outdated Show resolved Hide resolved
src/libstore/misc.cc Outdated Show resolved Hide resolved
Comment on lines 218 to 229
if (!bfd.outputs.contains(outputName))
continue;

bool found = false;
for (auto &sub : getDefaultSubstituters()) {
auto realisation = sub->queryRealisation({hash, outputName});
if (!realisation)
continue;
invalid.insert(realisation->outPath);
found = true;
}
if (!found) {
knownOutputPaths = false;
break;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this run in parallel (using the thread pool we already have)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this for a moment. There is already a level of paralleism at play, which is doPath is running in a thread along other doPaths. The paralellism benefit would be if each output's static output has could be queried in parallel for realisation. My following reasoning assumes I've understood the purpose of what you're suggesting.

I can see the potential benefit of that. I have derivations with hundreds of outputs. I like the software going fast.

However, if any of those outputs fail to realise, we must build the derivation. We potentially put in flight a lot of useless work in the negative case. In practice for most derivations, there are only a handful of outputs. Processing the static output hashes in serial allows us to stop early, and stopping early would save requesting the realisations when we have to build anyway. The most time that will be saved in the typical positive case would be 'O(typical # outputs)' of round trips, i.e. a few.

I also observe that for my usecase of tens of derivation outputs, the performance is not unreasonable as it stands.

Any other thoughts here?

@github-actions github-actions bot removed the store Issues and pull requests concerning the Nix store label Jul 7, 2023
@pwaller
Copy link
Contributor Author

pwaller commented Jul 7, 2023

Aside: I ran headlong into #6623 trying to get the test working. It would be really nice if nix copy could query for local realisations which it knows about and copy those realisations to the cache as well. The fact that you can't copy an output path like nix copy --to cache ./result is surprising and confusing even if it makes sense from a "living in a content addressed world" perspective.

@pwaller pwaller marked this pull request as ready for review July 7, 2023 09:35
@pwaller pwaller requested a review from edolstra as a code owner July 7, 2023 09:35
@pwaller
Copy link
Contributor Author

pwaller commented Aug 8, 2023

Friendly one-month review ping. Please let me know if that's not appropriate, or if there is anything else I can do to help progress this. I currently have time to follow up for a short while, and less after that, but I am very keen to help see this behaviour improved.

Copy link
Member

@thufschmitt thufschmitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the ping @pwaller (and sorry for not coming back to this earlier).

Looks good, let's ship it!

@github-actions github-actions bot added the with-tests Issues related to testing. PRs with tests have some priority label Aug 9, 2023
@thufschmitt
Copy link
Member

(I've just force-pushed the branch to unstuck the CI)

@pwaller
Copy link
Contributor Author

pwaller commented Aug 9, 2023

It's got a failing test, I could use a hand if you can see what's missing. I'm also suffering from a broken build[*]. Additionally, if I take a fresh clone of nix and bootstrap/configure it, I get:

checking for gtest_main... yes
checking for rapidcheck/gtest.h... no
configure: error: librapidcheck is not found.

The broken build I'm getting is that I'm unable to run my nix binary:

/home/pwaller/.local/src/nix/i/bin/nix: /nix/store/4nlgxhb09sdr51nc9hdm8az5b08vzkgx-glibc-2.35-163/lib/libc.so.6: version `GLIBC_ABI_DT_RELR' not found (required by /nix/store/37jp8qw8shf64bs4z0x036ji980bf2nr-glibc-x86_64-unknown-linux-gnu-2.37-8/lib/librt.so.1)

Not sure why this started failing. I have been playing with building nix using LLVM, but those experiments live entirely outside my current development environment and it's repeatable even when I do a make clean (in my old nix tree where configure doesn't fail...).

Is this a problem of nix develop purity, I wonder? How do I get a 'clean' develop shell where I can run commands so I can be sure they aren't contaminated?

@pwaller
Copy link
Contributor Author

pwaller commented Aug 9, 2023

Aah, finally found it. LD_DEBUG=all to the rescue. The outputs directory was included via the runpath and had a contaminated shared object in it.

Don't understand however why librapidcheck is now failing during configure.

@pwaller
Copy link
Contributor Author

pwaller commented Aug 9, 2023

I just ran the tests on 32d3052 locally and can't reproduce the failure given by ubuntu-latest. https://github.com/NixOS/nix/actions/runs/5805240629/job/15736024216?pr=8622

@thufschmitt is this for me to dig into or can I leave it with you? I'm not sure where to begin right now, I guess I should try and reproduce in an ubuntu environment?

This enables nix to correctly report what will be fetched in the case
that everything is a cache hit.

Note however that if an intermediate build of something which is not
cached could still cause products to end up being substituted if the
intermediate build results in a CA path which is in the cache.

Fixes NixOS#8615.

Signed-off-by: Peter Waller <p@pwaller.net>
auto-merge was automatically disabled August 9, 2023 19:57

Head branch was pushed to by a user without write access

@pwaller
Copy link
Contributor Author

pwaller commented Aug 9, 2023

@thufschmitt I've introduced tests/ca/build-cache.sh which disables the cache test on the older daemons, rather than disabling all tests/ca/build.sh with older daemons.

@thufschmitt thufschmitt merged commit a1fdc68 into NixOS:master Aug 10, 2023
8 checks passed
@pwaller pwaller deleted the issue-8615 branch August 10, 2023 10:08
@fricklerhandwerk fricklerhandwerk added the UX The way in which users interact with Nix. Higher level than UI. label Jan 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ca-derivations Derivations with content addressed outputs tests UX The way in which users interact with Nix. Higher level than UI. with-tests Issues related to testing. PRs with tests have some priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

--dry-run logic isn't implemented for ca-derivations
5 participants