Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

contentAddressedByDefault = true; is very slow to start building packages, blocked on many .doi fetches #5367

Closed
trofi opened this issue Oct 9, 2021 · 5 comments
Labels
bug ca-derivations Derivations with content addressed outputs stale

Comments

@trofi
Copy link
Contributor

trofi commented Oct 9, 2021

When I change a bootstrap package (say, bash) it takes minutes before any package gets built. Probably due to two things:

  1. excessive and redundant realisation polling
  2. serial realisation polling

Here is my benchmark:

        $ nix edit -f . bash # add unused environment variable, like FOO="2"
        $ time nix build -f nixos system --arg config '{ contentAddressedByDefault = true; }'

I would expect bash to start building within seconds (that's what happens in contentAddressedByDefault = false;) case.

What actually happens is a long queue of polls:

$ nix build -f nixos system --arg config '{ contentAddressedByDefault = true; }' --debug --verbose |& ts "%.T"
...
21:26:19.225831 starting download of https://cache.nixos.org/realisations/sha256:0e0653852bc6e6c1b1ba8683a0386c529b1e26fb193f55e240f2abc6727b0f96!out.doi
21:26:19.225843 verify TLS: Nix CA file = '/etc/ssl/certs/ca-certificates.crt'
21:26:19.318660 finished download of 'https://cache.nixos.org/realisations/sha256:0e0653852bc6e6c1b1ba8683a0386c529b1e26fb193f55e240f2abc6727b0f96!out.doi'; curl status = 0, HTTP status = 404, body = 3 bytes
...
21:26:19.319178 verify TLS: Nix CA file = '/etc/ssl/certs/ca-certificates.crt'
21:26:19.415872 finished download of 'https://cache.nixos.org/realisations/sha256:0e82e0337e7c29da8ccfeb6e017db1708bbd4a8b48038a85e5b957429ee68a67!dev.doi'; curl status = 0, HTTP status = 404, body = 3 bytes
...

Note: each subsequent verify TLS / finished takes about 10ms (it's the the distance to closest CDN for me). Before bash starts a build it looks like nix fetched ~2000 .doi files. That takes about 3-4 minutes. I suspect it's a lot worse for people who are further away from cache.

As I understand once bash is built none of them are needed anyway as nix notices that bash's content would be unchanged and existing paths could be reused. But it also makes sense that some of .doi substitutions could make need for a new bash irrelevant.

Given that we need to download so many .doi files it would be nice to have higher parallelism fetching those. Or maybe have a batch API to be able to query available .doi in larger batches?

CC @regnat

@trofi trofi added the bug label Oct 9, 2021
@thufschmitt
Copy link
Member

Yes, that bit is utterly inefficient. I was kinda hoping that merely leveraging the sqlite “binary-cache cache” would be enough to make this OK, but it indeed isn’t in practice.

Given that we need to download so many .doi files it would be nice to have higher parallelism fetching those

Yes. I don’t know whether the work of #5324 directly applies to the fetching of the doi files (but if it doesn’t, it should be extended for that).
We should also try to concurrently resolve the derivation locally and fetch it remotely (though that might be a bit more involved given the current shape of the code)

maybe have a batch API to be able to query available .doi in larger batches?

Unfortunately that’s not really possible, both because the doi depend on each other (we can’t fetch one until we’ve resolved all of its dependencies) and because the binary cache can be (and is in most instances) a “dumb” static server

@edolstra edolstra added the ca-derivations Derivations with content addressed outputs label Oct 11, 2021
@Kha
Copy link
Contributor

Kha commented Oct 11, 2021

Yes. I don’t know whether the work of #5324 directly applies to the fetching of the doi files (but if it doesn’t, it should be extended for that).

Not directly unfortunately. Substituter querying is done ahead of the actual build in a single function specifically so it can be done efficiently in parallel, my PR merely makes the thread pool size for that configurable. But from your description it sounds like this approach is not applicable here.

@thufschmitt
Copy link
Member

I've opened #5472 which makes this slightly better (unfortunately, it’s still far from being as fast as it sould be imho, but that was the lowest hanging fruit I could find)

@stale
Copy link

stale bot commented May 2, 2022

I marked this as stale due to inactivity. → More info

@stale stale bot added the stale label May 2, 2022
@trofi
Copy link
Contributor Author

trofi commented May 2, 2022

#5472 (and maybe something on top of it) sped things up for me from 3-4 minutes of query time down to 7 seconds. Let's declare it good enough.

@trofi trofi closed this as completed May 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug ca-derivations Derivations with content addressed outputs stale
Projects
None yet
Development

No branches or pull requests

4 participants