-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make it possible to add s3 tag to nars when uploading to s3, to allow different retention policy for debug symbols #8080
base: master
Are you sure you want to change the base?
Conversation
This is an interesting idea. One problem I can imagine is that Nix caches hold the invariant that if a store path exists in the cache, all of its dependencies are also in that same cache. The implication here is that I think this means we're assuming there are never any store paths that will ever depend on debug outputs. However, I'm not sure that is true, which makes this a rather dangerous proposal. It may be that we could introduce that invariant(?) but otherwise we can't really know if anything refers to the store path without a garbage collection process. |
This invariant is certainly necessary for stores but does it exist for binary caches as well? |
Yes, the invariant is held for binary caches as well. (Which are technically implemented as a store, but that is not why.) |
Could you expand about what parts of nix rely on the fact that binary caches are closed by transitive dependency? I tried to experience the issue myself, and failed. { config, pkgs, lib, ... }:
{
services.nginx = {
enable = true;
virtualHosts."incomplete.binary.cache" = {
default = true;
locations = {
"/".proxyPass = "https://cache.nixos.org";
"/ihkcrpmhw7v8gss4zhdfx5zbvxpan06i.narinfo".return = "404";
"/nar/1k1bsjrvbwfjp26civp4grsxnizmdnccx5xrq41wd881zwij6q12.nar.xz".return = "404";
};
};
};
} Then attempt to build nixos 16.03's sl:
As you can see it transparently rebuilds the missing path, and even respects topological order by substituting, then building, then substituting again. In this experiment at least, nix proved fully resilient to binary caches with holes. |
I really like the idea of having the ability to fetch debug symbols from cache.nixos.org. I'm not too keen on this implementation, because it's unnecessarily limited to serving this single use case. Allowing attaching extra metadata to store paths has a huge amount of potential for other uses, and attaching the tag based on this very specific shape of output path is an unfortunately specific mechanism. I'd much prefer if the policy producing metadata like this were kept outside Nix itself:
Regarding the store invariants: these are important especially for something like debug info; I can see the rebuilding behaviour you observed resulting in debug info which doesn't actually match the binaries fetched from the cache, and ends up being worse than useless. We could probably avoid breaking this invariant on the nixpkgs side by not allowing package outputs to reference their dependencies' debug outputs in practice, though I'd much prefer a solution that guarantees the preservation of the invariant as opposed to "hopefully" providing it. |
Consider first the case where we put this metadata (in our case "this is debug info") inside a file outside the nar (be it the narinfo or some other file) then to be able to remove debuginfo from the binary cache, we need to fetch all narinfos from the dawn of time (because listing APIs of s3 seem extremely limited, it does not even seem possible to fetch files by date). To get an idea of how slow it is: nix-index takes 8 minutes and half on my machine to fetch 92000 such metadata files. These 92000 paths are an approximation of "all that is in nixpkgs", and with the staging cycle we produce at least a new batch of 92000 every ~two weeks for nixos-unstable, and one other for stable. NixOS used to be smaller in the past, but still after 10 years this suggests it would take nearly three days to download one metadata file per store path in the binary cache. Putting it inside the nar like Final possibility: putting this info inside the db of hydra: well it's exactly as ad-hoc as my solution, and to some extent by opening this PR against nix I'm modifying the part of hydra that uploads to S3 so it's not fundamentally different nor less ugly.
You suggest we could interpret
Note that this problem (rebuilding debug info a second time results in mismatched binary vs debuginfo) already exists currently with nix upholding the invariant, see #7756 . This was deemed a bug in nixpkgs for not having reproducible builds, and not a responsibility of nix. Besides the current situation is:
With debuginfo removed from cache after say 6 month, the situation you deplore is:
This is bad, but exactly as bad as right now. On the other hand, we get usable debuginfo for 6 months. This is not a regression. |
I'm suggesting that this metadata could be applied as S3 tags at upload time, much like you do here -- my main goal with the suggestion was to avoid limiting ourselves to this very specific debug tag based on a very restrictive convention as opposed to being defined by the expression or its build output -- as an example, what if we want to do the same for JS source maps? I could easily imagine those not ending up in lib/debug but still being a separate output that we don't want to keep forever, and it wouldn't be great if we would then need to implement the logic for that in Nix itself rather than in the builders.
Regarding evaluation-time metadata: there's probably a sensible scheme through which one could attach such metadata per output as well. However, I don't think it's super relevant here and more of a related thing on my wishlist. |
Triaged in the Nix team meeting 2023-04-10:
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/2023-04-10-nix-team-meeting-minutes-47/27357/1 |
thank you for your feedback. I pushed an implementation of your idea. If a nar contains /nix-support/tags.json that contains |
I fixed the merge conflicts |
if a nar contains a nix-support/tags.json file of the form {key: string value}, then the nar file (plus .narinfo and other accompanying files) is tagged as "nix:key" -> "value".
I fixed the merge conflicts. |
FWIW, the archivist team is keeping consolidated dumps of all narinfos. Ingesting all ~205 million of them took a couple of hours and around €100 in S3 requests. We have about a quarter million DELETE requests are billed alike to GETs (and faster to run), so expiring a few million paths is not particularly costly or complex. It's also worth noting that there are
Naively using object expiration will expire objects that are still in recent closures but haven't been rebuilt recently.
Nothing in Nix cache retrieval depends on nars being in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating the approach. Let's add some docs regarding the new nix-support file and convention.
Thinking out loud now:
- this is basically metadata, and can be used to have additional attributes in a NAR model that does not have them.
- It cannot be used for single-file store objects.
- other possibilities?
- pros/cons for other realms than debugging and s3?
It's very cool that the archivist team has real numbers about the storage cost of debuginfo.
That's a very good point, and I had not thought about that. I expect this does not affect debuginfo too much, as the staging cycle ensures there is a world rebuild every month approximately. But that would definitely be relevant if we wanted to expire sources this way.
it's possible to encode these tags on file:/// binary caches with xattrs, but I fail to see an application. will write documentation later this week. |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/nixos-s3-long-term-resolution-phase-1/36493/1 |
hint=debug
when uploading to s3
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/2024-01-26-nix-team-meeting-minutes-118/38851/1 |
Hey @Ericson2314 |
this allows setting a different retention policy for debug symbols in s3 backed binary caches.
Motivation
We currently have debug symbols for very few libraries in NixOS. Recompiling everything to get debug symbols is non trivial and time consuming.
The main objection (NixOS/nixpkgs#18530) to enabling debug symbols on a large scale is that they are heavy (on my system, about 50% larger on average than the original libraries) so storage costs for the official binary cache would be unacceptable. One possible idea is to keep debug symbols for a shorter period than other packages (which are currently stores forever), for example 7 months (support duration of a release).
This implies being able to know what nars to remove 7 month from now.
.narinfo files contain the name of the store path. A naive approach could be to download all narinfo, read them for paths ending in -debug, and send DELETE queries to amazon but this will take forever and cost many requests.
A better approach is to rely on the ability of s3 to selectively expire objects after a delay
By prefix: it would be possible to move debug related nars to another directory instead of nar/, and point narinfo files to it. Although it's undocumented this breaks the format of binary caches and I fear may have unsuspected consequences. For this reason I tried the approach of tagging debuginfo related nars at upload time.
In terms of implementation, binary caches are given a hint when uploading a file, and this hint is set to debug for debug nars.
Only the s3 implem uses the hint currently. This does not really add special casing for debug nars, as there was already an option about them (index-debug-info).
Note: in my opinion this is of limited usefulness if people in charge of the official binary cache don't like this approach. Can someone confirm that hydra does not use multipart uploads ? I did not find a way to make this work with multipart uploads.
Context
Alternatives: keep living without debug info on NixOS forever. More seriously see the discussions at NixOS/nixpkgs#18530
Checklist for maintainers
Maintainers: tick if completed or explain if not relevant
tests/**.sh
src/*/tests
tests/nixos/*
Priorities
Add 👍 to pull requests you find important.