Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC 0062] Content-addressed paths #62

Merged
merged 34 commits into from
Jan 12, 2022
Merged

[RFC 0062] Content-addressed paths #62

merged 34 commits into from
Jan 12, 2022

Conversation

thufschmitt
Copy link
Member

@thufschmitt thufschmitt commented Dec 11, 2019

A baby step towards #17

rendered

Copy link
Member

@grahamc grahamc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial commentary :)

rfcs/0060-content-addressed-paths.md Outdated Show resolved Hide resolved
rfcs/0060-content-addressed-paths.md Outdated Show resolved Hide resolved
rfcs/0060-content-addressed-paths.md Outdated Show resolved Hide resolved
rfcs/0060-content-addressed-paths.md Outdated Show resolved Hide resolved
- Each (non content-adressed) derivation will have two outputs: A `static` one
computed at evaluation time and a `dynamic` one computed from the dynamic
outputs of its dependencies. These outputs may be identical if the derivation
doesn't (transitively) depend on any ca derivation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we fold the existing fixed output logic into this? e.g. even today two different derivations that produce the same output can be substituted within some downstream derivation without changing its output path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean by "fold" here. If you mean "make fixed-output derivations a special-case of content-addressed ones" then yes, we could in theory (just adding a check at the end that the content-hash is the expected one), but I'm not sure that this is desirable in practice (since the mechanism for CA derivations is inherently more complex than the one for fixed-output ones). Otoh they can (and do in the prototype) share a bit of implementation − in particular fixed-output and CA derivations are stored the same way in the db.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing about the hashDerivationModulo that @edolstra mentions is content addressable derivations already evaluate two another derivation, just like CA ones. But we don't see that derivation today as it is just used to calculate downstream store paths. I think we should expose that derivation. Then the complexity is about the same, and things are more uniform.

- Here again, we notice that `dynamic(transitivelyDependent.drv)` is the same as before,
so we don't build anything

## nix-build process
Copy link
Member

@Ericson2314 Ericson2314 Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of this stuff sounds normative? If so, it should be moved outside of "examples" back into "detailed design".

Théophane Hufschmitt and others added 3 commits December 11, 2019 17:32
@Ericson2314
Copy link
Member

Ericson2314 commented Dec 11, 2019

I nominate myself to be a shepherd. Very interested this stufff, both for its own sake and because the same mechanism would be reused for #40. [Between the prototype for this and existing recursive Nix feature, I'd hope to very easily bang out a prototype for #40.]

- Some derivations can be marked as content-adressed (ca), in which case their
output will be moved to a path `ca` determined only by its content after the
build
- Each (non content-adressed) derivation will have two outputs: A `static` one
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing because above it says that each derivation has one output. It's more accurate to say that there are two derivations (that differ in their output paths).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW the names static and dynamic are ambiguous because they're already used for static/dynamic linking (in particular builds might have a static output for .a files...).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about the names because it's indeed a very overloaded term, but that also matches the common usage when it comes to language semanics ("static" knowledge that can be known by just looking at the source code vs. dynamic stuff that is only available at runtime)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

evaluation-time vs. build-time seems to be less ambiguous to me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe in 2 names because there isn't just one dynamic hash. As more CA deps are built, we can incrementally "improve" the drv. It's better to speak of normal forms, or an arbitrary expression vs value description.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can, but why would you want to?

Purely-evaluation-time version can be computed without any builds and is a natural database key. Fully-updated version is used to determine the actual output path for the build. Why would we ever compute a partial substitution?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. When you do a build, you want your deps in normal form, but you yourself unchanged: "almost normal form".

  2. I just thought of another: when you ask the remote builder, you ask with the partial substitution which has all your existing substitutions. Then you can be sure the server won't respond with a conflict! This is really good because even if the server doesn't agree with you that your maximally unevaluated derivation has that content hash, maybe another derivation has that content hash?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I think in this case this actually is the normal form for build. This is the final form for non-CA, and a necessary build form for CA, their CA version is a different story.

  2. I think the most useful case of what you describe is that you have a different original derivation but some CA dependencies happen to have the same content. It would be a nice feature for a binary cache to provide, but we ither get nontrivial query logic, or combinatorial explosion, and I think neither should be mandated before we have the easy case working.

There can also be something about nondeterminisim and a CA derivation output switching between a small number of possible content sets, but I think we should encourage as much caution as possible for the first version, and if somethings slips through it should be highly visible (and get reported)

rfcs/0060-content-addressed-paths.md Outdated Show resolved Hide resolved
To allow this, we add a new type of store path: aliases paths.
These paths don't actually exist in the store, just in the database and point to
another path (so they are morally symlinks, but inside the db rather than
on-disk)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this shouldn't be a StorePath -> StorePath mapping but a DrvOutputId -> StorePath mapping. This eliminates the suggestion that the paths in the domain of this mapping are real paths. Nix already does something like this in hashDerivationModulo() to compute the "static" store path outputs of a derivation, by replacing the inputDrvs with IDs that don't change for fixed-output derivations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks this begins to answer my #62 (comment)

Copy link
Member

@Ericson2314 Ericson2314 Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say DrvOutputId -> DrvOutputId even. hashDerivationModulo and the thing that "upgrades" (possibly nested!) CA derivations to use their output hash should be extremely similar.

We can then have a separate normalized DrvOutputId -> StorePath mapping.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks better indeed, although I'm not sure how you define DrvOutputId. My understanding is that it should essentially be the "hash" part of the path. Is that it or is there something more?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's basically it. It's a hash over the derivation graph, which is essentially what the hash part of non-CA output paths is.

- Each (non content-adressed) derivation will have two outputs: A `static` one
computed at evaluation time and a `dynamic` one computed from the dynamic
outputs of its dependencies. These outputs may be identical if the derivation
doesn't (transitively) depend on any ca derivation
Copy link
Member

@edolstra edolstra Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there are 3 types of derivations:

  1. Non-CA derivations that only depend on non-CA derivations.
  2. Non-CA derivations that depend on at least one CA derivation.
  3. CA derivations.

I'm not sure if 2) is useful to support. It already behaves different from 1) in that the output paths cannot be known in advance. So then you may as well make the outputs content-addressable. So maybe be it's easier to require that only CA derivations may depend on CA derivations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this because there is analogous "context" vs "redex" situations for #40. Surely normal derivations would depend on ret-cont ones, so we should have normal derivations that depends on CA ones. Also matches how we have normal derivations that depend on fixed-output ones.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this model CA derivations should be treated the same as we would impure derivations if we had them; we do not know whether they are reproducible and thus we should propagate that uncertainty. In this case that would be done by marking all derivations downstream as CA. With fixed-output ones it works because of the hash. Thus, CA derivations that check against a passed-in hash would result in non-CA downstream, but without hash in CA downstream.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FRidh that doesn't work because downstream ones can have cycles. The similarity is that "naive" downstream one doesn't have have a corresponding build, only the normalized drvs map to builds.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should drop support for 2) as

  1. I think that could severely impair the usefulness of this
  2. Supporting that doesn't really add any complexity

To expand a bit:

  1. The original use-case for the prototype was a derivation (call it d0) deep into a derivation tree that had its inputs frequently changing, but its output rarely, and that was a dependency of a lot of other (expensive) things. We knew that d0 was totally deterministic, so we could easily make it CA (which saved a lot of building), but we couldn't make much assumptions about its referrers, so we couldn't make them CA.

    I guess that could be transposed to nixpkgs by replacing d0 by e.g. glibc or anything in stdenv (assuming that it's deterministic). Having CA derivations "poison" their referrers would mean that we would have to mark the whole of nixpkgs as CA which will probably not be possible.

  2. The way I see things, we only have 2 types of derivations

    1. CA derivations (that depend on zero or more CA derivations)
    2. Non-CA derivations (that depend on zero or more CA derivations)

    It just happens that the semantics of non-CA derivations that depend on zero CA derivations match the current semantics, but there's absolutely no extra-complexity involved compared to only supporting Non-CA derivations that don't depend on CA derivations).

    Another way to look at it is that being CA and depending on CA are two orthogonal features, so if we support CA-depending-on-CA and non-CA-depending-on-non-CA, we also support "for free" non-CA-depending-on-CA (and CA-depending-on-non-CA)

Copy link
Member

@7c6f434c 7c6f434c Dec 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the objection to making something CA?

Compressed manpages, JARs or any other such thing present in the output. Note that marking something as CA might require a careful study of either the build system or the output.

Also, of course, hard to debug issues with determinism could be a problem…

But if any dependency can be a CA derivation, and this propagates upwards to change the output paths of its referrers, then you don't have this property anymore.

Maybe specifically that should be cached, indeed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1. What's the objection to making something CA?

The main reason I can think of is non-determinism − I know there are caching models that can handle that just fine, but implementing these are out-of-scope for this RFC

2. It does add complexity, and a runtime cost. Currently, for a non-CA derivation, you can look at the top-level .drv file to know its output paths. So for instance, to substitute a derivation, you don't have to look at the dependencies (and `SubstitutionGoal` doesn't). But if any dependency can be a CA derivation, and this propagates upwards to change the output paths of its referrers, then you don't have this property anymore. This would also mean that `Store::buildDerivation()` would no longer work because it doesn't have access to the input derivations.

This doesn't have to be true I think (although I'm not 100% sure wrt #62 (comment)):

The current behavior (in the prototype I mean) is that the remote cache knows about aliases path, so if /nix/store/abc-foo is an alias for /nix/store/def-foo, the cache will happily serve a narinfo for abc-foo containing aliasOf: def-foo, and SubstitutionGoal will just add def-foo to its dependent goals as if it were another dependency. The same thing should also hold for Store::buildDerivation() unless I miss something.

There's indeed a slight runtime overhead (because we have to fetch an extra narinfo for the alias), but it seems very minor (and marking everything as CA wouldn't make this any faster)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason I can think of is non-determinism

Right, so that means that the output path of a non-CA derivation can differ between builds. But that's already the case since the CA dependencies of a non-CA derivation can change. So you can't use things like binary caches anyway.

Copy link
Member

@Ericson2314 Ericson2314 Dec 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to be able to make packages content-addressable one by one independently. If zlib doesn't have a self-reference, for example, then I want to make it so without having to also convert all its reverse dependencies. That would be a disaster.

The binary cache interaction doesn't seem too bad? It's basically "You asked for drv0, I map drv0 to drv1 under substitutions (a to a_hash, b to b_hash, ....z to z_hash). I have a build for drv1." The client can see if any of those content-hash assignments clash with any of its own, and take the drv1 build and the hash assignments if it is OK with them. As long as we keep track of all hash assignments, and stop doing whatever we are doing if we notice a conflict, things won't go wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason I can think of is non-determinism

Right, so that means that the output path of a non-CA derivation can differ between builds.

I am not sure I follow: if all CA derivations are carefully checked to be deterministic, then the output paths of their reverse-dependencies won't change.


# Unresolved questions

[unresolved]: #unresolved-questions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that we should ensure is that CA derivations can be built by untrusted users without the input drvs being available. This is currently not the case for non-CA derivations: Store::buildDerivation(BasicDerivation) can only be used by trusted users, which is annoying for Hydra builders.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A major open issue is garbage collection of the aliases table. It's not clear when entries should be deleted. The paths in the domain are fake so we can't use them for expiration. The paths in the codomain could be used (i.e. if a path is GC'ed, we delete the alias entries that map to it) but it's not clear whether that's desirable since you may want to bring back the path via substitution in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unprivileged builds is more a future work than an unresolved issue (since it's just something that we can build on top of that). I very quickly mention it (Investigate the consequences in term of privileges requirements), but if you think it's worth expanding, I can add more on that topic

it's not clear whether that's desirable since you may want to bring back the path via substitution in the future

What do you mean by that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned it because we need to keep it in mind to avoid a design of derivations that makes unprivileged builds impossible (i.e. avoid the mistake of hashDerivationModulo).

What do you mean by that?

If you garbage-collect the output of a CA derivation, and then build the CA derivation again, you could avoid building it by fetching the output from a binary cache, if you remembered what the CA output paths were.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could avoid building it by fetching the output from a binary cache, if you remembered what the CA output paths were.

Just to stress: this knowledge would allow fetching even when the binary cache doesn't know of the derivation I am trying to build but the output happens to be the same

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this mistake of hashDerivationModulo?

@edolstra

This comment has been minimized.

Makes it more in line with other "magic" attributes like
`__structuredAttributes`

Also fix the orthograph
@edolstra edolstra changed the title [RFC 0060] Content-addressed paths [RFC 0062] Content-addressed paths Dec 12, 2019
@Mic92 Mic92 added status: open for nominations Open for shepherding team nominations and removed status: new labels Dec 12, 2019
@edolstra
Copy link
Member

I'd like to nominate myself for the RFC shepherd team.

particular, the caching model needs some modifications (see [caching]);

- We specify that only a sub-category of derivations can safely be marked as
`contentAddressed`, but there's no way to enforce these restricitions;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bigger problem than it might look like, as it means that trivial updates can break the CA marking for reasons not worth mentioning in the upstream changelog.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely :)

Maybe that could be clearly stated, but the original scope of this work was to be able to mark very specific derivations that were clearly guaranteed to be deterministic, in which case the problem was less important

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the question «why not just propagate CA» shows that writing more is a good idea.

I do think that stressing the limitations in a few key places is also a nice thing to do (people should be able to apply RFC as passed, not what was intended and not what was discussed, after all… we should not treat ourselves worse than we treat computers!)

@7c6f434c
Copy link
Member

What is the worst-case impact of improper CA marking?

Nondeterminism can apparently break the use of the binary cache for all downstream packages. A CPU-dependent nondeterminism could even do it in an annoying hard to debug way.

Compressed self-references can probably break an unpredictable subset of functionality, but this is a general problem for careless maintenance of a fragile package.


# Unresolved questions

[unresolved]: #unresolved-questions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have functionality that allows to build a CA package twice with different apparent output paths, and optionally with different parallelism settings? The build of the package obviously fails if the CA unification doesn't lead to the same result.

Should we mandate that Hydra uses this functionality? Should it be on by default?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per https://github.com/NixOS/rfcs/pull/62/files#r357243841 I think we can deal with non-deterministic derivation just fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, for binary cache transparency it is much better if you can build something locally, then regain connectivity and fetch stuff from a cache, then fetch stuff from a different cache, then build some more locally, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in my mind that's equally risky with and without content addressable derivations. The only difference is one lets you know if something goes wrong, and one doesn't.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Build nondeterminism doesn't introduce significant behaviour changes, so as long as the expectations are not broken (yeah, we install you into this output path and your dependencies into those paths, and that is not going to change), it will be mostly usable. There are a few CPU-dependent optimisations from time to time, they are annoying.

With CA things are actually moved around, so even though everything would still work when assembled together, the assembling part will be failing. It is Nix, not the code that is built by Nix, that would fail to do things because of nondeterminism.

Copy link
Member

@Ericson2314 Ericson2314 Dec 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think trying to keep going despite non-determinism incoherence is a misfeature. You can always evict your own CA mappings (can keep the builds themselves for easy "rollback") and align with cache.reflex-frp.org and keep going.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if things are marked CA, of course it is a good idea to catch failures. But what you propose will not catch much, because a typical derivation is only built once (ever) by Hydra, later Hydra will use the binary cache. Also my proposal includes feeding different «apparent» output paths to the same build with the same dependencies, which has a better chance of discovering compressed self-references.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am OK with running --check and trying different termporary output paths. Catching non-determinism I don't think is important, because it's really clashes that we care about. However, catching self-references is important as we have to be able to move the thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, varying the output paths is something that doesn't follow from anything Nix does, so it has to be spelled explicitly.

thesis), but for the sake of simplicity, this RFC simply forbids to mark a
derivation as ca if its build is not deterministic (although there's no real
way to check that so it's up to the author of the derivation to ensure that it
is the case).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can skip this. If we track all the evaluation steps, we have all the information to ensure a binary cache isn't given anything that clashes with ourself. Maybe the first prototype will discover these errrors lazily, but it should discover them

Ericson2314 added a commit to Ericson2314/nix-rfcs that referenced this pull request Dec 12, 2019
Ericson2314 added a commit to Ericson2314/nix-rfcs that referenced this pull request Dec 12, 2019
@Ericson2314
Copy link
Member

Ericson2314 commented Dec 12, 2019

I fixed up my formalization in https://github.com/Ericson2314/nix-rfcs/blob/ret-cont/rfcs/0000-ret-cont-recursive-nix.md#detailed-design such that I think it would be really easy to use here. Essentially, we can have as many immediate-reduction-* rules as we like, so we would have something like:

drv : Drv
drv.outputs = { "out" = ...; }
build : StorePath
drv ? __contentAddressable == true
assoc(drv0, "out", build)
-------------------------------------------------- immediate-reduction-content-addressable
c(drv0, stubDrv(hash(build)))

This is extremely similar to the fixed-output rule, except the hash comes from the build instead of the fixed output. The downstream rewriting of derivations is exactly the same, and doesn't require those to be content addressed any more than fixed outputs require their downstream derivations to be fixed output.

Note I am assuming a single output. If we have multiple outputs, this is fine, but we'll have to change reducesTo so that different outputs of the same derivation can reduce in different ways.

@nomeata
Copy link

nomeata commented Dec 12, 2019

The rendered link is 404?

@7c6f434c
Copy link
Member

@nomeata it was left behind when updating the RFC number.

@thufschmitt
Copy link
Member Author

@nomeata thanks, I've updated the description with the correct one

@fogti
Copy link
Contributor

fogti commented Aug 15, 2021

probably relevant: NixOS/nixpkgs#133470


If the store knows about the given derivation output, it will return the associated realisation, otherwise it will return `None`.

- The substitution loop in Nix fist calls this method to ask the remote for the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- The substitution loop in Nix fist calls this method to ask the remote for the
- The substitution loop in Nix first calls this method to ask the remote for the

@Ericson2314
Copy link
Member

When we have a meeting on this? I don't think we have an RFC meeting yet.

I suppose the first problem is we don't know what the life cycle for experimental features ought to be? Maybe we need a separate RFC on that?

My gut instinct is RFC is for stabilization, but before we can stabilize CA derivations I think we need:

  1. Improve UX with CA derivations hydra#875 Hydra support
  2. Plan to Build CA version of nixpkgs with hydra.nixos.org? (Do we have the budget?)
  3. Decide how much of Nixpkgs needs to build successfully before we are confident in the feature.
  4. (Opt) decide how long after acceptance Nixpkgs should be switched over.

In particular, I want to tie making this not experimental to "it's working at scale and only a matter of time before Nixpkgs uses it". I don't want this to be like structured attrs where we stabilize it and then have yet to make the switch.

I am not actually that worried there is some terrible "mistake" we made that will make this unsuitable for Nixpkgs, but rather I think this is a good opportunity to get Nixpkgs and Nix contributors planning something together, because right now the contributors are shockingly disjoint.

@Ericson2314
Copy link
Member

I have made channel #nix-rfc-62:matrix.org so the shepherds can shepherd the shepherds :). (I am not sure whether these are supposed to be public or private, but @tomberek made 106's public, so I am following that.)

Simplify the paperwork and just get this to FCP because right now it’s stuck in a hole
@edolstra
Copy link
Member

We just had a call on this RFC and decided to go to FCP, with the text of the RFC reflecting that this is an experimental feature for now. There was some discussion about whether there should be another RFC to make it a stable feature, but this is probably best addressed in an RFC about the process for experimental features.

@edolstra edolstra added status: FCP in Final Comment Period and removed status: in discussion labels Dec 10, 2021
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/rfc-0062-fcp-content-addressed-paths/16531/1

@edolstra edolstra merged commit bcdca7c into NixOS:master Jan 12, 2022
edolstra added a commit that referenced this pull request Jan 12, 2022
* Initial draft "ret-cont" recursive Nix

* Fix typos and finish trailing sentance

* Switch to advocating temp store rather than daemon socket

Also:

 - Allow fixed output builds (in that temp store)

 - Clean up drawbacks and alternatives.

* ret-cont-recursive-nix: Fix typo

Thanks @siddharthist!

Co-Authored-By: Ericson2314 <git@JohnEricson.me>

* ret-cont-recursive-nix: Fix typo

Thanks @Mic92

Co-Authored-By: Ericson2314 <git@JohnEricson.me>

* ret-cont-recursive-nix: Fix typo

Thanks @globin

Co-Authored-By: Ericson2314 <git@JohnEricson.me>

* ret-cont-recursive-nix: Fix typo

Thanks @siddharthist!

Co-Authored-By: Ericson2314 <git@JohnEricson.me>

* ret-cont-recursive-nix: Clean up motivation, adding examples

* ret-cont-recursive-nix: Improve syntax highlighting

* Do a lousy job formalizing the detailed design

Break off some previously inline observations into their own subsection.

* ret-cont-recursive-nix: Mention `builtins.exec` in alternatives

* ret-cont-recursive-nix: Fix typo

Thanks @Mic92!

Co-Authored-By: Ericson2314 <git@JohnEricson.me>

* ret-cont-recursive-nix: Remove dangling "$o"

The later examples with `nix-instantiate` automatically get all the outputs of the final rewritten drv, so there's no output iteration needed. `"$o"` was mistakenly copied over from the earlier examples.

Thanks to @ocharles for asking me the question that led me to this. Hopefully this change answers that?

* Update rfcs/0000-ret-cont-recursive-nix.md

Co-Authored-By: Domen Kožar <domen@enlambda.com>

* ret-cont-recursive: Fix typo

about -> out

* ret-cont: Add examples and expand future work

* ret-cont: Fix syntax error

No `let`, so don't need `in`.

* ret-cont: Mention Ninja's upcomming `dyndep` and C++ oppertunity

* ret-cont: Fix missing explicit `outputs` and `__recursive`

This was in the "wrapper" derivation example.

* ret-cont: "caching builds" -> "caching evaluation"

We already cache builds just fine, thanks!

Thanks @globin for catching

* ret-cont: Improve formalism and reference #62

* drv-build-drv: Start drafting from old ret-cont-recursive-nix RFC

* drv-buiild-drv: WIP rewrite

* plan-dynamism: Rewrite RFC yet again

* plan-dynamism: Rename file accordingly

* plan-dynanism: Fix typo

Thanks @mweinelt.

* plan-dynanism: Fix formalism slightly

* Apply suggestions from code review

Thanks!

Co-authored-by: Rosario Pulella <Rosariopulella@gmail.com>
Co-authored-by: Ninjatrappeur <NinjaTrappeur@users.noreply.github.com>

* plan-dynamism: `Buildables` -> `DerivedPathsWithHints`

Thanks @sternenseemann for catching.

* plan-dynamism: Add semantics and examples for `!` syntax

* plan-dynamism: Too many dashes in `--derivation`

* plan-dynanism: Put pupose of text hashing before name

* Apply suggestions from code review

Co-authored-by: Ollie Charles <ollie@ocharles.org.uk>

* Apply suggestions from code review

Co-authored-by: Ollie Charles <ollie@ocharles.org.uk>

* Apply suggestions from code review

* Update rfcs/0000-plan-dynanism.md

* plan-dynanism: Fix bad sentence

Thanks @roberth!

* plan-dynamism: Number the two parts

* plan-dynamism: Rip out part 2

There is more to do I'm sure, but I wanted to get the ball rolling.

* plan-dynamism: New motivation

* plan-dynamism: Fix typo

Thanks @L-as

* TEMP PLES AMEND

* [RFC 0092] Rename file

* [RFC 0092] Fix YAML header

* [RFC 0092] Rewrite summary

* [RFC 0092] Add link to documentation

* [RFC 0092] Rewrite example section

* [RFC 0092] Small fix

* [RFC 0092] Rewrite drawbacks and alternatives

* [RFC 0092] Improve alternatives section

* [RFC 0092] Fix syntax error

* [RFC 0092] Small change

* [RFC 0092] Remove unnecessary file

* [RFC 0092] Add comment about IFD

* [RFC 0092] Fix typo

* Update rfcs/0092-plan-dynamism.md

Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>

* plan-dynamism-experiment: Make clear is experimental

* plan-dynamism-experiment: Fix typo

Thanks @L-as

Co-authored-by: Langston Barrett <langston.barrett@gmail.com>
Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>
Co-authored-by: Robin Gloster <mail@glob.in>
Co-authored-by: Domen Kožar <domen@enlambda.com>
Co-authored-by: Rosario Pulella <Rosariopulella@gmail.com>
Co-authored-by: Ninjatrappeur <NinjaTrappeur@users.noreply.github.com>
Co-authored-by: Ollie Charles <ollie@ocharles.org.uk>
Co-authored-by: Las Safin <me@las.rs>
Co-authored-by: Eelco Dolstra <edolstra@gmail.com>
@edolstra
Copy link
Member

Thanks everybody, RFC 62 is merged!

@edolstra edolstra added status: accepted and removed status: FCP in Final Comment Period labels Jan 12, 2022
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/tweag-nix-dev-update-2/6808/4

KAction pushed a commit to KAction/rfcs that referenced this pull request Apr 13, 2024
* CAP RFC: First draft

* typo

* Apply @grahamc's suggestions

Co-Authored-By: Graham Christensen <graham@grahamc.com>

* nix code -> Nix expression

* Break-up the big introduction paragraph

As suggested in https://github.com/NixOS/rfcs/pull/62/files#r356694585

* Rename to match the PR number

* Rename the drv attribute to __contentAddressed

Makes it more in line with other "magic" attributes like
`__structuredAttributes`

Also fix the orthograph

* Mention the GC issue

* Remove the ambiguity on what an `output` is

* Replace aliases paths by a pathOf mapping

* Move the example after the design description

* Rephrase the design

In particular, replace `static` and `dynamic` by `symbolic` and
`resolved`

* Add shepherd team

* Rewrite the RFC to account for the RFC meeting comments

- Add the notion of `drvOutputId`
- Replace the alias paths by a `PathOf(drvOutputId)` function
- Mention the notion of "truster" in the `PathOf` function

* Add a section about leaking output paths

* Refine the design summary

* Rename dependency-addressed into input-addressed

And define the term at the begining of the RFC

* minor fixup after comments

* Apply suggestions from code review

Co-authored-by: asymmetric <lorenzo@mailbox.org>
Co-authored-by: Profpatsch <philip.patsch@tweag.io>

* Update rfcs/0062-content-addressed-paths.md

* Update the terminology to match the in the implementation

* Reword the detailed design presentation

* Quote some strings in the yaml frontmatter

* Add a design paragraph about the remote caching

And update the “unresolved questions” to take it into account

* Lift the determinism requirement

It's not required anymore by the current remote caching semantics

* Typo

Co-authored-by: davidak <davidak@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: davidak <davidak@users.noreply.github.com>

* Rewrite the RFC

This is mostly a full-rewrite of the RFC to

1. Make it more “incremental”: A first part just describes the minimal
   model upon which everything is based, and a second one shows
   different extensions of this model to add more features
2. Remove the big ugly examples that don’t add much value because they
   aren’t really readable
3. Add a python pseudo-code pseudo-formalisation of the RFC.
  This is imho both more readable and precise than nested bullet-points
  of handwaved language

* Make the python samples a bit more pythonic

Co-authored-by: zseri <zseri.devel@ytrizja.de>

* Explicit that unresolved dependencies are eval-time

* Prettify

* Make the end-goal an experiment

Simplify the paperwork and just get this to FCP because right now it’s stuck in a hole

Co-authored-by: Théophane Hufschmitt <regnat@users.noreply.github.com>
Co-authored-by: Graham Christensen <graham@grahamc.com>
Co-authored-by: John Ericson <John.Ericson@Obsidian.Systems>
Co-authored-by: asymmetric <lorenzo@mailbox.org>
Co-authored-by: Profpatsch <philip.patsch@tweag.io>
Co-authored-by: Jörg Thalheim <Mic92@users.noreply.github.com>
Co-authored-by: Eelco Dolstra <edolstra@gmail.com>
Co-authored-by: davidak <davidak@users.noreply.github.com>
Co-authored-by: zseri <zseri.devel@ytrizja.de>
KAction pushed a commit to KAction/rfcs that referenced this pull request Apr 13, 2024
* Initial draft "ret-cont" recursive Nix

* Fix typos and finish trailing sentance

* Switch to advocating temp store rather than daemon socket

Also:

 - Allow fixed output builds (in that temp store)

 - Clean up drawbacks and alternatives.

* ret-cont-recursive-nix: Fix typo

Thanks @siddharthist!

Co-Authored-By: Ericson2314 <git@JohnEricson.me>

* ret-cont-recursive-nix: Fix typo

Thanks @Mic92

Co-Authored-By: Ericson2314 <git@JohnEricson.me>

* ret-cont-recursive-nix: Fix typo

Thanks @globin

Co-Authored-By: Ericson2314 <git@JohnEricson.me>

* ret-cont-recursive-nix: Fix typo

Thanks @siddharthist!

Co-Authored-By: Ericson2314 <git@JohnEricson.me>

* ret-cont-recursive-nix: Clean up motivation, adding examples

* ret-cont-recursive-nix: Improve syntax highlighting

* Do a lousy job formalizing the detailed design

Break off some previously inline observations into their own subsection.

* ret-cont-recursive-nix: Mention `builtins.exec` in alternatives

* ret-cont-recursive-nix: Fix typo

Thanks @Mic92!

Co-Authored-By: Ericson2314 <git@JohnEricson.me>

* ret-cont-recursive-nix: Remove dangling "$o"

The later examples with `nix-instantiate` automatically get all the outputs of the final rewritten drv, so there's no output iteration needed. `"$o"` was mistakenly copied over from the earlier examples.

Thanks to @ocharles for asking me the question that led me to this. Hopefully this change answers that?

* Update rfcs/0000-ret-cont-recursive-nix.md

Co-Authored-By: Domen Kožar <domen@enlambda.com>

* ret-cont-recursive: Fix typo

about -> out

* ret-cont: Add examples and expand future work

* ret-cont: Fix syntax error

No `let`, so don't need `in`.

* ret-cont: Mention Ninja's upcomming `dyndep` and C++ oppertunity

* ret-cont: Fix missing explicit `outputs` and `__recursive`

This was in the "wrapper" derivation example.

* ret-cont: "caching builds" -> "caching evaluation"

We already cache builds just fine, thanks!

Thanks @globin for catching

* ret-cont: Improve formalism and reference NixOS#62

* drv-build-drv: Start drafting from old ret-cont-recursive-nix RFC

* drv-buiild-drv: WIP rewrite

* plan-dynamism: Rewrite RFC yet again

* plan-dynamism: Rename file accordingly

* plan-dynanism: Fix typo

Thanks @mweinelt.

* plan-dynanism: Fix formalism slightly

* Apply suggestions from code review

Thanks!

Co-authored-by: Rosario Pulella <Rosariopulella@gmail.com>
Co-authored-by: Ninjatrappeur <NinjaTrappeur@users.noreply.github.com>

* plan-dynamism: `Buildables` -> `DerivedPathsWithHints`

Thanks @sternenseemann for catching.

* plan-dynamism: Add semantics and examples for `!` syntax

* plan-dynamism: Too many dashes in `--derivation`

* plan-dynanism: Put pupose of text hashing before name

* Apply suggestions from code review

Co-authored-by: Ollie Charles <ollie@ocharles.org.uk>

* Apply suggestions from code review

Co-authored-by: Ollie Charles <ollie@ocharles.org.uk>

* Apply suggestions from code review

* Update rfcs/0000-plan-dynanism.md

* plan-dynanism: Fix bad sentence

Thanks @roberth!

* plan-dynamism: Number the two parts

* plan-dynamism: Rip out part 2

There is more to do I'm sure, but I wanted to get the ball rolling.

* plan-dynamism: New motivation

* plan-dynamism: Fix typo

Thanks @L-as

* TEMP PLES AMEND

* [RFC 0092] Rename file

* [RFC 0092] Fix YAML header

* [RFC 0092] Rewrite summary

* [RFC 0092] Add link to documentation

* [RFC 0092] Rewrite example section

* [RFC 0092] Small fix

* [RFC 0092] Rewrite drawbacks and alternatives

* [RFC 0092] Improve alternatives section

* [RFC 0092] Fix syntax error

* [R