feature	start-date	author
overriding	2021-03-18	Yann Hamdaoui

Overriding

This document is a proposal for an overriding mechanism in Nickel. It is expected to evolve while we put these ideas in practice, but shall serve as a design and implementation baseline.

Related issues: #103, #240, #255, #279.

Context

Definition of the problem

Overriding is the ability to take an existing configuration - in our case, a record - and update a specific part of it, such as a field or the field of a nested record.

A naive and unergonomic of way of doing it is to repeat all the original values in a new record:

let record = {
  a = 1,
  b = "str",
  c = false
} in
{
  a = record.a,
  b = record.b,
  c = true
}

This quickly becomes unmanageable for large records, which are common in Nix and Nixpkgs. In functional languages, this is usually better done using what's called functional update. Functional update is an operation that takes the original record, one or several fields to update, the corresponding new value, and returns a new updated record. It has the same semantics as our first snippet, but doesn't require to rewrite all unchanged fields.

It can have a builtin syntax, such as OCaml's with: {record with field = new_value}, Haskell's record {field = newValue}, Nix // operator record // {field = newValue}, or Rust's syntax RecordDataType {field: new_value, ..record}. There are more advanced programming techniques that make updating deeply nested records ergonomic such as Lenses in Haskell, but these rely too heavily on advanced language and typing features to be practical in Nickel.

In Nickel, we can already do a functional update using merging, although the updated field must have been marked as default:

let record = {
  a = 1,
  b = "str",
  c | default = false,
} in
record & {c = true}

As explained in the next section though, this is not satisfying.

Overriding recursive records

Nickel's records are different from the ones of OCaml, Haskell or Rust. They are lazy and recursive by default. They are thus better understood as codata rather than data. Take the following example:

let record = {
  protocol | default = `Http,
  port | default = protocol |> match {
    `Http => 80,
    `Ftp => 21,
    _ => 8181,
  },
} in
record & { protocol = `Ftp }

Here, our final record will be {protocol = `Ftp, port = 80}. port is an expression depending on protocol, but it is not updated in the new version, which is something one could intuitively expect from a recursive record. Such recursive update is a pattern actively used in Nix, and is probably a common scenario for configurations in general: if I patch a base configuration by changing an option, I usually want other values depending on this option to be updated as well.

Comparing override mechanisms

We are going to review various overriding mechanisms of Nix and other related languages. Let us sketch some general traits of overriding mechanisms as a framework for comparison:

(ERG): Ergonomy. A mechanism is ergonomic if it avoids complex encodings for overridable records, if it doesn't require extra library functions, and so on. It would ideally work out of the box with the native records of the language.
(NEST): Nested field overriding. The ease of overriding nested fields, such as the value of bar in {foo = {bar = "baz"}}. Ideally, overriding nested fields is no harder than top-level fields.
(COMP): Composability. Overriding is composable when one can seamlessly apply several overrides to an initial record.
(EXPR): Expressivity. An expressive mechanism is one that allow to express and do more things. For example, it can be the ability to drop some fields, to access the previous record's version of the data (super), and so on.

Overriding in Nix

Nix expressions don't have a built-in way of overriding recursive records while automatically updating the dependent recursive fields either. Since the feature is still actively needed, it is provided by mechanisms implemented in library code.

Keeping a language lean is usually a good design guideline: to provide an expressive yet simple set of features upon which others can be built as libraries. However, in the case of Nix, overriding is a fundamental pattern, and having it implemented in user code leads to some general well-known issues:

~~(ERG)~~ Definition, update or field access is not as ergonomic as with vanilla native records.
Several competing mechanisms have been implemented in slightly different contexts. Overriding in general is already hard for beginners, but having to learn several ones, all similar but still different, is even harder.
User-land implementations don't have a first-class access to information such as source location and metavalues. In a dynamic language, this can make good error reporting much harder.
It is potentially harder to make user land implementations efficient.

See this gist for more details. We continue with an overview of existing mechanisms in Nix and related languages.

Nixpkgs overriding

The basic idea is to represent a recursive record explicitly as a function from a self record to the final record (here, in Nix syntax):

r = rec {
  a = 1;
  b = a + 1;
}
# r is represented as:
rRepr = self: {
  a = 1;
  b = self.a + 1;
}

self is a self-reference, akin to this in object oriented languages. It is computed as a fixpoint, simply realized by auto-application in Nix, thanks to laziness:

r = rRepr r

The explicit dependency to self gives the ability to provide a different definition for self.a in the final value. Overriding is achieved exactly by functionally updating the a field of the self parameter before passing it to the original representation:

let extension = {a = 2;}; in
# The fixpoint of result is { a = 2; b = 3; }
resultRepr = self: (rRepr (self // extension)) // extension

The second outer update ensures that the final result is also set to a = 2, and not only the a appearing in b.

Some details are left out, but this is the gist of it. See also the Nix pill on overriding or this article on fixpoints in Nix.

Limits

~~(ERG)~~ Use a specific representation, rather than handling good old plain records (although the actual representation in Nixpkgs is more ergonomic than a plain function).

~~(NEST)~~ Overriding nested attribute sets is painful. If one do the naive thing, the whole subrecord is erased:

let rRepr = self: {
  a = {b = self.a.c;};
}; in
let extension = {a = {c = 2;};}; in
rExt = let fixpoint = rRepr (fixpoint // extension); in
fixpoint // extension
# Gives {a = {c = 2;};} instead of expected {a = {b = 2; c = 2;};}

Nixpkgs overlays

Overlays can be seen as a sequence of transformations from a base record, each layer having access to a super reference to the previous layer and the self reference to the final value.

Take two consecutive transformations:

Set the a field to 1
Set the b field to 1, add 1 to the a field

let baseRepr = self: {c = self.a + self.b;}; in
let overlay1 = self: super: {a = 1;}; in
let overlay2 = self: super: {b = 1; a = super.a + 1;}; in
let applyOverlays = self:
  let base = baseRepr self; in
  let first = base // overlay1 self base; in
  let second = first // overlay2 self first; in
  second; in
let fixpoint = applyOverlays fixpoint; in fixpoint

In practice, the super // .. and fixpoints parts can be factorised in dedicated helper functions.

Advantages

(EXPR): The explicit representation of layers by super and self gives a large control to the user.
(COMP): Composition is first-class.

Limits

~~(ERG)~~ Users still need to manipulate representations.
order-dependency: The result is order-dependent. Applying both self: super: {a = super.a + 1} and self: super: {a = super.a / 2} can give two different results depending on which one is the first layer. This also means that overrides must be grouped by layers, which is not necessarily the most logical structure.
~~(NEST)~~ Overriding nested fields is still clumsy. For example, to override lib.firefoxVersion:
```
self: super: { lib = (super.lib or {}) // { firefoxVersion = ...; }; }
```

NixOs module system

The NixOS module system takes a different approach. There, all the basic blocks — the modules — are merged together in an unspecified order. What's deciding the priority of one option over the other are attributes that are explicitly stated in the modules themselves. The declaration of options (type, priority, etc.) and their assignment must be made separately.

Compared to overlays, the explicit reference to super has disappeared. This makes it look like modules can only override one value with another, instead of combining different pieces of data together.

This apparent shortcoming is solved by custom merge functions, that can redefine how to combine different values of the field of a configuration. By default, when merging two list values, the module system only knows how to replace one value with the other, because there's no canonical and commutative way of merging two lists. However, the user can specify that these lists should in fact be concatenated, resulting in the possibility of defining a list of paths by pieces:

# Some module
{
  paths = mkOption {
    type = types.listOf types.path;
  };
}

# Config
[..]
someModule.paths = [foo/bar];

# Other config
[..]
someModule.paths = [/bar/baz];

# Resulting config: [foo/bar /bar/baz]

Advantages

order-independence: as opposed to overlays, modules can be merged in any order, and the final result will be the same.

Limits

~~(EXPR)~~: Despite custom merge functions, this model is less powerful than overlays which have access to an explicit reference to super. Modules, on the other hand, are all merged at the same level in an unspecified order.

For example, the following has no equivalent in a merging model:
```
let overlay = self: super: {b = self.c + 1; a = super.c + 1;}; in #...
```

However, modules are quite capable in practice. Removing fields and being order-dependent as in the overlay example above is not necessarily a good idea, and this limitation of expressivity may in fact be a good thing.

Overriding elsewhere

CUE

CUE allows a form of late-binding for recursive attributes:

$ cat test.cue
fields: {
 a: int
 b: a + 1
} & {
 a: 2
}
$cue eval test.cue
fields: {
    a: 2
    b: 3
}

Combined with default values, this provides an overriding mechanism:

$ cat test2.cue
fields: {
 a: int | *1
 b: a + 1
} & {
 a: 2
}
$cue eval test2.cue
fields: {
    a: 2
    b: 3
}

The basics of this overriding mechanism are close in spirit to what we want to achieve in this proposal. This is similar to Nickel's current default behavior, but with the desired late-bound merging. The expressivity is however too limited to replace e.g. the NixOS module system.

Advantages

(ERG): Overriding works seamlessly with native records. Note however that it requires fields to be explicitly marked as overridable (default).

Limits

~~(EXP)~~: One can only replace a value by another one. It is not possible to add one to the previous value, for example.
~~(COMP)~~: Overriding a record a second time is not possible.

Jsonnet

Jsonnet refers to its overriding mechanism as inheritance, implemented by the + operator on objects:

local obj = {
  name: "Alice",
  greeting: "Hello, " + self.name,
};
[
  obj,
  obj + { name: "Bob" },
  obj + { greeting: super.greeting + "!"},
  obj + { name: "Bob", greeting: super.greeting + "!"},
]

This is similar to the Nix operator //, but doing recursive overriding in the expected way out of the box. The extension can access the previous version in the same way as Nixpkgs overlays, using the super keyword.

Advantages

(ERG): Integrated with native records.
(COMP): Overriding is easily iterated.
(EXP): Achieve the same level of expressivity than overlays with the special keywords self and super.

Limits

order-dependency: As with overlays, the order of application of overrides matters.

Taking a step back

While not exactly the same, all these overriding mechanisms are based on the same underlying principles:

Represent recursive records (explicitly or implicitly) as a function of self (and super in some cases)
Compute the combination as a fixpoint of several recursive records, giving back a standard record.

As hinted by Jsonnet's terminology, the semantics of recursive records strikingly resemble the semantics of objects and classes in OOP. Replace records with objects, fields with methods and overriding with inheritance. This is not so surprising: there's actually an history of encoding objects in functional languages as recursive records (see the introduction of The Recursive Record Semantics of Objects Revisited for a good overview) going back to 1988 [Cardelli].

This is also mentioned in the README of POP (an object system in Nix), where the author observes that overriding mechanisms in Nix (and Jsonnet for that matter) are a simplified lazy object system (simplified because objects lack proper state and there is no distinction between classes and instances). Their logical conclusion is to embrace this fact and design a proper object system helped by existing literature, rather than reinventing the wheel. Similarly, Nix overlays can be seen as a single inheritance mechanism.

Inheritance-based overriding imposes an order on the overrides. A single level inheritance is usually fine, but a complex hierarchy can become hard to maintain and to reason about.

The NixOS module system is designed differently. It is based on merging: the configuration is created by combining a set of unordered records following specific rules. Of course, there's still a need for ordering information somewhere, but it is rather expressed as priorities. This system has the advantage of making merge commutative (in contrast with inheritance or the // operator), as in CUE, and to untie data definition from precedence specification: one can define a module where each field has a different priority, if it makes sense to group them logically. With inheritance, values are required to be grouped in layers of the same priority, instead of logically.

Summing up the differences between inheritance-based mechanisms and merge-based mechanism:

order-dependency: Inheritance is order dependent: the chain of extensions must be defined with the precise order they will be applied in mind, and definitions must be grouped in consequence. On the other hand, merging is commutative, and the precedence information is encoded as priorities. Thus, overriding by merging can be defined using stand-alone pieces of data, although the behavior of priorities is not local.

(EXP): Merged records have only access to the final computed fixpoint self, while objects have access to the previous stage of extension via super. However, as in the NixOS module system, it is possible to address this issue using custom merge functions. This is a bit less expressive, but in a good way: it forces the merge strategy to be uniform along each field, while mechanisms like overlays or inheritance can do pretty much anything.

Proposal overview

We basically propose to adopt the same kind of merge, priority, and custom merge function-based scheme as the NixOS module system. One big difference with the NixOS module system is that it would be built-in in the language, being usable for any native record whatsoever, and having good interaction with the rest of the language features. The implementation would have first class access to locations, to the AST, the memory layout, the metadata, and so on.

We first describe the ideas at a high-level. Then, we review the issue and challenges of this approach, and propose concrete solutions to overcome or mitigate them. Finally, the precise operational semantics is laid out in a dedicated section.

Recursive records & merging

This section defines the semantics, rather than an actual efficient implementation.

As before, it is useful to see recursive records represented as functions - or constructors - that take a self parameter and return a final non recursive record, in the same way as the original overriding mechanism of Nixpkgs (the non-existing syntax def := value is used to insist on the fact that we are defining new objects):

r = {
  a = 1;
  b = a + 1;
}
# Definition of the representation of r
repr(r) := fun self => {
  a = 1;
  b = self.a + 1;
}

Field access amounts to compute a fix-point:

repr(r).foo := let fix = repr(r) fix in fix.foo

Merging simply merges the underlying representations:

repr(r1) & repr(r2) := fun self => r1 self & r2 self

Priorities

Merge is fundamentally commutative. The problem is, not all fields should be treated the same: some are default values that ought to be overridden, some are high-priority values that ought to override. The solution is to use a priority system, encoding the precedence of each value while retaining commutativity (only the priority annotation counts, not the order in which values are merged).

Priorities have drawbacks. One is non-locality: the final result depends on other priorities of the field's values you are being merged with, potentially defined elsewhere. However, this can be mitigated by good defaults and an adapted set of priorities, such as a bottom default, a top force, and a infinite range of integers priorities in between, the default priority (when no priority is provided) being 0. Doing so, it's easy to just use default or force to override or be overridden by anything, without knowing the precise integer priority. Integer priorities still gives freedom with an infinite supply of levels if required.

Custom merge functions

Custom merge functions would be specified by a merge metavalue attribute. In order to enforce commutativity, they would receive their arguments in an indistinguishable order, such as having the type {lower: Dyn, higher: Dyn, priority: <Different, Equal>} -> Dyn. If both have the same priority, the order is not specified, and may even be randomized by the interpreter. The priority field indicates when it is the case, if this case needs special handling.

let mergeLists :
  forall a. {lower: List a, higher: List a, priority: <Different, Equal>} -> List a
  = fun args => args.lower @ args.higher in

let Contract = {
  path | List Str
       | merge mergeLists
       | doc "A list of paths to search in."
} in

let block1 | #Contract = {
  path = ["/usr/local/bin"]
} in
let block2 = {
  path = ["/bin"]
} in

# { path = ["usr/local/bin", "/bin"] }
block1 & block2

Issues & challenges

The proposal raises some questions. While none seems insuperable, users will need to have a good mental picture of the system, and we should be careful to avoid ending up with a complicated system full of ad-hoc fixes to edge cases.

Implementation

Recomputing the fixpoint at each field access is wasteful, because recursive records as functions satisfy the following property: as long as it is not merged, the self argument is constant. Thus, we can memoize the fixpoint and only invalidate self on a merge. Also, fields that do not depend on a recursive variable can be hoisted out of the function, using a representation like:

repr(r) := {
  a = 1;
  b = fun self => self.a + 1;
}

Concretely, each field of a record may either be a thunk as usual, in the case of non recursive expressions like a, or a thunk together with the original expression in the recursive case. All record operations except merge operate on the thunk. A merge operation, on the other hand, restores the original expression again.

There's more potential optimizations, but this first step should be a reasonable trade-off between implementation complexity and performance. See #103 for more details.

Scoping

Should a record be able to access a yet undefined field because it is expected to be provided by a subsequent merge? Two possible approaches:

dynamic scoping: records can reference fields that are not explicitly defined locally, such as:
```
{a = b} & {b = 1}
```
Dynamic scoping have a number of issues, and is usually considered bad practice.
lexical scoping: as currently, require self-referenced fields to be defined locally. Note that thanks to contracts, one can require the existence of a field without defining it. For example, we could write the previous example as:
```
{a = b, b | Num} & {b = 1}
```
This is also a better practice to explicitly state the fields whose presence is assumed in general.

This RFC proposes to adopt lexical scoping. We could have an even lighter syntax, such as {a = b, b} & {b = 1} for requiring the presence of a field b.

Priorities

Levels

This RFC proposes to adopt the following ordered set of priorities:

default is the bottom element
Integers priorities in the middle
force is the top element

That is, Priorities := default \/ {n | n integer} \/ top with default <= ... <= -1 <= 0 <= 1 <= ... <= force.

If not specified, the normal priority (the default priority, no to be confused with the default priority) is 0. This provides an infinite supply of priorities both below (default \/ {n | n < 0}) as well as above (force \/ {n | n > 0}).

Integer priorities are specified using the priority keyword. Defining more than one priority in the same meta-value is an error.

Example:

{
  foo | Num
      | default = 1,

  bar | Str,
  # equivalent to `bar | Str | priority 0`

  baz.boo.bor | priority -4 = "value",

  final | force = `CantOverrideMe,
}

Recursive priorities

As noted in #240, configurations should be easily overridable, and the approach outlined until now can end up annoyingly requiring configurations to be written with either default or force everywhere.

This RFC proposes to add recursive (or "leafy", or "push down") priorities, as described in #279. We define the new meta-values default rec and force rec, whose semantics are defined as:

eval(expr | default rec): case of eval(expr):
- {field1 = value1, .., fieldn = valuen} | annots: {field1 = (value1 | default rec aux), .., fieldn = (valuen | default rec aux)} | annots
- v | annots if v is not a record: v | defaulted(annots) where defaulted(annots) is defined below.
defaulted(annots):
- if annots contains the priority metavalue force, then defaulted(annots) := annots
- otherwise, let annots' | prio be the decomposition of annots into a priority prio (possibly empty) and the other metavalues, then defaulted(annots) := annots' | default

That is, default rec recursively overwrites all the priorities of the leafs of a record to default, excepted for force that is left untouched (metavalues are written in a liberal way, in that they can be empty).

force rec is defined similarly, excepted that it erases all priorites, even default ones. The names default rec/force rec are just suggestions.

Example:

let neutralConf = {
  foo = 1,
  bar.baz = "stuff",
  bar.blorg = false,
}

let defaulted | default rec = neutralConf
# ^ Will evaluate to:
# {
#   foo | default = 1,
#   bar = {
#     baz | default = "stuff",
#     bar.blorg | default = false,
#   },
# }
# This is different from `neutralConf | default`! The latter version
# would be overrided at once, as illustrated below.

defaulted & {bar.baz = "shapoinkl"}
# ^ Gives the expected:
# {
#   foo | default = 1,
#   bar = {
#     baz = "shapoinkl";
#     bar.blor | default = false,
#   },
# }
# While

(neutralConf | default) & {bar.baz = "shapoinkl"}
# ^ This gives only:
# {bar.baz = "shapoinkl"}

This way, an existing definition (arbitrarily complex: that could be the root of all Nixpkgs) can easily (and lazily) be turned into a full overriding or overridable configuration. A possible extension is to have a user-provided function that is mapped on priotities: default/force rec are just special cases of this.

Custom merge functions 2

In the case of the NixOS module system, all the configuration is merged in one final phase that conceptually evaluates the merge AST.

In Nickel, merging can be done partially and observed by expressions. The natural semantics of custom merge functions should be to affect all subsequent merging, as for contracts application. Indeed, we want to specify a contract or a custom merge function once and for all, not to repeat it at every field definition:

let add = fun args => args.lower + args.higher in
// {a = 3}
{a | merge add = 1} & {a = 1} & {a = 1}

But what about {a = 1} & {a | merge add = 1} & {a = 2}? If a merge annotation only affects terms on the right hand side, this breaks commutativity.

let r1 = {a = 1} in
let r2 = {a = 1} in
let r3 = {a | merge add = 1} in

# {val = 2}
r1 & r2 & r3

# {val = 3}
r3 & r1 & r2

Possible solutions:

Make a n-ary merging behave differently than repeated merging, which would apply the custom merge function to all arguments. This doesn't solve the commutativity problem per se: we also have to decide that in ordinary merging, custom function wouldn't affect subsequent merges anymore. This does raise yet other problems, one being of a good distinctive syntax. More generally, it sounds confusing and ad-hoc.
What seems the right, but not trivial solution: symmetrize custom merging and make it "distribute backward" as well. That is (simplifying by forgetting about priority and lower/higher arguments):
```
val1 & val2 & (val3 | merge func)
<=> (val1 | merge func) & val2 & val3
<=> func (func val1 val2) val3
// instead of the naive
<=/=> func (val1 & val2) val3
```

This RFC proposes solution 2. This raises some difficulties:

What if val1 & val2 is in a thunk that happens to be forced in between? What should let x = val1 & val2 in x & (val3 | merge func) evaluate to, or let x = val1 & val2 in builtins.seq x (x & (val3 | merge func))?

In some sense, we would like to have a call-by-name semantics rather than a call-by-need.
- (a) In the case of record fields, overriding has to solve the same problem of remembering an original expression along an evaluated version. If we allow custom merge annotations only on record fields, it is possible to recover the original merge AST (abstract syntax tree) and interpret it with the custom merge function as wanted. This may prove a bit tricky to implement in practice, for the merge expression val1 & val2 may be obfuscated by some program transformations or evaluation artifacts, but doable.
- (b) A more extreme take is to generalize this to any term: a merge expression would behave like a lazy datatype Merge(t1,t2) with respect to evaluation. Evaluating it would amount to automatically apply eval : Merge(t1,t2) -> t1 & t2 with a potential merge function, which wouldn't erase the original top-level thunk Merge(t1,t2). This is in fact how default values are currently handled. Doing so, we don't need to restrict custom merge functions to record fields. This incurs an additional cost though for merging (remembering all the original ASTs of merge expressions), even when one doesn't actually use custom merge functions.

We propose to implement directly the general solution of (b).

Operational semantics

This section defines the operational semantics of merging with the choices previously described.

The choices made in custom merge function section, in particular symmetrizing merge with respect to custom merge function annotations, require the evaluation process to:

Determine the abstract syntax tree (AST) of a merge expression
Decide the merge function to use
Interpret the tree with the given function

Let us define the notion of a merge tree. The definition determines what is the area of influence of a merge function annotation, answering the following questions:

let add = fun x y => x + y in

let var = 1 & 1 in
var & (1 | merge add) # result?
(var | merge add) & 1 # result?

((1 & 1) + (1 & 1)) & (1 | merge add) # result?
((1 & 1) + (1 & 1) | merge add) & 1 # result?

# file: somefile.ncl
1 & 1
# file: other.ncl
(import "somefile") & (1 | merge add) # result?
(import "somefile" | merge add) & 1 # result?

In the following, we will write "meta"-code (think of the code of the Nickel interpreter) in ML pseudo-code. To distinguish this meta-code from Nickel expressions, we use the quote syntax [| exp |] to denote a Nickel expression. For example, hasAnnot [| f (1 + 1) |] = false means that the value of the hasAnnot (meta-)function on the Nickel expression f (1 + 1) is false.

Merge tree

A merge tree is a binary tree whose nodes are labelled by a metadata. Leafs are labelled with both a metadata and a Nickel expression e in WHNF (weak head normal form, that is, evaluated). In this context, a metadata is a (meta-)record which field names corresponds to meta attributes custom,priority,default,contract and so on. Fields are optional, excepted priority, which must always be defined.

data Metadata = Metadata {
  priority :: Priority,
  merge :: Option Priority,
  contracts :: Option (List Contract),
  // ...
}

data AbsMergeTree a =
  Merge a a
  | Exp Expression

type MergeTree = MergeTree (AbsMergeTree (MergeTree, Metadata))

The function mergeTree: Expression -> MergeTree compute the merge tree of an expression. It needs to evaluate leafs to see if they contains themselves merge expressions, such that let x = a & b in x & c and a & b & c has the same merge tree: that is, merge treee commute with evaluation.

# we maintain an environment of bindings in `env`

metaData [| e | attr = val, ... |] ::=
  { attr = val, ... }                    if priority is set
  { priority = normal, attr = val, ...}  otherwise

mergeTree e ::= (absMergeTree e, metaData e)

absMergeTree [| e1 & e2 |] = Merge(mergeTree(e1), mergeTree(e2))

// whnf = Weak head normal form, result of evaluation
absMergeTree [| whnf |] @ e = Exp(e)

// Should mergeTree cross import boundaries? Probably not
absMergeTree [| import path |] @ e = Exp(e)

// All other cases
absMergeTree e = weakEval e

// weakEval is defined exactly as standard evaluation, excepted that it stops at
// merge expressions, as if they were a lazy datatype in weak head normal form

weakEval [| e1 & e2 |] @ e = e

// all other cases are defined exactly as for eval
weakEval e = ...

Semantics

Now that we defined merge trees, we need to extract a potential merge function from it:

type MergeFunction =
  {value: Dyn, priority: Priority} ->
  {value: Dyn, priority: Priority} -> Dyn

extractMergeFuns : MergeTree -> List MergeFunction

extractMergeFuns (Merge(ast1,_),Merge(ast2,_)) = extractMergeFuns ast1 @ extractMergeFuns ast2
extractMergeFuns (Leaf(e,meta)) = [f] if meta.merge == Some([| f |])
                                  []  otherwise

mergeFunction : MergeTree -> Result MergeFunction ()
mergeFunction t = let funs = extractMergeFuns t in
  if lists.length t == 0 then
    Ok(__builtinMerge) // the standard `&` merge function
  else if lists.length t == 1 then
    Ok(head t)
  else
    Err(())

And the interpretation of a merge tree by a merge function:

// can be extended, as long as it is a partially ordered set
Priority = Number | -inf | +inf | ...

interpret (Merge(ast_1,meta_1),Merge(ast_2,meta_2)) f =
    f {value = interpret ast_i f, prio = meta_i.priority}
      {value = interpret ast_j f, prio = meta_j.priority}
    where
      i,j s.t meta_i.priority <= meta_j.priority

We can finally define the evaluation of merge:

eval [| e1 & e2 |] =
  let t = mergeTree [| e1 & e2 |] in
  interpret t (mergeFunction t)

Please keep in mind that mergeTree (e1 & e2) refers to the original merge tree of this expression. It needs to be accessible even after evaluation by the implementation.

In practice, we can't update a thunk that contains e1 & e2 with the result of the evaluation. This is already the case with default values currently (default (1 + 1) isn't updated to 2, but to default 2, otherwise the semantics would change). A good view on this is that the semantics is inherently call-by-name, and that any caching mechanism (including call-by-need) is a practical semantics-preserving optimization.

Files

001-overriding.md

Latest commit

History

001-overriding.md

File metadata and controls

Overriding

Context

Definition of the problem

Overriding recursive records

Comparing override mechanisms

Overriding in Nix

Nixpkgs overriding

Limits

Nixpkgs overlays

Advantages

Limits

NixOs module system

Advantages

Limits

Overriding elsewhere

CUE

Advantages

Limits

Jsonnet

Advantages

Limits

Taking a step back

Proposal overview

Recursive records & merging

Priorities

Custom merge functions

Issues & challenges

Implementation

Scoping

Priorities

Levels

Recursive priorities

Custom merge functions 2

Operational semantics

Merge tree

Semantics