Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polymorphic contracts and contract propagation seem fundamentally incompatible #1228

Closed
yannham opened this issue Apr 4, 2023 · 4 comments · Fixed by #1272
Closed

Polymorphic contracts and contract propagation seem fundamentally incompatible #1228

yannham opened this issue Apr 4, 2023 · 4 comments · Fixed by #1272
Labels

Comments

@yannham
Copy link
Member

yannham commented Apr 4, 2023

#1141 made dictionary contracts (or, should I say, dictionary type, as we will see that the difference matters) behave more lazily, and be overriding friendly. The gist is that {foo = ..., bar = ..., baz = ...} | {_ : T} is now strictly equivalent to write {foo | T = ..., bar | T = ..., baz | T = ...}. This unveiled a subtle but important question about the interaction between polymorphic contracts and the fact merging propagates contracts.

Propagation

You can read more into RFC005, but propagation in Nickel means that if I merge {foo | Number | default = 5} with {foo = "a"}, the contract will propagate to "a" and blame:

nickel> {foo | Number | default = 5} & {foo = "a"}
error: contract broken by a value
  ┌─ repl-input-3:1:39
  │
1 │ {foo | Number | default = 5} & {foo = "a"}
  │        ------                         ^^^ applied to this expression
  │        │                               
  │        expected type
  │
  ┌─ <unknown> (generated by evaluation):1:1
  │
1 │ "a"
  │ --- evaluated to this value

This is inspired from CUE and NixOS, in the spirit that merging can only refine the values and strengthen the contraints (in practice, the contracts).

Polymorphic contracts

Nickel implements polymorphic contracts which enforce parametricity. This is done thanks to a technique called sealing. Anything which enters a polymorphic function as a generic value is sealed, and this value is only unsealed when exiting the function, ensuring that a polymorphic function can't inspect a generic argument: it can only pass it around to other functions (with a matching type) or return it.

The issue

Here is the problematic case which appeared after #1141. Recall the type signature of record.insert:

insert : forall a. String -> a -> {_: a} -> {_: a}

Now, let's run the following example on current master:

nickel> let singleton_foo = fun value => record.insert "foo" value {}
in (singleton_foo {sub_left = 1}) & {foo.sub_right = 2} 
error: contract broken by a function
   ┌─ <stdlib/record.ncl>:65:29
   │
65 │       : forall a. String -> a -> { _ : a } -> { _ : a }
   │                             - expected type of an argument of an inner call
   │
   ┌─ <unknown> (generated by evaluation):1:1
   │
 1 │ x
   │ - evaluated to this value
   │
   = This error may happen in the following situation:
   [...]

This program looks totally valid (and it is), yet it blames record.insert for breaking the polymorphic contract. Here is what happens, when calling to record.insert:

  1. As a polymorphic argument, {sub_left = 1} is sealed, becoming seal key_a {sub_left = 1}
  2. {_: a} is applied to {}, to no effect
  3. The call to the %record_insert% primop produces {foo = seal key_a {sub_left = 1}}
  4. The outer dictionary contract is applied (this time corresponding to an unseal operation), giving {foo | Unseal key_a = seal key_a {sub_left = 1}}.

So far, so good, and if we stopped here, we would be fine. However, because of the new semantics of dictionary contract and propagation, when we merge with {foo = {sub_right = 2}}, the contract attached to foo is lazily propagated, giving:

{foo | Unseal key_a = (seal key_a {sub_left = 1}) & {sub_right = 2}}

Now, when we try to evaluate this, the interpreter has to force seal key_a {sub_left = 1} and complains that we are trying to use a sealed value. Intuitively, we would like to get something like {foo = (unseal key_a (seal key_a {sub_left = 1})) & {sub_right = 2}} instead.

Has this anything to do with dictionaries?

In fact, this issue hasn't much to do with dictionary contracts per se. It looks like we could reproduce this issue on records with a statically-known shape:

nickel>
let set_foo : forall a. a -> {foo : a} =
  fun x => {foo = x}
in
  (set_foo {sub_left = 1}) & {foo.sub_right = 2}

{ foo = { sub_left = 1, sub_right = 2 } }

nickel>

Ugh. It doesn't complain when using record types. The reason is that, although it might mostly be incidental, or even accidental, record types - once transformed as contracts - behave in a different way as record contracts (if we used | instead of :). The checks are more eager (with respect to missing/extra record fields), they have to handle the polymorphic tail, and they basically record.map the contracts onto the field value, instead of using the mechanism of storing lazy contracts at the position of fields. In particular, the field contracts derived from a record type don't propagate. Which turns out to be the right thing in this specific case.

Using record contracts

Ok, so let's transform the example to use record contracts instead:

nickel> 
let set_foo : forall a. a -> {foo | a} =
  fun x => {foo = x}
in
  (set_foo {sub_left = 1}) & {foo.sub_right = 2}
error: unbound identifier
  ┌─ repl-input-6:2:37
  │
2 │ let set_foo : forall a. a -> {foo | a} =
  │                                     ^ this identifier is unbound

Oh. When implementing RFC002, we made the choice to scope type variables only through static types. That is, in a term forall a. body, all the direct descendant of body in the AST (body included) which are syntactically static types (as defined by RFC002) will have a as a type variable in scope. However, this stops at any node that is not a static type. {foo | T} isn't a static type, because of the use of |. So the variable doesn't cross the boundary.

Thus, currently, the parser syntactically rejects the use of a record contract with free type variables inside, which appears to prevent problematic example to even be parsed in the first place.

So, all in all, we can't trigger the issue with records only. This is due to a bunch of reason, but mostly that record types and record contracts have a different semantics and capabilities, although part of it might be a bit accidental. Record contracts are lazy and propagate but they can't embed any polymorphic contract (well, they can have a forall x . ... inside, but that's a not a problem: the issue is with free type variable descending from a polymorphic contract). Record types, on the other hand, can include polymorphic subcontracts, but they don't propagate. This explains why we had to wait for the change of semantics of the dictionary type, which somehow combines the two (it allows polymorphic contracts, because it is first and foremost a static type, but its implementation is lazy and it's getting propagated), to unveil the interaction.

Solutions

The heart of the problem is not to be minimized. It really seems like polymorphic contracts are fundamentally incompatible with lazy propagation: propagation ends up applying sealing and unsealing contracts to terms that had nothing to do with the polymorphic call, and it feels like type variables somehow escape their scope, in some sense. It's clearly incorrect, but what to do is not totally clear. Here are some leads:

The radical one: annihilation of polymorphic contracts

A radical solution is to get rid of polymorphic sealing and unsealing altogether. It's a complex part of the contract system, has been interacting in non trivial ways with recursive overriding as well (see e.g. #1175), so it's not clear if they're paying for their complexity. This makes contracts check less things, but would allow to unify type and contracts (at least at runtime) in an indistinguishable way.

Endorse the difference between static types and contracts

Another point of view is that the (seemingly partly accidental) distinction between contracts derived from static types and pure record contracts is not that absurd in the end. This distinction might correspond to two different usages of validation, in the same way as since RFC005, {foo | Number = 5} and {foo = (5 | Number)} don't have the same semantics anymore: the latter is not a metadata, but a mere contract annotation, and doesn't propagate through merging.

For example, would it be very intuitive that calling a function like swap_foo_bar_num : {bar : Number, foo : Number} -> {bar : Number, foo : Number} would suddenly forbid subsequent overriding of foo or bar with anything else than a number? This seems to be a local contract, concerned with this specific function call.

So, maybe the idea that static types correspond to local checks, that don't propagate/contaminate out of their scope, while record contracts correspond to metadata, which do, isn't so silly. In that case, one potential solution to our problem would be to introduce a dictionary contract variant {_ | T}, which would behave as the current dictionary type operationally, would propagate, but would also not be able to be used with free type variables. The static variant {_ : T} would then be reverted to its previous semantics, which is to mostly map a contract onto a record, eagerly. Note that the issue of static record types or dictionary types not preserving the recursivity of records is a real issue (one of the original motivation of the change), but is orthogonal: we can very much preserve recursivity while preventing the descendant contracts from propagating.

@vkleen
Copy link
Member

vkleen commented Apr 5, 2023

Regarding our way forward, I would like to see a dictionary contract syntax {_ | T} anyway. I've had informal discussions with people unfamiliar with Nickel and writing record contracts with |, but needing to write dictionary contracts as {_ : T} even outside a statically typed context was confusing to them.

Once we have this separate syntax it would be a relatively small change to revert the contract conversion of {_ : T} to before #1141 and keep the new lazy semantics for {_ | T}. So, I guess, this is just a long winded way of saying I prefer the second of your two alternatives 😅

@yannham
Copy link
Member Author

yannham commented Apr 6, 2023

I have another question now: let's say we apply the second proposal. How should behave a contract like {foo : Number, bar | String}? Should the number contract of Foo propagate or not? In fact I've always found it dubious to use a type annotation on something that doesn't have a definition (outside of writing a record type, of course), or mixing : and | . Hence I would err on the side of the this doesn't have any effect, and issue a warning, or even an error on parsing.

@vkleen
Copy link
Member

vkleen commented Apr 6, 2023

Yes, {foo : Number, bar | String} doesn't seem like a sensible thing to write in my opinion as well. I would be in favor of making that syntactically invalid, say by requiring a type annotation in a record to always be part of a definition with =?

@yannham
Copy link
Member Author

yannham commented Apr 19, 2023

Update: for now, the decision was made to go with the approach explained above. That is, there is a distinction between type and contracts, which is that contracts propagate, will types don't. We will also add a variant of the dictionary contract {_ | Type}, which will correspond to the current dictionary contract, and the dictionary type {_ : Type} will correspond to mapping type annotations : onto the values, which won't propagate.

vkleen added a commit that referenced this issue Sep 22, 2023
Because of #1228 we syntactically forbid contracts from containing free
type variables. This was implemented in #1271 and #1272. As a result,
the double contract application in #1194 became dead code. This PR
removes the `%dualize%` primop, the `dualize` field in `Label` and the
double application logic when constructing recursive environments for
records.
github-merge-queue bot pushed a commit that referenced this issue Sep 22, 2023
Because of #1228 we syntactically forbid contracts from containing free
type variables. This was implemented in #1271 and #1272. As a result,
the double contract application in #1194 became dead code. This PR
removes the `%dualize%` primop, the `dualize` field in `Label` and the
double application logic when constructing recursive environments for
records.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants